两个排序的数组,找到第k大的元素
2013-02-28 00:12
477 查看
相关问题1:median of two sorted array
相关问题2:在排序数组中找到第k个元素 find the k-th element in two sorted arrays
相关问题3:两个排序的数组,找到第k大的元素
给你两个排序的数组,A和B,大小分别是 m 和 n。找到把 A和B 归并后的数组中,第k大的元素。(假定没有重复元素)。
The trivial way, O(m + n):
Merge both arrays and the k-th smallest element could be accessed directly. Merging would require extra space of O(m+n). The linear run time is pretty good, but could we improve it even further?
A better way, O(k):
There is an improvement from the above method, thanks to readers who suggested this. (See comments below by Martin for
an implementation). Using two pointers, you can traverse both arrays without actually merging them, thus without the extra space. Both pointers are initialized to point to head of A and B respectively, and the pointer that has the larger smaller
(thanks to a reader for this correction) of the two is incremented one step. The k-th smallest is obtained by traversing a total of k steps. This algorithm is very similar to finding
intersection of two sorted arrays.
The best solution, but non-trivial, O(lg m + lg n):
Although the above solution is an improvement both in run time and space complexity, it only works well for small values of k, and thus is still in linear run time. Could we improve the run time further?
The above logarithmic complexity gives us one important hint. Binary search is a great example of achieving logarithmic complexity by halving its search space in each iteration. Therefore, to achieve the complexity ofO(lg m + lg n),
we must halved the search space of A and B in each iteration.
We try to approach this tricky problem by comparing middle elements of A and B, which we identify as Ai and Bj. If Ai is between Bj and Bj-1, we have just found the
Summarizing the above,
Maintaining the invariant
i + j = k - 1,
If Bj-1 < Ai < Bj, then Ai must be the k-th smallest,
or else if Ai-1 < Bj < Ai, then Bj must be the k-th smallest.
If one of the above conditions are satisfied, we are done. If not, we will use i and j as the pivot index to subdivide the arrays. But how? Which portion should we discard? How about Ai and Bj itself?
We make an observation that when Ai < Bj, then it must be true that Ai < Bj-1. On the other hand, if Bj < Ai, then Bj < Ai-1. Why?
Using the above relationship, it becomes clear that when Ai < Bj, Ai and its lower portion could never be the k-th smallest element. So do Bj and its upper portion. Therefore, we could conveniently discard Ai with
its lower portion and Bj with its upper portion.
If you are still not convince why the above argument is true, try drawing blocks representing elements in A and B. Try visualize inserting blocks of A up to Ai in front of Bj-1. You could easily see that no elements in the inserted blocks
would ever be the k-th smallest. For the latter, you might want to keep the invariant i + j = k - 1 in mind to reason why Bj and its upper portion could never be the k-th smallest.
On the other hand, the case for Ai > Bj is just the other way around. Easy.
Below is the code and I have inserted lots of assertion (highly recommended programming style by the way) to help you understand the code. Note that the below code is an example of tail
recursion, so you could technically convert it to an iterative method in a straightforward manner. However, I would leave it as it is, since this is how I derive the solution and it seemed more natural to be expressed in a recursive manner.
Another side note is regarding the choices of i and j. The below code would subdivide both arrays using its array sizes as weights. The reason is it might be able to guess the k-th element quicker (as long as the A and B is not differed in
an extreme way; ie, all elements in A are smaller than B). If you are wondering, yes, you could choose i to be A's middle. In theory, you could choose any values for i and j as long as the invariant i+j = k-1
is satisfied.
相关问题2:在排序数组中找到第k个元素 find the k-th element in two sorted arrays
相关问题3:两个排序的数组,找到第k大的元素
给你两个排序的数组,A和B,大小分别是 m 和 n。找到把 A和B 归并后的数组中,第k大的元素。(假定没有重复元素)。
The trivial way, O(m + n):
Merge both arrays and the k-th smallest element could be accessed directly. Merging would require extra space of O(m+n). The linear run time is pretty good, but could we improve it even further?
A better way, O(k):
There is an improvement from the above method, thanks to readers who suggested this. (See comments below by Martin for
an implementation). Using two pointers, you can traverse both arrays without actually merging them, thus without the extra space. Both pointers are initialized to point to head of A and B respectively, and the pointer that has the larger smaller
(thanks to a reader for this correction) of the two is incremented one step. The k-th smallest is obtained by traversing a total of k steps. This algorithm is very similar to finding
intersection of two sorted arrays.
The best solution, but non-trivial, O(lg m + lg n):
Although the above solution is an improvement both in run time and space complexity, it only works well for small values of k, and thus is still in linear run time. Could we improve the run time further?
The above logarithmic complexity gives us one important hint. Binary search is a great example of achieving logarithmic complexity by halving its search space in each iteration. Therefore, to achieve the complexity ofO(lg m + lg n),
we must halved the search space of A and B in each iteration.
We try to approach this tricky problem by comparing middle elements of A and B, which we identify as Ai and Bj. If Ai is between Bj and Bj-1, we have just found the
i + j< + 1smallest element. Why? Therefore, if we choose i and j such that
i + j = k - 1, we are able to find the k-th smallest element. This is an important invariant that we must maintain for the correctness of this algorithm.
Summarizing the above,
Maintaining the invariant
i + j = k - 1,
If Bj-1 < Ai < Bj, then Ai must be the k-th smallest,
or else if Ai-1 < Bj < Ai, then Bj must be the k-th smallest.
If one of the above conditions are satisfied, we are done. If not, we will use i and j as the pivot index to subdivide the arrays. But how? Which portion should we discard? How about Ai and Bj itself?
We make an observation that when Ai < Bj, then it must be true that Ai < Bj-1. On the other hand, if Bj < Ai, then Bj < Ai-1. Why?
Using the above relationship, it becomes clear that when Ai < Bj, Ai and its lower portion could never be the k-th smallest element. So do Bj and its upper portion. Therefore, we could conveniently discard Ai with
its lower portion and Bj with its upper portion.
If you are still not convince why the above argument is true, try drawing blocks representing elements in A and B. Try visualize inserting blocks of A up to Ai in front of Bj-1. You could easily see that no elements in the inserted blocks
would ever be the k-th smallest. For the latter, you might want to keep the invariant i + j = k - 1 in mind to reason why Bj and its upper portion could never be the k-th smallest.
On the other hand, the case for Ai > Bj is just the other way around. Easy.
Below is the code and I have inserted lots of assertion (highly recommended programming style by the way) to help you understand the code. Note that the below code is an example of tail
recursion, so you could technically convert it to an iterative method in a straightforward manner. However, I would leave it as it is, since this is how I derive the solution and it seemed more natural to be expressed in a recursive manner.
Another side note is regarding the choices of i and j. The below code would subdivide both arrays using its array sizes as weights. The reason is it might be able to guess the k-th element quicker (as long as the A and B is not differed in
an extreme way; ie, all elements in A are smaller than B). If you are wondering, yes, you could choose i to be A's middle. In theory, you could choose any values for i and j as long as the invariant i+j = k-1
is satisfied.
int findKthSmallest(int A[], int m, int B[], int n, int k) { assert(m >= 0); assert(n >= 0); assert(k > 0); assert(k <= m+n); int i = (int)((double)m / (m+n) * (k-1)); int j = (k-1) - i; assert(i >= 0); assert(j >= 0); assert(i <= m); assert(j <= n); // invariant: i + j = k-1 // Note: A[-1] = -INF and A[m] = +INF to maintain invariant int Ai_1 = ((i == 0) ? INT_MIN : A[i-1]); int Bj_1 = ((j == 0) ? INT_MIN : B[j-1]); int Ai = ((i == m) ? INT_MAX : A[i]); int Bj = ((j == n) ? INT_MAX : B[j]); if (Bj_1 < Ai && Ai < Bj) return Ai; else if (Ai_1 < Bj && Bj < Ai) return Bj; assert((Ai > Bj && Ai_1 > Bj) || (Ai < Bj && Ai < Bj_1)); // if none of the cases above, then it is either: if (Ai < Bj) // exclude Ai and below portion // exclude Bj and above portion return findKthSmallest(A+i+1, m-i-1, B, j, k-i-1); else /* Bj < Ai */ // exclude Ai and above portion // exclude Bj and below portion return findKthSmallest(A, i, B+j+1, n-j-1, k-j-1); }
相关文章推荐
- LeetCode--找到两个排序数组中第k大的元素
- 在两个排序数组中查找第k小元素
- 面试算法:lg(k)时间查找两个排序数组合并后第k小的元素
- 其他题目---在两个排序数组中找到第K小的数
- 在两个排序数组中找到第k小的数
- 【转载】两个排序数组的中位数 / 第K大元素(Median of Two Sorted Arrays)
- 给定两个已经排序好的数组,找到两者所有元素中第 k 大的元素
- 微软算法100题14 在排序数组中找到和为指定数的任意两个元素
- 求两个已排序的数组中所有元素的第K大(小)
- 求两个排序数组的第K大元素--求全排列的第K大元素
- 无序数组O(n)时间找到排序后的两个相邻元素使得他们之间的差最大
- 在2个排序数组内找到第K小的元素
- 由无序数组中找到第K 大的元素
- 求两个有序数组的中位数或者第k小元素
- [面试题]设计一个算法找到数组中两个元素相加等于指定数的所有组合
- 【leetcode】——从两个有序数组中寻找他们并集的第k小元素
- 杨氏矩阵第K小值/两个数组元素之和最小值
- 从两个有序数组的并集中寻找第k小元素
- 求两个有序数组的第k小元素
- 找出两个排序数组中排在第k位置的数