从两个排序数组中找到第 k 个最小的元素答案

【问题标题】：Find kth smallest element from two sorted arrays从两个排序数组中找到第 k 个最小的元素
【发布时间】：2017-03-04 11:57:19
【问题描述】：

给定两个大小为 M 和 N 的排序数组。我试图实现一个时间复杂度为 O(logM+logN) 的算法。该策略基本上是根据长度条件比较两个子数组中的中间索引元素。

    // Test case 1
    // Works for all position except when kth is 6
    int[] num1 = {6,7,8,9,10,11,12};
    int[] num2 = {1,2,3,4,5};

    // Test case 2
    // Always print the next smallest element
    int[] num3 = {1,3,5,7,9};
    int[] num4 = {2,4,6,8,10,12,14,16,18,20,22,24,30,40,50,56,77,35};


public static int findKth(int[] A, int p1, int r1, int[] B, int p2, int r2, int k){


    if (p1 > r1) { 
        return B[p2+k-1];
    } else if (p2 > r2) {
        return A[p1+k-1];
    }


    int midA = p1 + (int)Math.floor((r1-p1)/2);// Middle element from subarray A
    int midB = p2 + (int)Math.floor((r2-p2)/2);// Middle element from subarray B

    /**
     * Compare the sum of number of elements from left-subarray up to middle element. 
     */
    if ((midA-p1+midB-p2+2) < k) { 
        // We don't need to the left-subarray based on the comparisons between middle element
        if (A[midA] > B[midB]) {
            return findKth(A, p1, r1, B, midB+1, r2, k-(midB-p2+1)); //
        } else {
            return findKth(A, midA+1, r1, B, p2, r2, k-(midA-p1+1)); //
        }
    } else {
        // We don't need to the right-subarray based on the comparisons between middle element.
        if (A[midA] > B[midB]) {
            return findKth(A, p1, midA-1, B, p2, r2, k);
        } else {
            return findKth(A, p1, r1, B, p2, midB-1, k);
        }
    }
}

我觉得我使用的策略应该是正确的。但是对于上面显示的两个测试用例，它会在某个特定的第 k 个值中打印错误的输出。所以我猜我的策略一定有问题。谁能简要描述一下这个实现的哪一部分是不正确的？谢谢！

【问题讨论】：

我猜你可以通过使用 minheap/maxheap 来解决这个问题，这是获取第 k 个最小元素的最佳方法之一
@Geeky 我知道还有很多其他方法可以解决这个问题。但是你能发现我在这个算法上做错了什么吗？

标签： arrays algorithm time-complexity

【解决方案1】：

如果我能在你的 sn-p 中找到错误，我会更新我的答案。现在你可以看看我的代码，它的逻辑和你的完全一样，除了：

该策略基本上是比较两个中间的索引元素子数组基于它们的长度条件。

我的代码的简单性和小尺寸的主要区别是我通过调用函数并在第一个数组/索引对不小的情况下交换其参数来避免一些 if-else 条件（对于长度条件）。

public static int findKth(int[] A, int i, int[] B, int j, int k) {

    // Here is the simple trick. We've just changed the parameter order if first array is not smaller.
    // so that later we won't need to write if-else conditions to check smaller/greater stuff
    if((A.length - i) > (B.length - j))
    {
        return findKth(B, j, A, i, k);
    }

    if(i >= A.length) 
    {
        return B[j + k - 1];
    }
    if(k == 1)
    {
        return Math.min(A[i], B[j]);
    }

    int aMid = Math.min(k / 2, A.length - i);
    int bMid = k - aMid;

    if(A[i + aMid - 1] <= B[j + bMid - 1])
    {
        return findKth(A, i + aMid, B, j, k - aMid);
    }

    return findKth(A, i, B, j + bMid, k - bMid);
}

public static int findKthSmallestElement(int[] A, int[] B, int k)
{
    if(k > A.length + B.length)
        return -1;

    return findKth(A, 0, B, 0, k);
}

时间复杂度为O(log(m + n))。

【讨论】：

刚刚发现错误。无论如何感谢您的努力！

【解决方案2】：

StackOverflow 上已经有很好的解决方案，例如here 和here，所以我将只专注于识别错误：

这一行：

if ((midA-p1+midB-p2+2) < k) {

应该修改为使用<=:

if ((midA-p1+midB-p2+2) <= k) {

您可以通过这个小示例了解为什么需要这样做：

A = {1,3}
B = {2}

其他变量将是：

p1 = p2 = r2 = midA = midB = 0
r1 = 1

所以表达式midA-p1+midB-p2+2 将是2。在这种情况下，您显然想丢弃 A 的一个元素，而不是 B 的唯一元素。

请注意，通常可以包含 k 案例来执行 if 块，因为您永远不会丢弃 k（即太多) 元素：作为最后一个参数传递给递归调用的表达式总是最多k-1。

另一方面，当if 表达式为k 时转到else 部分是错误的，如midA 或midB em> 很可能是那个 k^th 元素的索引，它会被扔掉。

【讨论】：

谢谢，我认为你是对的。由于这种情况，该程序不适用于某些 kth 值。有时我只是发现定义这些边界条件有点棘手。程序现在返回所有输入数组和第 k 个值的正确值。
不客气。如果此答案解决了您的问题，请考虑将其标记为已接受（单击答案旁边的灰色复选标记）。

【解决方案3】：

#include <bits/stdc++.h>
using namespace std;

int findKthElement(int a[],int start1,int end1,int b[],int start2,int end2,int k){
    if(start1 >= end1)return b[start2+k-1];
    if(start2 >= end2)return a[start1+k-1];
    if(k==1)return min(a[start1],b[start2]);
    int aMax = INT_MAX;
    int bMax = INT_MAX;
    if(start1+k/2-1 < end1) aMax = a[start1 + k/2 - 1];
    if(start2+k/2-1 < end2) bMax = b[start2 + k/2 - 1];

    if(aMax > bMax){
        return findKthElement(a,start1,end1,b,start2+k/2,end2,k-k/2);
    }
    else{
        return findKthElement(a,start1 + k/2,end1,b,start2,end2,k-k/2);
    }
}

int main(void){
    int t;
    scanf("%d",&t);
    while(t--){
        int n,m,k;
        cout<<"Enter the size of 1st Array"<<endl;
        cin>>n;
        int arr[n];
        cout<<"Enter the Element of 1st Array"<<endl;
        for(int i = 0;i<n;i++){
            cin>>arr[i];
        }
        cout<<"Enter the size of 2nd Array"<<endl;
        cin>>m;
        int arr1[m];
        cout<<"Enter the Element of 2nd Array"<<endl;
        for(int i = 0;i<m;i++){
            cin>>arr1[i];
        }
        cout<<"Enter The Value of K";
        cin>>k;
        sort(arr,arr+n);
        sort(arr1,arr1+m);
        cout<<findKthElement(arr,0,n,arr1,0,m,k)<<endl;
    }

    return 0;
}

时间复杂度为 O(log(min(m,n)))

【讨论】：