用于在两个数组中查找公共元素的 Javascript 程序答案

【问题标题】：Javascript Program for find common elements in two array用于在两个数组中查找公共元素的 Javascript 程序
【发布时间】：2019-02-11 16:25:50
【问题描述】：

最近我有一个面试问题如下：让我们考虑我们有两个不同长度的排序数组。需要在两个数组中找到共同的元素。

var a=[1,2,3,4,5,6,7,8,9,10];
var b = [2,4,5,7,11,15];
for(var i=0;i<a.length;i++){
    for(var j=0;j<b.length;j++){
        if(a[i]==b[j]){
            console.log(a[i],b[j])
        }
    }
}

我像上面那样写的。面试官说现在假设 a 有 2000 个元素，b 有 3000 个元素。那你是怎么写得更有效率的呢？

请用示例代码解释您的答案。这样我可以更清楚地理解。

【问题讨论】：

对象数组？诠释？字符串？
一个数组中是否有 2 个或多个相同的元素？
因为它们已排序，binary search。在 O(log n) 而不是 O(n^2) 中运行。另见stackoverflow.com/questions/22697936/…
Simplest code for array intersection in javascript的可能重复
O(n) 的复杂度是可能的。找到两个数组中的最小值，然后为每个项目找到下一个更高的值。一路记录匹配。

标签： javascript algorithm

【解决方案1】：

我有时发现将一个列表转换为哈希集很方便。

var hashA = {};
for(var i=0; i<a.length; i++) {hashA[a[i]] = true;}

然后你可以搜索哈希集。

for(var i=0; i<b.length; i++) {if(hashA[b[i]]) {console.log(b[i]);}}

这当然不如二分搜索快，因为你必须花时间来构建哈希集，但它还不错，如果你需要保留列表并在未来进行大量搜索，它可能是最好的选择.另外，我知道 javascript 对象不仅仅是哈希集，它很复杂，但大多数情况下效果都很好。

不过，老实说，对于 3000 个项目，我不会更改代码。那还不足以成为一个问题。这将在 30 毫秒内运行。因此，它还取决于它运行的频率。一小时一次？忘掉它。每毫秒一次？绝对要优化它。

【讨论】：

【解决方案2】：

The easiest way!!

var a = [1,2,3,4,5,6,7,8,9,10];
var b = [2,4,5,7,11,15];

for(let i of a){
  if(b.includes(i)){
    console.log(i)
  }
}


--------- OR --------------

var c = a.filter(value => b.includes(value))
console.log(c)

【讨论】：

【解决方案3】：

不确定，但这可能会有所帮助

let num1 = [2, 3, 6, 6, 5];
let num2 = [1, 3, 6, 4];
var array3 = num1.filter((x) => {
  return num2.indexOf(x) != -1
})
console.log(array3);

【讨论】：

【解决方案4】：

如果我们谈论的是在两个数组之间找到共同元素的算法，那么这是我的看法。

function common(arr1, arr2) {
 var newArr = [];
 newArr = arr1.filter(function(v){ return arr2.indexOf(v) >= 0;}) 
 newArr.concat(arr2.filter(function(v){ return newArr.indexOf(v) >= 0;}));
 return newArr;
}

但如果你还要考虑性能，那么你也应该尝试其他方法。

首先在此处检查 javascript 循环的性能，它将帮助您找出最佳方法

https://dzone.com/articles/performance-check-on-different-type-of-for-loops-a

https://hackernoon.com/javascript-performance-test-for-vs-for-each-vs-map-reduce-filter-find-32c1113f19d7

【讨论】：

这会导致完全相同的复杂性（如果不是更糟的话）
它比在循环内创建循环更好。因为如果你在循环内使用循环，那么循环计数是 2000*3000（数组长度），在我的代码中它将是 2000 + 3000。还有其他想法吗？
您的代码不是 2000 + 3000（即线性），使用 .indexOf 只是隐藏了二次性。它还在那里。
但我已经分享了我对这个问题的看法。我已经检查了两个功能时间。我的函数比循环函数运行得更快。
@ArifRathod 那又怎样？ 在大 O 方面并不快。它仍然是二次的：持续的因子改进与关于算法复杂性的面试问题无关。让我换一种方式来解决这个问题：如果数组分别是 2000 万个元素和 3000 万个元素，你仍然认为你的答案足够快吗？

【解决方案5】：

您可以使用第一个数组（无论它们是否已排序）构建散列，然后迭代第二个数组并检查散列中是否存在！

let arr1 = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150],
  arr2 = [15,30,45,60,75,90,105,120,135,150,165]
  hash = arr1.reduce((h,e)=> (h[e]=1, h), {}), //iterate first array once
  common = arr2.filter(v=>hash[v]); //iterate secod array once
  
  console.log('Cpmmon elements: ', common);

【讨论】：

【解决方案6】：

由于数组是排序的，所以二分查找是关键。

基本上，您正在搜索数组中的项目。

您将项目与数组的中间索引（长度/2）进行比较

如果两者相等，你就找到了。

如果项低于数组中间索引处的项，则将项与索引长度 / 4 -> ((0 + 长度 / 2) / 2) 的索引进行比较，如果低于，则在索引 ( (length / 2) + length) / 2（上半部分的中间）等等。

这样，如果在示例中您必须在 40 000 长度的数组中搜索项目，更糟糕的是，您会发现该项目不在经过 16 次比较的数组中：

我在一个有 40000 个索引的数组中搜索“某物”，我能找到它的最小索引是 0，最大值是 39999。

"something" > arr[20000]。让我们假设。我知道现在要搜索的最小索引是 20001，最大值是 39999。我现在正在搜索中间的 (20000 + 39999) / 2。

现在，"something" < arr[30000]，它将搜索范围从索引 20001 限制到 29999。(20000 + 30000) / 2 = 25000。

"something" > arr[25000]，我要从 25001 搜索到 29999。(25000 + 30000) / 2 = 27500

"something" < arr[27500]，我要从 25001 搜索到 27499。(25000 + 27500) / 2 = 26250

"something" > arr[26250]，我必须从 26251 搜索到 27499。(26250 + 27500) / 2 = 26875

"something" < arr[26875]，我要从 26251 搜索到 26874。(26250 + 26875) / 2 = 26563

等等......当然，你必须四舍五入以避免浮动索引

var iteration = 1;

function bSearch(item, arr)
{
    var minimumIndex = 0;
    var maximumIndex = arr.length - 1;
    var index = Math.round((minimumIndex + maximumIndex) / 2);

    while (true)
    {
        ++iteration;
        if (item == arr[index])
        {
            arr.splice(0, minimumIndex);
            return (true);
        }
        if (minimumIndex == maximumIndex)
        {
            arr.splice(0, minimumIndex);
            return (false);
        }
        if (item < arr[index])
        {
            maximumIndex = index - 1;
            index = Math.ceil((minimumIndex + maximumIndex) / 2);
        }
        else
        {
            minimumIndex = index + 1;
            index = Math.floor((minimumIndex + maximumIndex) / 2);
        }
    }
}

var arrA;
var arrB;

for (var i = 0; i < arrA.length; ++i)
{
    if (bSearch(arrA[i], arrB))
        console.log(arrA[i]);
}
console.log("number of iterations : " + iteration);

【讨论】：

如果您发布工作代码，我会很乐意对此表示赞同。
不，二分查找确实有助于在已排序的数组中查找一个元素，但不能比较两个已排序的数组。
@Bergi 我知道是对的，但是没有什么能阻止您循环第一个数组并调用二进制搜索函数。我将编辑我的答案。
@Cid 效率仍然很低，不是面试官想要的
@Bergi 此外，您对效率的看法是错误的。这是大小显着不相等的情况的正确答案。 constant * log2 x 将很快变得比 constant + x 小得多，因为 x 变得更大。

【解决方案7】：

我们可以迭代一个数组并在另一个数组中找到重复项，但是每次找到匹配项时，我们都会移动到匹配的元素 + 1 以进行嵌套循环中的下一次迭代。它之所以有效，是因为两个数组都已排序。所以每个匹配的数组都比较短（从左到右）。

我们也可以在第二个数组的元素大于第一个时打破嵌套循环（从右到左更短），因为我们永远找不到匹配项（因为数组是有序，只剩下更大的值），这里和示例在两个 10k 元素的数组中查找重复项大约需要 15 毫秒：

var arr = [];
var arr2 = [];

for(let i = 0; i<9999; i++){
    arr.push(i);
    arr2.push(i+4999)
}

var k = 0;//<-- the index we start to compare
var res = [];

for (let i = 0; i < arr2.length; i++) {
  for (let j = k; j < arr.length; j++) {
    if (arr2[i] === arr[j]) {
      res.push(arr2[i]);
      k = j + 1;//<-- updates the index
      break;
    } else if (arr[j] > arr2[i]) {//<-- there is no need to keep going
      break;
    }
  }
}

console.log(res.length)

我没有打印 res，因为它有 5000 个元素。

【讨论】：

【解决方案8】：

由于两个数组都已排序，因此只需保存最新的匹配索引。然后从这个索引开始你的内部循环。

var lastMatchedIndex = 0;
for(var i=0;i<a.length;i++){
    for(var j=lastMatchIndex ;j<b.length;j++){
        if(a[i]==b[j]){
            console.log(a[i],b[j]);
            lastMatchedIndex = j;
            break;
        }
    }
}

==================

更新：

正如 Xufox 在 cmets 中提到的，如果 a[i] 低于 b[i] 那么你有中断循环，因为它没有继续循环的意义。

var lastMatchedIndex = 0;
for(var i=0;i<a.length;i++){
    if(a[i]<b[i]){
        break;
    }   
    for(var j=lastMatchIndex ;j<b.length;j++){
        if(a[i]==b[j]){
            console.log(a[i],b[j]);
            lastMatchedIndex = j;
            break;
        }
        if(a[i]<b[j]){
            lastMatchedIndex = j;
            break;
        }         
    }
}

【讨论】：

这项改进可以防止检查b的太低的项目，但它不会阻止检查太高的项目。应该有if(a[i] < b[i]){ break; }，否则最坏情况下的复杂度仍然是O(n²)。
@Xufox 是的，你完全正确。我应该编辑我的代码并添加你的代码吗？
如果你愿意，可以。

【解决方案9】：

最佳策略是将比较次数和数组读数降至最低。

理论上你想要的是交替你正在处理的列表，以避免不必要的比较。鉴于列表已排序，我们知道列表中任何索引左侧的数字都不能小于当前索引。

假设以下列表A = [1,5]、列表B = [1,1,3,4,5,6] 和索引a 和b 都从0 开始，您会希望您的代码如下所示：

A[a] == 1, B[b] == 1
A[a] == B[b] --> add indexes to results and increase b (B[b] == 1)
A[a] == B[b] --> add indexes to results and increase b (B[b] == 3)
A[a] < B[b] --> don't add indexes to results and increase a (A[a] == 5)
A[a] > B[b] --> don't add indexes to results and increase b (B[b] == 4)
A[a] > B[b] --> don't add indexes to results and increase b (B[b] == 5)
A[a] == B[b] --> add indexes to results and increase b (B[b] == 6)
A[a] < B[b] --> don't add indexes to results and increase a (A is at the end, so we terminate and return results)

下面是我的 JavaScript 执行上述算法：

//Parameters
var listA = [];
var listB = [];
//Parameter initialization
(function populateListA() {
    var value = 0;
    while (listA.length < 200) {
        listA.push(value);
        value += Math.round(Math.random());
    }
})();
(function populateListB() {
    var value = 0;
    while (listB.length < 300) {
        listB.push(value);
        value += Math.round(Math.random());
    }
})();
//Searcher function
function findCommon(listA, listB) {
    //List of results to return
    var results = [];
    //Initialize indexes
    var indexA = 0;
    var indexB = 0;
    //Loop through list a
    while (indexA < listA.length) {
        //Get value of A
        var valueA = listA[indexA];
        var result_1 = void 0;
        //Get last result or make a first result
        if (results.length < 1) {
            result_1 = {
                value: valueA,
                indexesInA: [],
                indexesInB: []
            };
            results.push(result_1);
        }
        else {
            result_1 = results[results.length - 1];
        }
        //If higher than last result, make new result
        //Push index to result
        if (result_1.value < valueA) {
            //Make new object
            result_1 = {
                value: valueA,
                indexesInA: [indexA],
                indexesInB: []
            };
            //Push to list
            results.push(result_1);
        }
        else {
            //Add indexA to list
            result_1.indexesInA.push(indexA);
        }
        //Loop through list b
        while (indexB < listB.length) {
            //Get value of B
            var valueB = listB[indexB];
            //If b is less than a, move up list b
            if (valueB < valueA) {
                indexB++;
                continue;
            }
            //If b is greather than a, break and move up list a
            if (valueB > valueA) {
                break;
            }
            //If b matches a, append index to result
            result_1.indexesInB.push(indexB);
            //Move up list B
            indexB++;
        }
        //Move up list A
        indexA++;
    }
    //Return all results with values in both lines
    return results.filter(function (result) { return result.indexesInB.length > 0; });
}
//Run
var result = findCommon(listA, listB);
//Output
console.log(result);

【讨论】：

【解决方案10】：

您可以通过检查每个数组的索引来使用嵌套方法，并通过增加索引来查找值。如果找到相等的值，则增加两个索引。

时间复杂度：最大。 O(n+m)，其中n是数组a的长度，m是数组b的长度。

var a = [1, 2, 3, 4, 5, 6, 8, 10, 11, 15], // left side
    b = [3, 7, 8, 11, 12, 13, 15, 17],     // right side
    i = 0,                                 // index for a
    j = 0;                                 // index for b

while (i < a.length && j < b.length) {     // prevent running forever
    while (a[i] < b[j]) {                  // check left side
        ++i;                               // increment index
    }
    while (b[j] < a[i]) {                  // check right side
        ++j;                               // increment
    }
    if (a[i] === b[j]) {                   // check equalness
        console.log(a[i], b[j]);           // output or collect
        ++i;                               // increment indices
        ++j;
    }
}

【讨论】：

只有当每个元素都是唯一的时候，这才像一个魅力
@Cid，如果在同一个数组中有重复，则需要添加另一个while循环，直到相同的值消失。
@MBo 对于大小显着不相等的情况，二分搜索将超过此答案的效率。随着x 变大，constant * log2 x 将很快变得比constant + x 小得多。
@MBo 我不确定你的意思。例如，2000 * log2 40000 ≈ 30000。 2000 * log2 400000 ≈ 37000。这有多奇葩？
@גלעד ברקן 啊哈，现在我确实抓住了。我不小心想到了相反的情况（在小列表中搜索长列表元素）。所以值得根据大小比例选择方法。