【问题标题】:Removing duplicates from array is only returning one object从数组中删除重复项只返回一个对象
【发布时间】:2018-11-09 05:22:53
【问题描述】:

我试图从这个 json 中删除重复的条目,但它只返回一个我不明白我哪里出错的对象。

代码如下。

// exemplary array of objects (id 'NewLive' occurs twice)
var arr = [
{"jobcodeid":{"S":"Etc_new"}},
{"jobcodeid":{"S":"NewLive"}},
{"jobcodeid":{"S":"NewLiveVid"}},
{"jobcodeid":{"S":"New_Live"}},
{"jobcodeid":{"S":"New_Live_Vid"}},
{"jobcodeid":{"S":"Newest"}},
{"jobcodeid":{"S":"NewestLive"}},
{"jobcodeid":{"S":"NewestLiveVid"}},
{"jobcodeid":{"S":"Very_New_Vid"}},
{"jobcodeid":{"S":"Etc_new"}},
{"jobcodeid":{"S":"NewLive"}},
{"jobcodeid":{"S":"NewLiveVid"}},
{"jobcodeid":{"S":"New_Live"}},
{"jobcodeid":{"S":"New_Live_Vid"}},
{"jobcodeid":{"S":"Newest"}},
{"jobcodeid":{"S":"NewestLive"}},
{"jobcodeid":{"S":"NewestLiveVid"}},
{"jobcodeid":{"S":"Very_New_Vid"}}
],
    obj = {}, new_arr = [];

// in the end the last unique object will be considered
arr.forEach(function(v){
    obj[v['id']] = v;
   console.log(JSON.stringify(new_arr));
});
new_arr = Object.keys(obj).map(function(id) { return obj[id]; });

console.log(JSON.stringify(new_arr));

我也附上了codepen。

https://codepen.io/anon/pen/oQXJWK

【问题讨论】:

  • 您是否只是想返回没有重复的相同对象?
  • @Jacques 是的,我是
  • obj[v['id']] = v; 应该是 obj[v['jobcodeid']] = v;。由于没有id,因此所有对象都未定义,因此您基本上是用下一个替换所有值。因此你只得到 1
  • for (const job of arr) { if (!obj[job.jobcodeid.S]) { obj[job.jobcodeid.S] = true; new_arr.push(job); } }

标签: javascript arrays node.js json


【解决方案1】:

您的代码返回单个元素的原因是因为您使用的是v['id'],但对象上没有id 属性,因此在整个循环中您一遍又一遍地设置obj[undefined]

在您的 jsfiddle 代码中,虽然这看起来是正确的,并且代码似乎按预期工作。

如果有人遇到这个问题来了解如何在 javascript 中从数组中删除重复项,这里有几个选项:

经典方式:老for循环

这本质上是您使用的解决方案,遍历数组,检查键是否已添加到结果数组中,如果不存在,则将元素添加到结果中。

示例:

const result = [];
const knownIDs = new Set();
for (const item of input) {
  if (!knownIDs.has(item.jobcodeid.S)) {
    result.push(item);
    knownIDs.add(item.jobcodeid.S);
  }
}

地图和返回

要过滤重复项,您可以将元素转换为键 -> 值的Map,然后再转换回数组。这是因为密钥在Map 中是唯一的,并且会自动消除重复项。这种方法的主要优点是由于代码的简单性,它的错误更少。

console.log(
  Array.from(
    new Map(
      input.map(i => [i.jobcodeid.S, i])
    ).values()
  )
)

过滤和设置

另一种选择是使用Set 记录已知ID,并使用filter 删除具有已知ID 的项目。这种方法的优点是它可能更容易阅读,因为意图是明确的。这也比转换为 Map 并返回更高效。

const knownKeys = new Set();
console.log(
  input.filter(i => {
    if (!knownKeys.has(i.jobcodeid.S)) {
      knownKeys.add(i.jobcodeid.S);
      return true;
   }
  })
);

查看它们的实际效果:

const input = [{"jobcodeid":{"S":"Etc_new"}},{"jobcodeid":{"S":"NewLive"}},{"jobcodeid":{"S":"NewLiveVid"}},{"jobcodeid":{"S":"New_Live"}},{"jobcodeid":{"S":"New_Live_Vid"}},{"jobcodeid":{"S":"Newest"}},{"jobcodeid":{"S":"NewestLive"}},{"jobcodeid":{"S":"NewestLiveVid"}},{"jobcodeid":{"S":"Very_New_Vid"}},{"jobcodeid":{"S":"Etc_new"}},{"jobcodeid":{"S":"NewLive"}},{"jobcodeid":{"S":"NewLiveVid"}},{"jobcodeid":{"S":"New_Live"}},{"jobcodeid":{"S":"New_Live_Vid"}},{"jobcodeid":{"S":"Newest"}},{"jobcodeid":{"S":"NewestLive"}},{"jobcodeid":{"S":"NewestLiveVid"}},{"jobcodeid":{"S":"Very_New_Vid"}}];

// Classic for loop
const result = [];
const knownIDs = new Set();
for (const item of input) {
  if (!knownIDs.has(item.jobcodeid.S)) {
    result.push(item);
    knownIDs.add(item.jobcodeid.S);
  }
}

console.log(result.map(r => r.jobcodeid.S));

// To Map and back
console.log(
  Array.from(
    new Map(
      input.map(i => [i.jobcodeid.S, i])
    ).values()
  )
)

// filter and set
const knownKeys = new Set();
console.log(
  input.filter(i => {
    if (!knownKeys.has(i.jobcodeid.S)) {
      knownKeys.add(i.jobcodeid.S);
      return true;
   }
  })
);

为了记录,我对接受的解决方案、我的解决方案和来自Jacques' answer的性能改进进行了基准测试

accepted solution x 1,892,585 ops/sec ±3.48% (89 runs sampled)
Map and back x 495,116 ops/sec ±2.27% (90 runs sampled)
Set and filter x 1,600,833 ops/sec ±1.98% (90 runs sampled)
Jacques x 2,110,510 ops/sec ±0.98% (92 runs sampled)
Fastest is Jacques

如您所见,Jacques' solution 的速度确实快了一倍,所以如果您的目标是过滤大型数组或性能是关键,您绝对应该选择它!

【讨论】:

    【解决方案2】:

    首先,您必须使用obj[v['jobcodeid']] = v; 而不是obj[v['id']] = v;

    但由于v[jobcodeid] 是一个对象,js 会将其转换为字符串,即[object Object],最终数组中只有一个元素。

    // exemplary array of objects (id 'NewLive' occurs twice)
    var arr=[{"jobcodeid":{"S":"Etc_new"}},{"jobcodeid":{"S":"NewLive"}},{"jobcodeid":{"S":"NewLiveVid"}},{"jobcodeid":{"S":"New_Live"}},{"jobcodeid":{"S":"New_Live_Vid"}},{"jobcodeid":{"S":"Newest"}},{"jobcodeid":{"S":"NewestLive"}},{"jobcodeid":{"S":"NewestLiveVid"}},{"jobcodeid":{"S":"Very_New_Vid"}},{"jobcodeid":{"S":"Etc_new"}},{"jobcodeid":{"S":"NewLive"}},{"jobcodeid":{"S":"NewLiveVid"}},{"jobcodeid":{"S":"New_Live"}},{"jobcodeid":{"S":"New_Live_Vid"}},{"jobcodeid":{"S":"Newest"}},{"jobcodeid":{"S":"NewestLive"}},{"jobcodeid":{"S":"NewestLiveVid"}},{"jobcodeid":{"S":"Very_New_Vid"}}], obj = {}, new_arr = [];
    
    // in the end the last unique object will be considered
    arr.forEach(function(v){
        obj[v['jobcodeid']] = v;
    });
    new_arr = Object.keys(obj).map(function(id) { return obj[id]; });
    
    console.log(JSON.stringify(new_arr));

    您应该使用v.jobcodeid.S 作为对象的键。

    // exemplary array of objects (id 'NewLive' occurs twice)
    var arr=[{"jobcodeid":{"S":"Etc_new"}},{"jobcodeid":{"S":"NewLive"}},{"jobcodeid":{"S":"NewLiveVid"}},{"jobcodeid":{"S":"New_Live"}},{"jobcodeid":{"S":"New_Live_Vid"}},{"jobcodeid":{"S":"Newest"}},{"jobcodeid":{"S":"NewestLive"}},{"jobcodeid":{"S":"NewestLiveVid"}},{"jobcodeid":{"S":"Very_New_Vid"}},{"jobcodeid":{"S":"Etc_new"}},{"jobcodeid":{"S":"NewLive"}},{"jobcodeid":{"S":"NewLiveVid"}},{"jobcodeid":{"S":"New_Live"}},{"jobcodeid":{"S":"New_Live_Vid"}},{"jobcodeid":{"S":"Newest"}},{"jobcodeid":{"S":"NewestLive"}},{"jobcodeid":{"S":"NewestLiveVid"}},{"jobcodeid":{"S":"Very_New_Vid"}}], obj = {}, new_arr = [];
    
    // in the end the last unique object will be considered
    arr.forEach(function(v){
        obj[v.jobcodeid.S] = v;
    });
    new_arr = Object.keys(obj).map(function(id) { return obj[id]; });
    
    console.log(JSON.stringify(new_arr));

    【讨论】:

    • 我在评论中错过了这一点。 +1
    • 你测试过代码吗?你可以在 codepen 上更新相同的内容吗?
    • 您可以通过Object.values 获得new_arrnew_arr = Object.values(obj)
    【解决方案3】:

    发布答案以展示另一种提高效率的方法。

    var arr = [
      {"jobcodeid":{"S":"Etc_new"}
      },
      {"jobcodeid":{"S":"NewLive"}
      },
    {"jobcodeid":{"S":"NewLiveVid"}},
    {"jobcodeid":{"S":"New_Live"}},
    {"jobcodeid":{"S":"New_Live_Vid"}},
    {"jobcodeid":{"S":"Newest"}},
    {"jobcodeid":{"S":"NewestLive"}},
    {"jobcodeid":{"S":"NewestLiveVid"}},
    {"jobcodeid":{"S":"Very_New_Vid"}},
    {"jobcodeid":{"S":"Etc_new"}},
    {"jobcodeid":{"S":"NewLive"}},
    {"jobcodeid":{"S":"NewLiveVid"}},
    {"jobcodeid":{"S":"New_Live"}},
    {"jobcodeid":{"S":"New_Live_Vid"}},
    {"jobcodeid":{"S":"Newest"}},
    {"jobcodeid":{"S":"NewestLive"}},
    {"jobcodeid":{"S":"NewestLiveVid"}},
    {"jobcodeid":{"S":"Very_New_Vid"}}
    ],
        obj = {}, new_arr = [];
    
    // in the end the last unique object will be considered
    for (const job of arr) {
    	if (!obj[job.jobcodeid.S]) {
    		obj[job.jobcodeid.S] = true;
    		new_arr.push(job);
        }
    }
    
    console.log(JSON.stringify(new_arr));

    这个答案总是运行 N 次迭代。当您在设置唯一值后遍历键时,它最多可以运行 2N 次迭代。 (从谈论 Big O/Complexity 改为更清楚)

    【讨论】:

    • O(2n)O(n) 完全相同 -- en.wikipedia.org/wiki/Big_O_notation
    • 你是对的,我不应该用大 o 来解释这一点,因为它在技术上用于解释复杂性如何随着数量的增加而增加。我的观点是,一种解决方案的完成时间可能是另一种解决方案的两倍。
    • 即使循环两次并不一定意味着更慢,但您运行的操作总数更少,所以您是对的,这段代码会执行得更快。我运行了一些快速基准测试,确认该解决方案的运行速度比第二好的解决方案快大约 2 倍 :)
    【解决方案4】:

    您只需要使用Set

    const arr = [
        { jobcodeid: { S: "Etc_new" } },
        { jobcodeid: { S: "NewLive" } },
        { jobcodeid: { S: "NewLiveVid" } },
        { jobcodeid: { S: "New_Live" } },
        { jobcodeid: { S: "New_Live_Vid" } },
        { jobcodeid: { S: "Newest" } },
        { jobcodeid: { S: "NewestLive" } },
        { jobcodeid: { S: "NewestLiveVid" } },
        { jobcodeid: { S: "Very_New_Vid" } },
        { jobcodeid: { S: "Etc_new" } },
        { jobcodeid: { S: "NewLive" } },
        { jobcodeid: { S: "NewLiveVid" } },
        { jobcodeid: { S: "New_Live" } },
        { jobcodeid: { S: "New_Live_Vid" } },
        { jobcodeid: { S: "Newest" } },
        { jobcodeid: { S: "NewestLive" } },
        { jobcodeid: { S: "NewestLiveVid" } },
        { jobcodeid: { S: "Very_New_Vid" } }
    ];
    
    const uniqueItems = [...new Set(arr.map(i => i.jobcodeid.S))]
    

    【讨论】:

      【解决方案5】:

      也试试这个..解决这个问题的另一种方法

      var arr = [
        {"jobcodeid":{"S":"Etc_new"}
        },
        {"jobcodeid":{"S":"NewLive"}
        },
      {"jobcodeid":{"S":"NewLiveVid"}},
      {"jobcodeid":{"S":"New_Live"}},
      {"jobcodeid":{"S":"New_Live_Vid"}},
      {"jobcodeid":{"S":"Newest"}},
      {"jobcodeid":{"S":"NewestLive"}},
      {"jobcodeid":{"S":"NewestLiveVid"}},
      {"jobcodeid":{"S":"Very_New_Vid"}},
      {"jobcodeid":{"S":"Etc_new"}},
      {"jobcodeid":{"S":"NewLive"}},
      {"jobcodeid":{"S":"NewLiveVid"}},
      {"jobcodeid":{"S":"New_Live"}},
      {"jobcodeid":{"S":"New_Live_Vid"}},
      {"jobcodeid":{"S":"Newest"}},
      {"jobcodeid":{"S":"NewestLive"}},
      {"jobcodeid":{"S":"NewestLiveVid"}},
      {"jobcodeid":{"S":"Very_New_Vid"}}
      ],
          obj = {}, new_arr = [];
      
      
      arr.forEach(function(v){
          obj[v['id']] = v;
      
          for(var i=0;i< new_arr.length;i++){
            if(new_arr[i].jobcodeid.S == v.jobcodeid.S)       {
               return;
            }
          } 
            new_arr.push(v);     
      });
      
      console.log(new_arr);
      

      【讨论】:

      • 虽然这可行,但请在此处考虑代码复杂性。你在一个循环中有一个循环,这是非常低效的。
      猜你喜欢
      • 2020-06-22
      • 2017-04-10
      • 2016-03-12
      • 2018-07-10
      • 1970-01-01
      相关资源
      最近更新 更多