【问题标题】:Remove duplicates from list with objects based on an attribute of an object根据对象的属性从具有对象的列表中删除重复项
【发布时间】:2021-07-09 19:12:51
【问题描述】:
所以我有这个对象数组列表,这些对象具有名称和时间戳作为属性。我需要一种方法,当它在数组列表中找到重复名称时,它会删除时间戳较小的对象。理论上看起来很容易但我真的被卡住了。我试过的是这样的:
for (int j=0; j<travellers.size(); j++){
if ( (travellers.get(i).getName() ).equals( sorted_travellers.get(j).getName() ) ){
if (travellers.get(i).getTimestamp() < sorted_travellers.get(j).getTimestamp()){
sorted_travellers.remove(j);
}else if (travellers.get(i).getTimestamp() > sorted_travellers.get(j).getTimestamp()){
sorted_travellers.remove(i);
}
}
}
}
【问题讨论】:
标签:
java
arraylist
duplicates
attributes
【解决方案1】:
您可以尝试以下方法:
List<Traveller> travellers = new ArrayList<>();
travellers.add(new Traveller("test", 0L));
travellers.add(new Traveller("test", 1L));
Map<String, Traveller> latestTravellers = new HashMap<>();
for (var traveller : travellers) {
latestTravellers.merge(traveller.getName(), traveller,
(existing, incoming) -> incoming.getTimestamp() > existing.getTimestamp() ? incoming : existing);
}
Collection<Traveller> updatedTravellers = latestTravellers.values();
【解决方案2】:
public static List<Traveler> filter(List<Traveler> travelers) {
List<String> found = new ArrayList<>();
return travelers.stream()
.sorted((t1, t2) -> -t1.timestamp.compareTo(t2.timestamp))
.filter(t -> !found.contains(t.name))
.peek(t -> found.add(t.name))
.collect(Collectors.toList());
}
【解决方案3】:
如果您不介意使用 StreamEx - 它可以轻松完成:
StreamEx.of(travelers)
.sorted(Comparator.comparing(Traveler::getName))
.collapse((t1,t2) -> t1.getName().equals(t2.getName()),
(t1,t2) -> {
if(t1.getTimestamp().compareTo(t2.getTimestamp()) > 0){
return t1;
}
return t2;
}).toList();
不分配任何额外内存且具有 n*lgn 时间复杂度的另一种解决方案可能是:
travelers.sort(Comparator.comparing(Traveler::getName)
.thenComparing(Traveler::getTimestamp).reversed()
);
//Now list sorted in order when all items with the same name are near and highest timestamp is first
Iterator<Traveler> iterator = travelers.iterator();
Traveler current = null;
Traveler next = null;
while (iterator.hasNext()) {
if (current == null) {
current = iterator.next();
} else {
next = iterator.next();
if (next.getName().equals(current.getName())) {
iterator.remove();
} else {
current = next;
}
}
}
【解决方案4】:
要删除重复项并仅保留具有最新时间戳的记录,您可以这样做:
List<Traveller> travellers = ...
// Determine traveller objects to keep.
Map<String, Traveller> byName = new HashMap<>();
for (Traveller traveller : travellers) {
byName.merge(traveller.getName(), traveller,
(a, b) -> a.getTimestamp() > b.getTimestamp() ? a : b);
}
// Remove traveller objects that shouldn't be kept.
// Use iterator to prevent ConcurrentModificationException.
for (Iterator<Traveller> iter = travellers.iterator(); iter.hasNext(); ) {
Traveller traveller = iter.next();
if (traveller != byName.get(traveller.getName()))
iter.remove();
}