使用 Java 8 Stream API 处理 HashMap答案

【问题标题】：Processing HashMap using Java 8 Stream API使用 Java 8 Stream API 处理 HashMap
【发布时间】：2017-09-19 09:56:41
【问题描述】：

我在表单中有一个哈希表

Map<String, Map<String,Double>

我需要处理它并创建另一个具有相同结构的。

按照示例说明目标

INPUT HASH TABLE
----------------------------
|       |   12/7/2000 5.0  |
| id 1  |   13/7/2000 4.5  |
|       |   14/7/2000 3.4  |
  ...
| id N  |      ....        |

 OUTPUT HASH TABLE
|  id 1 |    1/1/1800 max(5,4.5,3.4) |
  ...             ...

特别是，输出必须具有相同的键（id1，...，id n）内部哈希表必须有一个固定的键 (1/1/1800) 和一个已处理的值。

我当前（不工作）的代码：

output = input.entrySet()
                        .stream()
                        .collect(
                                Collectors.toMap(entry -> entry.getKey(), 
                                        entry -> Collectors.toMap(
                                                e -> "1/1/2000",
                                                e -> {
                                            // Get input array
                                            List<Object> list = entry.getValue().values().stream()
                                                    .collect(Collectors.toList());

                                            DescriptiveStatistics stats = new DescriptiveStatistics();

                                            // Remove the NaN values from the input array
                                            list.forEach(v -> {
                                                if(!new Double((double)v).isNaN()) 
                                                    stats.addValue((double)v);
                                            });

                                            double value = stats.max();                         

                                            return value;
                                        }));

问题出在哪里？

谢谢

【问题讨论】：

您遇到的错误是什么？
无法将 Map

标签： java hashmap java-stream

【解决方案1】：

问题是试图在第一个 Collectors.toMap 中调用第二个类型 Collectors.toMap。 Collectors.toMap 应传递给接受 Collector 的方法。

这是实现您想要的一种方法：

Map<String, Map<String,Double>>
output = input.entrySet()
              .stream()
              .collect(Collectors.toMap(e -> e.getKey(),
                                        e -> Collections.singletonMap (
                                            "1/1/1800",
                                            e.getValue()
                                             .values()
                                             .stream()
                                             .filter (d->!Double.isNaN (d))
                                             .mapToDouble (Double::doubleValue)
                                             .max()
                                             .orElse(0.0))));

请注意，不需要第二个Collectors.toMap。输出的内部 Maps 每个都有一个条目，因此您可以使用 Collections.singletonMap 来创建它们。

【讨论】：

过滤!isNan。
调用(max, sum, ...)的操作应该通过反射来调用。所以我需要对内部哈希表进行一些处理。我正在使用 Apache Commons Math 库（请参阅 stats 变量）
@DodgyCodeException 谢谢。添加了过滤器。
@Fab 好吧，您可以调用reduce() 并将执行所需操作的DoubleBinaryOperator 传递给它，而不是调用max()。
@Eran 好的，不错的选择。但是，没有办法修复原始代码？

【解决方案2】：

您的原始代码可以使用Collections.singletonMap 而不是Collectors.toMap 解决

Map<String, Map<String,Double>> output = input.entrySet()
                .stream()
                .collect(
                        Collectors.toMap(entry -> entry.getKey(), 
                            entry -> {
                                // Get input array
                                List<Object> list = entry.getValue().values().stream()
                                        .collect(Collectors.toList());

                                DescriptiveStatistics stats = new DescriptiveStatistics();

                                // Remove the NaN values from the input array
                                list.forEach(v -> {
                                    if(!new Double((double)v).isNaN()) 
                                        stats.addValue((double)v);
                                });

                                double value = stats.max();                         

                                return Collections.singletonMap("1/1/2000", value);
                            }));

或者使嵌套的Collectors.toMap 成为实际流操作的一部分

Map<String, Map<String,Double>> output = input.entrySet()
                .stream()
                .collect(Collectors.toMap(entry -> entry.getKey(), 
                            entry -> Stream.of(entry.getValue()).collect(Collectors.toMap(
                                    e -> "1/1/2000",
                                    e -> {
                                // Get input array
                                List<Object> list = e.values().stream()
                                        .collect(Collectors.toList());

                                DescriptiveStatistics stats = new DescriptiveStatistics();

                                // Remove the NaN values from the input array
                                list.forEach(v -> {
                                    if(!new Double((double)v).isNaN()) 
                                        stats.addValue((double)v);
                                });

                                double value = stats.max();                         

                                return value;
                            }))));

虽然这是一个安静的巴洛克式解决方案。

也就是说，你应该知道有标准的DoubleSummaryStatistics 使得DescriptiveStatistics 是不必要的，但是如果你只想获得最大值，那么两者都是不必要的。

此外，如果确实需要 List，List<Object> list = e.values().stream().collect(Collectors.toList()); 可以简化为 List<Object> list = new ArrayList<>(e.values());，但在这里，Collection<Double> list = e.values(); 就足够了，并且使用 Double 而不是 Object 键入集合会使后续不需要类型转换。

将这些改进用于第一个变体，您将获得

Map<String, Map<String,Double>> output = input.entrySet()
            .stream()
            .collect(
                    Collectors.toMap(entry -> entry.getKey(), 
                        entry -> {
                            Collection<Double> list = entry.getValue().values();
                            DoubleSummaryStatistics stats = new DoubleSummaryStatistics();
                            list.forEach(v -> {
                                if(!Double.isNaN(v)) stats.accept(v);
                            });
                            double value = stats.getMax();                         
                            return Collections.singletonMap("1/1/2000", value);
                        }));

但是，如前所述，DoubleSummaryStatistics 仍然超出了获得最大值所需的范围：

Map<String, Map<String,Double>> output = input.entrySet()
            .stream()
            .collect(Collectors.toMap(entry -> entry.getKey(), 
                                      entry -> {
                                          double max = Double.NEGATIVE_INFINITY;
                                          for(double d: entry.getValue().values())
                                              if(d > max) max = d;
                                          return Collections.singletonMap("1/1/2000", max);
                                      }));

请注意，double 比较总是评估为 false，如果至少一个值是 NaN，所以使用正确的运算符，即“值可能为 NaN”>“当前最大值从不 NaN”，我们不需要一个额外的条件。

现在，您可以将循环替换为流操作，最终将得到Eran’s solution。选择权在你。

【讨论】：