【问题标题】:Java8 stream.map on same stream out different mapping functions?Java8 stream.map 在同一个流上输出不同的映射函数?
【发布时间】:2018-09-06 18:54:42
【问题描述】:

有人可以帮我优化下面的代码吗?我不想在同一个列表上播放 3 次。我必须迭代同一个列表并应用不同的映射函数。 有人可以提出任何更好的解决方案吗 -

List<Dummy> dummy = getDummyData(); //Assume we are getting data from some source
List<NewDummy> newDummyList = dummy.stream().map(eachDummy -> mapper.map(eachDummy, NewDummy.class)).collect(Collectors.toList());

if(someCondition) {
  final BigDecimal amount1 = dummy.stream().filter(eachDummy -> Optional.ofNullable(eachDummy.getAmount1()).isPresent())
                                  .map(Dummy::getAmount1).reduce(BigDecimal.ZERO, BigDecimal::add);
  final BigDecimal amount2 = dummy.stream().filter(eachDummy -> Optional.ofNullable(eachDummy.getAmount2()).isPresent())
                                  .map(Dummy::getAmount2).reduce(BigDecimal.ZERO, BigDecimal::add);

  return new DummyObject(newDummyList, amount1, amount2);
} else {
    return new DummyObject(newDummyList);
}

【问题讨论】:

  • 您的eachDummy::getAmount1eachDummy::getAmount2 应分别为Dummy::getAmount1Dummy::getAmount1
  • @TomaszLinkowski 是的..这是正确的.. 会更新

标签: lambda java-8 functional-programming java-stream


【解决方案1】:

这似乎是自定义收集器的理想用例。但在此之前,我认为您可以将金额的总和简化如下:

BigDecimal amount1 = dummy.stream()
    .map(Dummy::getAmount1)
    .filter(Objects::nonNull)
    .reduce(BigDecimal::add).orElse(BigDecimal.ZERO);

现在,自定义收集器。您可以将 Dummy 的实例累积到静态实用程序方法内的专用本地类的实例中:

static Collector<Dummy, ?, DummyObject> toDummyObject(
        Function<Dummy, NewDummy> mapper, 
        boolean someCondition) {

    class Accumulator {
        List<NewDummy> newDummyList = new ArrayList<>();
        BigDecimal amount1 = BigDecimal.ZERO;
        BigDecimal amount2 = BigDecimal.ZERO;

        public void add(Dummy dummy) {
            newDummyList.add(mapper.apply(dummy));
        }

        public void addAndSum(Dummy dummy) {
            if (dummy.getAmount1() != null) amount1 = amount1.add(dummy.getAmount1());
            if (dummy.getAmount2() != null) amount2 = amount2.add(dummy.getAmount2());
            add(dummy);
        }

        public Accumulator merge(Accumulator another) {
            newDummyList.addAll(another.newDummyList);
            return this;
        }

        public Accumulator mergeAndSum(Accumulator another) {
            amount1 = amount1.add(another.amount1);
            amount2 = amount2.add(another.amount2);
            return merge(another);
        }

        public DummyObject finish() {
            return someCondition ?
                new DummyObject(newDummyList, amount1, amount2) :
                new DummyObject(newDummyList);
        }
    }

    return Collector.of(
        Accumulator::new, 
        someCondition ? Accumulator::addAndSum : Accumulator::add,
        someCondition ? Accumulator::mergeAndSum : Accumulator::merge,
        Accumulator::finish);
}

现在我们准备好了:

dummy.stream().collect(toDummyObject(
    eachDummy -> mapper.map(eachDummy, NewDummy.class), 
    someCondition));

【讨论】:

  • @federico 感谢您提出了一个很好的选择。在性能方面,您不认为使用 Reduce 来计算原始方法中的总和会优于您在此处建议的代码。基本上,如果我想将您建议的方法与有问题的方法进行比较,我该怎么做?
  • @prats 谢谢你的话。在此处搜索 jmh microbenchmark,您会发现非常有用的信息。我相信(理论上)我的版本会比你的版本稍微好一点,只是一点点。两个版本具有相同的时间复杂度。我认为有 3 个单独的循环或流是可以的,也许您可​​以将金额的总和重构为私有方法。
  • 在调试完代码后,我意识到combiner(mergeAndSum 或merge)永远不会被调用,因为这是一个顺序流。所以我们可能会将其更改为并行流以利用组合器
  • @prats 你可以尝试并测量,但我怀疑你会发现任何改进
【解决方案2】:

我同意Federico 的观点,Collector 似乎是这里的最佳选择。

然而,与其实现一个非常专业的Collector,我更愿意实现一些通用的“构建块”,然后使用这些块来撰写在给定情况下我需要的Collector

假设:

interface Mapper<T> {
    T map(Dummy dummy, Class<T> type);
}

这是使用我的解决方案时DummyObject 的构造方式:

Collector<Dummy, ?, DummyObject> dummyObjectCollector = someCondition
        ? toDummyObjectWithSums(mapper)
        : toDummyObjectWithoutSums(mapper);
return dummy.stream().collect(dummyObjectCollector);

这是我如何编写特定于用例的Collectors:

private static Collector<Dummy, ?, DummyObject> toDummyObjectWithoutSums(Mapper<NewDummy> mapper) {
    return Collectors.collectingAndThen(toNewDummyList(mapper), DummyObject::new);
}

private static Collector<Dummy, ?, List<NewDummy>> toNewDummyList(Mapper<NewDummy> mapper) {
    return Collectors.mapping(dummy -> mapper.map(dummy, NewDummy.class), Collectors.toList());
}

private static Collector<Dummy, ?, DummyObject> toDummyObjectWithSums(Mapper<NewDummy> mapper) {
    return ExtraCollectors.collectingBoth(
            toNewDummyList(mapper),
            sumGroupCollector(),
            (newDummyList, amountSumPair) -> new DummyObject(
                    newDummyList, amountSumPair.getAmountSum1(), amountSumPair.getAmountSum2()
            )
    );
}

private static Collector<Dummy, ?, AmountSumPair> sumGroupCollector() {
    return ExtraCollectors.collectingBoth(
            summingAmount(Dummy::getAmount1),
            summingAmount(Dummy::getAmount2),
            AmountSumPair::new
    );
}

static Collector<Dummy, ?, BigDecimal> summingAmount(Function<Dummy, BigDecimal> getter) {
    return Collectors.mapping(getter,
            ExtraCollectors.filtering(Objects::nonNull,
                    ExtraCollectors.summingBigDecimal()
            )
    );
}

private static class AmountSumPair {
    private final BigDecimal amountSum1;
    private final BigDecimal amountSum2;

    // constructor + getters
}

最后,我们来看看通用的“积木”(我把它放在ExtraCollectors 类中):

  • summingBigDecimal:很明显
  • filtering:也比较明显(对应Stream.filter
  • collectingBoth:这是最有趣的一个:
    1. 需要两个Collectors(都在T上运行,但返回不同的结果,即Collector&lt;T, ?, R1&gt;Collector&lt;T, ?, R2&gt;
    2. 并使用BiFunction&lt;R1, R2, R&gt; 将它们组合成一个Collector&lt;T, ?, R&gt;

这是ExtraCollectors 类:

final class ExtraCollectors {

    static Collector<BigDecimal, ?, BigDecimal> summingBigDecimal() {
        return Collectors.reducing(BigDecimal.ZERO, BigDecimal::add);
    }

    static <T, A, R> Collector<T, A, R> filtering(
            Predicate<T> filter, Collector<T, A, R> downstream) {
        return Collector.of(
                downstream.supplier(),
                (A acc, T t) -> {
                    if (filter.test(t)) {
                        downstream.accumulator().accept(acc, t);
                    }
                },
                downstream.combiner(),
                downstream.finisher(),
                downstream.characteristics().toArray(new Collector.Characteristics[0])
        );
    }

    static <T, R1, R2, R> Collector<T, ?, R> collectingBoth(
            Collector<T, ?, R1> collector1, Collector<T, ?, R2> collector2, BiFunction<R1, R2, R> biFinisher) {
        return collectingBoth(new BiCollectorHandler<>(collector1, collector2), biFinisher);
    }

    // method needed to capture A1 and A2
    private static <T, A1, R1, A2, R2, R> Collector<T, ?, R> collectingBoth(
            BiCollectorHandler<T, A1, R1, A2, R2> biCollectorHandler, BiFunction<R1, R2, R> biFinisher) {
        return Collector.<T, BiCollectorHandler<T, A1, R1, A2, R2>.BiAccumulator, R>of(
                biCollectorHandler::newBiAccumulator,
                BiCollectorHandler.BiAccumulator::accept,
                BiCollectorHandler.BiAccumulator::combine,
                biAccumulator -> biAccumulator.finish(biFinisher)
        );
    }
}

这里是BiCollectorHandler 类(ExtraCollectors.collectingBoth 内部使用):

final class BiCollectorHandler<T, A1, R1, A2, R2> {

    private final Collector<T, A1, R1> collector1;
    private final Collector<T, A2, R2> collector2;

    BiCollectorHandler(Collector<T, A1, R1> collector1, Collector<T, A2, R2> collector2) {
        this.collector1 = collector1;
        this.collector2 = collector2;
    }

    BiAccumulator newBiAccumulator() {
        return new BiAccumulator(collector1.supplier().get(), collector2.supplier().get());
    }

    final class BiAccumulator {

        final A1 acc1;
        final A2 acc2;

        private BiAccumulator(A1 acc1, A2 acc2) {
            this.acc1 = acc1;
            this.acc2 = acc2;
        }

        void accept(T t) {
            collector1.accumulator().accept(acc1, t);
            collector2.accumulator().accept(acc2, t);
        }

        BiAccumulator combine(BiAccumulator other) {
            A1 combined1 = collector1.combiner().apply(acc1, other.acc1);
            A2 combined2 = collector2.combiner().apply(acc2, other.acc2);
            return new BiAccumulator(combined1, combined2);
        }

        <R> R finish(BiFunction<R1, R2, R> biFinisher) {
            R1 result1 = collector1.finisher().apply(acc1);
            R2 result2 = collector2.finisher().apply(acc2);
            return biFinisher.apply(result1, result2);
        }
    }
}

【讨论】:

  • +1 的工作(我敢打赌它花了相当长的时间来写)和 btw BiCollector 可能会来到 jdk-12。
  • @Eugene Ha,Tagir Valeev came up with this 在我之前 ;) 很高兴知道,感谢您提供的信息!我希望这个功能可以在 JDK 12 中使用。是的,写起来花了一些时间——比我预期的要长一点,但这是 SO 的问题:解决像这里这样的问题,一旦你开始就会上瘾 ;)
  • 这是一部很棒的作品,Tomasz,我同意拥有这些构建块是一种更好的方法。我也非常喜欢你最初的分析,我认为它显示了硬币的另一面,即you don't need a specialized collector for this, just refactor the sums of the amounts into a private method and you will be OK streaming 3 times over the list
  • @FedericoPeraltaSchaffner 谢谢! :) 关于我最初的回答:这实际上是错误的,因为我误解了这个问题,并且只播放了一次 List&lt;Dummy&gt; 和两次 List&lt;NewDummy&gt;。当然,List&lt;Dummy&gt; 的三重流式传输也可能是一种选择,但由于 OP 写道“假设我们从某个来源获取数据”,我认为这种流式传输被认为是昂贵的。
  • 更新:“BiCollector”功能将以 Collectors.teeing:JDK-8209685 的形式进入 JDK12。
猜你喜欢
  • 2018-09-08
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2022-10-23
  • 1970-01-01
相关资源
最近更新 更多