【问题标题】:Custom Java 8 Collector自定义 Java 8 收集器
【发布时间】:2018-10-24 18:39:54
【问题描述】:

我想检查如何实现自定义收集器。

说,我需要做一些事情

(1) 对字母频率图等词的分析和 (2) 能够结合 2 个结果得到一个结果。

class CharHistogram implements Collector<String, Map<Character, Integer>, Map<Character, Integer>> {



    public static CharHistogram toCharHistogram(){
        return new CharHistogram();
    }

    @Override
    public Supplier<Map<Character, Integer>> supplier() {
        SysOut.print("supplier invoked");
        return HashMap::new;
    }

    @Override
    public BiConsumer<Map<Character, Integer>, String> accumulator() {
        SysOut.print("accumulator invoked");
        return (map, val) -> {
            SysOut.print(val +" processed");
            char[] characters = val.toCharArray();
            for (char character : characters) {
                int count = 1;
                if (map.containsKey(character)) {
                    count = map.get(character);
                    count++;
                }
                map.put(character, count);
            }
        };
    }

    @Override
    public BinaryOperator<Map<Character, Integer>> combiner() {
        SysOut.print("combiner invoked");
        return (map1, map2) -> {
            SysOut.print(map1+" merged to "+map2);
            map2.forEach((k, v) -> map1.merge(k, v, (v1, v2) -> v1 + v2));
            return map1;
        };
    }

    @Override
    public Function<Map<Character, Integer>, Map<Character, Integer>> finisher() {
        SysOut.print("finisher invoked");
        return Function.identity();
    }

    @Override
    public Set<java.util.stream.Collector.Characteristics> characteristics() {
        return Collections.unmodifiableSet(EnumSet.of(Characteristics.IDENTITY_FINISH, Characteristics.UNORDERED));
    }

}

客户端代码:

CharHistogram charStatsState = CharHistogram.toCharHistogram();

Map<Character, Integer> charCountMap = Arrays.asList("apple","orange","orange").stream().collect(charStatsState);
SysOut.print(charCountMap);
charCountMap = Arrays.asList("pears","pears","orange").stream().collect(charStatsState);
SysOut.print(charCountMap);

输出:

[main]: supplier invoked
[main]: accumulator invoked
[main]: combiner invoked
[main]: apple processed
[main]: orange processed
[main]: orange processed
[main]: {p=2, a=3, r=2, e=3, g=2, l=1, n=2, o=2}
[main]: supplier invoked
[main]: accumulator invoked
[main]: combiner invoked
[main]: pears processed
[main]: pears processed
[main]: orange processed
[main]: {p=2, a=3, r=3, s=2, e=3, g=1, n=1, o=1}

我没有看到组合器和终结器被调用,我相信需要正确设计这些以实现我正在寻找的东西。

我错过了什么?

编辑:

支持流和组合器的可能方法。但是下面的代码不起作用。

class CharStreamHistogram implements Function<String, Map<Character, Integer>>{

    private int totalCharactersRead;
    private Map<Character, Integer> histogram;

    public int getTotalCharactersRead() {
        return totalCharactersRead;
    }
    public Map<Character, Integer> getHistogram() {
        return histogram;
    }
    public void setHistogram(Map<Character, Integer> histogram) {
        this.histogram = histogram;
    }
    public void setTotalCharactersRead(int totalCharactersRead) {
        this.totalCharactersRead = totalCharactersRead;
    }

    public Map<Character, Integer> combine(Map<Character, Integer>  map2) {
        Map<Character, Integer> map1 = this.histogram;
        map2.forEach((k, v) -> map1.merge(k, v, (v1, v2) -> v1 + v2));
        return map2;
    }


    @Override
    public Map<Character, Integer> apply(String val) {
        char[] characters = val.toCharArray();
        totalCharactersRead += characters.length;
        for (char character : characters) {
            int count = 1;
            if (histogram.containsKey(character)) {
                count = histogram.get(character);
                count++;
            }
            histogram.put(character, count);
        }
        return histogram;
    }

} 

public static <T> Collector<T, ?, CharStreamHistogram> summarizeCharStream(
             CharStreamHistogram histogram) { //TODO: is this correct?
        Collector charStatsState = new Collector<String, CharStreamHistogram, CharStreamHistogram>() {

            @Override
            public Supplier<CharStreamHistogram> supplier() {
                return CharStreamHistogram::new;
            }

            @Override
            public BiConsumer<CharStreamHistogram, String> accumulator() {
                //TODO: What to do here?
                return null;
            }

            @Override
            public BinaryOperator<CharStreamHistogram> combiner() {
                BinaryOperator binaryOperator = (l, r) -> {
                    l.combine(r); //TODO: Something like this?
                };
                return binaryOperator;
            }

            @Override
            public Function<CharStreamHistogram, CharStreamHistogram> finisher() {
                //TODO: What to do here?
                return null;
            }

            @Override
            public Set<java.util.stream.Collector.Characteristics> characteristics() {
                return Collections.unmodifiableSet(EnumSet.of(Characteristics.UNORDERED));
            }
        };
        return charStatsState;
    }

【问题讨论】:

  • 你的Collector 不应该有任何状态。您的累加器对象应同时包含 totalCharactersRead 和地图。
  • @LouisWasserman 谢谢。所以现在我将其删除。但我稍后会谈到。

标签: java lambda java-8 collectors


【解决方案1】:

你已经声明了Characteristics.IDENTITY_FINISH——这明确意味着finisher不会被调用,combiner只会在并行流的情况下被调用。

【讨论】:

  • 我并没有真正理解finisher()的作用。上面的代码对用例是否正确。
  • 是的,除了收集器中的状态。对于这个用例,组合器和整理器都不是必需的。
  • 但我需要能够结合 2 个结果。我该怎么做?
  • @John 你只有在并行流运行时才需要这种能力,除此之外我在这里并没有真正理解你的问题
  • @John 当你想学习,如何编写自定义收集器时,你仍然应该坚持最佳实践,学习这些。首先,没有理由自己实现Collector。您可以简单地调用Collector.of(…),传递三个或四个函数和可选特性。当您省略整理器时,将自动添加IDENTITY_FINISH。此外,您在组合函数中表明您知道Map.merge,那么为什么不在累加器中也使用它呢?并且,不要将整个字符串复制到 char[] 数组中,仅用于 for 循环。使用string.chars().forEach(…)
猜你喜欢
  • 2016-01-25
  • 1970-01-01
  • 2023-03-17
  • 1970-01-01
  • 2016-12-21
  • 2014-06-11
  • 1970-01-01
  • 1970-01-01
  • 2017-05-17
相关资源
最近更新 更多