【问题标题】:Project Reactor - How to create a sliding window Flux from a Java 8 streamProject Reactor - 如何从 Java 8 流创建滑动窗口 Flux
【发布时间】:2017-06-17 14:59:51
【问题描述】:

Java 8 Streams 不允许重用。这就产生了一个关于如何在创建滑动窗口通量来计算像 x(i)*x(i-1) 这样的关系时重用流的难题。

以下代码基于移位运算符的思想。我用 skip(1) 移动第一个流以创建第二个流。

Flux<Integer> primary = Flux.fromStream(IntStream.range(1, 10).boxed());
Flux<Integer> secondary = primary.skip(1);
primary.zipWith(secondary)
        .map(t -> t.getT1() * t.getT2())
        .subscribe(System.out::println);

这是上述代码的可视化表示:

1 2 3 4 5 6 7 8 9 10
v v v v v v v v v v  skip(1)
2 3 4 5 6 7 8 9 10
v v v v v v v v v v  zipWith
1 2, 2 3, 3 4, 4 5, 5 6, 6 7, 7 8, 8 9, 9 10 <- sliding window of length 2
v v v v v v v v v v  multiples
2 6 12 20 30 42 56 72 90

不幸的是,这段代码错误为:

java.lang.IllegalStateException: stream has already been operated upon or closed

显而易见的解决方法是缓存元素并确保缓存大小大于或等于流大小:

Flux<Integer> primary = Flux.fromStream(IntStream.range(1, 10).boxed()).cache(10);

或使用流替换:

Flux<Integer> primary = Flux.range(0, 10);

第二种解决方案只是重新执行skip(1)序列的原始序列。

然而,一个有效的解决方案只需要一个大小为 2 的缓冲区。如果流恰好是一个大文件,这很重要:

Files.lines(Paths.get(megaFile));

如何有效地缓冲流,以便多次订阅主 Flux 不会导致所有内容都被读入内存或导致重新执行?

【问题讨论】:

    标签: java project-reactor


    【解决方案1】:

    我终于找到了一个解决方案,尽管它不是面向缓冲区的。灵感是先解决滑动窗口为2的问题:

    Flux<Integer> primary = Flux.fromStream(IntStream.range(0, 10).boxed());
    primary.flatMap(num -> Flux.just(num, num))
        .skip(1)
        .buffer(2)
        .filter(list -> list.size() == 2)
        .map(list -> Arrays.toString(list.toArray()))
        .subscribe(System.out::println);
    

    过程的可视化表示如下:

    1 2 3 4 5 6 7 8 9 
    V V V V V V V V V    Flux.just(num, num)
    1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9
    V V V V V V V V V    skip(1)
    1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9
    V V V V V V V V V    bufffer(2)
    1 2, 2 3, 3 4, 4 5, 5 6, 6 7, 7 8, 8 9, 9
    V V V V V V V V V    filter
    1 2, 2 3, 3 4, 4 5, 5 6, 6 7, 7 8, 8 9
    

    这是输出:

    [0, 1]
    [1, 2]
    [2, 3]
    [3, 4]
    [4, 5]
    [5, 6]
    [6, 7]
    [7, 8]
    [8, 9]
    

    然后我将上面的想法概括为一个任意滑动窗口大小的解决方案:

    public class SlidingWindow {
    
        public static void main(String[] args) {
            System.out.println("Different sliding windows for sequence 0 to 9:");
            SlidingWindow flux = new SlidingWindow();
            for (int windowSize = 1; windowSize < 5; windowSize++) {
                flux.slidingWindow(windowSize, IntStream.range(0, 10).boxed())
                    .map(SlidingWindow::listToString)
                    .subscribe(System.out::print);
                System.out.println();
            }
    
            //show stream difference: x(i)-x(i-1)
            List<Integer> sequence = Arrays.asList(new Integer[]{10, 12, 11, 9, 13, 17, 21});
            System.out.println("Show difference 'x(i)-x(i-1)' for " + listToString(sequence));
            flux.slidingWindow(2, sequence.stream())
                .doOnNext(SlidingWindow::printlist)
                .map(list -> list.get(1) - list.get(0))
                .subscribe(System.out::println);
            System.out.println();
        }
    
        public <T> Flux<List<T>> slidingWindow(int windowSize, Stream<T> stream) {
            if (windowSize > 0) {
                Flux<List<T>> flux = Flux.fromStream(stream).map(ele -> Arrays.asList(ele));
                for (int i = 1; i < windowSize; i++) {
                    flux = addDepth(flux);
                }
                return flux;
            } else {
                return Flux.empty();
            }
        }
    
        protected <T> Flux<List<T>> addDepth(Flux<List<T>> flux) {
            return flux.flatMap(list -> Flux.just(list, list))
                .skip(1)
                .buffer(2)
                .filter(list -> list.size() == 2)
                .map(list -> flatten(list));
        }
    
        protected <T> List<T> flatten(List<List<T>> list) {
            LinkedList<T> newl = new LinkedList<>(list.get(1));
            newl.addFirst(list.get(0).get(0));
            return newl;
        }
    
        static String listToString(List list) {
            return list.stream()
                .map(i -> i.toString())
                .collect(Collectors.joining(", ", "[ ", " ], "))
                .toString();
        }
    
        static void printlist(List list) {
            System.out.print(listToString(list));
        }
    
    }
    

    以上代码的输出如下:

    Different sliding windows for sequence 0 to 9:
    [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ], [ 8 ], [ 9 ], 
    [ 0, 1 ], [ 1, 2 ], [ 2, 3 ], [ 3, 4 ], [ 4, 5 ], [ 5, 6 ], [ 6, 7 ], [ 7, 8 ], [ 8, 9 ], 
    [ 0, 1, 2 ], [ 1, 2, 3 ], [ 2, 3, 4 ], [ 3, 4, 5 ], [ 4, 5, 6 ], [ 5, 6, 7 ], [ 6, 7, 8 ], [ 7, 8, 9 ], 
    [ 0, 1, 2, 3 ], [ 1, 2, 3, 4 ], [ 2, 3, 4, 5 ], [ 3, 4, 5, 6 ], [ 4, 5, 6, 7 ], [ 5, 6, 7, 8 ], [ 6, 7, 8, 9 ], 
    
    Show difference 'x(i)-x(i-1)' for [ 10, 12, 11, 9, 13, 17, 21 ], 
    [ 10, 12 ], 2
    [ 12, 11 ], -1
    [ 11, 9 ], -2
    [ 9, 13 ], 4
    [ 13, 17 ], 4
    [ 17, 21 ], 4
    

    【讨论】:

    • 你可以把primary.flatMap(num -&gt; Flux.just(num, num)).skip(1).buffer(2)写成primary.buffer(2, 1)
    【解决方案2】:

    我已经实现了以下解决方案:

    public <T> Flux<Flux<T>> toSlidingWindow(Flux<T> source, int size) {
        return toSlidingWindow(source, deque -> {
            while (deque.size() > size) {
                deque.poll();
            }
            return Flux.fromIterable(deque);
        });
    }
    
    public <T> Flux<Flux<T>> toSlidingWindow(Flux<T> source, Function<Deque<T>, Flux<T>> dequePruneFunction) {
        return source.map(ohlc -> {
            Deque<T> deque = dequeAtomicReference.get();
            deque.offer(ohlc);
            return dequePruneFunction.apply(deque);
        });
    }
    

    这可以是固定大小的滑动窗口,也可以使用自定义函数来确定每个窗口的范围。

    如果像这样使用它出现任何多线程问题,您可以在acquirerelease 块中复制Deque,这似乎受到AtomicReference 的支持。这将确保生成的窗口Flux 不会被其他线程保持不变。

    大概是这样:

    public <T> Flux<Flux<T>> toSlidingWindowAsync(Flux<T> source, int size) {
        return toSlidingWindowAsync(source, deque -> {
            while (deque.size() > size) {
                deque.poll();
            }
            return Flux.fromIterable(new LinkedList<>(deque));
        });
    }
    
    public <T> Flux<Flux<T>> toSlidingWindowAsync(Flux<T> source, Function<Deque<T>, Flux<T>> dequePruneFunction) {
        AtomicReference<Deque<T>> dequeAtomicReference = new AtomicReference<>(new LinkedList<>());
        return source.map(ohlc -> {
            Deque<T> deque = dequeAtomicReference.getAcquire();
            deque.offer(ohlc);
            Flux<T> windowFlux = dequePruneFunction.apply(deque);
            dequeAtomicReference.setRelease(deque);
            return windowFlux;
        });
    }
    

    这会复制用于每个生成的滑动窗口的Deque

    【讨论】:

      【解决方案3】:

      如果你使用的是 Reactor Core 3(我不确定这个操作符是什么时候发布的),你可以简单地使用

          Flux.fromStream(IntStream.rangeClosed(1, 10).boxed())
                  .buffer(2, 1)
                  .skipLast(1)
                  .map(t -> t.stream().reduce((a, b)-> a*b))
                  .subscribe(System.out::println);
      

      神奇的是 buffer(2, 1) 部分:这里 maxSize 为 2,skip 为 1。由于 maxSize 大于 skip,这会在通量上创建重叠缓冲区(即滑动窗口),并发出每个缓冲区作为列表。 需要 skipLast(1),因为最后一个缓冲区将是单个元素(共 10 个),因此需要跳过。

      【讨论】:

        猜你喜欢
        • 2017-12-17
        • 1970-01-01
        • 2019-09-27
        • 2019-01-07
        • 1970-01-01
        • 2020-09-14
        • 1970-01-01
        • 2016-03-13
        • 1970-01-01
        相关资源
        最近更新 更多