【问题标题】:Why filtering an unsorted list is faster than filtering a sorted list为什么过滤未排序列表比过滤排序列表更快
【发布时间】:2014-02-25 04:54:45
【问题描述】:

我一直在使用 Java 8 Streams - API,我决定对 stream()parallelStream() 流进行微基准测试。正如预期的那样,parallelStream() 的速度是原来的两倍,但弹出了其他东西 - 如果我在将数据传递给filter 之前对数据进行排序,那么filter->map->collect 的结果要比传递未排序的数据多 5-8 倍列表。

未分类

(Stream) Elapsed time [ns] : 53733996 (53 ms)
(ParallelStream) Elapsed time [ns] : 25901907 (25 ms)

已排序

(Stream) Elapsed time [ns] : 336976149 (336 ms)
(ParallelStream) Elapsed time [ns] : 204781387 (204 ms)

这里是代码

package com.github.svetlinzarev.playground.javalang.lambda;

import static java.lang.Long.valueOf;

import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;

import com.github.svetlinzarev.playground.util.time.Stopwatch;

public class MyFirstLambda {
    private static final int ELEMENTS = 1024 * 1024 * 16;

    private static List<Integer> getRandom(int nElements) {
        final Random random = new Random();
        final List<Integer> data = new ArrayList<Integer>(nElements);
        for (int i = 0; i < MyFirstLambda.ELEMENTS; i++) {
            data.add(random.nextInt(MyFirstLambda.ELEMENTS));
        }
        return data;
    }

    private static void benchStream(List<Integer> data) {
        final Stopwatch stopwatch = new Stopwatch();
        final List<Long> smallLongs = data.stream()
                .filter(i -> i.intValue() < 16)
                .map(Long::valueOf)
                .collect(Collectors.toList());
        stopwatch.log("Stream");
        System.out.println(smallLongs);
    }

    private static void benchParallelStream(List<Integer> data) {
        final Stopwatch stopwatch = new Stopwatch();
        final List<Long> smallLongs = data.parallelStream()
                .filter(i -> i.intValue() < 16)
                .map(Long::valueOf)
                .collect(Collectors.toList());
        stopwatch.log("ParallelStream");
        System.out.println(smallLongs);
    }

    public static void main(String[] args) {
        final List<Integer> data = MyFirstLambda.getRandom(MyFirstLambda.ELEMENTS);
        // Collections.sort(data, (first, second) -> first.compareTo(second)); //<- Sort the data

        MyFirstLambda.benchStream(data);
        MyFirstLambda.benchParallelStream(data);

        MyFirstLambda.benchStream(data);
        MyFirstLambda.benchParallelStream(data);

        MyFirstLambda.benchStream(data);
        MyFirstLambda.benchParallelStream(data);

        MyFirstLambda.benchStream(data);
        MyFirstLambda.benchParallelStream(data);

        MyFirstLambda.benchStream(data);
        MyFirstLambda.benchParallelStream(data);
    }
}

更新

这是一个更好的基准代码

package com.github.svetlinzarev.playground.javalang.lambda;

import static java.lang.Long.valueOf;

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;

import com.github.svetlinzarev.playground.util.time.Stopwatch;

public class MyFirstLambda {
    private static final int ELEMENTS = 1024 * 1024 * 10;
    private static final int SMALLER_THAN = 16;
    private static final int WARM_UP_ITERRATIONS = 1000;

    private static List<Integer> getRandom(int nElements) {
        final Random random = new Random();
        final List<Integer> data = new ArrayList<Integer>(nElements);
        for (int i = 0; i < MyFirstLambda.ELEMENTS; i++) {
            data.add(random.nextInt(MyFirstLambda.ELEMENTS));
        }
        return data;
    }

    private static List<Long> filterStream(List<Integer> data) {
        final List<Long> smallLongs = data.stream()
                .filter(i -> i.intValue() < MyFirstLambda.SMALLER_THAN)
                .map(Long::valueOf)
                .collect(Collectors.toList());
        return smallLongs;
    }

    private static List<Long> filterParallelStream(List<Integer> data) {
        final List<Long> smallLongs = data.parallelStream()
                .filter(i -> i.intValue() < MyFirstLambda.SMALLER_THAN)
                .map(Long::valueOf)
                .collect(Collectors.toList());
        return smallLongs;
    }

    private static long filterAndCount(List<Integer> data) {
        return data.stream()
                .filter(i -> i.intValue() < MyFirstLambda.SMALLER_THAN)
                .count();
    }

    private static long filterAndCountinParallel(List<Integer> data) {
        return data.parallelStream()
                .filter(i -> i.intValue() < MyFirstLambda.SMALLER_THAN)
                .count();
    }

    private static void warmUp(List<Integer> data) {
        for (int i = 0; i < MyFirstLambda.WARM_UP_ITERRATIONS; i++) {
            MyFirstLambda.filterStream(data);
            MyFirstLambda.filterParallelStream(data);
            MyFirstLambda.filterAndCount(data);
            MyFirstLambda.filterAndCountinParallel(data);
        }
    }

    private static void benchmark(List<Integer> data, String message) throws InterruptedException {
        System.gc();
        Thread.sleep(1000); // Give it enough time to complete the GC cycle

        final Stopwatch stopwatch = new Stopwatch();
        MyFirstLambda.filterStream(data);
        stopwatch.log("Stream: " + message);

        System.gc();
        Thread.sleep(1000); // Give it enough time to complete the GC cycle

        stopwatch.reset();
        MyFirstLambda.filterParallelStream(data);
        stopwatch.log("ParallelStream: " + message);

        System.gc();
        Thread.sleep(1000); // Give it enough time to complete the GC cycle

        stopwatch.reset();
        MyFirstLambda.filterAndCount(data);
        stopwatch.log("Count: " + message);

        System.gc();
        Thread.sleep(1000); // Give it enough time to complete the GC cycle

        stopwatch.reset();
        MyFirstLambda.filterAndCount(data);
        stopwatch.log("Count in parallel: " + message);
    }

    public static void main(String[] args) throws InterruptedException {
        final List<Integer> data = MyFirstLambda.getRandom(MyFirstLambda.ELEMENTS);

        MyFirstLambda.warmUp(data);
        MyFirstLambda.benchmark(data, "UNSORTED");

        Collections.sort(data, (first, second) -> first.compareTo(second));
        MyFirstLambda.benchmark(data, "SORTED");

        Collections.sort(data, (first, second) -> second.compareTo(first));
        MyFirstLambda.benchmark(data, "IN REVERSE ORDER");

    }
}

结果还是相似的:

   16:09:20.470 [main] INFO  c.g.s.playground.util.time.Stopwatch - (Stream: UNSORTED) Elapsed time [ns] : 66812263 (66 ms)
16:09:22.149 [main] INFO  c.g.s.playground.util.time.Stopwatch - (ParallelStream: UNSORTED) Elapsed time [ns] : 39580682 (39 ms)
16:09:23.875 [main] INFO  c.g.s.playground.util.time.Stopwatch - (Count: UNSORTED) Elapsed time [ns] : 97852866 (97 ms)
16:09:25.537 [main] INFO  c.g.s.playground.util.time.Stopwatch - (Count in parallel: UNSORTED) Elapsed time [ns] : 94884189 (94 ms)
16:09:35.608 [main] INFO  c.g.s.playground.util.time.Stopwatch - (Stream: SORTED) Elapsed time [ns] : 361717676 (361 ms)
16:09:38.439 [main] INFO  c.g.s.playground.util.time.Stopwatch - (ParallelStream: SORTED) Elapsed time [ns] : 150115808 (150 ms)
16:09:41.308 [main] INFO  c.g.s.playground.util.time.Stopwatch - (Count: SORTED) Elapsed time [ns] : 338335743 (338 ms)
16:09:44.209 [main] INFO  c.g.s.playground.util.time.Stopwatch - (Count in parallel: SORTED) Elapsed time [ns] : 370968432 (370 ms)
16:09:50.693 [main] INFO  c.g.s.playground.util.time.Stopwatch - (Stream: IN REVERSE ORDER) Elapsed time [ns] : 352036140 (352 ms)
16:09:53.323 [main] INFO  c.g.s.playground.util.time.Stopwatch - (ParallelStream: IN REVERSE ORDER) Elapsed time [ns] : 151044664 (151 ms)
16:09:56.159 [main] INFO  c.g.s.playground.util.time.Stopwatch - (Count: IN REVERSE ORDER) Elapsed time [ns] : 359281197 (359 ms)
16:09:58.991 [main] INFO  c.g.s.playground.util.time.Stopwatch - (Count in parallel: IN REVERSE ORDER) Elapsed time [ns] : 353177542 (353 ms)

那么,我的问题是为什么过滤未排序列表比过滤排序列表更快?

【问题讨论】:

  • 我假设您已经多次迭代此基准并计算了平均值和标准值。您给出的数字的偏差。否则你的数字是垃圾。你知道,PC 有一个调度程序,所以 CPU 时间和挂钟时间几乎永远不会匹配。
  • @Stefano Sanfilippo - 是的,我有。但我对确切的数字不感兴趣,但为什么对数据进行排序会减慢进程
  • @Andrei Andrei - 我很熟悉这个 SO 问题,这正是我问的原因 - 因为它表现出 OPPOSITE 行为
  • 为什么是重复的?这是关于缓存未命中的问题,而另一个问题是关于分支预测的。

标签: java lambda java-8


【解决方案1】:

当您使用未排序列表时,所有元组都在 记忆顺序。它们已在 RAM 中连续分配。 CPU喜欢 顺序访问内存,因为它们可以推测性地请求 下一个缓存行,以便在需要时始终存在。

当您对列表进行排序时,您会将其随机排列,因为 您的排序键是随机生成的。这意味着内存 对元组成员的访问是不可预测的。 CPU 无法预取 内存,几乎每次访问元组都是缓存未命中。

这是一个很好的例子,展示了 GC 内存的特定优势 管理:被分配在一起的数据结构 一起使用表现非常好。他们有很大的地方性 参考。

缓存未命中的损失超过了保存的分支预测 在这种情况下会受到惩罚。

这个问题已被接受,也回答了我的问题: Why is processing a sorted array slower than an unsorted array?

当我创建原始的List 排序时 - 即它的元素顺序在内存中,执行时间没有区别,当List 填充随机数时它等于unsorted 版本。

【讨论】:

    猜你喜欢
    • 2023-04-04
    • 2017-03-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-08-23
    • 2019-03-31
    • 1970-01-01
    相关资源
    最近更新 更多