Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
117 views
in Technique[技术] by (71.8m points)

Java 8: performance of Streams vs Collections

I'm new to Java 8. I still don't know the API in depth, but I've made a small informal benchmark to compare the performance of the new Streams API vs the good old Collections.

The test consists in filtering a list of Integer, and for each even number, calculate the square root and storing it in a result List of Double.

Here is the code:

    public static void main(String[] args) {
        //Calculating square root of even numbers from 1 to N       
        int min = 1;
        int max = 1000000;

        List<Integer> sourceList = new ArrayList<>();
        for (int i = min; i < max; i++) {
            sourceList.add(i);
        }

        List<Double> result = new LinkedList<>();


        //Collections approach
        long t0 = System.nanoTime();
        long elapsed = 0;
        for (Integer i : sourceList) {
            if(i % 2 == 0){
                result.add(Math.sqrt(i));
            }
        }
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Collections: Elapsed time: %d ns (%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


        //Stream approach
        Stream<Integer> stream = sourceList.stream();       
        t0 = System.nanoTime();
        result = stream.filter(i -> i%2 == 0).map(i -> Math.sqrt(i)).collect(Collectors.toList());
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Streams: Elapsed time: %d ns (%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


        //Parallel stream approach
        stream = sourceList.stream().parallel();        
        t0 = System.nanoTime();
        result = stream.filter(i -> i%2 == 0).map(i -> Math.sqrt(i)).collect(Collectors.toList());
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Parallel streams: Elapsed time: %d ns (%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));      
    }.

And here are the results for a dual core machine:

    Collections: Elapsed time:        94338247 ns   (0,094338 seconds)
    Streams: Elapsed time:           201112924 ns   (0,201113 seconds)
    Parallel streams: Elapsed time:  357243629 ns   (0,357244 seconds)

For this particular test, streams are about twice as slow as collections, and parallelism doesn't help (or either I'm using it the wrong way?).

Questions:

  • Is this test fair? Have I made any mistake?
  • Are streams slower than collections? Has anyone made a good formal benchmark on this?
  • Which approach should I strive for?

Updated results.

I ran the test 1k times after JVM warmup (1k iterations) as advised by @pveentjer:

    Collections: Average time:      206884437,000000 ns     (0,206884 seconds)
    Streams: Average time:           98366725,000000 ns     (0,098367 seconds)
    Parallel streams: Average time: 167703705,000000 ns     (0,167704 seconds)

In this case streams are more performant. I wonder what would be observed in an app where the filtering function is only called once or twice during runtime.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  1. Stop using LinkedList for anything but heavy removing from the middle of the list using iterator.

  2. Stop writing benchmarking code by hand, use JMH.

Proper benchmarks:

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(StreamVsVanilla.N)
public class StreamVsVanilla {
    public static final int N = 10000;

    static List<Integer> sourceList = new ArrayList<>();
    static {
        for (int i = 0; i < N; i++) {
            sourceList.add(i);
        }
    }

    @Benchmark
    public List<Double> vanilla() {
        List<Double> result = new ArrayList<>(sourceList.size() / 2 + 1);
        for (Integer i : sourceList) {
            if (i % 2 == 0){
                result.add(Math.sqrt(i));
            }
        }
        return result;
    }

    @Benchmark
    public List<Double> stream() {
        return sourceList.stream()
                .filter(i -> i % 2 == 0)
                .map(Math::sqrt)
                .collect(Collectors.toCollection(
                    () -> new ArrayList<>(sourceList.size() / 2 + 1)));
    }
}

Result:

Benchmark                   Mode   Samples         Mean   Mean error    Units
StreamVsVanilla.stream      avgt        10       17.588        0.230    ns/op
StreamVsVanilla.vanilla     avgt        10       10.796        0.063    ns/op

Just as I expected stream implementation is fairly slower. JIT is able to inline all lambda stuff but doesn't produce as perfectly concise code as vanilla version.

Generally, Java 8 streams are not magic. They couldn't speedup already well-implemented things (with, probably, plain iterations or Java 5's for-each statements replaced with Iterable.forEach() and Collection.removeIf() calls). Streams are more about coding convenience and safety. Convenience -- speed tradeoff is working here.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...