Java 8+ Functional Programming :: Collections & Streams (1)

The Concept of a Collection

A collection represents a group of objects, known as its elements.

 

The Collection Interface

JDK 9 defines this interface as follows:

public interface Collection extends Iterable {
    int size();
    boolean isEmpty();
    boolean contains(Object o);
    Iterator iterator();
    Object[] toArray();
     T[] toArray(T[] a);
    boolean add(E e);
    boolean remove(Object o);
    boolean containsAll(Collection c);
    boolean addAll(Collection c);
    boolean removeAll(Collection c);
    default boolean removeIf(Predicate filter) { /* ... */ }
    boolean retainAll(Collection c);
    void clear();
    boolean equals(Object o);
    int hashCode();
    default Spliterator spliterator() { /* ... */ }

    /**
     * Returns a sequential {@code Stream} with this collection as its source.
     *
     * 

This method should be overridden when the {@link #spliterator()} * method cannot return a spliterator that is {@code IMMUTABLE}, * {@code CONCURRENT}, or late-binding. (See {@link #spliterator()} * for details.) * * @implSpec * The default implementation creates a sequential {@code Stream} from the * collection's {@code Spliterator}. * * @return a sequential {@code Stream} over the elements in this collection * @since 1.8 */ default Stream stream() { return StreamSupport.stream(spliterator(), false); } default Stream parallelStream() { /* ... */ } }

 

The Concept of a Stream

From JDK 9:

A sequence of elements supporting sequential and parallel aggregate perations. The following example illustrates an aggregate operation using Stream and IntStream:

    int sum = widgets.stream()
                     .filter(w -> w.getColor() == RED)
                     .mapToInt(w -> w.getWeight())
                     .sum();

In this example, widgets is a Collection. We create a stream of Widget objects via Collection#stream Collection.stream(), filter it to produce a stream containing only the red widgets, and then transform it into a stream of int values representing the weight of each red widget. Then this stream is summed to produce a total weight.

In addition to Stream, which is a stream of object references, there are primitive specializations for IntStream, LongStream, and DoubleStream, all of which are referred to as “streams” and conform to the characteristics and restrictions described here.

To perform a computation, stream operations are composed into a stream pipeline. A stream pipeline consists of a source (which might be an array, a collection, a generator function, an I/O channel, etc), zero or more intermediate operations (which transform a stream into another stream, such as Stream#filter(Predicate)), and a terminal operation (which produces a result or side-effect, such as Stream#count() or Stream#forEach(Consumer)). Streams are lazy; computation on the source data is only performed when the terminal operation is initiated, and source elements are consumed only as needed.

A stream implementation is permitted significant latitude in optimizing the computation of the result. For example, a stream implementation is free to elide operations (or entire stages) from a stream pipeline — and therefore elide invocation of behavioral parameters — if it can prove that it would not affect the result of the computation. This means that side-effects of behavioral parameters may not always be executed and should not be relied upon, unless otherwise specified (such as by the terminal operations forEach and forEachOrdered). (For a specific example of such an optimization, see the API note documented on the #count operation. For more detail, see the side-effects section of the stream package documentation.)

Collections and streams, while bearing some superficial similarities, have different goals. Collections are primarily oncerned with the efficient management of, and access to, their elements. By contrast, streams do not provide a means to directly access or manipulate their elements, and are instead concerned with declaratively describing their source and the
computational operations which will be performed in aggregate on that source. However, if the provided stream operations do not offer the desired functionality, the #iterator() and #spliterator() operations can be used to perform a controlled traversal.

A stream pipeline, like the “widgets” example above, can be viewed as a query on the stream source. Unless the source was explicitly designed for concurrent modification (such as a ConcurrentHashMap), unpredictable or erroneous behavior may result from modifying the stream source while it is being queried.

Most stream operations accept parameters that describe user-specified behavior, such as the lambda expression w -> w.getWeight() passed to mapToInt in the example above. To preserve correct behavior, these behavioral parameters:

  • must be non-interfering (they do not modify the stream source); and
  • in most cases must be stateless (their result should not depend on any state that might change during execution
    of the stream pipeline).

Such parameters are always instances of a functional interface such as java.util.function.Function, and are often lambda expressions or method references. Unless otherwise specified these parameters must be non-null.

A stream should be operated on (invoking an intermediate or terminal stream operation) only once. This rules out, for example, “forked” streams, where the same source feeds two or more pipelines, or multiple traversals of the same stream. A stream implementation may throw IllegalStateException if it detects that the stream is being reused. However, since some stream operations may return their receiver rather than a new stream object, it may not be possible to detect reuse in all cases.

Streams have a #close() method and implement AutoCloseable. Operating on a stream after it has been closed will throw IllegalStateException. Most stream instances do not actually need to be closed after use, as they are backed by collections, arrays, or generating functions, which require no special resource management. Generally, only streams whose source is an IO channel, such as those returned by Files#lines(Path), will require closing. If a stream does require closing, it must be opened as a resource within a try-with-resources statement or similar control structure to ensure that it is closed promptly after its operations have completed.

Stream pipelines may execute either sequentially or in parallel. This execution mode is a property of the stream. Streams are created with an initial choice of sequential or parallel execution. (For example, Collection#stream() Collection.stream() creates a sequential stream, and Collection#parallelStream() Collection.parallelStream() creates a parallel one.) This choice of execution mode may be modified by the #sequential() or #parallel() methods, and may be queried with the #isParallel() method.

@param the type of the stream elements
@since 1.8
@see IntStream
@see LongStream
@see DoubleStream
@see java.util.stream>

 

The Stream Interface

/*
 * Copyright (c) 2012, 2016, Oracle and/or its affiliates. All rights reserved.
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
 *
 * This code is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License version 2 only, as
 * published by the Free Software Foundation.  Oracle designates this
 * particular file as subject to the "Classpath" exception as provided
 * by Oracle in the LICENSE file that accompanied this code.
 *
 * This code is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
 * version 2 for more details (a copy is included in the LICENSE file that
 * accompanied this code).
 *
 * You should have received a copy of the GNU General Public License version
 * 2 along with this work; if not, write to the Free Software Foundation,
 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
 *
 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
 * or visit www.oracle.com if you need additional information or have any
 * questions.
 */
package java.util.stream;

import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Arrays;
import java.util.Collection;
import java.util.Comparator;
import java.util.Objects;
import java.util.Optional;
import java.util.Spliterator;
import java.util.Spliterators;
import java.util.concurrent.ConcurrentHashMap;
import java.util.function.BiConsumer;
import java.util.function.BiFunction;
import java.util.function.BinaryOperator;
import java.util.function.Consumer;
import java.util.function.Function;
import java.util.function.IntFunction;
import java.util.function.Predicate;
import java.util.function.Supplier;
import java.util.function.ToDoubleFunction;
import java.util.function.ToIntFunction;
import java.util.function.ToLongFunction;
import java.util.function.UnaryOperator;

public interface Stream extends BaseStream> {

    Stream filter(Predicate predicate);
    
     Stream map(Function mapper);
    
    IntStream mapToInt(ToIntFunction mapper);
    
    LongStream mapToLong(ToLongFunction mapper);
    
    DoubleStream mapToDouble(ToDoubleFunction mapper);
    
     Stream flatMap(Function> mapper);
    
    IntStream flatMapToInt(Function mapper);
    
    LongStream flatMapToLong(Function mapper);
    
    DoubleStream flatMapToDouble(Function mapper);
    
    Stream distinct();
    
    Stream sorted();
    
    Stream sorted(Comparator comparator);
    
    Stream peek(Consumer action);
    
    Stream limit(long maxSize);
    
    Stream skip(long n);
 
    default Stream takeWhile(Predicate predicate) {
        Objects.requireNonNull(predicate);
        // Reuses the unordered spliterator, which, when encounter is present,
        // is safe to use as long as it configured not to split
        return StreamSupport.stream(
                new WhileOps.UnorderedWhileSpliterator.OfRef.Taking<>(spliterator(), true, predicate),
                isParallel()).onClose(this::close);
    }
    
    default Stream dropWhile(Predicate predicate) {
        Objects.requireNonNull(predicate);
        // Reuses the unordered spliterator, which, when encounter is present,
        // is safe to use as long as it configured not to split
        return StreamSupport.stream(
                new WhileOps.UnorderedWhileSpliterator.OfRef.Dropping<>(spliterator(), true, predicate),
                isParallel()).onClose(this::close);
    }

    void forEach(Consumer action);
    
    void forEachOrdered(Consumer action);
    
    Object[] toArray();
    
     A[] toArray(IntFunction generator);
    
    T reduce(T identity, BinaryOperator accumulator);
    
    Optional reduce(BinaryOperator accumulator);
    
     U reduce(U identity,
                 BiFunction accumulator,
                 BinaryOperator combiner);

     R collect(Supplier supplier,
                  BiConsumer accumulator,
                  BiConsumer combiner);

     R collect(Collector collector);
    
    Optional min(Comparator comparator);
    
    Optional max(Comparator comparator);
    
    long count();
    
    boolean anyMatch(Predicate predicate);
    
    boolean allMatch(Predicate predicate);
    
    boolean noneMatch(Predicate predicate);
    
    Optional findFirst();
    
    Optional findAny();
    
    public static Builder builder() {
        return new Streams.StreamBuilderImpl<>();
    }

    public static Stream empty() {
        return StreamSupport.stream(Spliterators.emptySpliterator(), false);
    }
    
    public static Stream of(T t) {
        return StreamSupport.stream(new Streams.StreamBuilderImpl<>(t), false);
    }

    public static Stream ofNullable(T t) {
        return t == null ? Stream.empty()
                         : StreamSupport.stream(new Streams.StreamBuilderImpl<>(t), false);
    }

    @SafeVarargs
    @SuppressWarnings("varargs") // Creating a stream from an array is safe
    public static Stream of(T... values) {
        return Arrays.stream(values);
    }

    public static Stream iterate(final T seed, final UnaryOperator f) {
        Objects.requireNonNull(f);
        Spliterator spliterator = new Spliterators.AbstractSpliterator<>(Long.MAX_VALUE,
               Spliterator.ORDERED | Spliterator.IMMUTABLE) {
            T prev;
            boolean started;

            @Override
            public boolean tryAdvance(Consumer action) {
                Objects.requireNonNull(action);
                T t;
                if (started)
                    t = f.apply(prev);
                else {
                    t = seed;
                    started = true;
                }
                action.accept(prev = t);
                return true;
            }
        };
        return StreamSupport.stream(spliterator, false);
    }

    public static Stream iterate(T seed, Predicate hasNext, UnaryOperator next) {
        Objects.requireNonNull(next);
        Objects.requireNonNull(hasNext);
        Spliterator spliterator = new Spliterators.AbstractSpliterator<>(Long.MAX_VALUE,
               Spliterator.ORDERED | Spliterator.IMMUTABLE) {
            T prev;
            boolean started, finished;

            @Override
            public boolean tryAdvance(Consumer action) {
                Objects.requireNonNull(action);
                if (finished)
                    return false;
                T t;
                if (started)
                    t = next.apply(prev);
                else {
                    t = seed;
                    started = true;
                }
                if (!hasNext.test(t)) {
                    prev = null;
                    finished = true;
                    return false;
                }
                action.accept(prev = t);
                return true;
            }

            @Override
            public void forEachRemaining(Consumer action) {
                Objects.requireNonNull(action);
                if (finished)
                    return;
                finished = true;
                T t = started ? next.apply(prev) : seed;
                prev = null;
                while (hasNext.test(t)) {
                    action.accept(t);
                    t = next.apply(t);
                }
            }
        };
        return StreamSupport.stream(spliterator, false);
    }

    public static Stream generate(Supplier s) {
        Objects.requireNonNull(s);
        return StreamSupport.stream(
                new StreamSpliterators.InfiniteSupplyingSpliterator.OfRef<>(Long.MAX_VALUE, s), false);
    }

    public static  Stream concat(Stream a, Stream b) {
        Objects.requireNonNull(a);
        Objects.requireNonNull(b);

        @SuppressWarnings("unchecked")
        Spliterator split = new Streams.ConcatSpliterator.OfRef<>(
                (Spliterator) a.spliterator(), (Spliterator) b.spliterator());
        Stream stream = StreamSupport.stream(split, a.isParallel() || b.isParallel());
        return stream.onClose(Streams.composedClose(a, b));
    }

    public interface Builder extends Consumer {

        @Override
        void accept(T t);

        default Builder add(T t) {
            accept(t);
            return this;
        }

        Stream build();
    }
}

 

This entry was posted in functional interface, functional programming, java stream, lambda, lambda expression and tagged , , , , . Bookmark the permalink.

Leave a Reply