Java Streams

Posted by Wahab Ahmad on Friday, October 13, 2023

Contents

What is a Stream

Streams are contained inside the standard java.util library that were introduced in Java 8. They are used when we have sequence of elements on which we can perform aggregate operations. Stream operations consist of a stream pipeline, which consists of a source, a set of operations/computations followed by a terminal operation. Streams are lazy which means that they are not setup or initialized during setup, rather computation on the source is only initiated when the program reaches the terminal operation. It is imporant to note that a stream does not store data and in that sense it is not a data structure.

Why Streams vs Collections

There are many similarities among stream uses and collection uses. However, the main differences lies in management of and access to elements. Collections allow access to elements along with efficient ways to manage these elements. Streams do not allow access to these elements or ways to manipulate the data. Rather, the goal of a stream is to give you a way to declaratively describe computational operations performed on the data from the source.

Example from java docs:

 int sum = widgets.stream()
      .filter(w -> w.getColor() == RED)
      .mapToInt(w -> w.getWeight())
      .sum();

How are streams used

As described before, there are two types of operations that can be performed on a stream. Namely:

  • Intermediate Operations
  • Terminal Operations

Intermediate operations convert one input stream into another output stream. Then multiple operations can be chained in a row to perform multiple operations. The terminal operations then transforms the stream into whatever data type the result should be stored.

Intermediate Operations

Map

Map mutates elements of a stream into another:

List<Integer> number = Arrays.asList(2,3,4,5);
List<Integer> square = number.stream().map(x->x*x).collect(Collectors.toList());

There are also builtin functions to map elements to a different data type. For instance, mapToInt converts all elements to int type.

Filter

List<Integer> number = Arrays.asList(2,3,4,5);
List<Integer> even = number.stream().filter(x->x%2==0).collect(Collectors.toList());

We can also use the distinct method in a similar way.

Sort

List<Integer> number = Arrays.asList(2,3,4,5,1);
List<Integer> sorted = number.stream().sorted().collect(Collectors.toList());

Flat Map

Some times we are dealing with data structures which are grouped (Ex: List<List<String>>. However, we would like to perform operations on items of each group. In this case, we would like to flatten the elements inside the stream.

    List<List<String>> namesNested = Arrays.asList(
      Arrays.asList("Jeff", "Bezos"),
      Arrays.asList("Bill", "Gates"),
      Arrays.asList("Mark", "Zuckerberg"));

    List<String> namesFlatStream = namesNested.stream()
      .flatMap(Collection::stream)
      .collect(Collectors.toList());

Terminal Operations

Collect

Used to collect the results of a stream in a data structure. Several example of this have been shown above.

We can also collect a stream of strings and join them as we deem fit:

String joinedName = people.stream()
  .map(Person::getName)
  .collect(Collectors.joining(", "))
  .toString();

For Each

Loop through and access each element of the stream result:

List<Integer> number = Arrays.asList(2,3,4,5);
number.stream().map(x->x*x).forEach(y->System.out.println(y));

Reduce

Reduce the elements of a stream to a single value. Takes a binary operator as a parameter. For example, here we sum all the numbers into res where the initial value of res is 0.

List<Integer> number = Arrays.asList(2,3,4,5);
Integer result = number.stream().reduce(0, (res, i) -> res+i);
// or
Integer result = number.stream().reduce(0, Integer::sum);

Find First

Allows us to collect only the first elment that reaches the end of the stream.

Employee employee = Stream.of(empIds)
  .map(employeeRepository::findById)
  .filter(e -> e != null)
  .filter(e -> e.getSalary() > 100000)
  .findFirst()
  .orElse(null);

All Match, Any Match and None Match

These operations take a predicate and return a boolean. This can be used to see if a condition matches any, all or none of the elements in stream.

List<Integer> intList = Arrays.asList(2, 4, 5, 6, 8);

boolean allEven = intList.stream().allMatch(i -> i % 2 == 0);
boolean oneEven = intList.stream().anyMatch(i -> i % 2 == 0);
boolean noneMultipleOfThree = intList.stream().noneMatch(i -> i % 3 == 0);

Partition By

Using a boolean predicate we can partition a stream into two groups:

    Map<Boolean, List<Integer>> isEven = intList.stream().collect(
      Collectors.partitioningBy(i -> i % 2 == 0));

Group By

If we want to partition the stream into more than two groups we can use groupingBy:

    // Group by first character of first name
    Map<Character, List<Person>> groupByAlphabet = people.stream().collect(
      Collectors.groupingBy(e -> new Character(e.getName().charAt(0))));

Mapping

Grouping gives us a list of items grouped appropriately. However, sometimes we are interested in grouping and storing a different data type. This can be achieved through mapping:

    Map<Character, List<Integer>> idGroupedByAlphabet = people.stream().collect(
      Collectors.groupingBy(p -> new Character(p.getName().charAt(0)),
        Collectors.mapping(Person::getId, Collectors.toList())));

Dealing with Nulls

To avoid writing the following code:

Stream<Integer> result = number != null
        ? Stream.of(number)
        : Stream.empty();

We can use the method ofNullable:

Stream<Integer> result = Stream.ofNullable(number);

Parallel Streams

Another advantage of using streams is that it is easy to parallelize. All you have to do is to use .parallel() after the stream of your choice. However, it is important to meet certain conditions:

  1. You have to ensure that the code is thread safe
  2. You have to ensure the order of execution of the elements does not impact the result. For example, using findFirst cannot work will with concurrent execution.
    empList.stream().parallel().forEach(e -> e.salaryIncrement(10.0));

Benefits

  • Makes bulk operations convenient and fast
  • Lazy evaluation - reduces the amount of memory required and improves performance
  • Offers easy parallel processing
  • Improves readablity and is consise

References: https://stackify.com/streams-guide-java-8/