Minborg

Minborg
Minborg

Friday, December 18, 2015

My Previous Post on Video

Escape Analysis Aftermath


My post on Java Escape Analysis has triggered a lot of interest. I made a follow up post on the effect of inlining on Escape Analysis.

A company that provides Java programming training filmed the post and had one person summarize the content. The company is called Webucator.

Have a look at the presentation here.



Keep on hackin'



Wednesday, December 16, 2015

Java 8: The JVM Can Re-capture Objects That Have Escaped

Background


In my previous post, I wrote about Escape Analysis and how the JVM can allocate non-escaping objects on the stack rather than on the heap. I immediately got a very interesting question from Caleb Cushing asking if Objects that actually can escape could be optimized anyhow, provided that that escaped object is reasonably contained by the caller.

Read this post and find out the answer!


A Simple Example

Let's assume that we have the following simple Person class:

public class Person {

    private final String firstName;
    private final String middleName;
    private final String lastName;

    public Person(String firstName, String middleName, String lastName) {
        this.firstName = requireNonNull(firstName);  // Cannot be null
        this.middleName = middleName;                // Can be null
        this.lastName = requireNonNull(lastName);    // Cannot be null
    }

    public String getFirstName() {
        return firstName;
    }

    public Optional<String> getMiddleName() {
        return Optional.ofNullable(middleName);
    }

    public String getLastName() {
        return lastName;
    }

}

Now, if we call the method Person::getMiddleName, it is obvious that the Optional object will escape the method because it is returned by the method and becomes visible to anyone calling the method. Thus, it will be classified as GlobalEscape and must be allocated on the heap. However, this is not necessarily the case. The JVM will sometimes be able to allocate it on the stack, despite the fact that it escapes the method. How is that possible?


What is Escape Analysis (EA)?

Before you read on, I encourage you to read my previous post because it will be more easy to understand what is going on. The post describes the fundamental aspects of EA.

How Can GlobalEscape Objects Still Live on the Stack?

It turns out that the C2 compiler is able to do EA not only over single methods, but over larger chunks of code that is inlined by the compiler. Inlining is an optimization scheme where the code is "flattened" to eliminate redundant calls. So, one or several layers of calls are flattened to a sequential list of instructions. The compiler then evaluates EA, not on the individual methods, but on the entire inlined code block. So, even though an object might escape a particular method, it might not be able to escape the larger inlined code block. 

A Demonstration of Inlined Escape Analysis

public class Main2 {

    public static void main(String[] args) throws IOException {

        Person p = new Person("Johan", "Sebastian", "Bach");

        count(p);
        System.gc();
        System.out.println("Press any key to continue");
        System.in.read();
        long sum = count(p);

        System.out.println(sum);
        System.out.println("Press any key to continue2");
        System.in.read();

        sum = count(p);

        System.out.println(sum);
        System.out.println("Press any key to exit");
        System.in.read();

    }

    private static long count(Person p) {
        long count = 0;
        for (int i = 0; i < 1_000_000; i++) {
            if (p.getMiddleName().isPresent()) {
                count++;
            }
        }
        return count;

    }

}

The code above will create a single instance of a Person and then it will call that Person's getMiddleName() method a large number of times. We will do it in three steps where the first step is just for warming up and then GC away all the objects that were created. The two following steps will not remove anything from the heap and we will be able to examine the heap between each step.We can use the following JVM parameters when we run the code:

-server
-XX:BCEATraceLevel=3
-XX:+PrintCompilation
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintInlining
-verbose:gc
-XX:MaxInlineSize=256
-XX:FreqInlineSize=1024
-XX:MaxBCEAEstimateSize=1024
-XX:MaxInlineLevel=22
-XX:CompileThreshold=10
-Xmx4g
-Xms4g


After the first run, we get the following heap usage (after the System.gc() call cleaned up all our Optionals)

pemi$ jps | grep Main2
74886 Main2
 num     #instances         #bytes  class name
----------------------------------------------
   1:            95       42952184  [I
   2:          1062         101408  [C
   3:           486          55384  java.lang.Class
   4:           526          25944  [Ljava.lang.Object;
   5:            13          25664  [B
   6:          1040          24960  java.lang.String
   7:            74           5328  java.lang.reflect.Field

The two following steps gave:

pemi$ jmap -histo 74886 | head

 num     #instances         #bytes  class name
----------------------------------------------
   1:            95       39019792  [I
   2:        245760        3932160  java.util.Optional
   3:          1063         101440  [C
   4:           486          55384  java.lang.Class
   5:           526          25944  [Ljava.lang.Object;
   6:            13          25664  [B
   7:          1041          24984  java.lang.String
pemi$ jmap -histo 74886 | head

 num     #instances         #bytes  class name
----------------------------------------------
   1:            95       39019544  [I
   2:        245760        3932160  java.util.Optional
   3:          1064         101472  [C
   4:           486          55384  java.lang.Class
   5:           526          25944  [Ljava.lang.Object;
   6:            13          25664  [B
   7:          1042          25008  java.lang.String

No new Optionals were created between step two and step three and thus, EA was eventually able to eliminate the creation of the Optional instances on the heap even though they escaped the initial method where they were created and returned. This means that we can use an appropriate level of abstraction and still retain performant code.

Conclusions

Escape Analysis can work on several layers in our code. EA can optimize away heap allocation even though objects escapes one or several methods.  As with EA in general, we do not get a guarantee that we will get the optimizations we are expecting in all cases.

The open-source project Speedment that I am contributing to, often returns Streams containing entities or Optionals. The fact that EA works on several layers makes the application code run faster. The JVM is able to inline code from the Speedment library into the application code itself and then, using EA, temporary return objects are never allocated on the heap. So, Speedment developers can enjoy a nice API while still retaining high performance and low latency
.



Wednesday, December 2, 2015

Do Not Let Your Java Objects Escape

Background

I am working on the Open Source project Speedment and for us contributors, it is important to use code that people can understand and improve. It is also important that performance is good, otherwise people are likely to use some other solution. 

Escape Analysis allows us to write performant code at the same time as we can use good code style with appropriate abstractions.


This is Escape Analysis

Escape Analysis (also abbreviated as "EA") allows the Java compiler to optimize our code in many ways.  Please consider the following simple Point class:

public class Point {

    private final int x, y;

    public Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    @Override
    public String toString() {
        final StringBuilder sb = new StringBuilder()
                .append("(")
                .append(x)
                .append(", ")
                .append(y)
                .append(")");
        return sb.toString();
    }

}

Each time we call the Point::toString method, it looks like a new StringBuilder object is created. However, as we can see, the StringBuilder object is not visible from outside the method. It cannot be observed neither from outside the method nor by another thread running the same piece of code (because that other thread would se its own version of the StringBuilder).

So, after calling the method some million times, there might be millions of StringBuilder objects lying around? Not so! By employing EA, the compiler can allocate the StringBuilder on the stack instead. So when our method returns, the object is automatically deleted upon return, as the stack pointer is restored to the previous value it had before the method was called.

Escape analysis has been available for a relatively long time in Java. In the beginning we had to enable it using command line options, but nowadays it is used by default. Java 8 has an improved Escape Analysis compared to previous Java versions.


How It Works

Based on EA, an object's escape state will take on one of three distinct values:
  • GlobalEscape: An object may escape the method and/or the thread. Clearly, if an object is returned as the result of a method, its state is GlobalEscape. The same is true for objects that are stored in static fields or in fields of an object that itself is of state GlobalEscape. Also, if we override the finalize() method, the object will always be classified as GlobalEscape and thus, it will be allocated on the heap. This is logical, because eventually the object will be visible to the JVM's finalizer. There are also some other conditions that will render our object's status GlobalEscape.
  • ArgEscape: An object that is passed as an argument to a method but cannot otherwise be observed outside the method or by other threads.
  • NoEscape: An object that cannot escape the method or thread at all.


GlobalEscape and ArgEscape objects must be allocated on the heap, but for ArgEscape objects it is possible to remove some locking and memory synchronization overhead because these objects are only visible from the calling thread.

The NoEscape objects may be allocated freely, for example on the stack instead of on the heap. In fact, under some circumstances, it is not even necessary to construct an object at all, but instead only the object's scalar values, such as an int for the object Integer. Synchronization may be removed too, because we know that only this thread will use the objects. For example, if we were to use the somewhat ancient StringBuffer (which as opposed to StringBuilder has synchronized methods), then these synchronizations could safely be removed.

EA is currently only available under the C2 HotSpot Compiler so we have to make sure that we run in -server mode.


Why It Matters

In theory, NoEscape objects objects can be allocated on the stack or even in CPU registers using EA,  giving very fast execution.

When we allocate objects on the heap, we start to drain our CPU caches because objects are placed on different addresses on the heap possibly far away from each other. This way we will quickly deplete our L1 CPU cache and performance will decrease. With EA and stack allocation on the other hand, we are using memory that (most likely) is already in the L1 cache anyhow.  So, EA and stack allocation will improve our localization of data. This is good from a performance standpoint.

Obviously, the garbage collects needs to run much less frequently when we are using EA with stack allocation. This is perhaps the biggest performance advantage. Recall that each time the JVM runs a complete heap scan, we take performance out of our CPUs and the CPU caches will quickly deplete. Not to mention if we have virtual memory paged out on our server, whereby GC is devastating for performance.

The most important advantage of EA is not performance though. EA allows us to use local abstractions like Lambdas, Functions, Streams, Iterators etc. without any significant performance penalty so that we can write better and more readable code. Code that describes what we are doing rather than how it is done.

A Small Example

public class Main {

    public static void main(String[] args) throws IOException {
        Point p = new Point(100, 200);

        sum(p);
        System.gc();
        System.out.println("Press any key to continue");
        System.in.read();
        long sum = sum(p);

        System.out.println(sum);
        System.out.println("Press any key to continue2");
        System.in.read();
        
        sum = sum(p);

        System.out.println(sum);
        System.out.println("Press any key to exit");
        System.in.read();

    }

    private static long sum(Point p) {
        long sumLen = 0;
        for (int i = 0; i < 1_000_000; i++) {
            sumLen += p.toString().length();
        }
        return sumLen;

    }

}


The code above will create a single instance of a Point and then it will call that Point's toString() method a large number of times. We will do it in three steps where the first step is just for warming up and then GC away all the objects that were created. The two following steps will not remove anything from the heap and we will be able to examine the heap between each step.

If we run the program with the following parameters, we will be able to see what is going on within the JVM:

-server
-XX:BCEATraceLevel=3
-XX:+PrintCompilation
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintInlining
-verbose:gc
-XX:MaxInlineSize=256
-XX:FreqInlineSize=1024
-XX:MaxBCEAEstimateSize=1024
-XX:MaxInlineLevel=22
-XX:CompileThreshold=10
-Xmx4g
-Xms4g

And yes, that is a huge pile of parameters but we really want to be able to see what is going on.

After the first run, we get the following heap usage (after the System.gc() call cleaned up all our StringBuilders)

pemi$ jps | grep Main
50903 Main
pemi$ jmap -histo 50903 | head
 num     #instances         #bytes  class name

----------------------------------------------
   1:            95       42952184  [I
   2:          1079         101120  [C
   3:           485          55272  java.lang.Class
   4:           526          25936  [Ljava.lang.Object;
   5:            13          25664  [B
   6:          1057          25368  java.lang.String
   7:            74           5328  java.lang.reflect.Field

The two following steps gave:

pemi$ jmap -histo 50903 | head
 num     #instances         #bytes  class name
----------------------------------------------
   1:       2001080       88101152  [C
   2:           100       36777992  [I
   3:       1001058       24025392  java.lang.String
   4:         64513        1548312  java.lang.StringBuilder
   5:           485          55272  java.lang.Class
   6:           526          25936  [Ljava.lang.Object;
   7:            13          25664  [B


pemi$ jmap -histo 50903 | head
 num     #instances         #bytes  class name
----------------------------------------------
   1:       4001081      176101184  [C
   2:       2001059       48025416  java.lang.String
   3:           105       32152064  [I
   4:         64513        1548312  java.lang.StringBuilder
   5:           485          55272  java.lang.Class
   6:           526          25936  [Ljava.lang.Object;
   7:            13          25664  [B

As can be seen, EA was eventually able to eliminate the creation of the StringBuilder instances on the heap. There were only 64K created compared to the 2M Stings. A big improvement!

Conclusions

The advantages of Escape Analysis are nice in theory but they are somewhat difficult to understand and predict. We do not get a guarantee that we will get the optimizations we are expecting in all cases but it seems to work reasonably well under common conditions.

Check out open-source Speedment and see if you can spot the places where we rely on Escape Analysis.

Hopefully, this post contributed to shed some light on EA so that you opt to write good code over "performant" code.

I would like to thank Peter Lawery for the tips and suggestions I got from him in connection with writing this post.

Read more on Objects in general here


Thursday, November 12, 2015

Easily Create Database Content with Java 8

Database Connectivity Now and Then

Spire and Duke adding stuff
to a database.
I remember back in the old (Java) days, when we were sitting up late nights and experimented a lot with Java and databases. In the beginning of the Java era, there was not much support for database connectivity and so we had to basically write your own database classes and handle ResultSets, Connections and SQLExceptions by ourself.

Nowadays, we expect the simple things just to happen! Suppose that we have an existing database and we want to add or update information in it using a Java application. How can we do that in a simple way without having to write a lot of boilerplate code?

I have contributed a lot to the Java 8 Open Source project Speedment, that can be used to very easily extract Java code from existing database schemas and start coding applications directly.

Let's take it for a spin.

Example

Let's say we have a MySQL database table that is supposed to contain data on various countries. The table could look something like this:

mysql> explain country +------------+-------------+------+-----+---------+----------------+ | Field      | Type        | Null | Key | Default | Extra          | +------------+-------------+------+-----+---------+----------------+ | id         | int(11)     | NO   | PRI | NULL    | auto_increment | | name       | varchar(45) | YES  | UNI | NULL    |                | | local_name | varchar(45) | YES  |     | NULL    |                | | code       | int(11)     | YES  |     | NULL    |                | | domain     | varchar(10) | YES  |     | NULL    |                +------------+-------------+------+-----+---------+----------------+ 5 rows in set (0.00 sec)

Let us further pretend that we have the task of populating the table with a few countries and see how that can be solved.

Setting up the Speedment Project

In order to set up a project with Speedment, we need to include a few lines in our POM.xml file, connect to the database and generate code. Read more on how to do this here!.

Also, check out this film how easy it is:

Initializing the Database Connection

Now that we have our database domain model generated automatically, we can start with the actual coding for inserting data into the database. First, we need to setup our Java 8 database project like this:

// Setup Speedment speedment = new JavapotApplication().withPassword("javapot").build(); Manager<Country> countries = speedment.managerOf(Country.class);

The JavapotApplication class was generated automatically from the database schema and contains all meta data (like columns and tables) of the database. Note that we manually need to provide the password since this is not stored in the meta data model (for security reasons). The countries variable is a "handle" for the table we are about to work with.

There is really no "magic" going on with the generation. We can see all the generated Java files in clear text and we can change them or introduce our own versions if we want.

Inserting Data in the Database

Once the setup is made, is is very easy to insert data rows in the database like this:

try { countries.newInstance() .setName("United States") .setLocalName("United States") .setCode(1) .setDomain(".us") .persist(); countries.newInstance() .setName("Germany") .setLocalName("Deutschland") .setCode(49) .setDomain(".de") .persist(); // Needless to say, you can call the setters in any order. countries.newInstance() .setDomain(".uk") .setCode(44) .setName("United Kingdom") .setLocalName("United Kingdom") .setDomain(".uk") .persist(); countries.newInstance() .setName("Sweden") .setLocalName("Sverige") .setCode(40) // Intentionally wrong, should be 46!! .setDomain(".se") .persist(); } catch (SpeedmentException se) { // Handle the exception here }

The newInstance() method creates a new empty Country object and then we just use the setters to initialize the country. After all parameters are set, we call the persist() method to store the object in the database. If there is an error during the database insert, a SpeedmentException will be thrown, allowing you to examine why (for example if you are trying to insert two countries with the same name).  I intentionally picked the wrong country call code for Sweden (förlÃ¥t Sverige) so that we can learn how to update data in our database too.

Updating Data in the Database

If you want to update an existing row in the database you can do it like this:

countries.stream() .filter(Country.NAME.equal("Sweden")) .findAny() .ifPresent(c -> c.setCode(46).update());

This will create a Stream with all the counties that have the name "Sweden" (which, evidently, is only one country) and then it will try to find that country and if it is present, it will take that country and set the code to 46 (which is the correct calling code for Sweden) and then it will update the selected country in the database. It is important to understand that even though our country table might contain a large number of countries, it will only include those countries in the stream that are satisfying an equivalent query of "select * from country where name='Sweden' " in this case.

Now our database looks like this:
mysql> select * from country; +----+----------------+----------------+------+--------+ | id | name           | local_name     | code | domain | +----+----------------+----------------+------+--------+ |  1 | United States  | United States  |    1 | .us    | |  2 | Germany        | Deutschland    |   49 | .de    | |  3 | United Kingdom | United Kingdom |   44 | .uk    | |  4 | Sweden         | Sverige       |   46 | .se    | +----+----------------+----------------+------+--------+ 4 rows in set (0.00 sec)

Success! Mission accomplished!

Contributing 

Read more about Speedment Open Source on www.speedment.org and that's the place to be if you want to learn more things like how the API looks like and how you use Speedment in your projects. Speedment is here on GitHub. You can contribute by submitting comments on gitter or download the source code and create pull requests with your own code contributions.

Conclusions

With Java 8, you can easily write database applications with almost no extra manual code work. There are tools that automatically can extract your domain model from a database schema.

These days, we Java programmers can can put more time on the actual problem (and perhaps get some more well deserved sleep) instead of fiddling around with basic database functionality.



Thursday, October 15, 2015

Java 8, Query Databases Using Streams

Mapping Streams
When I wrote my first Java database application back in the late 90's, I had to do everything myself. There was a lot of code, error capturing and object conversions going on and the code rapidly became very difficult to maintain and, I have to admit, error prone.

Even today, when you work with databases, it is often cumbersome to bridge between the two fundamentally different worlds of an object oriented language like Java and a relational database. Often, you need write your own mapping layer or you can use an Object Relational Mapper (ORM) like Hibernate. It is often convenient to use an ORM, but sometimes it is not so easy to configure it in the right way. More often than not, the ORM also slows down your application.

Recently, I have been contributing a lot to a new Open Source project named "Speedment" that we contributors are hoping will make life easier for us Java 8 database application developers.

What is Speedment Open Source?

Speedment Open Source is a new library that provides many interesting Java 8 features. It is written entirely in Java 8 from start. Speedment uses standard Streams for querying the database and thanks to that, you do not have to learn any new query API. You do not have to think at all about JDBC, ResultSet and other database specific things either.

Speedment analyses an existing database and generates code automatically. That way, you do not have to write a single line of code for database entities. The generated code even contains auto-generated JavaDocs. This means that you do not have to write entity classes like "User" or "Book" in your Java code, instead you just create (or use an existing) the database, connect Speedment to it and Speedment will analyze the database structure and generate the entity classes by analyzing the database structure.

Spire

Because Speedment's Open Source mascot is a hare, I use hares in many of my examples. Let us suppose that we have an existing (MySQL) table called "hare" that looks like this:






mysql> explain hare;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(11)     | NO   | PRI | NULL    | auto_increment |
| name  | varchar(45) | NO   |     | NULL    |                |
| color | varchar(45) | NO   |     | NULL    |                |
| age   | int(11)     | NO   |     | NULL    |                |
+-------+-------------+------+-----+---------+----------------+
4 rows in set (0.01 sec)

Then Speedment will generate a corresponding entity (JavaDocs removed for brevity) :

public interface Hare extends Entity<Hare> {
   
    public final static ReferenceComparableField<Hare, Integer> ID = new ReferenceComparableFieldImpl<>("id", Hare::getId, Hare::setId);
    public final static ReferenceComparableStringField<Hare> NAME = new ReferenceComparableStringFieldImpl<>("name", Hare::getName, Hare::setName);
    public final static ReferenceComparableStringField<Hare> COLOR = new ReferenceComparableStringFieldImpl<>("color", Hare::getColor, Hare::setColor);
    public final static ReferenceComparableField<Hare, Integer> AGE = new ReferenceComparableFieldImpl<>("age", Hare::getAge, Hare::setAge);
    
    Integer getId();
    String getName();
    String getColor();
    Integer getAge();

    Hare setId(Integer id);
    Hare setName(String name);
    Hare setColor(String color);
    Hare setAge(Integer age);
    
    /** Graph-like traversal methods eliminating JOINs */
    Stream<Carrot> findCarrotsByOwner();
    Stream<Carrot> findCarrotsByRival();
    Stream<Carrot> findCarrots();
}

I will explain the find*() methods in a separate post. They can be used instead of SQL joins.

Queries

Here is an example how it could look like when you are querying the Hare table :

List<Hare> oldHares = hares.stream()
    .filter(AGE.greaterThan(8))
    .collect(toList());

Smart Streams

Now, it looks like the code above will stream over all rows in the "hare" database table, but it is actually not. The Stream is "smart" and will, upon reaching its Terminal Operation collect(), analyze the filter Predicate and conclude that it is actually the "hare.age" column that is compared to 8 and thus, it will be able to reduce the stream of hares to something corresponding to "select * from hare where age > 8". If you are using several filters, they are of course further combined to reduce the stream even more.

Just to show the principle, here is another stream with more operations:

long noOldHares = hares.stream()
    .filter(AGE.greaterThan(8))
    .mapToInt(Hare::getAge)
    .sorted()
    .count();

When this Stream reaches its Terminal Operation count(), it will inspect its own pipeline. It will then conclude that it can reduce the AGE Predicate just like in the previous example. It will furthermore conclude that the operations mapToInt() and sorted() do not change the outcome of count() and thus, these operations can be eliminated all together. So the statement is reduced to "select count(*) from hare where age > 8".

This means that you can use Java 8 streams while you do not have to care so much about how the streams are translated to SQL.

How to Download and Contribute

Read more about Speedment Open Source on www.speedment.org and that's the place to be if you want to learn more things like how the API looks like and how you use Speedment in your projects. Speedment is here on GitHub. You can contribute by submitting comments on gitter or download the source code and create pull requests with your own code contributions.

Conclusions

Looking back on some of my old projects at the dawn of the Java era, one of my database classes (with over 100 lines) could now be reduced to a single line of Java 8 code. That is Moores law, but inverted! In 14 years (=7 Moore cycles) the number of rows halved about 7 times. That is progress!

Speedment and Java 8 is the new super-awesome alternative to traditional ORMs.

Saturday, July 4, 2015

An O(1) n-factorial Support Class for Java 8

Background


In some of my posts, I have talked about calculating n-factorial (i.e. n!) and I have received some comments about performance. In my post Compute factorials using Java 8 streams, I presented a number of ways to implement an n-factorial method and in a more recent post, Java 8, Master Permutations, I used one of those methods in the main solution for generating permutations.

In this post I present a very simple (or even trivial), yet high performance, n-factorial support class for Java 8.

Implementation


If a factorial method is to return a long, there are only 21 valid input values that can be used (read more about why in this post) namely 0, 1, ..., 20.  This fact allows us to pre-calculate all results and just use a lookup array like this:

public class Factorials {

    private Factorials() {
    }

    private static final long[] FACTORIALS = {
        1L,
        1L,
        2L,
        6L,
        24L,
        120L,
        720L,
        5040L,
        40320L,
        362880L,
        3628800L,
        39916800L,
        479001600L,
        6227020800L,
        87178291200L,
        1307674368000L,
        20922789888000L,
        355687428096000L,
        6402373705728000L,
        121645100408832000L,
        2432902008176640000L
    };

    public static long factorial(int n) {
        return FACTORIALS[n];
    }
    
    public static LongStream stream() {
        return LongStream.of(FACTORIALS);
    }

}
As can be seen, the factorial method will complete in O(1) time (i.e. in constant time regardless of the input parameter). In addition to the factorial() method, I have also added a stream() function that allows us to conveniently obtain a stream of all n-factorials.

If we use an argument outside the definition set, an ArrayIndexOutOfBoundsException will be thrown. You might want to clean up this behavior and throw a more relevant exception like this:
public static long factorial(int n) {
    if (n > 20 || n < 0) throw new IllegalArgumentException(n + " is out of range");
    return FACTORIALS[n];
}

Conclusion


When the definition set for a method is limited, it may sometimes be a good idea to eagerly pre-calculate all the values.






Friday, July 3, 2015

Java 8, Master Permutations

Background

Every morning when you wake up, you need to perform a set of tasks before you go to work or do something else you'd like. Have you ever thought of each and every possible order you can start your day in?

Using Permutations, you can try all combinations of an input set. As you will learn in this post, there is an astonishing number of combinations even for a moderately numbered set. For example, if you have just ten items, you can combine them in over three million ways.

In this post, we will build a small Java 8 support class named Permutations that can be used in unit testing, games strategy evaluation, problem solving and in many other powerful applications.

The Mathematics

You do not have to read this chapter in order to understand this post. However, it helps. So, try to stay on as long as you can before skipping to the next chapter.

The term permutation relates to the process of arranging all the members of a set in an order or, if the set is already ordered, rearranging (or mathematically speaking permutating) the order of the set. For example, if we have a set {1, 2, 3} we can arrange that set in six different ways; {1, 2, 3}, {1, 3, 2}, {2, 1, 3}, {2, 3, 1}, {3, 1, 2}, {3, 2, 1}.

The number of permutations for n distinct objects in a set is said to be n-factorial, mathematically written as n!. The value of n! can be calculated as the factor of one multiplied by all positive integers less than or equal to n. So, if we take our previous set {1, 2, 3}, we will see that it has three distinct members. Thus, there will be 3! = 1*1*2*3 = 6 unique combinations. Encouragingly enough, this is consistent with the example above, where we listed all the six combinations. The observant reader now concludes that one may ignore the serial factor of one since one is the multiplicative identity number, i.e. for any number a, it is true that 1*a = a. In pure English, this means that you can skip multiplying with 1 since you will get back the same result anyhow. So, n! is really 1 multiplied with 2*3*....*n.

Now, if we produce a list of the number of distinct members in a set and calculate the corresponding n-factorial, we will arrive at the following table:

0!   1 (by definition, 1 without multiplying with anything)
1!   1 (obviously you can only arrange a single item in one way)
2!   2
3!   6 (as in the example above)
4!   24
5!   120 (the largest n-factorial value that can fit in a byte (max 2^7 = 127))
6!   720
7!   5040 (the largest n-factorial value that can fit in a short (max 2^15 = 32768))
8!   40320 
9!   362880
10!   3628800
11!   39916800
12! 479001600 (the largest n-factorial value that can fit in an int (max 2^31  = 2147483647))
13! 6227020800
14! 87178291200
15! 1307674368000
16! 20922789888000
17! 355687428096000
18! 6402373705728000
19! 121645100408832000
20! 2432902008176640000 (the largest n-factorial value that can fit in a long (max 2^63))
21! 51090942171709440000
...
42! 1405006117752879898543142606244511569936384000000000 (~number of atoms on earth)

As  can be seen, you have to take care when applying permutations in your code, because you might experience extremely long execution times even for a small number of items. If you, for example, can evaluate any given combination in 1 ms and you have 16 items, the wall clock time for a single thread will be 16!/1000 s > 600 years!

The Permutations class

There are many ways to implement a permutator in a programming language. In this post, my objective was to present the shortest possible code for a permutator in Java 8, so if you can come up with a shorter one, please leave a comment below. Thus, I have opted for simplicity rather than performance.

Secondly, I wanted to work with Java 8 Streams, allowing integration with the powerful Stream functions that are available.

Implementation

The first method we need is the n-factorial function that calculates the number of unique combination of a distinct set. I have written a post on this before and you can read the details here. This is how it looks like:

public long factorial(int n) {
    if (n > 20 || n < 0) throw new IllegalArgumentException(n + " is out of range");
    return LongStream.rangeClosed(2, n).reduce(1, (a, b) -> a * b);
}
Note that the function will only handle an input range of  0  <= n <= 20 because 0 is the lowest defined input value for n-factorial and 20 corresponds to the maximum return value a long can hold.

Next thing we need is a method that returns a numbered permutation of any given input List. So if we, for example, have a List with two items "A" and "B", the first permutation (0) is {A, B} and the second permutation (1) is {B, A}. If we have a List with tree items {A, B, C}, then we have the following permutations:

no   Permutation
--   ---------------
0 {A, B, C}
1 {A, C, B}
2 {B, A, C}
3 {B, C, A}
4 {C, A, B}
5 {C, B, A}

We need to make yet another table of a List before the pattern that I am going to exploit becomes obvious. The complete permutations for the List {A, B, C, D} looks like this:

no   Permutation
--   ---------------
0 {A, B, C, D}
1 {A, B, D, C}
2 {A, C, B, D}
3 {A, C, D, B}
4 {A, D, B, C}
5 {A, D, C, B}
6 {B, A, C, D}
7 {B, A, D, C}
8 {B, C, A, D}
9 {B, C, D, A}
10 {B, D, A, C}
11 {B, D, C, A}
12 {C, A, B, D}
13 {C, A, D, B}
14 {C, B, A, D}
15 {C, B, D, A} (Used in an example below)
16 {C, D, A, B}
17 {C, D, B, A}
18 {D, A, B, C}
19 {D, A, C, B}
20 {D, B, A, C}
21 {D, B, C, A}
22 {D, C, A, B}
23 {D, C, B, A}

Take a look at the first item in each row and you will see that it is changing in sequence of the input list. Let n be the number of items in the input List (n = 4) and subFactorial be (n-1)! (subFactorial = (4-1)! = 3!  = 6) then the first item in the list for permutation no is at index no/subFactorial. In the example above, the first item for permutation 15 is at index no/subFactorial = 15/6 = 2, which corresponds to "C" (remember that "A" is at index 0, "B" is at index 1 and so forth).

This means that we, for any input List, can reduce the problem by picking the first item according to the algorithm above and then recursively invoke the same method again, but now with a reduced number of items in the input list. A classical divide-and-conquer approach. This is how it looks like:

private static <T> List<T> permutationHelper(long no, LinkedList<T> in, List<T> out) {
    if (in.isEmpty()) return out;
    long subFactorial = factorial(in.size() - 1);
    out.add(in.remove((int) (no / subFactorial)));
    return permutationHelper(no % subFactorial, in, out);
}
This helper method takes the permutation number no, an input List in with the initial ordered items and then an out List used to build up the result as the recursion progresses.

As common recursion practice dictates, we start with the base case where we handle the trivial case where recursion shall stop. If the in List is empty, then we are ready and just return the now completed out list.

If the in List was not empty, then we first calculate the previously mentioned subFactorial which is (in.size()-1)! as explained above. This value is then used later on in the method.

The next line is really all logic in the method. We first calculate the index of the item we want to put first in the out List. The index of the item is (int)(no/subFactorial) and we remove this from the in List. As the remove method conveniently returns the item we just removed, we can use remove's return value directly and just add it to the end of the out List in one step.

Now that we have reduced the problem from n to n-1 we need to recurse in a proper way. We call our permutationHelper() method with the same in and out lists we got as input parameters but the permutaion number no is now the remainder of the previous division. This is logical because we handled only the corresponding whole part of the division before. So, we invoke with no%subFactorial which is the remainder that remains to handle.

This is really all there is to do. Those five innocent lines can execute in under 1 ms or perhaps in over 2000 years depending on your input.

The helper method is just there to help us write the real permutator that we are exposing in the class. Here is the exposed wrapper:

public static <T> List<T> permutation(long no, List<T> items) {
    return permutationHelper(
        no,
        new LinkedList<>(Objects.requireNonNull(items)),
        new ArrayList<>()
    );
}
We provide the permutation number no and the List we want to use as input for our method and we just create the in and out List and call the helper method. Simple.

So now we can compute any permutation of a given List. But the Stream support remains, the one  that can give us streams of permutated lists. Providing such a method is really trivial now that we got our permutation() method. This is how it can be done:

public static <T> Stream<Stream<T>> of(T... items) {
    List<T> itemList = Arrays.asList(items);
    return LongStream.range(0, factorial(items.length))
            .mapToObj(no -> permutation(no, itemList).stream());
}

All the hard work is done by the LongStream which supports laziness, parallelism, etc for free and we only map the long Stream to what we would like to have. The long stream provides the input longs 0, 1, 2, ..., (items.lenght)! and then we map those longs to the corresponding permutation. It is really simple.

Note that I have opted to return a Stream of Stream rather than a Stream of List or something similar. If you want a Stream of List, you just change the return type and the mapping.

Now that we have completed everything we need, I will provide the entire Permutations class before we start putting it to use with some examples:

public class Permutations {
    
    private Permutations() {
    }

    public static long factorial(int n) {
        if (n > 20 || n < 0) throw new IllegalArgumentException(n + " is out of range");
        return LongStream.rangeClosed(2, n).reduce(1, (a, b) -> a * b);
    }

    public static <T> List<T> permutation(long no, List<T> items) {
        return permutationHelper(no,
              new LinkedList<>(Objects.requireNonNull(items)),
              new ArrayList<>());
    }

    private static <T> List<T> permutationHelper(long no, LinkedList<T> in, List<T> out) {
        if (in.isEmpty()) return out;
        long subFactorial = factorial(in.size() - 1);
        out.add(in.remove((int) (no / subFactorial)));
        return permutationHelper((int) (no % subFactorial), in, out);
    }

    @SafeVarargs
    @SuppressWarnings("varargs") // Creating a List from an array is safe
    public static <T> Stream<Stream<T>> of(T... items) {
        List<T> itemList = Arrays.asList(items);
        return LongStream.range(0, factorial(items.length))
                .mapToObj(no -> permutation(no, itemList).stream());
    }

}

Examples


We start with testing our factorial() method a bit with the following code snippet:
List<String> items = Arrays.asList("A", "B", "C");
long permutations = Permutations.factorial(items.size());
System.out.println(items + " can be combined in " + permutations + " different ways:");
which will print out:
[A, B, C] can be combined in 6 different ways:
This looks encouraging. We know from the previous chapter that one can combine three distinct objects in six ways. So now we want to see facts on the table: What are those combinations? One way of seeing them all is to use our permutation() method. If we add the following code to our existing snippet:
LongStream.range(0, permutations).forEachOrdered(i -> {
        System.out.println(i + ": " + Permutations.permutation(i, items));
    });
We will get the following output:
[A, B, C] can be combined in 6 different ways:
0: [A, B, C]
1: [A, C, B]
2: [B, A, C]
3: [B, C, A]
4: [C, A, B]
5: [C, B, A]
Wow! It works. It looks exactly the same as in the table in the previous chapter. Now let's take the Permutaions.of() method for a spin like this:
Permutations.of("A", "B", "C")
        .map(s -> s.collect(toList()))
        .forEachOrdered(System.out::println);
The Permutations.of() method will provide all permutations of {A, B, C} and then we will collect those permutations into lists and subsequently print the lists like this:
[A, B, C]
[A, C, B]
[B, A, C]
[B, C, A]
[C, A, B]
[C, B, A]
Sometimes we are interested in just generating a single sequence in all possible orders and this can also be done easily in a similar way like this:
Permutations.of("A", "B", "C")
        .flatMap(Function.identity())
        .forEachOrdered(System.out::print);
Which will produce:
ABCACBBACBCACABCBA

Parallelism

As I mentioned, the Stream we get from Permutations.of() supports parallelism which is nice if you have a large number of permutaions and you want to put all your CPU:s to use. To be able to examine which threads are executing what, we will use a small support method:
private static <T> void printThreadInfo(Stream<T> s) {
    System.out.println(Thread.currentThread().getName() + " handles " + s.collect(toList()));
}
Now, let us examine which threads are being used by running the following line:
Permutations.of("A", "B", "C").forEach(Main::printThreadInfo);
We will see something like this:
main handles [A, B, C]
main handles [A, C, B]
main handles [B, A, C]
main handles [B, C, A]
main handles [C, A, B]
main handles [C, B, A]
This is logical. All our permutations are handled by the same thread (because we did not ask for a parallel stream). Now, let's modify the test line so it looks like this:
Permutations.of("A", "B", "C").parallel().forEach(Main::printThreadInfo);
This will produce something similar to this;
main handles [B, C, A]
main handles [C, B, A]
ForkJoinPool.commonPool-worker-2 handles [C, A, B]
ForkJoinPool.commonPool-worker-2 handles [A, B, C]
ForkJoinPool.commonPool-worker-2 handles [B, A, C]
ForkJoinPool.commonPool-worker-1 handles [A, C, B]
Apparently, on my computer, two combinations continued to execute on the main thread where as there were two other threads that (unevenly) shared the rest of the work.

Laziness

It should be noted that the Stream in Permutations.of() produces the sequence of combinations lazily (or using other words, as they are needed). So, if we set up an enormous amount of combinations, but only need one, the stream will only produce that one and only combination. Let's take an example with 16 input items which corresponds to an extremely large number of permutations, something comparable with the national dept:
System.out.println(
        Permutations.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
        .findFirst()
        .get()                
        .collect(toList())
);
This line will complete instantly and will return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] which, unsurprisingly, is the first combination.

Testing

Permutations are great for testing! How many times have you asked yourself: "Have I tried all the possible combinations". Using Permutations,  you can now be sure that you have tested every input combinations like in this simple outline (obviously, you need to replace System.out.println with something more useful):
Runnable setup = () -> System.out.println("setup connection");
Runnable send = () -> System.out.println("send data");
Runnable close = () -> System.out.println("close connection");

Permutations.of(setup, send, close)
    .flatMap(Function.identity())
    .forEachOrdered(Runnable::run);

The Morning

So, how does the optimal morning look like? Well, if you are alone at home, your day might start in many different ways, perhaps like this:
Permutations.of(
        "Get up",
        "Brush teeths",
        "Eat breakfast",
        "Get dressed",
        "Find wallet",
        "Find computer"
)
        .map(s -> s.collect(toList()))
        .forEach(System.out::println);
So now you can pick your favorite from:
[Get up, Brush teeths, Eat breakfast, Get dressed, Find wallet, Find computer]
[Get up, Brush teeths, Eat breakfast, Get dressed, Find computer, Find wallet]
[Get up, Brush teeths, Eat breakfast, Find wallet, Get dressed, Find computer]
[Get up, Brush teeths, Eat breakfast, Find wallet, Find computer, Get dressed]
[Get up, Brush teeths, Eat breakfast, Find computer, Get dressed, Find wallet]
[Get up, Brush teeths, Eat breakfast, Find computer, Find wallet, Get dressed]

... (713 other combinations) ...

[Find computer, Find wallet, Get dressed, Eat breakfast, Brush teeths, Get up]
With some preparation, even the last combination is really doable...

Conclusions

Permutation is a powerful tool that is worth mastering.

In this post, I have devised a very short and simple, yet reasonably efficient, implementation of a permutation support class for Java 8.

If you are writing unit tests, you should definitely know how to use permutations.

Take care, not to hit the "asymptotic wall" and limit the number of items that you work with when you are generating permutations.


Life is about picking the right choices, is it not?




Thursday, May 7, 2015

A Java 8 functional view on POJOs (Plain Ordinary Java Object)

Background

Since Java was first created back in the 90's, developers have used Plain Ordinary Java Objects (or short "POJOs"). Now that Java 8 is released, we can have another view on the legacy POJOs and work with them in a completely new way. Read this post and try to master the new opportunities with POJOs and Java 8.

Example of a POJO

Consider the following simple POJO that models a City that can have a name and an associated short code:
public class City {

    private String name;
    private String code;

    public City() {
    }

    public City(String name, String code) {
        this.name = name;
        this.code = code;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getCode() {
        return code;
    }

    public void setCode(String code) {
        this.code = code;
    }

}
The idea with POJOs is to compose a more precise Object that consists of one or several other objects so that we can build up more and more complex representations of different models we might need. For the moment, we disregard that the POJO should also have additional equals()hashCode() and  toString() methods.

With this class we can easily create cities like this:
// Use a constructor with parameters to create a City
City sf = new City("San Francisco", "SF");
     
// Use a default constructor with no parameters to create a City
City la = new City();
// Set the members using setters
la.setName("Los Angeles");
la.setCode("LA");
This is all old school, but what is really a getter and a setter in a Java 8 functional context? How can you model different constructors in Java 8?

A Functional View of Getters

A getter is something that can take a POJO and convert it into something else. This corresponds to a Java 8 Function. In our City example, the getName() getter corresponds to a Java 8 Function<City, String> because the getter takes a City and turns it into a String. In other words it maps a City to a String. We can easily prove our postulate by the following code snippet:

// Use the City's method references and assign them to functions
Function<City, String> getNameFunction = City::getName;
Function<City, String> getCodeFunction = City::getCode;

System.out.println("The code for "
        + getNameFunction.apply(sf)
        + " is "
        + getCodeFunction.apply(sf));

-> The code for San Francisco is SF
So, now you can use a Function instead of calling the getter directly. This can be very useful when you want to use different getters programatically in your code, for example by providing the getter as a Function in methods you call. It is also handy because it allows you to use the methods programatically without resorting to reflection in some cases.

A Functional View of Setters

The functional view on setters is a slightly more complex than for getters. Here we really use two parameters: the POJO itself and some value we want to set in the POJO. As opposed to a getter, we do not want to return anything. The way to go in Java 8 is to use something called a BiConsumer. As its prefix implies, a BiConsumer takes  two parameter and it "consumes" them both, meaning that they are "swallowed" so that nothing is returned. Again, we can test this thesis very easily like this:
// Use the City's method references and assign them to biconsumers
BiConsumer<City, String> setNameBiConsumer = City::setName;
BiConsumer<City, String> setCodeBiConsumer = City::setCode;
     
City ny = new City();
setNameBiConsumer.accept(ny, "New York");
setCodeBiConsumer.accept(ny, "NY");
So, now you can also use your setters programatically in your Java 8 code.

A Functional View of Constructors

The way of constructing objects in Java is sometimes considered as "magic" but it is really nothing special about a constructor from a functional point of view. A constructor is something like a Function that takes anything from zero to a large number of parameters and produces an Object.

The Default Constructor

In our example we had two constructors and now we will start with looking at the default constructor City(). The constructor apparently creates a City without using any parameter, so in a Java 8 context it is really a Supplier and more specifically, since it supplies cities, it is a Supplier<City>.  Again, we can prove this statement by trying this simple code:
// Use the City's constructor method reference to create
// a default constructor reference.
Supplier<City> defaultConstructor = City::new;

City sd = defaultConstructor.get();
sd.setName("San Diego");
sd.setCode("SD");
This is great! Now we can create different objects using simple constructor references. Very useful in dynamic programming code where you do not know the type of object you want to create until you run your code. Note how we obtain a method reference for the constructor using the ::new reference.

Parameter Constructors

Now it becomes a bit more tricky. A constructor that takes parameters is something more than just a Supplier because it needs the input parameters before the object can be created. In our example we have a constructor with two parameters and thus we can use Java 8's BiFunction. The BiFunction takes two parameters and can return something else, potentially depending on the two parameters. Because our City constructor takes two strings we want to use a BiFunction<String, String, City> which implies that we take two strings and maps them to a City. Again, we put together some code to show the concept:
// Use the City's constructor method reference to create
// a two-parameter constructor reference.
BiFunction<String, String, City> twoParameterConstructor = City::new;

City dc = twoParameterConstructor.apply("Washington, D.C.", "DC");
But wait there! We used the same method reference City::new for both the parameter-less default constructor and the two-parameter constructor, and yet Java 8 can distinguish between them! How can City::new have two different meanings? The answer is that Java 8 is able to infer which constructor is shall select, because it is able to match the different constructors against the expected result. So, because we want a BiFunction, it understands that it shall return a method reference to the two-parameter constructor and not the default constructor with no parameters!

Conclusions

In this post, we have devised a way of obtaining a fully functional view of POJOs. By mastering these techniques, you will be able to shake new life into your old POJOs and your Java 8 code.

Good luck with your functions!







Friday, April 3, 2015

Return Constructed Values in Java 8

Background

Very often when you are writing a method, you create an object, work with it a bit and then return the object or something that depends on the newly created object. In Java 8, you can do this in one step.

This (short) post shows a simple, yet very convenient and power-full, way of writing small methods i Java 8 where you do not have to write so much boiler plate code as you used to do in previous Java versions.

The pre-Java 8 Way

Suppose you want to create a method that is using a StringBuilder, add some stuff to the builder and then return a String using the StringBuilder's toString() method. If you are using a pre-Java 8 version you would most likely do it like this:

    public String poem() {
        final StringBuilder sb = new StringBuilder();
        sb.append("Hey there my dearest baby small, ");
        sb.append("soon you will be big and tall");
        return sb.toString();
    }
and if you would like to build an unmodifiable Set of Strings you could do something like this:

    public Set<String> suits() {
        final Set<String> result = new HashSet<>();
        result.add("Hearts");
        result.add("Spades");
        result.add("Diamonds");
        result.add("Clubs");
        return Collections.unmodifiableSet(result);
    }
As can be seen above, we first create something (i.e. a StringBuilder or a Set<String>), then we mutate that object (i.e. sb.append() or result.add()) and then finally we are finishing things up by applying some kind of function on the thing we created in the first place.

One Java 8 Way

By writing a small utility method like this...

    public <T, R> R with(Supplier<T> supplier, Consumer<T> mutator, Function<T, R> finisher) {
        final T start = supplier.get();
        mutator.accept(start);
        return finisher.apply(start);
    }

... we could make our methods a bit smaller. The method above takes two generic types, namely T which is the initial type we are working with (e.g. a StringBuilder) and R which is the return type (e.g. String). First the supplier is called so that we can obtain the initial start object (e.g. an instance of a StringBuilder). Then the mutator is called on the initial start object (e.g. we add stings to the StringBuilder). Then, lastly, the finisher is applied so that the right return type may be obtained (e.g. the initial StringBuilder instance is converted to a String). We could now start writing the previous examples in another way:

    public String poem() {
        return with(StringBuilder::new,
                sb -> sb
                .append("Hey there my dearest baby small, ")
                .append("soon you will be big and tall"),
                StringBuilder::toString);
    }
The first (Supplier) parameter in the call of the with() method is StringBuilder::new which is a method reference to StringBuilder's parameter-less constructor. The second (Consumer) argument will receive the newly created instance of the StringBuilder and it will then mutate it (i.e. call add() to add stuff to it). Finally, the third (Function) argument will take the newly created mutated instance and modify it to something else. In this case we will (again) us a method reference to obtain the object we are returning. In this case StringBuilder::toString will convert the mutated newly created object and map it to a String using the parameter-less function toString().

Here is the other example that builds an unmodifiable Set:

    public Set<String> suits() {
        return with(HashSet<String>::new, s -> {
            s.add("Hearts");
            s.add("Spades");
            s.add("Diamonds");
            s.add("Clubs");
        }, Collections::unmodifiableSet);
    }
This is sometimes refereed to as "Dependency Inversion". If this is an improvement over the pre-Java 8 ways is up to the reader to decide. I think one advantage is that you focus more on what to do rather than how to do it. On the other hand it is not so much less code as one might have hoped for and it is less clear what is going on for some one else not familiar with the concept.

If you do not want to run the finisher (perhaps you are satisfied with just a Set and does not need an unmodifiable Set), you could always skip the last step as shown below:

    public Set<String> suitsMutable() {

        return with(HashSet<String>::new, s -> {
            s.add("Hearts");
            s.add("Spades");
            s.add("Diamonds");
            s.add("Clubs");
        }, Function.identity());
    }
The static Function.identity() Function is (as its name implies) an identity function, meaning that it will always return exactly the same instance that you are providing to the function.


Using Java 8 Streams

Another way of accomplishing the same thing without using any utility method would be to use Java 8's built in Stream functions. The methods would then look something like this:

    public String poem() {
        return Stream.of("Hey there my dearest baby small, ", "soon you will be big and tall")
                .collect(joining());
    }

    public Set<String> suits() {
        return Collections.unmodifiableSet(
                Stream.of("Hearts", "Spades", "Diamonds", "Clubs")
                .collect(toSet()));
    }
Note that I have imported Collectors.joining and Collectors.toSet statically (this makes the code a bit smaller).

This is certainly an improvement over the previous methods. Less code and more in line with Java 8 standard coding pattern. However, sometimes you do not have the required collectors or finishers if you work with custom return objects. In some of those cases you can use the Collectors.collectingAndThen() method which allows you to run a normal collector and then apply a finisher on it. In the suits method example, it could look like this:

    public Set<String> suits() {
        return Stream.of("Hearts", "Spades", "Diamonds", "Clubs")
                .collect(collectingAndThen(toSet(), Collections::unmodifiableSet));
    }
Again, I have imported the collectingAndThen() method statically.


Convenience Methods

If you create a convenience methods like this:

   public <T> Set<T> immutableSetOf(T... members) {
        return Stream.of(members)
                .collect(collectingAndThen(toSet(), Collections::unmodifiableSet));
    }
then your can obtain your immutable Set very easily like this:

   public Set<String> suitsSimple() {
        return immutableSetOf("Hearts", "Spades", "Diamonds", "Clubs");
    }

Conclusions

Consider using Streams in your methods if you want to write simple and compact methods. Know your collectors and you will be able to save code space and time. A support method like the with() method can be useful in some circumstances.