I won’t let Java confuse you #5: let’s talk lambdas

Lambdas are interesting structures. What are they exactly? And why can’t we use non-final local variables in lambda bodies?

Welcome back to the Java-unconfusing series! It’s our safe place to take a look at simple pieces of Java code that don’t always behave exactly as we expect. We look under the hood in order to understand why Java does what it does, so we are never confused again.

Context

Years ago, when we were only starting to use the Streams API in Java, it took some time to switch the mindset and properly use the new approach.

For example, in our lambda, we might have tried modifying a variable defined outside the lambda body, which wasn’t possible. But why?

What are lambdas

Suppose we have a method that accepts a Supplier<Object>:

private static void lambdaFun(Supplier<Object> s) {}

Supplier is one of the functional interfaces. Functional interface is an interface that has one unimplemented method. In this case, it’s a get method, which doesn’t accept any arguments and returns an Object.

When we are writing a lambda, we are essentially writing an implementation of the specific functional interface expected in the context. So, the lambda like () -> 1, when provided to our lambdaFun method, is considered as an implementation of Supplier<Object>, because that’s what lambdaFun expects.

Depending on expected type, () -> 1 can be a Supplier<Integer> or even a custom type that we define ourselves. Everything depends on the context.

Annoying lambda use-case

Let’s take a look at this code:

public static void test(int x) {
    int y = 1;
    lambdaFun(() -> {
        y++;
        return x + y;
    });
}

In this case, the compiler will give us the following error: Variable used in lambda expression should be final or effectively final. What does that effectively final even mean?

What is “effectively final”

The Java compiler does a lot of smart things during compilation. It can analyse the use of variables in certain use-cases to determine if they are constant. The variable might not even be marked as final. If it’s defined within a method body, then nothing from outside can modify this variable. Then compiler only needs to check if it’s modified anywhere inside, and if it’s not, the compiler can safely conclude that the variable is effectively final, in other words a constant value.

Why is that a problem for lambdas

Lambdas don’t allow the use of any variables from outside the lambda body if those variables are not “final or effectively final”. It means they can use variables read-only. Why is that?

Remember, when we write a lambda body, we’re implementing a functional interface? So, in our Supplier case, we’re saying: this is the body of the get method of that Supplier interface implementation that is magically created “in the air”.

Let me explain.

When we compile our code, under the hood a new separate method is created. Say we have this code:

public static void test(int x) {
    lambdaFun(() -> x);
}

The () -> x becomes a method body, and in the bytecode this becomes a separate method named lambda$test$0, generated by the compiler, we can see it in the bytecode:

private static java.lang.Object lambda$test$0(int);
descriptor: (I)Ljava/lang/Object;
flags: (0x100a) ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC
Code:
  stack=1, locals=1, args_size=1
     0: iload_0
     1: invokestatic  #28                 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
     4: areturn
  LineNumberTable:
    line 7: 0
  LocalVariableTable:
    Start  Length  Slot  Name   Signature
        0       5     0     x   I

The synthetic class that actually implements the Supplier interface and uses this method as implementation is generated later, at runtime. And then the actual instance is created. It happens only in the case when the lambda is actually hit, saving up resources when it’s not.

So, why can’t we normally use outer variables in our lambdas? Simply, because internally lambdas are little methods. Any local variable that they can access, is passed to them (captured) as a method argument. Inside a lambda body, we are viewing a copied value and not an actual variable.

Still, why so final?

It’s easy to see why the local variable defined outside can’t be modified inside the lambda body. But why does it have to be final, in a sense that even after a lambda definition or even execution it can’t be modified?

Let’s imagine this limitation doesn’t exist.

We might want to define a lambda, then try and modify a variable that it uses. The variable value is actually captured at the moment when lambda is executed, so all changes between definition and execution should be taken into account. After execution, no changes are visible to lambda, since it has already entered its own stack with its own variable copy.

This all might look really confusing to a developer. The syntax makes us think that the values are captured at lambda definition, but that’s not true.

We also need to prevent an expectation, that the changes to a variable after lambda execution are visible in the lambda body. They can’t be.

It becomes especially important in a multithreaded environment. We might want to execute a lambda in a new Thread, and modify the variable at the moment that looks to us like “after the lambda execution”. But can we be so sure? Did our new Thread already start and execute our lambda when we modified the variable? Which value will actually be captured, the old one, or the amended one?

Given all of the above, compiler is protecting us from confusions and wrong expectations.

What about fields?

You might have noticed that there is no problem in reading and modifying the fields from our lambdas. Unlike a local variable, which exists with a certain state in a specific method frame, the field is visible to all class members, in all the methods of the class, independent of any frame states. As we remember, lambda body becomes a separate method under the hood. If the field is used in the lambda body, it is not captured as an argument to that method. Instead, the method can read and overwrite that field directly, just as any normal method can do.

That is easy, when the field and the method are located in the same class. Perhaps, we would like to use a reference to an external method in our lambda instead of generating one in the bytecode? For example, like this:

Stream.of(1, 2, 3).forEach(System.out::println);

Then indeed no new method is created at compile time, and instead we will have an existing method System.out.println() as our functional interface implementation. However, the syntax wouldn’t allow us to pass any fields to the method reference, which the external method might not even be able to access.

Let’s play

Digging through JLS (the Java Language Specification), I stumbled upon a couple of interesting examples of lambda bodies using local variables.

Let’s have a look.

#1

void m1(int x) {
    int y = 1;
    lambdaFun(() -> x + y);
}

In this example, both x and y are “effectively final”. They are defined within the m1 method (guaranteed to not be modified from outside). x gets a value when the method m1 is called, and it’s never modified afterward, y is also initialized only once.

#2

What about this one?

void m2(int x) {
    int y;
    y = 1;
    lambdaFun(() -> x + y);
}

y is only initialized once, so it’s still effectively final. We’re good.

#3

void m3(int x) {
    int y;
    if (x == 2) y = 1;
    lambdaFun(() -> x + y);
}

Variable y is initialized conditionally, in proper terms they call it not definitely assigned. We can’t use an uninitialized variable because then there’s no value for our lambda to capture (to feed it as an argument into a method created under the hood). That won’t compile!

#4

void m4(int x) {
    int y;
    if (x == 2) y = 1; else y = 2;
    lambdaFun(() -> x + y);
}

We do have a conditional assignment to y, but it’s guaranteed that in any condition there will be something (else covers all possible cases). There will be for sure some value that’s going to be assigned to it once and only once. We’re good!

#5

void m5(int x) {
    int y;
    if (x == 2) y = 1;
    y = 2;
    lambdaFun(() -> x + y);
}

Here we have a possibility that y gets reassigned, meaning that y is not effectively final. Won’t compile!

#6

void m6(int x) {
    lambdaFun(() -> x + 1);
    x++;
}

Even though x is modified after our lambda definition, it still means that x is not effectively final. Won’t compile!

#7

void m7(int x) {
    lambdaFun(() -> x = 1);
}

Won’t compile either: x is attempted to be modified within lambda body.

#8

void m8() {
    int y;
    lambdaFun(() -> y = 1);
}

What about this one? y is not having any value, so it can’t be captured by lambda. And then, we are trying to modify it within the lambda body. None of that is going to work…

#9

void m9(String[] arr) {
    for (String s : arr) {
        lambdaFun(() -> s);
    }
}

Here, on each iteration we are creating new instance of String. Each of these instances is definitely assigned a value, and it is guaranteed to happen only once. Everything is alright!

#10

void m10(String[] arr) {
    for (int i = 0; i < arr.length; i++) {
        lambdaFun(() -> arr[i]);
    }
}

Unlike in the previous example, here we are reusing the same variable i and modifying its value by incrementing it on each iteration. Variable i is not effectively final, so this won’t work.

#11

void m11(int x) {
    int y = 1;
    if (x == 2) y = 1;
    lambdaFun(() -> x + y);
}

I know, it looks like we are not changing the value of y here, yet the value is getting overwritten, meaning that the variable is not effectively final. This won’t compile!

Conclusion

Lambdas are interesting structures. But we can get a feeling of how they are going to behave simply by realizing one thing: lambda body is, or becomes, a separate method. Any local variables that we use in the lambda body are captured as method arguments, so we can access only the copies of the values. Our compiler protects us, so we can’t use non-final local variables in our lambdas.

Hope this was educative and fun! Here are the related docs.

If you liked this little post, take a look at the previous posts from the series: