The Ultimate Breakdown of Kotlin Coroutines. Part 1. State-Machines and Suspension.
Introduction
Throughout 2020 I wrote a document on coroutines codegen, which aimed mostly at my colleagues at JetBrains and Google to help them understand the pitfalls of the codegen and which design decisions we made. These blog posts are a simplified and trimmed down version of the documentation for a larger audience of experienced Kotlin programmers, who just want to know how the compiler turns sequential code into suspendable and become more proficient at their jobs. The blogposts will not cover kotlinx.coroutines internals. However, I think, with the knowledge of coroutines internals, it will be easier to understand, for example, how Flow's backpressure works and why it is essential for Flow operators to be inline with crossinline lambda parameter (spoilers: this is an optimization).
I decided to write the blogposts now, before Kotlin 1.5 release, since Kotlin 1.5 brings new JVM_IR back-end with a lot of improvements in code generations, tail-call optimization of suspend functions and so many bug fixes, that I lost count.
The blogposts are JVM-centric, that means it explains how things work in JVM back-end since this is the area I am most familiar with and since in JVM, there are guaranties of backward compatibility, which the compiler shall obey in both so-called "Old JVM" back-end, and in the new JVM_IR one. The new back-end's naming can differ from the official documentation: the document uses the "IR" suffix, while the official documentation omits it.
The things I omitted in this breakdown, but present in the documentation are compiler internals, and the shape of generated classes, since the last thing I want is to confuse my readers and overwhelm them with the information, which is helpful for the compiler writers, but not necessarily for language users.
Throughout the series term "coroutine" will represent either a suspend lambda or a suspend function, which is different from the usual definition of coroutines - something like a lightweight thread. The document reuses the term since "suspend lambda or function" is wordy, and when it requires the typical definition, I will say explicitly "a coroutine in a broad sense."
The last couple of words before I get to the meat - I will use a lot of code to explain different concepts, most of it actually suspends, and it might get overwhelming really fast. I will try my best to make the example short and digestible, however, coroutines is a complex and intermingled topic, and it is hard to kill the three-headed hydra one head at a time, with its heads, namely suspend, resume and state-machine, supporting each other. I hope, I've done a fine job of explaining the concepts, but if something is unclear, feel free to point to the part, which needs to be reworked.
With all the introductory note out of the way, let me begin by introducing suspend lambdas.
Suspend Lambda
Suspend lambda is an example of a coroutine, and the compiler turns ordinary, sequential code inside the lambda into suspendable. The example shows a simple suspend lambda:
suspend fun dummy() {}
suspend fun main() {
val lambda: suspend () -> Unit = {
dummy()
println(1)
dummy()
println(2)
}
lambda()
}
which, upon running, will print 1
and 2
, as expected.
One can call a suspend function only from other suspend function or suspend
lambda, but it can call ordinary, non-suspendable functions. For example, both
dummy
and println
are used only inside the lambda. Because one cannot call
suspendable functions from ordinary, we can imagine two worlds: suspendable and
ordinary. Alternatively, one can consider them as two different colors, and we
color the program by using the "suspend" modifier.
The lambda, creatively named lambda
, contains two suspend calls (dummy
) and
one from the main
function to the lambda itself, but there is no suspension.
Let us add it:
import kotlin.coroutines.*
var c: Continuation<Unit>? = null
suspend fun suspendMe() = suspendCoroutine<Unit> { continuation ->
println("Suspended")
c = continuation
}
suspend fun main() {
val lambda: suspend () -> Unit = {
suspendMe()
println(1)
suspendMe()
println(2)
}
lambda()
}
Now, when we run the code, it prints Suspended
and nothing else; it does not
even finish the execution of the program, since lambda
is, in fact, suspended,
and it suspends suspend fun main
as well.
To fix the issue with the suspension of main
, we need to cross a boundary
between suspendable and ordinary worlds and make main
ordinary, so, when it
starts a coroutine, and the coroutine suspends, main
does not. Since one
cannot call a suspendable function from an ordinary one, there are special
functions, so-called coroutine builders, whose sole purpose is to create a
coroutine, run it, and when it suspends, return execution to the caller. Other
than that, they act like other ordinary functions. Let's name ours, I don't
know, builder
:
fun builder(c: suspend () -> Unit) {
c.startCoroutine(object: Continuation<Unit> {
override val context = EmptyCoroutineContext
override fun resumeWith(result: Result<Unit>) {
result.getOrThrow()
}
})
}
A separate section explains the exact mechanism of starting a coroutine (in a
broad sense) and writing their builders. For now, consider builder
as a
boilerplate to cross the worlds.
Now, when we change main
to use the builder and not suspend itself
fun main() {
val lambda: suspend () -> Unit = {
suspendMe()
println(1)
suspendMe()
println(2)
}
builder {
lambda()
}
}
and then run the example, it will print expected Suspended
, but this time it
will exit the program.
Additionally, when we change main
to resume the lambda
import kotlin.coroutines.*
var c: Continuation<Unit>? = null
suspend fun suspendMe() = suspendCoroutine<Unit> { continuation ->
println("Suspended")
c = continuation
}
fun builder(c: suspend () -> Unit) {
c.startCoroutine(object: Continuation<Unit> {
override val context = EmptyCoroutineContext
override fun resumeWith(result: Result<Unit>) {
result.getOrThrow()
}
})
}
fun main() {
val lambda: suspend () -> Unit = {
suspendMe()
println(1)
suspendMe()
println(2)
}
builder {
lambda()
}
c?.resume(Unit)
}
it will print
Suspended
1
Suspended
lambda
is resumed and then suspended once more. If we add a couple more
c?.resume(Unit)
fun main() {
val lambda: suspend () -> Unit = {
suspendMe()
println(1)
suspendMe()
println(2)
}
builder {
lambda()
}
c?.resume(Unit)
c?.resume(Unit)
c?.resume(Unit)
}
we will get
Suspended
1
Suspended
2
Exception in thread "main" java.lang.IllegalStateException: Already resumed
The last line is what we get when we try to resume a finished continuation.
A lot happens in this little example. Unfortunately, I cannot break everything down in just one blogpost, so, let's stick with state-machines for the rest of it.
State-Machine
The compiler turns sequential code into suspendable by using state-machines. It distributes suspending calls between states in a state-machine. The relationship between the calls and the states is almost one-to-one: each call gets a state, but not each state gets a call. The state ends with the call, and the compiler places all instructions preceding the call in the same state before the call. It places all instructions after the last call before (implicit) return statement in a separate state.
For example, having
dummy()
println(1)
dummy()
println(2)
the compiler splits the code in the following way
==========
dummy()
----------
println(1)
dummy()
----------
println(2)
==========
where function boundaries are represented by ==========
and state boundaries
are represented by ----------
. The compiler after splitting the function
generates the following code (simplified for now):
val $result: Any? = null
when(this.label) {
0 -> {
this.label = 1
$result = dummy(this)
if ($result == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED
goto 1
}
1 -> {
println(1)
this.label = 2
$result = dummy(this)
if ($result == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED
goto 2
}
2 -> {
println(2)
return Unit
}
else -> {
throw IllegalStateException("call to 'resume' before 'invoke' with coroutine")
}
}
Then it puts the state-machine inside the invokeSuspend
function. Thus, in
addition to the usual for lambda captured parameters, <init>
and invoke
, we have label
field and invokeSuspend
function.
At the beginning of the function label
's value is 0
. Before the call, we set
it to 1
. During the call, two things can happen:
dummy
returns a result, in this case,Unit
. When this happens, execution continues as if it was sequential code, jumping to the next state.dummy
suspends. When a suspending function suspends, it returns theCOROUTINE_SUSPENDED
marker. So, ifdummy
suspends, the caller also suspends by returning the sameCOROUTINE_SUSPENDED
. Furthermore, all the suspend functions in call stack suspend, returning COROUTINE_SUSPEND, until we reach the coroutine builder function, which just returns.
Upon resume, invokeSuspend
is called again, but this time label
is 1
, so
the execution jumps directly to the second state, and the caller does not
execute the first call to dummy
again. This way, the lambda's execution can be
suspended and resumed. Thus, the lambda is turned into a coroutine, which is, by
definition, is a suspendable unit of code.
That is the reason why we need to turn linear code into a state machine.
On a closing note, the state machine should be flat; in other words, there
should be no state-machine inside a state-machine state. Otherwise, inner
state-machine states will rewrite label
, breaking the whole suspend-resume
machinery and leading to weird behavior, ranging from ClassCastException to
infinite loops. Similar buggy behavior happens when several suspending calls are
in one state: when the first call suspends, and then the execution resumes,
skipping all the remaining code in the state. Both these bugs were quite
frequent in the early days of coroutines inlining.
It the following post I will explain continuation passing style, which is the reason to use coroutine builders to start a coroutine (in a broad sence, of course), and the resumption process.
Comments
Post a Comment