Dataflow Analysis in `rustc`

# Dataflow Analysis  If you work on the MIR, you will frequently come across various flavors of [dataflow analysis][wiki]. `rustc` uses dataflow to find uninitialized variables, determine what variables are live across a generator `yield` statement, and compute which `Place`s are borrowed at a given point in the control-flow graph. Dataflow analysis is a fundamental concept in modern compilers, and knowledge of the subject will be helpful to prospective contributors. However, this documentation is not a general introduction to dataflow analysis. It is merely a description of the framework used to define these analyses in `rustc`. It assumes that the reader is familiar with the core ideas as well as some basic terminology, such as "transfer function", "fixpoint" and "lattice". If you're unfamiliar with these terms, or if you want a quick refresher, [*Static Program Analysis*] by Anders Møller and Michael I. Schwartzbach is an excellent, freely available textbook. For those who prefer audiovisual learning, we previously recommended a series of short lectures by the Goethe University Frankfurt on YouTube, but it has since been deleted. See [this PR][pr-1295] for the context and [this comment][pr-1295-comment] for the alternative lectures. ## Defining a Dataflow Analysis A dataflow analysis is defined by the [`Analysis`] trait. In addition to the type of the dataflow state, this trait defines the initial value of that state at entry to each block, as well as the direction of the analysis, either forward or backward. The domain of your dataflow analysis must be a [lattice][] (strictly speaking a join-semilattice) with a well-behaved `join` operator. See documentation for the [`lattice`] module, as well as the [`JoinSemiLattice`] trait, for more information. ### Transfer Functions and Effects The dataflow framework in `rustc` allows each statement (and terminator) inside a basic block to define its own transfer function. For brevity, these individual transfer functions are known as "effects". Each effect is applied successively in dataflow order, and together they define the transfer function for the entire basic block. It's also possible to define an effect for particular outgoing edges of some terminators (e.g. [`apply_call_return_effect`] for the `success` edge of a `Call` terminator). Collectively, these are referred to as "per-edge effects". ### "Before" Effects Observant readers of the documentation may notice that there are actually *two* possible effects for each statement and terminator, the "before" effect and the unprefixed (or "primary") effect. The "before" effects are applied immediately before the unprefixed effect **regardless of the direction of the analysis**. In other words, a backward analysis will apply the "before" effect and then the "primary" effect when computing the transfer function for a basic block, just like a forward analysis. The vast majority of analyses should use only the unprefixed effects: Having multiple effects for each statement makes it difficult for consumers to know where they should be looking. However, the "before" variants can be useful in some scenarios, such as when the effect of the right-hand side of an assignment statement must be considered separately from the left-hand side. ### Convergence Your analysis must converge to "fixpoint", otherwise it will run forever. Converging to fixpoint is just another way of saying "reaching equilibrium". In order to reach equilibrium, your analysis must obey some laws. One of the laws it must obey is that the bottom value[^bottom-purpose] joined with some other value equals the second value. Or, as an equation: > *bottom* join *x* = *x* Another law is that your analysis must have a "top value" such that > *top* join *x* = *top* Having a top value ensures that your semilattice has a finite height, and the law state above ensures that once the dataflow state reaches top, it will no longer change (the fixpoint will be top).

This section introduces dataflow analysis within the `rustc` compiler, explaining its use in various tasks like uninitialized variable detection and borrow checking. It outlines how to define dataflow analyses using the `Analysis` trait, emphasizing the importance of lattices and the `join` operator. The concept of transfer functions, or "effects," is detailed, including "before" effects and per-edge effects. Finally, it stresses the necessity for analyses to converge to a fixpoint, adhering to laws involving bottom and top values to ensure termination.