AST Lowering, MIR Lowering, and Code Generation in the Rust Compiler

Parsing is organized by semantic construct. Separate `parse_*` methods can be found in the [`rustc_parse`][rustc_parse_parser_dir] directory. The source file name follows the construct name. For example, the following files are found in the `parser`: - [`expr.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/expr.rs) - [`pat.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/pat.rs) - [`ty.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/ty.rs) - [`stmt.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/stmt.rs) This naming scheme is used across many compiler stages. You will find either a file or directory with the same name across the parsing, lowering, type checking, [Typed High-level Intermediate Representation (`THIR`)][thir] lowering, and [Mid-level Intermediate Representation (`MIR`)][mir] building sources. Macro-expansion, `AST`-validation, name-resolution, and early linting also take place during the lexing and parsing stage. returned from the parser while the standard [`Diag`] API is used for error handling. Generally Rust's compiler will try to recover from errors by parsing a superset of Rust's grammar, while also emitting an error type. ### `AST` lowering Next the `AST` is converted into [High-Level Intermediate Representation (`HIR`)][hir], a more compiler-friendly representation of the `AST`. This process is called "lowering" and involves a lot of desugaring (the expansion and formalizing of shortened or abbreviated syntax constructs) of things like loops and `async fn`. We then use the `HIR` to do [*type inference*] (the process of automatic detection of the type of an expression), [*trait solving*] (the process of pairing up an impl with each reference to a `trait`), and [*type checking*]. Type checking is the process of converting the types found in the `HIR` ([`hir::Ty`]), which represent what the user wrote, into the internal representation used by the compiler ([`Ty<'tcx>`]). It's called type checking because the information is used to verify the type safety, correctness and coherence of the types used in the program. ### `MIR` lowering The `HIR` is further lowered to `MIR` (used for [borrow checking]) by constructing the `THIR` (an even more desugared `HIR` used for pattern and exhaustiveness checking) to convert into `MIR`. We do [many optimizations on the MIR][mir-opt] because it is generic and that improves later code generation and compilation speed. It is easier to do some optimizations at `MIR` level than at `LLVM-IR` level. For example LLVM doesn't seem to be able to optimize the pattern the [`simplify_try`] `MIR`-opt looks for. Rust code is also [_monomorphized_] during code generation, which means making copies of all the generic code with the type parameters replaced by concrete types. To do this, we need to collect a list of what concrete types to generate code for. This is called _monomorphization collection_ and it happens at the `MIR` level. ### Code generation We then begin what is simply called _code generation_ or _codegen_. The [code generation stage][codegen] is when higher-level representations of source are turned into an executable binary. Since `rustc` uses LLVM for code generation, the first step is to convert the `MIR` to `LLVM-IR`. This is where the `MIR` is actually monomorphized. The `LLVM-IR` is passed to LLVM, which does a lot more optimizations on it, emitting machine code which is basically assembly code with additional low-level types and annotations added (e.g. an ELF object or `WASM`). The different libraries/binaries are then linked together to produce the final binary. ## How it does it Now that we have a high-level view of what the compiler does to your code, let's take a high-level view of _how_ it does all that stuff. There are a lot of constraints and conflicting goals that the compiler needs to satisfy/optimize for. For example,

This section explains the subsequent steps in the Rust compilation process after parsing. It covers AST lowering to HIR (High-Level Intermediate Representation), type inference, trait solving, and type checking. It then describes MIR (Mid-level Intermediate Representation) lowering via THIR, followed by MIR optimizations. Finally, it details code generation, which involves converting MIR to LLVM-IR, monomorphization, and linking libraries/binaries to produce the final executable.