Home Explore Blog CI



rustc

2nd chunk of `src/overview.md`
5dad53a2b244623d5aad553af447832eb87b1c5ba99abf6d0000000100000fac
Parsing is organized by semantic construct. Separate
`parse_*` methods can be found in the [`rustc_parse`][rustc_parse_parser_dir]
directory. The source file name follows the construct name. For example, the
following files are found in the `parser`:

- [`expr.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/expr.rs)
- [`pat.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/pat.rs)
- [`ty.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/ty.rs)
- [`stmt.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/stmt.rs)

This naming scheme is used across many compiler stages. You will find either a
file or directory with the same name across the parsing, lowering, type
checking, [Typed High-level Intermediate Representation (`THIR`)][thir] lowering, and
[Mid-level Intermediate Representation (`MIR`)][mir] building sources.

Macro-expansion, `AST`-validation, name-resolution, and early linting also take
place during the lexing and parsing stage.

returned from the parser while the standard [`Diag`] API is used
for error handling. Generally Rust's compiler will try to recover from errors
by parsing a superset of Rust's grammar, while also emitting an error type.

### `AST` lowering

Next the `AST` is converted into [High-Level Intermediate Representation
(`HIR`)][hir], a more compiler-friendly representation of the `AST`. This process
is called "lowering" and involves a lot of desugaring (the expansion and
formalizing of shortened or abbreviated syntax constructs) of things like loops
and `async fn`.

We then use the `HIR` to do [*type inference*] (the process of automatic
detection of the type of an expression), [*trait solving*] (the process of
pairing up an impl with each reference to a `trait`), and [*type checking*]. Type
checking is the process of converting the types found in the `HIR` ([`hir::Ty`]),
which represent what the user wrote, into the internal representation used by
the compiler ([`Ty<'tcx>`]). It's called type checking because the information
is used to verify the type safety, correctness and coherence of the types used
in the program.

### `MIR` lowering

The `HIR` is further lowered to `MIR`
(used for [borrow checking]) by constructing the `THIR`  (an even more desugared `HIR` used for
pattern and exhaustiveness checking) to convert into `MIR`.

We do [many optimizations on the MIR][mir-opt] because it is generic and that
improves later code generation and compilation speed. It is easier to do some
optimizations at `MIR` level than at `LLVM-IR` level. For example LLVM doesn't seem
to be able to optimize the pattern the [`simplify_try`] `MIR`-opt looks for.

Rust code is also [_monomorphized_] during code generation, which means making
copies of all the generic code with the type parameters replaced by concrete
types. To do this, we need to collect a list of what concrete types to generate
code for. This is called _monomorphization collection_ and it happens at the
`MIR` level.


### Code generation

We then begin what is simply called _code generation_ or _codegen_. The [code
generation stage][codegen] is when higher-level representations of source are
turned into an executable binary. Since `rustc` uses LLVM for code generation,
the first step is to convert the `MIR` to `LLVM-IR`. This is where the `MIR` is
actually monomorphized. The `LLVM-IR` is passed to LLVM, which does a lot more
optimizations on it, emitting machine code which is basically assembly code
with additional low-level types and annotations added (e.g. an ELF object or
`WASM`). The different libraries/binaries are then linked together to produce
the final binary.


## How it does it

Now that we have a high-level view of what the compiler does to your code,
let's take a high-level view of _how_ it does all that stuff. There are a lot
of constraints and conflicting goals that the compiler needs to
satisfy/optimize for. For example,

Title: AST Lowering, MIR Lowering, and Code Generation in the Rust Compiler
Summary
This section explains the subsequent steps in the Rust compilation process after parsing. It covers AST lowering to HIR (High-Level Intermediate Representation), type inference, trait solving, and type checking. It then describes MIR (Mid-level Intermediate Representation) lowering via THIR, followed by MIR optimizations. Finally, it details code generation, which involves converting MIR to LLVM-IR, monomorphization, and linking libraries/binaries to produce the final executable.