Home Explore Blog CI



rustc

5th chunk of `src/overview.md`
4ccc3fcf58a98c42865b8e4c2bceb882f8313ef577b681110000000100000fd6
This is a performance and memory optimization in which we allocate the values in
a special allocator called an
_[arena]_. Then, we pass
around references to the values allocated in the arena. This allows us to make
sure that identical values (e.g. types in your program) are only allocated once
and can be compared cheaply by comparing pointers. Many of the intermediate
representations are interned.


### Queries

The first big implementation choice is Rust's use of the _query_ system in its
compiler. The Rust compiler _is not_ organized as a series of passes over the
code which execute sequentially. The Rust compiler does this to make
incremental compilation possible -- that is, if the user makes a change to
their program and recompiles, we want to do as little redundant work as
possible to output the new binary.

In `rustc`, all the major steps above are organized as a bunch of queries that
call each other. For example, there is a query to ask for the type of something
and another to ask for the optimized `MIR` of a function. These queries can call
each other and are all tracked through the query system. The results of the
queries are cached on disk so that the compiler can tell which queries' results
changed from the last compilation and only redo those. This is how incremental
compilation works.

In principle, for the query-fied steps, we do each of the above for each item
individually. For example, we will take the `HIR` for a function and use queries
to ask for the `LLVM-IR` for that HIR. This drives the generation of optimized
`MIR`, which drives the borrow checker, which drives the generation of `MIR`, and
so on.

... except that this is very over-simplified. In fact, some queries are not
cached on disk, and some parts of the compiler have to run for all code anyway
for correctness even if the code is dead code (e.g. the borrow checker). For
example, [currently the `mir_borrowck` query is first executed on all functions
of a crate.][passes] Then the codegen backend invokes the
`collect_and_partition_mono_items` query, which first recursively requests the
`optimized_mir` for all reachable functions, which in turn runs `mir_borrowck`
for that function and then creates codegen units. This kind of split will need
to remain to ensure that unreachable functions still have their errors emitted.


Moreover, the compiler wasn't originally built to use a query system; the query
system has been retrofitted into the compiler, so parts of it are not query-fied
yet. Also, LLVM isn't our code, so that isn't querified either. The plan is to
eventually query-fy all of the steps listed in the previous section,
but as of <!-- date-check --> November 2022, only the steps between `HIR` and
`LLVM-IR` are query-fied. That is, lexing, parsing, name resolution, and macro
expansion are done all at once for the whole program.

One other thing to mention here is the all-important "typing context",
[`TyCtxt`], which is a giant struct that is at the center of all things.
(Note that the name is mostly historic. This is _not_ a "typing context" in the
sense of `Γ` or `Δ` from type theory. The name is retained because that's what
the name of the struct is in the source code.) All
queries are defined as methods on the [`TyCtxt`] type, and the in-memory query
cache is stored there too. In the code, there is usually a variable called
`tcx` which is a handle on the typing context. You will also see lifetimes with
the name `'tcx`, which means that something is tied to the lifetime of the
[`TyCtxt`] (usually it is stored or interned there).


### `ty::Ty`

Types are really important in Rust, and they form the core of a lot of compiler
analyses. The main type (in the compiler) that represents types (in the user's
program) is [`rustc_middle::ty::Ty`][ty]. This is so important that we have a whole chapter
on [`ty::Ty`][ty], but for now, we just want to mention that it exists and is the way
`rustc` represents types!

Also note that the [`rustc_middle::ty`] module defines the [`TyCtxt`] struct we mentioned before.

Title: Queries, Typing Context, and `ty::Ty` in the Rust Compiler
Summary
This section explains the query system in the Rust compiler, where major compilation steps are organized as queries that call each other, enabling incremental compilation. Results are cached to minimize redundant work during recompilation. The typing context (`TyCtxt`), a central struct, is described along with its role in queries and lifetime management. The section also introduces `ty::Ty`, the core representation of types in the compiler, and mentions that the `rustc_middle::ty` module defines the `TyCtxt` struct.