Queries: Demand-Driven Compilation in Rust

# Queries: demand-driven compilation  As described in [the high-level overview of the compiler][hl], the Rust compiler is still (as of  July 2021) transitioning from a traditional "pass-based" setup to a "demand-driven" system. The compiler query system is the key to rustc's demand-driven organization. The idea is pretty simple. Instead of entirely independent passes (parsing, type-checking, etc.), a set of function-like *queries* compute information about the input source. For example, there is a query called `type_of` that, given the [`DefId`] of some item, will compute the type of that item and return it to you. Query execution is *memoized*. The first time you invoke a query, it will go do the computation, but the next time, the result is returned from a hashtable. Moreover, query execution fits nicely into *incremental computation*; the idea is roughly that, when you invoke a query, the result *may* be returned to you by loading stored data from disk.[^incr-comp-detail] Eventually, we want the entire compiler control-flow to be query driven. There will effectively be one top-level query (`compile`) that will run compilation on a crate; this will in turn demand information about that crate, starting from the *end*. For example: - The `compile` query might demand to get a list of codegen-units (i.e. modules that need to be compiled by LLVM). - But computing the list of codegen-units would invoke some subquery that returns the list of all modules defined in the Rust source. - That query in turn would invoke something asking for the HIR. - This keeps going further and further back until we wind up doing the actual parsing. Although this vision is not fully realized, large sections of the compiler (for example, generating [MIR](./mir/index.md)) currently work exactly like this. in-depth description of what queries are and how they work. If you intend to write a query of your own, this is a good read. ## Invoking queries Invoking a query is simple. The [`TyCtxt`] ("type context") struct offers a method for each defined query. For example, to invoke the `type_of` query, you would just do this: ```rust,ignore let ty = tcx.type_of(some_def_id); ``` ## How the compiler executes a query So you may be wondering what happens when you invoke a query method. The answer is that, for each query, the compiler maintains a cache – if your query has already been executed, then, the answer is simple: we clone the return value out of the cache and return it (therefore, you should try to ensure that the return types of queries are cheaply cloneable; insert an `Rc` if necessary). ### Providers If, however, the query is *not* in the cache, then the compiler will try to find a suitable **provider**. A provider is a function that has been defined and linked into the compiler somewhere that contains the code to compute the result of the query. **Providers are defined per-crate.** The compiler maintains, internally, a table of providers for every crate, at least conceptually. Right now, there are really two sets: the providers for queries about the **local crate** (that is, the one being compiled) and providers for queries about **external crates** (that is, dependencies of the local crate). Note that what determines the crate that a query is targeting is not the *kind* of query, but the *key*. For example, when you invoke `tcx.type_of(def_id)`, that could be a local query or an external query, depending on what crate the `def_id` is referring to (see the [`self::keys::Key`][Key] trait for more information on how that works). Providers always have the same signature: ```rust,ignore fn provider<'tcx>( tcx: TyCtxt<'tcx>, key: QUERY_KEY, ) -> QUERY_RESULT { ... } ``` Providers take two arguments: the `tcx` and the query key. They return the result of the query. ### How providers are setup When the tcx is created, it is given the providers by its creator using the [`Providers`][providers_struct] struct. This struct is generated by

The Rust compiler is transitioning to a demand-driven system using queries. Queries are function-like and compute information about the source code, with memoization for efficiency and support for incremental computation. The compiler aims to use queries for all control flow, starting from the `compile` query. Invoking a query uses the `TyCtxt` struct. The compiler caches query results and uses providers (functions that compute query results) when a result is not cached. Providers are defined per-crate and are selected based on the query key.