Macro Expansion and AST Integration in Rust

# Macro expansion  Rust has a very powerful macro system. In the previous chapter, we saw how the parser sets aside macros to be expanded (using temporary [placeholders]). This chapter is about the process of expanding those macros iteratively until we have a complete [*Abstract Syntax Tree* (AST)][ast] for our crate with no unexpanded macros (or a compile error). First, we discuss the algorithm that expands and integrates macro output into ASTs. Next, we take a look at how hygiene data is collected. Finally, we look at the specifics of expanding different types of macros. Many of the algorithms and data structures described below are in [`rustc_expand`], with fundamental data structures in [`rustc_expand::base`][base]. Also of note, `cfg` and `cfg_attr` are treated specially from other macros, and are handled in [`rustc_expand::config`][cfg]. ## Expansion and AST Integration Firstly, expansion happens at the crate level. Given a raw source code for a crate, the compiler will produce a massive AST with all macros expanded, all modules inlined, etc. The primary entry point for this process is the [`MacroExpander::fully_expand_fragment`][fef] method. With few exceptions, we use this method on the whole crate (see ["Eager Expansion"](#eager-expansion) below for more detailed discussion of edge case expansion issues). At a high level, [`fully_expand_fragment`][fef] works in iterations. We keep a queue of unresolved macro invocations (i.e. macros we haven't found the definition of yet). We repeatedly try to pick a macro from the queue, resolve it, expand it, and integrate it back. If we can't make progress in an iteration, this represents a compile error. Here is the [algorithm][original]: 1. Initialize a `queue` of unresolved macros. 2. Repeat until `queue` is empty (or we make no progress, which is an error): 1. [Resolve](./name-resolution.md) imports in our partially built crate as much as possible. 2. Collect as many macro [`Invocation`s][inv] as possible from our partially built crate (`fn`-like, attributes, derives) and add them to the queue. 3. Dequeue the first element and attempt to resolve it. 4. If it's resolved: 1. Run the macro's expander function that consumes a [`TokenStream`] or AST and produces a [`TokenStream`] or [`AstFragment`] (depending on the macro kind). (A [`TokenStream`] is a collection of [`TokenTree`s][tt], each of which are a token (punctuation, identifier, or literal) or a delimited group (anything inside `()`/`[]`/`{}`)). - At this point, we know everything about the macro itself and can call [`set_expn_data`] to fill in its properties in the global data; that is the [hygiene] data associated with [`ExpnId`] (see [Hygiene][hybelow] below). 2. Integrate that piece of AST into the currently-existing though partially-built AST. This is essentially where the "token-like mass" becomes a proper set-in-stone AST with side-tables. It happens as follows: - If the macro produces tokens (e.g. a proc macro), we parse into an AST, which may produce parse errors. - During expansion, we create [`SyntaxContext`]s (hierarchy 2) (see [Hygiene][hybelow] below). - These three passes happen one after another on every AST fragment freshly expanded from a macro: - [`NodeId`]s are assigned by [`InvocationCollector`]. This also collects new macro calls from this new AST piece and adds them to the queue. - ["Def paths"][defpath] are created and [`DefId`]s are assigned to them by [`DefCollector`]. - Names are put into modules (from the resolver's point of view) by [`BuildReducedGraphVisitor`]. 3. After expanding a single macro and integrating its output, continue to the next iteration of [`fully_expand_fragment`][fef].

This section describes the macro expansion process in Rust, which iteratively expands macros until a complete Abstract Syntax Tree (AST) is formed. The process involves resolving macro invocations, expanding them using expander functions, and integrating the resulting AST fragments back into the main AST. The process also entails assigning NodeIds, creating Def paths and DefIds, and putting names into modules.