Expansion Order and Macro Definition Hierarchies in Rust's Hygiene System

attached, such as some desugared syntax (non-macro-expanded nodes are considered to just have the "root" context, as described below). Throughout the compiler, we use [`rustc_span::Span`s][span] to refer to code locations. This struct also has hygiene information attached to it, as we will see later. Because macros invocations and definitions can be nested, the syntax context of a node must be a hierarchy. For example, if we expand a macro and there is another macro invocation or definition in the generated output, then the syntax context should reflect the nesting. However, it turns out that there are actually a few types of context we may want to track for different purposes. Thus, there are not just one but _three_ expansion hierarchies that together comprise the hygiene information for a crate. All of these hierarchies need some sort of "macro ID" to identify individual elements in the chain of expansions. This ID is [`ExpnId`]. All macros receive an integer ID, assigned continuously starting from 0 as we discover new macro calls. All hierarchies start at [`ExpnId::root`][rootid], which is its own parent. The [`rustc_span::hygiene`][hy] crate contains all of the hygiene-related algorithms (with the exception of some hacks in [`Resolver::resolve_crate_root`][hacks]) and structures related to hygiene and expansion that are kept in global data. The actual hierarchies are stored in [`HygieneData`][hd]. This is a global piece of data containing hygiene and expansion info that can be accessed from any [`Ident`] without any context. ### The Expansion Order Hierarchy The first hierarchy tracks the order of expansions, i.e., when a macro invocation is in the output of another macro. Here, the children in the hierarchy will be the "innermost" tokens. The [`ExpnData`] struct itself contains a subset of properties from both macro definition and macro call available through global data. [`ExpnData::parent`][edp] tracks the child-to-parent link in this hierarchy. For example: ```rust,ignore macro_rules! foo { () => { println!(); } } fn main() { foo!(); } ``` In this code, the AST nodes that are finally generated would have hierarchy `root -> id(foo) -> id(println)`. ### The Macro Definition Hierarchy The second hierarchy tracks the order of macro definitions, i.e., when we are expanding one macro another macro definition is revealed in its output. This one is a bit tricky and more complex than the other two hierarchies. [`SyntaxContext`][sc] represents a whole chain in this hierarchy via an ID. [`SyntaxContextData`][scd] contains data associated with the given [`SyntaxContext`][sc]; mostly it is a cache for results of filtering that chain in different ways. [`SyntaxContextData::parent`][scdp] is the child-to-parent link here, and [`SyntaxContextData::outer_expns`][scdoe] are individual elements in the chain. The "chaining-operator" is [`SyntaxContext::apply_mark`][am] in compiler code. A [`Span`][span], mentioned above, is actually just a compact representation of a code location and [`SyntaxContext`][sc]. Likewise, an [`Ident`] is just an interned [`Symbol`] + `Span` (i.e. an interned string + hygiene data). For built-in macros, we use the context: [`SyntaxContext::empty().apply_mark(expn_id)`], and such macros are considered to be defined at the hierarchy root. We do the same for `proc macro`s because we haven't implemented cross-crate hygiene yet. If the token had context `X` before being produced by a macro then after being produced by the macro it has context `X -> macro_id`. Here are some examples: Example 0: ```rust,ignore macro m() { ident } m!(); ``` Here `ident` which initially has context [`SyntaxContext::root`][scr] has context `ROOT -> id(m)` after it's produced by `m`. Example 1: ```rust,ignore macro m() { macro n() { ident } } m!(); n!(); ``` In this example the `ident` has context `ROOT` initially, then `ROOT -> id(m)` after the first expansion, then `ROOT -> id(m) -> id(n)`. Example 2:

This section delves into two of the three expansion hierarchies that comprise Rust's hygiene system: the expansion order hierarchy and the macro definition hierarchy. The expansion order hierarchy tracks the order of macro expansions, while the macro definition hierarchy tracks the order in which macro definitions are revealed during expansion. It also explains the roles of `SyntaxContext`, `SyntaxContextData`, `Span`, and `Ident` in maintaining hygiene information.