Home Explore Blog CI



rustc

6th chunk of `src/macro-expansion.md`
0349fcae5f87d1fcc3268f19d71a5758f1426290c785a40f0000000100000fde
- [`SyntaxExtension`] - a lowered macro representation, contains its expander
  function, which transforms a [`TokenStream`] or AST into another
  [`TokenStream`] or AST + some additional data like stability, or a list of
  unstable features allowed inside the macro.
- [`SyntaxExtensionKind`] - expander functions may have several different
  signatures (take one token stream, or two, or a piece of AST, etc). This is
  an `enum` that lists them.
- [`BangProcMacro`]/[`TTMacroExpander`]/[`AttrProcMacro`]/[`MultiItemModifier`] -
  `trait`s representing the expander function signatures.


## Macros By Example

MBEs have their own parser distinct from the Rust parser. When macros are
expanded, we may invoke the MBE parser to parse and expand a macro.  The
MBE parser, in turn, may call the Rust parser when it needs to bind a
metavariable (e.g. `$my_expr`) while parsing the contents of a macro
invocation. The code for macro expansion is in
[`compiler/rustc_expand/src/mbe/`][code_dir].

### Example

```rust,ignore
macro_rules! printer {
    (print $mvar:ident) => {
        println!("{}", $mvar);
    };
    (print twice $mvar:ident) => {
        println!("{}", $mvar);
        println!("{}", $mvar);
    };
}
```

Here `$mvar` is called a _metavariable_. Unlike normal variables, rather than
binding to a value _at runtime_, a metavariable binds _at compile time_ to a
tree of _tokens_.  A _token_ is a single "unit" of the grammar, such as an
identifier (e.g. `foo`) or punctuation (e.g. `=>`). There are also other
special tokens, such as `EOF`, which its self indicates that there are no more
tokens. There are token trees resulting from the paired parentheses-like
characters (`(`...`)`, `[`...`]`, and `{`...`}`) – they include the open and
close and all the tokens in between (Rust requires that parentheses-like
characters be balanced). Having macro expansion operate on token streams
rather than the raw bytes of a source-file abstracts away a lot of complexity.
The macro expander (and much of the rest of the compiler) doesn't consider
the exact line and column of some syntactic construct in the code; it considers
which constructs are used in the code. Using tokens allows us to care about
_what_ without worrying about _where_. For more information about tokens, see
the [Parsing][parsing] chapter of this book.

```rust,ignore
printer!(print foo); // `foo` is a variable
```

The process of expanding the macro invocation into the syntax tree
`println!("{}", foo)` and then expanding the syntax tree into a call to
`Display::fmt` is one common example of _macro expansion_.

### The MBE parser

There are two parts to MBE expansion done by the macro parser: 
  1. parsing the definition, and,
  2. parsing the invocations. 

We think of the MBE parser as a nondeterministic finite automaton (NFA) based
regex parser since it uses an algorithm similar in spirit to the [Earley
parsing algorithm](https://en.wikipedia.org/wiki/Earley_parser). The macro
parser is defined in
[`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].

The interface of the macro parser is as follows (this is slightly simplified):

```rust,ignore
fn parse_tt(
    &mut self,
    parser: &mut Cow<'_, Parser<'_>>,
    matcher: &[MatcherLoc]
) -> ParseResult
```

We use these items in macro parser:

- a `parser` variable is a reference to the state of a normal Rust parser,
  including the token stream and parsing session. The token stream is what we
  are about to ask the MBE parser to parse. We will consume the raw stream of
  tokens and output a binding of metavariables to corresponding token trees.
  The parsing session can be used to report parser errors.
- a `matcher` variable is a sequence of [`MatcherLoc`]s that we want to match
  the token stream against. They're converted from token trees before matching.


In the analogy of a regex parser, the token stream is the input and we are
matching it against the pattern defined by matcher. Using our examples, the
token stream could be the stream of tokens containing the inside of the example

Title: Macros By Example (MBE) Parsing and Expansion
Summary
This section explains Macros By Example (MBE), detailing their parser, which is distinct from the Rust parser, and the process of macro expansion. It discusses metavariables, tokens, and token trees, emphasizing how MBEs operate on token streams rather than raw source code. The MBE parser is described as a nondeterministic finite automaton (NFA) based regex parser. The section outlines the two parts of MBE expansion: parsing the definition and parsing the invocations. It explains the interface of the macro parser, including the `parser` and `matcher` variables, and their roles in matching the token stream against the defined pattern.