Macro Invocation Parsing and Procedural Macros

`ident`, `block`, `expr`, etc., the macro parser must call back to the normal Rust parser. Both the definition and invocation of macros are parsed using the parser in a process which is non-intuitively self-referential. The code to parse macro _definitions_ is in [`compiler/rustc_expand/src/mbe/macro_rules.rs`][code_mr]. It defines the pattern for matching a macro definition as `$( $lhs:tt => $rhs:tt );+`. In other words, a `macro_rules` definition should have in its body at least one occurrence of a token tree followed by `=>` followed by another token tree. When the compiler comes to a `macro_rules` definition, it uses this pattern to match the two token trees per the rules of the definition of the macro, _thereby utilizing the macro parser itself_. In our example definition, the metavariable `$lhs` would match the patterns of both arms: `(print $mvar:ident)` and `(print twice $mvar:ident)`. And `$rhs` would match the bodies of both arms: `{ println!("{}", $mvar); }` and `{ println!("{}", $mvar); println!("{}", $mvar); }`. The parser keeps this knowledge around for when it needs to expand a macro invocation. When the compiler comes to a macro invocation, it parses that invocation using a NFA-based macro parser described above. However, the matcher variable used is the first token tree (`$lhs`) extracted from the arms of the macro _definition_. Using our example, we would try to match the token stream `print foo` from the invocation against the matchers `print $mvar:ident` and `print twice $mvar:ident` that we previously extracted from the definition. The algorithm is exactly the same, but when the macro parser comes to a place in the current matcher where it needs to match a _non-terminal_ (e.g. `$mvar:ident`), it calls back to the normal Rust parser to get the contents of that non-terminal. In this case, the Rust parser would look for an `ident` token, which it finds (`foo`) and returns to the macro parser. Then, the macro parser proceeds in parsing as normal. Also, note that exactly one of the matchers from the various arms should match the invocation; if there is more than one match, the parse is ambiguous, while if there are no matches at all, there is a syntax error. For more information about the macro parser's implementation, see the comments in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp]. ## Procedural Macros Procedural macros are also expanded during parsing. However, rather than having a parser in the compiler, proc macros are implemented as custom, third-party crates. The compiler will compile the proc macro crate and specially annotated functions in them (i.e. the proc macro itself), passing them a stream of tokens. A proc macro can then transform the token stream and output a new token stream, which is synthesized into the AST. The token stream type used by proc macros is _stable_, so `rustc` does not use it internally. The compiler's (unstable) token stream is defined in [`rustc_ast::tokenstream::TokenStream`][rustcts]. This is converted into the stable [`proc_macro::TokenStream`][stablets] and back in [`rustc_expand::proc_macro`][pm] and [`rustc_expand::proc_macro_server`][pms]. Since the Rust ABI is currently unstable, we use the C ABI for this conversion.  ### Custom Derive Custom derives are a special type of proc macro. ### Macros By Example and Macros 2.0 There is an legacy and mostly undocumented effort to improve the MBE system by giving it more hygiene-related features, better scoping and visibility rules, etc. Internally this uses the same machinery as today's MBEs with some additional syntactic sugar and are allowed to be in namespaces.

This section explains how macro invocations are parsed using an NFA-based macro parser, comparing the invocation's token stream against the matchers extracted from the macro definition. It details the interaction between the macro parser and the normal Rust parser when matching non-terminals. The section then transitions to procedural macros, which are implemented as external crates. It describes how the compiler compiles these crates, passes them a stream of tokens, and synthesizes the resulting token stream into the AST. It mentions the conversion between the compiler's unstable token stream and the stable token stream used by proc macros, highlighting the use of the C ABI for this conversion. Finally, it touches upon custom derives and the legacy effort to improve the MBE system.