`ident`, `block`, `expr`, etc., the macro parser must call back to the normal
Rust parser. Both the definition and invocation of macros are parsed using
the parser in a process which is non-intuitively self-referential.
The code to parse macro _definitions_ is in
[`compiler/rustc_expand/src/mbe/macro_rules.rs`][code_mr]. It defines the
pattern for matching a macro definition as `$( $lhs:tt => $rhs:tt );+`. In
other words, a `macro_rules` definition should have in its body at least one
occurrence of a token tree followed by `=>` followed by another token tree.
When the compiler comes to a `macro_rules` definition, it uses this pattern to
match the two token trees per the rules of the definition of the macro, _thereby
utilizing the macro parser itself_. In our example definition, the
metavariable `$lhs` would match the patterns of both arms: `(print
$mvar:ident)` and `(print twice $mvar:ident)`. And `$rhs` would match the
bodies of both arms: `{ println!("{}", $mvar); }` and `{ println!("{}", $mvar);
println!("{}", $mvar); }`. The parser keeps this knowledge around for when it
needs to expand a macro invocation.
When the compiler comes to a macro invocation, it parses that invocation using
a NFA-based macro parser described above. However, the matcher variable
used is the first token tree (`$lhs`) extracted from the arms of the macro
_definition_. Using our example, we would try to match the token stream `print
foo` from the invocation against the matchers `print $mvar:ident` and `print
twice $mvar:ident` that we previously extracted from the definition. The
algorithm is exactly the same, but when the macro parser comes to a place in the
current matcher where it needs to match a _non-terminal_ (e.g. `$mvar:ident`),
it calls back to the normal Rust parser to get the contents of that
non-terminal. In this case, the Rust parser would look for an `ident` token,
which it finds (`foo`) and returns to the macro parser. Then, the macro parser
proceeds in parsing as normal. Also, note that exactly one of the matchers from
the various arms should match the invocation; if there is more than one match,
the parse is ambiguous, while if there are no matches at all, there is a syntax
error.
For more information about the macro parser's implementation, see the comments
in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].
## Procedural Macros
Procedural macros are also expanded during parsing. However, rather than
having a parser in the compiler, proc macros are implemented as custom,
third-party crates. The compiler will compile the proc macro crate and
specially annotated functions in them (i.e. the proc macro itself), passing
them a stream of tokens. A proc macro can then transform the token stream and
output a new token stream, which is synthesized into the AST.
The token stream type used by proc macros is _stable_, so `rustc` does not
use it internally. The compiler's (unstable) token stream is defined in
[`rustc_ast::tokenstream::TokenStream`][rustcts]. This is converted into the
stable [`proc_macro::TokenStream`][stablets] and back in
[`rustc_expand::proc_macro`][pm] and [`rustc_expand::proc_macro_server`][pms].
Since the Rust ABI is currently unstable, we use the C ABI for this conversion.
<!-- TODO(rylev): more here. [#1160](https://github.com/rust-lang/rustc-dev-guide/issues/1160) -->
### Custom Derive
Custom derives are a special type of proc macro.
### Macros By Example and Macros 2.0
There is an legacy and mostly undocumented effort to improve the MBE system
by giving it more hygiene-related features, better scoping and visibility
rules, etc. Internally this uses the same machinery as today's MBEs with some
additional syntactic sugar and are allowed to be in namespaces.
<!-- TODO(rylev): more? [#1160](https://github.com/rust-lang/rustc-dev-guide/issues/1160) -->