Advanced LLVM IR Handling and Optimization Analysis

`cargo llvm-ir [options] path` spits out the LLVM IR for a particular function at `path`. (`cargo install cargo-asm` installs `cargo asm` and `cargo llvm-ir`). `--build-type=debug` emits code for debug builds. There are also other useful options. Also, debug info in LLVM IR can clutter the output a lot: `RUSTFLAGS="-C debuginfo=0"` is really useful. `RUSTFLAGS="-C save-temps"` outputs LLVM bitcode (not the same as IR) at different stages during compilation, which is sometimes useful. The output LLVM bitcode will be in `.bc` files in the compiler's output directory, set via the `--out-dir DIR` argument to `rustc`. * If you are hitting an assertion failure or segmentation fault from the LLVM backend when invoking `rustc` itself, it is a good idea to try passing each of these `.bc` files to the `llc` command, and see if you get the same failure. (LLVM developers often prefer a bug reduced to a `.bc` file over one that uses a Rust crate for its minimized reproduction.) * To get human readable versions of the LLVM bitcode, one just needs to convert the bitcode (`.bc`) files to `.ll` files using `llvm-dis`, which should be in the target local compilation of rustc. Note that rustc emits different IR depending on whether `-O` is enabled, even without LLVM's optimizations, so if you want to play with the IR rustc emits, you should: ```bash $ rustc +local my-file.rs --emit=llvm-ir -O -C no-prepopulate-passes \ -C codegen-units=1 $ OPT=./build/$TRIPLE/llvm/bin/opt $ $OPT -S -O2 < my-file.ll > my ``` If you just want to get the LLVM IR during the LLVM pipeline, to e.g. see which IR causes an optimization-time assertion to fail, or to see when LLVM performs a particular optimization, you can pass the rustc flag `-C llvm-args=-print-after-all`, and possibly add `-C llvm-args='-filter-print-funcs=EXACT_FUNCTION_NAME` (e.g. `-C llvm-args='-filter-print-funcs=_ZN11collections3str21_$LT$impl$u20$str$GT$\ 7replace17hbe10ea2e7c809b0bE'`). That produces a lot of output into standard error, so you'll want to pipe that to some file. Also, if you are using neither `-filter-print-funcs` nor `-C codegen-units=1`, then, because the multiple codegen units run in parallel, the printouts will mix together and you won't be able to read anything. * One caveat to the aforementioned methodology: the `-print` family of options to LLVM only prints the IR unit that the pass runs on (e.g., just a function), and does not include any referenced declarations, globals, metadata, etc. This means you cannot in general feed the output of `-print` into `llc` to reproduce a given problem. * Within LLVM itself, calling `F.getParent()->dump()` at the beginning of `SafeStackLegacyPass::runOnFunction` will dump the whole module, which may provide better basis for reproduction. (However, you should be able to get that same dump from the `.bc` files dumped by `-C save-temps`.) If you want just the IR for a specific function (say, you want to see why it causes an assertion or doesn't optimize correctly), you can use `llvm-extract`, e.g. ```bash $ ./build/$TRIPLE/llvm/bin/llvm-extract \ -func='_ZN11collections3str21_$LT$impl$u20$str$GT$7replace17hbe10ea2e7c809b0bE' \ -S \ < unextracted.ll \ > extracted.ll ``` ### Investigate LLVM optimization passes If you are seeing incorrect behavior due to an optimization pass, a very handy LLVM option is `-opt-bisect-limit`, which takes an integer denoting the index value of the highest pass to run. Index values for taken passes are stable from run to run; by coupling this with software that automates bisecting the search space based on the resulting program, an errant pass can be quickly determined. When an `-opt-bisect-limit` is specified, all runs are displayed to standard error, along with their index and output indicating if the pass was run or skipped. Setting the limit to an index of -1 (e.g., `RUSTFLAGS="-C llvm-args=-opt-bisect-limit=-1"`) will show all passes and

This section details how to extract and analyze LLVM IR for debugging Rust code. It covers using `cargo llvm-ir` and `RUSTFLAGS` to obtain IR and bitcode, converting bitcode to human-readable IR with `llvm-dis`, and the impact of optimization levels on generated IR. The text explains how to inspect IR during the LLVM pipeline using `-C llvm-args=-print-after-all` and filter function output with `-filter-print-funcs`. Techniques for extracting IR for specific functions using `llvm-extract` and investigating optimization passes with `-opt-bisect-limit` are described, allowing for precise isolation of problematic optimization steps.