Profiling Rustc with Perf

# Profiling with perf This is a guide for how to profile rustc with [perf](https://perf.wiki.kernel.org/index.php/Main_Page). ## Initial steps - Get a clean checkout of rust-lang/master, or whatever it is you want to profile. - Set the following settings in your `bootstrap.toml`: - `debuginfo-level = 1` - enables line debuginfo - `jemalloc = false` - lets you do memory use profiling with valgrind - leave everything else the defaults - Run `./x build` to get a full build - Make a rustup toolchain pointing to that result - see [the "build and run" section for instructions][b-a-r] ## Gathering a perf profile perf is an excellent tool on linux that can be used to gather and analyze all kinds of information. Mostly it is used to figure out where a program spends its time. It can also be used for other sorts of events, though, like cache misses and so forth. ### The basics The basic `perf` command is this: ```bash perf record -F99 --call-graph dwarf XXX ``` The `-F99` tells perf to sample at 99 Hz, which avoids generating too much data for longer runs (why 99 Hz you ask? It is often chosen because it is unlikely to be in lockstep with other periodic activity). The `--call-graph dwarf` tells perf to get call-graph information from debuginfo, which is accurate. The `XXX` is the command you want to profile. So, for example, you might do: ```bash perf record -F99 --call-graph dwarf cargo +<toolchain> rustc ``` to run `cargo` -- here `<toolchain>` should be the name of the toolchain you made in the beginning. But there are some things to be aware of: - You probably don't want to profile the time spend building dependencies. So something like `cargo build; cargo clean -p $C` may be helpful (where `$C` is the crate name) - Though usually I just do `touch src/lib.rs` and rebuild instead. =) - You probably don't want incremental messing about with your profile. So something like `CARGO_INCREMENTAL=0` can be helpful. In case to avoid the issue of `addr2line xxx/elf: could not read first record` when reading collected data from `cargo`, you may need use the latest version of `addr2line`: ```bash cargo install addr2line --features="bin" ``` ### Gathering a perf profile from a `perf.rust-lang.org` test Often we want to analyze a specific test from `perf.rust-lang.org`. The easiest way to do that is to use the [rustc-perf][rustc-perf] benchmarking suite, this approach is described [here](with_rustc_perf.md). Instead of using the benchmark suite CLI, you can also profile the benchmarks manually. First, you need to clone the [rustc-perf][rustc-perf] repository: ```bash $ git clone https://github.com/rust-lang/rustc-perf ``` and then find the source code of the test that you want to profile. Sources for the tests are found in [the `collector/compile-benchmarks` directory][compile-time dir] and [the `collector/runtime-benchmarks` directory][runtime dir]. So let's go into the directory of a specific test; we'll use `clap-rs` as an example: ```bash cd collector/compile-benchmarks/clap-3.1.6 ``` In this case, let's say we want to profile the `cargo check` performance. In that case, I would first run some basic commands to build the dependencies: ```bash # Setup: first clean out any old results and build the dependencies: cargo +<toolchain> clean CARGO_INCREMENTAL=0 cargo +<toolchain> check ``` (Again, `<toolchain>` should be replaced with the name of the toolchain we made in the first step.) Next: we want record the execution time for *just* the clap-rs crate, running cargo check. I tend to use `cargo rustc` for this, since it also allows me to add explicit flags, which we'll do later on. ```bash touch src/lib.rs CARGO_INCREMENTAL=0 perf record -F99 --call-graph dwarf cargo rustc --profile check --lib ``` Note that final command: it's a doozy! It uses the `cargo rustc` command, which executes rustc with (potentially) additional options; the `--profile check` and `--lib` options specify that we are doing a

This document describes how to profile rustc using the perf tool on Linux. It covers initial setup steps, including configuring the `bootstrap.toml` file, building the rust compiler, and creating a rustup toolchain. The document outlines the basic perf command for gathering performance data, along with tips to avoid profiling unrelated tasks, such as building dependencies or incremental compilation. It also details how to profile specific tests from the `perf.rust-lang.org` benchmark suite, using both the benchmarking CLI and manual methods.