Stable Identifiers, Fingerprints, and Dual Dependency Graphs in Incremental Compilation

`DefPath` is "stable" across changes made to the source code while a `DefId` isn't. There is also the `DefPathHash` which is just a 128-bit hash value of the `DefPath`. The two contain the same information and we mostly use the `DefPathHash` because it simpler to handle, being `Copy` and self-contained. This principle of stable identifiers is used to make the data in the on-disk cache resilient to source code changes. Instead of storing a `DefId`, we store the `DefPathHash` and when we deserialize something from the cache, we map the `DefPathHash` to the corresponding `DefId` in the *current* compilation session (which is just a simple hash table lookup). The `HirId`, used for identifying HIR components that don't have their own `DefId`, is another such stable ID. It is (conceptually) a pair of a `DefPath` and a `LocalId`, where the `LocalId` identifies something (e.g. a `hir::Expr`) locally within its "owner" (e.g. a `hir::Item`). If the owner is moved around, the `LocalId`s within it are still the same. ### Checking Query Results For Changes: HashStable And Fingerprints In order to do red-green-marking we often need to check if the result of a query has changed compared to the result it had during the previous compilation session. There are two performance problems with this though: - We'd like to avoid having to load the previous result from disk just for doing the comparison. We already computed the new result and will use that. Also loading a result from disk will "pollute" the interners with data that is unlikely to ever be used. - We don't want to store each and every result in the on-disk cache. For example, it would be wasted effort to persist things to disk that are already available in upstream crates. The compiler avoids these problems by using so-called `Fingerprint`s. Each time a new query result is computed, the query engine will compute a 128 bit hash value of the result. We call this hash value "the `Fingerprint` of the query result". The hashing is (and has to be) done "in a stable way". This means that whenever something is hashed that might change in between compilation sessions (e.g. a `DefId`), we instead hash its stable equivalent (e.g. the corresponding `DefPath`). That's what the whole `HashStable` infrastructure is for. This way `Fingerprint`s computed in two different compilation sessions are still comparable. The next step is to store these fingerprints along with the dependency graph. This is cheap since fingerprints are just bytes to be copied. It's also cheap to load the entire set of fingerprints together with the dependency graph. Now, when red-green-marking reaches the point where it needs to check if a result has changed, it can just compare the (already loaded) previous fingerprint to the fingerprint of the new result. This approach works rather well but it's not without flaws: - There is a small possibility of hash collisions. That is, two different results could have the same fingerprint and the system would erroneously assume that the result hasn't changed, leading to a missed update. We mitigate this risk by using a high-quality hash function and a 128 bit wide hash value. Due to these measures the practical risk of a hash collision is negligible. - Computing fingerprints is quite costly. It is the main reason why incremental compilation can be slower than non-incremental compilation. We are forced to use a good and thus expensive hash function, and we have to map things to their stable equivalents while doing the hashing. ### A Tale Of Two DepGraphs: The Old And The New The initial description of dependency tracking glosses over a few details that quickly become a head scratcher when actually trying to implement things. In particular it's easy to overlook that we are actually dealing with *two* dependency graphs: The one we built during the previous compilation session and the one that we are building for the current compilation session. When a compilation session starts, the compiler loads the previous dependency

Incremental compilation uses stable identifiers like `DefPathHash` and `HirId` to maintain data validity across compilation sessions. To check for changes in query results, it uses `Fingerprint`s (128-bit hash values) computed using `HashStable` infrastructure, avoiding the need to load previous results from disk. These fingerprints are stored alongside the dependency graph for efficient comparison during red-green marking. The process involves dealing with two dependency graphs: the one from the previous compilation and the one being built for the current session.