Home Explore Blog CI



git

3rd chunk of `Documentation/gitformat-commit-graph.adoc`
2b0ec1f4c066d374b242e0fbdf2e6ca6fdb6cb653cf354f60000000100000dd8
 with corrected commit date offsets that cannot be
      stored within 31 bits.
    * Generation Data Overflow chunk is present only when Generation Data
      chunk is present and at least one corrected commit date offset cannot
      be stored within 31 bits.

==== Extra Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
      This list of 4-byte values store the second through nth parents for
      all octopus merges. The second parent value in the commit data stores
      an array position within this list along with the most-significant bit
      on. Starting at that array position, iterate through this list of commit
      positions for the parents until reaching a value with the most-significant
      bit on. The other bits correspond to the position of the last parent.

==== Bloom Filter Index (ID: {'B', 'I', 'D', 'X'}) (N * 4 bytes) [Optional]
    * The ith entry, BIDX[i], stores the number of bytes in all Bloom filters
      from commit 0 to commit i (inclusive) in lexicographic order. The Bloom
      filter for the i-th commit spans from BIDX[i-1] to BIDX[i] (plus header
      length), where BIDX[-1] is 0.
    * The BIDX chunk is ignored if the BDAT chunk is not present.

==== Bloom Filter Data (ID: {'B', 'D', 'A', 'T'}) [Optional]
    * It starts with header consisting of three unsigned 32-bit integers:
      - Version of the hash algorithm being used. We currently support
	value 2 which corresponds to the 32-bit version of the murmur3 hash
	implemented exactly as described in
	https://en.wikipedia.org/wiki/MurmurHash#Algorithm and the double
	hashing technique using seed values 0x293ae76f and 0x7e646e2 as
	described in https://doi.org/10.1007/978-3-540-30494-4_26 "Bloom Filters
	in Probabilistic Verification". Version 1 Bloom filters have a bug that appears
	when char is signed and the repository has path names that have characters >=
	0x80; Git supports reading and writing them, but this ability will be removed
	in a future version of Git.
      - The number of times a path is hashed and hence the number of bit positions
	      that cumulatively determine whether a file is present in the commit.
      - The minimum number of bits 'b' per entry in the Bloom filter. If the filter
	      contains 'n' entries, then the filter size is the minimum number of 64-bit
	      words that contain n*b bits.
    * The rest of the chunk is the concatenation of all the computed Bloom
      filters for the commits in lexicographic order.
    * Note: Commits with no changes or more than 512 changes have Bloom filters
      of length one, with either all bits set to zero or one respectively.
    * The BDAT chunk is present if and only if BIDX is present.

==== Base Graphs List (ID: {'B', 'A', 'S', 'E'}) [Optional]
      This list of H-byte hashes describe a set of B commit-graph files that
      form a commit-graph chain. The graph position for the ith commit in this
      file's OID Lookup chunk is equal to i plus the number of commits in all
      base graphs.  If B is non-zero, this chunk must exist.

=== TRAILER:

	H-byte HASH-checksum of all of the above.

== Historical Notes:

The Generation Data (GDA2) and Generation Data Overflow (GDO2) chunks have
the number '2' in their chunk IDs because a previous version of Git wrote
possibly erroneous data in these chunks with the IDs "GDAT" and "GDOV". By
changing the IDs, newer versions of Git will silently ignore those older
chunks and write the new information without trusting the incorrect data.

GIT
---
Part of the linkgit:git[1] suite

Title: Git Commit-Graph Format: Additional Chunks and Historical Notes
Summary
The Git commit-graph format includes additional chunks such as Extra Edge List, Bloom Filter Index, Bloom Filter Data, and Base Graphs List, which store information about commit parents, Bloom filters, and commit-graph chains, and also notes the historical context of certain chunk IDs to ensure compatibility and data integrity across different Git versions.