Home Explore Blog CI



git

5th chunk of `Documentation/gitformat-pack.adoc`
0a101cc8a9a02d4fae71360571df2f449d2dc8af4ccc6b120000000100000fa3
 (MIDX) files have the following format:

The multi-pack-index files refer to multiple pack-files and loose objects.

In order to allow extensions that add extra data to the MIDX, we organize
the body into "chunks" and provide a lookup table at the beginning of the
body. The header includes certain length values, such as the number of packs,
the number of base MIDX files, hash lengths and types.

All 4-byte numbers are in network order.

HEADER:

	4-byte signature:
	    The signature is: {'M', 'I', 'D', 'X'}

	1-byte version number:
	    Git only writes or recognizes version 1.

	1-byte Object Id Version
	    We infer the length of object IDs (OIDs) from this value:
		1 => SHA-1
		2 => SHA-256
	    If the hash type does not match the repository's hash algorithm,
	    the multi-pack-index file should be ignored with a warning
	    presented to the user.

	1-byte number of "chunks"

	1-byte number of base multi-pack-index files:
	    This value is currently always zero.

	4-byte number of pack files

CHUNK LOOKUP:

	(C + 1) * 12 bytes providing the chunk offsets:
	    First 4 bytes describe chunk id. Value 0 is a terminating label.
	    Other 8 bytes provide offset in current file for chunk to start.
	    (Chunks are provided in file-order, so you can infer the length
	    using the next chunk position if necessary.)

	The CHUNK LOOKUP matches the table of contents from
	the chunk-based file format, see linkgit:gitformat-chunk[5].

	The remaining data in the body is described one chunk at a time, and
	these chunks may be given in any order. Chunks are required unless
	otherwise specified.

CHUNK DATA:

	Packfile Names (ID: {'P', 'N', 'A', 'M'})
	    Store the names of packfiles as a sequence of NUL-terminated
	    strings. There is no extra padding between the filenames,
	    and they are listed in lexicographic order. The chunk itself
	    is padded at the end with between 0 and 3 NUL bytes to make the
	    chunk size a multiple of 4 bytes.

	Bitmapped Packfiles (ID: {'B', 'T', 'M', 'P'})
	    Stores a table of two 4-byte unsigned integers in network order.
	    Each table entry corresponds to a single pack (in the order that
	    they appear above in the `PNAM` chunk). The values for each table
	    entry are as follows:
	    - The first bit position (in pseudo-pack order, see below) to
	      contain an object from that pack.
	    - The number of bits whose objects are selected from that pack.

	OID Fanout (ID: {'O', 'I', 'D', 'F'})
	    The ith entry, F[i], stores the number of OIDs with first
	    byte at most i. Thus F[255] stores the total
	    number of objects.

	OID Lookup (ID: {'O', 'I', 'D', 'L'})
	    The OIDs for all objects in the MIDX are stored in lexicographic
	    order in this chunk.

	Object Offsets (ID: {'O', 'O', 'F', 'F'})
	    Stores two 4-byte values for every object.
	    1: The pack-int-id for the pack storing this object.
	    2: The offset within the pack.
		If all offsets are less than 2^32, then the large offset chunk
		will not exist and offsets are stored as in IDX v1.
		If there is at least one offset value larger than 2^32-1, then
		the large offset chunk must exist, and offsets larger than
		2^31-1 must be stored in it instead. If the large offset chunk
		exists and the 31st bit is on, then removing that bit reveals
		the row in the large offsets containing the 8-byte offset of
		this object.

	[Optional] Object Large Offsets (ID: {'L', 'O', 'F', 'F'})
	    8-byte offsets into large packfiles.

	[Optional] Bitmap pack order (ID: {'R', 'I', 'D', 'X'})
	    A list of MIDX positions (one per object in the MIDX, num_objects in
	    total, each a 4-byte unsigned integer in network byte order), sorted
	    according to their relative bitmap/pseudo-pack positions.

TRAILER:

	Index checksum of the above contents.

== multi-pack-index reverse indexes

Similar to the pack-based reverse index, the multi-pack index can also
be used to generate a reverse index.

Instead of mapping between offset, pack-, and

Title: Multi-Pack-Index (MIDX) File Format
Summary
The MIDX file format is used by Git to refer to multiple pack-files and loose objects, and is organized into chunks with a lookup table, containing various types of data such as packfile names, bitmapped packfiles, OID fanout, OID lookup, object offsets, and large offsets, all padded and formatted according to specific rules.