force recomputation of all deltas can significantly reduce the
final packfile size (30-50% smaller can be quite typical).
Instead of running `git repack` you can also run `git gc
--aggressive`, which will also optimize other things after an import
(e.g. pack loose refs). As noted in the "AGGRESSIVE" section in
linkgit:git-gc[1] the `--aggressive` option will find new deltas with
the `-f` option to linkgit:git-repack[1]. For the reasons elaborated
on above using `--aggressive` after a fast-import is one of the few
cases where it's known to be worthwhile.
MEMORY UTILIZATION
------------------
There are a number of factors which affect how much memory fast-import
requires to perform an import. Like critical sections of core
Git, fast-import uses its own memory allocators to amortize any overheads
associated with malloc. In practice fast-import tends to amortize any
malloc overheads to 0, due to its use of large block allocations.
per object
~~~~~~~~~~
fast-import maintains an in-memory structure for every object written in
this execution. On a 32 bit system the structure is 32 bytes,
on a 64 bit system the structure is 40 bytes (due to the larger
pointer sizes). Objects in the table are not deallocated until
fast-import terminates. Importing 2 million objects on a 32 bit system
will require approximately 64 MiB of memory.
The object table is actually a hashtable keyed on the object name
(the unique SHA-1). This storage configuration allows fast-import to reuse
an existing or already written object and avoid writing duplicates
to the output packfile. Duplicate blobs are surprisingly common
in an import, typically due to branch merges in the source.
per mark
~~~~~~~~
Marks are stored in a sparse array, using 1 pointer (4 bytes or 8
bytes, depending on pointer size) per mark. Although the array
is sparse, frontends are still strongly encouraged to use marks
between 1 and n, where n is the total number of marks required for
this import.
per branch
~~~~~~~~~~
Branches are classified as active and inactive. The memory usage
of the two classes is significantly different.
Inactive branches are stored in a structure which uses 96 or 120
bytes (32 bit or 64 bit systems, respectively), plus the length of
the branch name (typically under 200 bytes), per branch. fast-import will
easily