Home Explore Blog CI



git

3rd chunk of `Documentation/git-fast-import.adoc`
97252f436c59bd0e134f96ae9518e67bbdd8beb77aee70fe0000000100000fa6
	Maximum delta depth, for blob and tree deltification.
	Default is 50.

--export-pack-edges=<file>::
	After creating a packfile, print a line of data to
	<file> listing the filename of the packfile and the last
	commit on each branch that was written to that packfile.
	This information may be useful after importing projects
	whose total object set exceeds the 4 GiB packfile limit,
	as these commits can be used as edge points during calls
	to 'git pack-objects'.

--max-pack-size=<n>::
	Maximum size of each output packfile.
	The default is unlimited.

fastimport.unpackLimit::
	See linkgit:git-config[1]

PERFORMANCE
-----------
The design of fast-import allows it to import large projects in a minimum
amount of memory usage and processing time.  Assuming the frontend
is able to keep up with fast-import and feed it a constant stream of data,
import times for projects holding 10+ years of history and containing
100,000+ individual commits are generally completed in just 1-2
hours on quite modest (~$2,000 USD) hardware.

Most bottlenecks appear to be in foreign source data access (the
source just cannot extract revisions fast enough) or disk IO (fast-import
writes as fast as the disk will take the data).  Imports will run
faster if the source data is stored on a different drive than the
destination Git repository (due to less IO contention).


DEVELOPMENT COST
----------------
A typical frontend for fast-import tends to weigh in at approximately 200
lines of Perl/Python/Ruby code.  Most developers have been able to
create working importers in just a couple of hours, even though it
is their first exposure to fast-import, and sometimes even to Git.  This is
an ideal situation, given that most conversion tools are throw-away
(use once, and never look back).


PARALLEL OPERATION
------------------
Like 'git push' or 'git fetch', imports handled by fast-import are safe to
run alongside parallel `git repack -a -d` or `git gc` invocations,
or any other Git operation (including 'git prune', as loose objects
are never used by fast-import).

fast-import does not lock the branch or tag refs it is actively importing.
After the import, during its ref update phase, fast-import tests each
existing branch ref to verify the update will be a fast-forward
update (the commit stored in the ref is contained in the new
history of the commit to be written).  If the update is not a
fast-forward update, fast-import will skip updating that ref and instead
prints a warning message.  fast-import will always attempt to update all
branch refs, and does not stop on the first failure.

Branch updates can be forced with --force, but it's recommended that
this only be used on an otherwise quiet repository.  Using --force
is not necessary for an initial import into an empty repository.


TECHNICAL DISCUSSION
--------------------
fast-import tracks a set of branches in memory.  Any branch can be created
or modified at any point during the import process by sending a
`commit` command on the input stream.  This design allows a frontend
program to process an unlimited number of branches simultaneously,
generating commits in the order they are available from the source
data.  It also simplifies the frontend programs considerably.

fast-import does not use or alter the current working directory, or any
file within it.  (It does however update the current Git repository,
as referenced by `GIT_DIR`.)  Therefore an import frontend may use
the working directory for its own purposes, such as extracting file
revisions from the foreign source.  This ignorance of the working
directory also allows fast-import to run very quickly, as it does not
need to perform any costly file update operations when switching
between branches.

INPUT FORMAT
------------
With the exception of raw file data (which Git does not interpret)
the fast-import input format is text (ASCII) based.  This text based
format simplifies development and debugging of frontend programs,
especially when a higher level language

Title: Git Fast Import Performance, Development, and Technical Details
Summary
The git-fast-import command is designed for efficient import of large projects, with options for tuning performance and handling parallel operations, and its text-based input format simplifies development and debugging of frontend programs, making it easy to create working importers with minimal code.