Home Explore Blog CI



git

4th chunk of `Documentation/git-sparse-checkout.adoc`
8e98a4b020ba5ecab771fe9c0ae866a4af52d77123d61c2f0000000100000fa0
 tree,
then its absence is ignored. Git will avoid populating the contents of
those files, which makes a sparse checkout helpful when working in a
repository with many files, but only a few are important to the current
user.

The `$GIT_DIR/info/sparse-checkout` file is used to define the
skip-worktree reference bitmap. When Git updates the working
directory, it updates the skip-worktree bits in the index based
on this file. The files matching the patterns in the file will
appear in the working directory, and the rest will not.

INTERNALS -- NON-CONE PROBLEMS
------------------------------

The `$GIT_DIR/info/sparse-checkout` file populated by the `set` and
`add` subcommands is defined to be a bunch of patterns (one per line)
using the same syntax as `.gitignore` files.  In cone mode, these
patterns are restricted to matching directories (and users only ever
need supply or see directory names), while in non-cone mode any
gitignore-style pattern is permitted.  Using the full gitignore-style
patterns in non-cone mode has a number of shortcomings:

  * Fundamentally, it makes various worktree-updating processes (pull,
    merge, rebase, switch, reset, checkout, etc.) require O(N*M) pattern
    matches, where N is the number of patterns and M is the number of
    paths in the index.  This scales poorly.

  * Avoiding the scaling issue has to be done via limiting the number
    of patterns via specifying leading directory name or glob.

  * Passing globs on the command line is error-prone as users may
    forget to quote the glob, causing the shell to expand it into all
    matching files and pass them all individually along to
    sparse-checkout set/add.  While this could also be a problem with
    e.g. "git grep -- *.c", mistakes with grep/log/status appear in
    the immediate output.  With sparse-checkout, the mistake gets
    recorded at the time the sparse-checkout command is run and might
    not be problematic until the user later switches branches or rebases
    or merges, thus putting a delay between the user's error and when
    they have a chance to catch/notice it.

  * Related to the previous item, sparse-checkout has an 'add'
    subcommand but no 'remove' subcommand.  Even if a 'remove'
    subcommand were added, undoing an accidental unquoted glob runs
    the risk of "removing too much", as it may remove entries that had
    been included before the accidental add.

  * Non-cone mode uses gitignore-style patterns to select what to
    *include* (with the exception of negated patterns), while
    .gitignore files use gitignore-style patterns to select what to
    *exclude* (with the exception of negated patterns).  The
    documentation on gitignore-style patterns usually does not talk in
    terms of matching or non-matching, but on what the user wants to
    "exclude".  This can cause confusion for users trying to learn how
    to specify sparse-checkout patterns to get their desired behavior.

  * Every other git subcommand that wants to provide "special path
    pattern matching" of some sort uses pathspecs, but non-cone mode
    for sparse-checkout uses gitignore patterns, which feels
    inconsistent.

  * It has edge cases where the "right" behavior is unclear.  Two examples:

    First, two users are in a subdirectory, and the first runs
       git sparse-checkout set '/toplevel-dir/*.c'
    while the second runs
       git sparse-checkout set relative-dir
    Should those arguments be transliterated into
       current/subdirectory/toplevel-dir/*.c
    and
       current/subdirectory/relative-dir
    before inserting into the sparse-checkout file?  The user who typed
    the first command is probably aware that arguments to set/add are
    supposed to be patterns in non-cone mode, and probably would not be
    happy with such a transliteration.  However, many gitignore-style
    patterns are just paths, which might be what the user who typed the
    second command was thinking, and they'd be upset if their

Title: Git Sparse Checkout Internals: Non-Cone Mode Problems
Summary
The git sparse-checkout file uses patterns to define which files to include in the working directory. Non-cone mode allows for gitignore-style patterns, but this has several shortcomings, including poor scalability, error-prone glob passing, and inconsistent behavior. These issues can lead to confusion and unexpected behavior, making it challenging for users to manage their sparse checkouts effectively.