of which can be left unspecified. Upon
checkout, when the `smudge` command is specified, the command is
fed the blob object from its standard input, and its standard
output is used to update the worktree file. Similarly, the
`clean` command is used to convert the contents of worktree file
upon checkin. By default these commands process only a single
blob and terminate. If a long running `process` filter is used
in place of `clean` and/or `smudge` filters, then Git can process
all blobs with a single filter command invocation for the entire
life of a single Git command, for example `git add --all`. If a
long running `process` filter is configured then it always takes
precedence over a configured single blob filter. See section
below for the description of the protocol used to communicate with
a `process` filter.
One use of the content filtering is to massage the content into a shape
that is more convenient for the platform, filesystem, and the user to use.
For this mode of operation, the key phrase here is "more convenient" and
not "turning something unusable into usable". In other words, the intent
is that if someone unsets the filter driver definition, or does not have
the appropriate filter program, the project should still be usable.
Another use of the content filtering is to store the content that cannot
be directly used in the repository (e.g. a UUID that refers to the true
content stored outside Git, or an encrypted content) and turn it into a
usable form upon checkout (e.g. download the external content, or decrypt
the encrypted content).
These two filters behave differently, and by default, a filter is taken as
the former, massaging the contents into more convenient shape. A missing
filter driver definition in the config, or a filter driver that exits with
a non-zero status, is not an error but makes the filter a no-op passthru.
You can declare that a filter turns a content that by itself is unusable
into a usable content by setting the filter.<driver>.required configuration
variable to `true`.
Note: Whenever the clean filter is changed, the repo should be renormalized:
$ git add --renormalize .
For example, in .gitattributes, you would assign the `filter`
attribute for paths.
------------------------
*.c filter=indent
------------------------
Then you would define a "filter.indent.clean" and "filter.indent.smudge"
configuration in your .git/config to specify a pair of commands to
modify the contents of C programs when the source files are checked
in ("clean" is run) and checked out (no change is made because the
command is "cat").
------------------------
[filter "indent"]
clean = indent
smudge = cat
------------------------
For best results, `clean` should not alter its output further if it is
run twice ("clean->clean" should be equivalent to "clean"), and
multiple `smudge` commands should not alter `clean`'s output
("smudge->smudge->clean" should be equivalent to "clean"). See the
section on merging below.
The "indent" filter is well-behaved in this regard: it will not modify
input that is already correctly indented. In this case, the lack of a
smudge filter means that the clean filter _must_ accept its own output
without modifying it.
If a filter _must_ succeed in order to make the stored contents usable,
you can declare that the filter is `required`, in the configuration:
------------------------
[filter "crypt"]
clean = openssl enc ...
smudge = openssl enc -d ...
required
------------------------
Sequence "%f" on the filter command line is replaced with the name of
the file the filter is working on. A filter might use this in keyword
substitution. For example:
------------------------
[filter "p4"]
clean = git-p4-filter --clean %f
smudge = git-p4-filter --smudge %f
------------------------
Note that "%f" is the name of the path that is being worked on. Depending
on the version that is being filtered, the corresponding file on disk may
not exist, or may have different contents. So, smudge