installed), add the following section to your
`$GIT_DIR/config` file (or `$HOME/.gitconfig` file):
------------------------
[diff "jpg"]
textconv = exif
------------------------
NOTE: The text conversion is generally a one-way conversion;
in this example, we lose the actual image contents and focus
just on the text data. This means that diffs generated by
textconv are _not_ suitable for applying. For this reason,
only `git diff` and the `git log` family of commands (i.e.,
log, whatchanged, show) will perform text conversion. `git
format-patch` will never generate this output. If you want to
send somebody a text-converted diff of a binary file (e.g.,
because it quickly conveys the changes you have made), you
should generate it separately and send it as a comment _in
addition to_ the usual binary diff that you might send.
Because text conversion can be slow, especially when doing a
large number of them with `git log -p`, Git provides a mechanism
to cache the output and use it in future diffs. To enable
caching, set the "cachetextconv" variable in your diff driver's
config. For example:
------------------------
[diff "jpg"]
textconv = exif
cachetextconv = true
------------------------
This will cache the result of running "exif" on each blob
indefinitely. If you change the textconv config variable for a
diff driver, Git will automatically invalidate the cache entries
and re-run the textconv filter. If you want to invalidate the
cache manually (e.g., because your version of "exif" was updated
and now produces better output), you can remove the cache
manually with `git update-ref -d refs/notes/textconv/jpg` (where
"jpg" is the name of the diff driver, as in the example above).
Choosing textconv versus external diff
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you want to show differences between binary or specially-formatted
blobs in your repository, you can choose to use either an external diff
command, or to use textconv to convert them to a diff-able text format.
Which method you choose depends on your exact situation.
The advantage of using an external diff command is flexibility. You are
not bound to find line-oriented changes, nor is it necessary for the
output to resemble unified diff. You are free to locate and report
changes in the most appropriate way for your data format.
A textconv, by comparison, is much more limiting. You provide a
transformation of the data into a line-oriented text format, and Git
uses its regular diff tools to generate the output. There are several
advantages to choosing this method:
1. Ease of use. It is often much simpler to write a binary to text
transformation than it is to perform your own diff. In many cases,
existing programs can be used as textconv filters (e.g., exif,
odt2txt).
2. Git diff features. By performing only the transformation step
yourself, you can still utilize many of Git's diff features,
including colorization, word-diff, and combined diffs for merges.
3. Caching. Textconv caching can speed up repeated diffs, such as those
you might trigger by running `git log -p`.
Marking files as binary
^^^^^^^^^^^^^^^^^^^^^^^
Git usually guesses correctly whether a blob contains text or binary
data by examining the beginning of the contents. However, sometimes you
may want to override its decision, either because a blob contains binary
data later in the file, or because the content, while technically
composed of text characters, is opaque to a human reader. For example,
many postscript files contain only ASCII characters, but produce noisy
and meaningless diffs.
The simplest way to mark a file as binary is to unset the diff
attribute in the `.gitattributes` file:
------------------------
*.ps -diff
------------------------
This will cause Git to generate `Binary files differ` (or a binary
patch, if binary patches are enabled) instead of a regular diff.
However, one may also want to specify other diff driver attributes. For
example, you might want to use