Home Explore Blog CI



git

5th chunk of `Documentation/gitattributes.adoc`
24c077882ec60374491e7700941f998cdc27215733020a7a0000000100000fa4
 `foo.ps1` with
a `working-tree-encoding` enabled Git client, then `foo.ps1` will be
stored as UTF-8 internally. A client without `working-tree-encoding`
support will checkout `foo.ps1` as UTF-8 encoded file. This will
typically cause trouble for the users of this file.
+
If a Git client that does not support the `working-tree-encoding`
attribute adds a new file `bar.ps1`, then `bar.ps1` will be
stored "as-is" internally (in this example probably as UTF-16).
A client with `working-tree-encoding` support will interpret the
internal contents as UTF-8 and try to convert it to UTF-16 on checkout.
That operation will fail and cause an error.

- Reencoding content to non-UTF encodings can cause errors as the
  conversion might not be UTF-8 round trip safe. If you suspect your
  encoding to not be round trip safe, then add it to
  `core.checkRoundtripEncoding` to make Git check the round trip
  encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character
  set) is known to have round trip issues with UTF-8 and is checked by
  default.

- Reencoding content requires resources that might slow down certain
  Git operations (e.g 'git checkout' or 'git add').

Use the `working-tree-encoding` attribute only if you cannot store a file
in UTF-8 encoding and if you want Git to be able to process the content
as text.

As an example, use the following attributes if your '*.ps1' files are
UTF-16 encoded with byte order mark (BOM) and you want Git to perform
automatic line ending conversion based on your platform.

------------------------
*.ps1		text working-tree-encoding=UTF-16
------------------------

Use the following attributes if your '*.ps1' files are UTF-16 little
endian encoded without BOM and you want Git to use Windows line endings
in the working directory (use `UTF-16LE-BOM` instead of `UTF-16LE` if
you want UTF-16 little endian with BOM).
Please note, it is highly recommended to
explicitly define the line endings with `eol` if the `working-tree-encoding`
attribute is used to avoid ambiguity.

------------------------
*.ps1		text working-tree-encoding=UTF-16LE eol=crlf
------------------------

You can get a list of all available encodings on your platform with the
following command:

------------------------
iconv --list
------------------------

If you do not know the encoding of a file, then you can use the `file`
command to guess the encoding:

------------------------
file foo.ps1
------------------------


`ident`
^^^^^^^

When the attribute `ident` is set for a path, Git replaces
`$Id$` in the blob object with `$Id:`, followed by the
40-character hexadecimal blob object name, followed by a dollar
sign `$` upon checkout.  Any byte sequence that begins with
`$Id:` and ends with `$` in the worktree file is replaced
with `$Id$` upon check-in.


`filter`
^^^^^^^^

A `filter` attribute can be set to a string value that names a
filter driver specified in the configuration.

A filter driver consists of a `clean` command and a `smudge`
command, either of which can be left unspecified.  Upon
checkout, when the `smudge` command is specified, the command is
fed the blob object from its standard input, and its standard
output is used to update the worktree file.  Similarly, the
`clean` command is used to convert the contents of worktree file
upon checkin. By default these commands process only a single
blob and terminate. If a long running `process` filter is used
in place of `clean` and/or `smudge` filters, then Git can process
all blobs with a single filter command invocation for the entire
life of a single Git command, for example `git add --all`. If a
long running `process` filter is configured then it always takes
precedence over a configured single blob filter. See section
below for the description of the protocol used to communicate with
a `process` filter.

One use of the content filtering is to massage the content into a shape
that is more convenient for the platform, filesystem, and the user to use.
For this mode of operation,

Title: Git Attributes for Text Encoding and Filtering
Summary
Git attributes such as 'working-tree-encoding' and 'filter' allow for encoding conversion and content filtering, enabling features like line ending conversion, character encoding, and content transformation, with considerations for compatibility, performance, and configuration, to facilitate efficient collaboration and workflow management.