Git Attributes for Text Encoding and Filtering

`foo.ps1` with a `working-tree-encoding` enabled Git client, then `foo.ps1` will be stored as UTF-8 internally. A client without `working-tree-encoding` support will checkout `foo.ps1` as UTF-8 encoded file. This will typically cause trouble for the users of this file. + If a Git client that does not support the `working-tree-encoding` attribute adds a new file `bar.ps1`, then `bar.ps1` will be stored "as-is" internally (in this example probably as UTF-16). A client with `working-tree-encoding` support will interpret the internal contents as UTF-8 and try to convert it to UTF-16 on checkout. That operation will fail and cause an error. - Reencoding content to non-UTF encodings can cause errors as the conversion might not be UTF-8 round trip safe. If you suspect your encoding to not be round trip safe, then add it to `core.checkRoundtripEncoding` to make Git check the round trip encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character set) is known to have round trip issues with UTF-8 and is checked by default. - Reencoding content requires resources that might slow down certain Git operations (e.g 'git checkout' or 'git add'). Use the `working-tree-encoding` attribute only if you cannot store a file in UTF-8 encoding and if you want Git to be able to process the content as text. As an example, use the following attributes if your '*.ps1' files are UTF-16 encoded with byte order mark (BOM) and you want Git to perform automatic line ending conversion based on your platform. ------------------------ *.ps1 text working-tree-encoding=UTF-16 ------------------------ Use the following attributes if your '*.ps1' files are UTF-16 little endian encoded without BOM and you want Git to use Windows line endings in the working directory (use `UTF-16LE-BOM` instead of `UTF-16LE` if you want UTF-16 little endian with BOM). Please note, it is highly recommended to explicitly define the line endings with `eol` if the `working-tree-encoding` attribute is used to avoid ambiguity. ------------------------ *.ps1 text working-tree-encoding=UTF-16LE eol=crlf ------------------------ You can get a list of all available encodings on your platform with the following command: ------------------------ iconv --list ------------------------ If you do not know the encoding of a file, then you can use the `file` command to guess the encoding: ------------------------ file foo.ps1 ------------------------ `ident` ^^^^^^^ When the attribute `ident` is set for a path, Git replaces `$Id$` in the blob object with `$Id:`, followed by the 40-character hexadecimal blob object name, followed by a dollar sign `$` upon checkout. Any byte sequence that begins with `$Id:` and ends with `$` in the worktree file is replaced with `$Id$` upon check-in. `filter` ^^^^^^^^ A `filter` attribute can be set to a string value that names a filter driver specified in the configuration. A filter driver consists of a `clean` command and a `smudge` command, either of which can be left unspecified. Upon checkout, when the `smudge` command is specified, the command is fed the blob object from its standard input, and its standard output is used to update the worktree file. Similarly, the `clean` command is used to convert the contents of worktree file upon checkin. By default these commands process only a single blob and terminate. If a long running `process` filter is used in place of `clean` and/or `smudge` filters, then Git can process all blobs with a single filter command invocation for the entire life of a single Git command, for example `git add --all`. If a long running `process` filter is configured then it always takes precedence over a configured single blob filter. See section below for the description of the protocol used to communicate with a `process` filter. One use of the content filtering is to massage the content into a shape that is more convenient for the platform, filesystem, and the user to use. For this mode of operation,

Git attributes such as 'working-tree-encoding' and 'filter' allow for encoding conversion and content filtering, enabling features like line ending conversion, character encoding, and content transformation, with considerations for compatibility, performance, and configuration, to facilitate efficient collaboration and workflow management.