Home Explore Blog Models CI



nixpkgs

1st chunk of `lib/fileset/README.md`
d16687c3fffe6e31fd677ab4096b7850e8eba980a0855d510000000100000fa5
# File set library

This is the internal contributor documentation.
The user documentation is [in the Nixpkgs manual](https://nixos.org/manual/nixpkgs/unstable/#sec-fileset).

## Goals

The main goal of the file set library is to be able to select local files that should be added to the Nix store.
It should have the following properties:
- Easy:
  The functions should have obvious semantics, be low in number and be composable.
- Safe:
  Throw early and helpful errors when mistakes are detected.
- Lazy:
  Only compute values when necessary.

Non-goals are:
- Efficient:
  If the abstraction proves itself worthwhile but too slow, it can be still be optimized further.

## Tests

Tests are declared in [`tests.sh`](./tests.sh) and can be run using
```
./tests.sh
```

## Benchmark

A simple benchmark against the HEAD commit can be run using
```
./benchmark.sh HEAD
```

This is intended to be run manually and is not checked by CI.

## Internal representation

The internal representation is versioned in order to allow file sets from different Nixpkgs versions to be composed with each other, see [`internal.nix`](./internal.nix) for the versions and conversions between them.
This section describes only the current representation, but past versions will have to be supported by the code.

### `fileset`

An attribute set with these values:

- `_type` (constant string `"fileset"`):
  Tag to indicate this value is a file set.

- `_internalVersion` (constant `3`, the current version):
  Version of the representation.

- `_internalIsEmptyWithoutBase` (bool):
  Whether this file set is the empty file set without a base path.
  If `true`, `_internalBase*` and `_internalTree` are not set.
  This is the only way to represent an empty file set without needing a base path.

  Such a value can be used as the identity element for `union` and the return value of `unions []` and co.

- `_internalBase` (path):
  Any files outside of this path cannot influence the set of files.
  This is always a directory and should be as long as possible.
  This is used by `lib.fileset.toSource` to check that all files are under the `root` argument

- `_internalBaseRoot` (path):
  The filesystem root of `_internalBase`, same as `(lib.path.splitRoot _internalBase).root`.
  This is here because this needs to be computed anyway, and this computation shouldn't be duplicated.

- `_internalBaseComponents` (list of strings):
  The path components of `_internalBase`, same as `lib.path.subpath.components (lib.path.splitRoot _internalBase).subpath`.
  This is here because this needs to be computed anyway, and this computation shouldn't be duplicated.

- `_internalTree` ([filesetTree](#filesettree)):
  A tree representation of all included files under `_internalBase`.

- `__noEval` (error):
  An error indicating that directly evaluating file sets is not supported.

## `filesetTree`

One of the following:

- `{ <name> = filesetTree; }`:
  A directory with a nested `filesetTree` value for directory entries.
  Entries not included may either be omitted or set to `null`, as necessary to improve efficiency or laziness.

- `"directory"`:
  A directory with all its files included recursively, allowing early cutoff for some operations.
  This specific string is chosen to be compatible with `builtins.readDir` for a simpler implementation.

- `"regular"`, `"symlink"`, `"unknown"` or any other non-`"directory"` string:
  A nested file with its file type.
  These specific strings are chosen to be compatible with `builtins.readDir` for a simpler implementation.
  Distinguishing between different file types is not strictly necessary for the functionality this library,
  but it does allow nicer printing of file sets.

- `null`:
  A file or directory that is excluded from the tree.
  It may still exist on the file system.

## API design decisions

This section justifies API design decisions.

### Internal structure

The representation of the file set data type is internal and can be changed over time.

Title: Nixpkgs File Set Library: Internal Documentation
Summary
This document serves as internal contributor documentation for the Nixpkgs file set library, distinct from the user manual. Its primary goal is to facilitate the selection of local files for the Nix store, emphasizing ease of use, safety, and laziness over efficiency. The document outlines how to run tests (`tests.sh`) and a manual benchmark (`benchmark.sh`). It details the versioned internal representation of a `fileset`, which includes metadata like type, version, base path information, and a `filesetTree`. The `filesetTree` structure recursively represents included files and directories, using specific strings for file types or `null` for excluded items. The API design ensures that the internal representation can evolve without breaking external usage.