Home Explore Blog Models CI



nixpkgs

2nd chunk of `lib/fileset/README.md`
0876e166719cc0b8dd82266c4f7ca68bdd4151de5ac399c20000000100000fc4
  Entries not included may either be omitted or set to `null`, as necessary to improve efficiency or laziness.

- `"directory"`:
  A directory with all its files included recursively, allowing early cutoff for some operations.
  This specific string is chosen to be compatible with `builtins.readDir` for a simpler implementation.

- `"regular"`, `"symlink"`, `"unknown"` or any other non-`"directory"` string:
  A nested file with its file type.
  These specific strings are chosen to be compatible with `builtins.readDir` for a simpler implementation.
  Distinguishing between different file types is not strictly necessary for the functionality this library,
  but it does allow nicer printing of file sets.

- `null`:
  A file or directory that is excluded from the tree.
  It may still exist on the file system.

## API design decisions

This section justifies API design decisions.

### Internal structure

The representation of the file set data type is internal and can be changed over time.

Arguments:
- (+) The point of this library is to provide high-level functions, users don't need to be concerned with how it's implemented
- (+) It allows adjustments to the representation, which is especially useful in the early days of the library.
- (+) It still allows the representation to be stabilized later if necessary and if it has proven itself

### Influence tracking

File set operations internally track the top-most directory that could influence the exact contents of a file set.
Specifically, `toSource` requires that the given `fileset` is completely determined by files within the directory specified by the `root` argument.
For example, even with `dir/file.txt` being the only file in `./.`, `toSource { root = ./dir; fileset = ./.; }` gives an error.
This is because `fileset` may as well be the result of filtering `./.` in a way that excludes `dir`.

Arguments:
- (+) This gives us the guarantee that adding new files to a project never breaks a file set expression.
  This is also true in a lesser form for removed files:
  only removing files explicitly referenced by paths can break a file set expression.
- (+) This can be removed later, if we discover it's too restrictive
- (-) It leads to errors when a sensible result could sometimes be returned, such as in the above example.

### Empty file set without a base

There is a special representation for an empty file set without a base path.
This is used for return values that should be empty but when there's no base path that would makes sense.

Arguments:
- Alternative: This could also be represented using `_internalBase = /.` and `_internalTree = null`.
  - (+) Removes the need for a special representation.
  - (-) Due to [influence tracking](#influence-tracking),
    `union empty ./.` would have `/.` as the base path,
    which would then prevent `toSource { root = ./.; fileset = union empty ./.; }` from working,
    which is not as one would expect.
  - (-) With the assumption that there can be multiple filesystem roots (as established with the [path library](../path/README.md)),
    this would have to cause an error with `union empty pathWithAnotherFilesystemRoot`,
    which is not as one would expect.
- Alternative: Do not have such a value and error when it would be needed as a return value
  - (+) Removes the need for a special representation.
  - (-) Leaves us with no identity element for `union` and no reasonable return value for `unions []`.
    From a set theory perspective, which has a well-known notion of empty sets, this is unintuitive.

### No intersection for lists

While there is `intersection a b`, there is no function `intersections [ a b c ]`.

Arguments:
- (+) There is no known use case for such a function, it can be added later if a use case arises
- (+) There is no suitable return value for `intersections [ ]`, see also "Nullary intersections" [here](https://en.wikipedia.org/w/index.php?title=List_of_set_identities_and_relations&oldid=1177174035#Definitions)
  - (-) Could throw an error for that case

Title: Nixpkgs File Set Library: API Design Decisions
Summary
This section details the API design decisions for the Nixpkgs file set library. It justifies keeping the file set's internal data type representation private and mutable, allowing for flexibility and future optimization without user concern. The library implements 'influence tracking,' ensuring that file set operations are robust against new file additions by requiring all influencing files to be within a specified root directory, though this can sometimes lead to restrictive errors. A special representation for an empty file set without a base path is used to provide an identity element for `union` operations and handle cases where no sensible base path exists, avoiding complications with influence tracking and multi-root file systems. Lastly, the library intentionally omits a function for `intersections` of a list of file sets, citing a lack of current use cases and ambiguity regarding the return value for an empty input list, while `intersection` for two sets is provided.