Home Explore Blog Models CI



nixpkgs

3rd chunk of `lib/fileset/README.md`
2a38619263eb83b5b2fc7f90b0d29358d1dab3cc90d9dd230000000100000fc2
  - (-) With the assumption that there can be multiple filesystem roots (as established with the [path library](../path/README.md)),
    this would have to cause an error with `union empty pathWithAnotherFilesystemRoot`,
    which is not as one would expect.
- Alternative: Do not have such a value and error when it would be needed as a return value
  - (+) Removes the need for a special representation.
  - (-) Leaves us with no identity element for `union` and no reasonable return value for `unions []`.
    From a set theory perspective, which has a well-known notion of empty sets, this is unintuitive.

### No intersection for lists

While there is `intersection a b`, there is no function `intersections [ a b c ]`.

Arguments:
- (+) There is no known use case for such a function, it can be added later if a use case arises
- (+) There is no suitable return value for `intersections [ ]`, see also "Nullary intersections" [here](https://en.wikipedia.org/w/index.php?title=List_of_set_identities_and_relations&oldid=1177174035#Definitions)
  - (-) Could throw an error for that case
  - (-) Create a special value to represent "all the files" and return that
    - (+) Such a value could then not be used with `fileFilter` unless the internal representation is changed considerably
  - (-) Could return the empty file set
    - (+) This would be wrong in set theory
- (-) Inconsistent with `union` and `unions`

### Intersection base path

The base path of the result of an `intersection` is the longest base path of the arguments.
E.g. the base path of `intersection ./foo ./foo/bar` is `./foo/bar`.
Meanwhile `intersection ./foo ./bar` returns the empty file set without a base path.

Arguments:
- Alternative: Use the common prefix of all base paths as the resulting base path
  - (-) This is unnecessarily strict, because the purpose of the base path is to track the directory under which files _could_ be in the file set.
    It should be as long as possible.
    All files contained in `intersection ./foo ./foo/bar` will be under `./foo/bar` (never just under `./foo`), and `intersection ./foo ./bar` will never contain any files (never under `./.`).
    This would lead to `toSource` having to unexpectedly throw errors for cases such as `toSource { root = ./foo; fileset = intersect ./foo base; }`, where `base` may be `./bar` or `./.`.
  - (-) There is no benefit to the user, since base path is not directly exposed in the interface

### Empty directories

File sets can only represent a _set_ of local files.
Directories on their own are not representable.

Arguments:
- (+) There does not seem to be a sensible set of combinators when directories can be represented on their own.
  Here's some possibilities:
  - `./.` represents the files in `./.` _and_ the directory itself including its subdirectories, meaning that even if there's no files, the entire structure of `./.` is preserved

    In that case, what should `fileFilter (file: false) ./.` return?
    It could return the entire directory structure unchanged, but with all files removed, which would not be what one would expect.

    Trying to have a filter function that also supports directories will lead to the question of:
    What should the behavior be if `./foo` itself is excluded but all of its contents are included?
    It leads to having to define when directories are recursed into, but then we're effectively back at how the `builtins.path`-based filters work.

  - `./.` represents all files in `./.` _and_ the directory itself, but not its subdirectories, meaning that at least `./.` will be preserved even if it's empty.

    In that case, `intersection ./. ./foo` should only include files and no directories themselves, since `./.` includes only `./.` as a directory, and same for `./foo`, so there's no overlap in directories.
    But intuitively this operation should result in the same as `./foo` – everything else is just confusing.
- (+) This matches how Git only supports files, so developers should already be used to it.

Title: Nixpkgs File Set Library: Advanced API Design Considerations
Summary
This document continues the justification for API design decisions within the Nixpkgs file set library. It explains the deliberate exclusion of a function for `intersections` of a list of file sets, citing a lack of use cases and the difficulty in defining a semantically correct return value for an empty input list, unlike `union` and `unions`. It details the policy for determining an `intersection`'s base path: it is the longest possible base path that contains all resulting files, not merely the common prefix of argument base paths, to avoid unnecessary restrictions and unexpected errors for users. Finally, it justifies why file sets represent only files and not empty directories themselves, arguing that explicit directory representation would complicate combinator logic, lead to unintuitive filtering behavior, and break consistency with Git's file-based model.