Entries not included may either be omitted or set to `null`, as necessary to improve efficiency or laziness.
- `"directory"`:
A directory with all its files included recursively, allowing early cutoff for some operations.
This specific string is chosen to be compatible with `builtins.readDir` for a simpler implementation.
- `"regular"`, `"symlink"`, `"unknown"` or any other non-`"directory"` string:
A nested file with its file type.
These specific strings are chosen to be compatible with `builtins.readDir` for a simpler implementation.
Distinguishing between different file types is not strictly necessary for the functionality this library,
but it does allow nicer printing of file sets.
- `null`:
A file or directory that is excluded from the tree.
It may still exist on the file system.
## API design decisions
This section justifies API design decisions.
### Internal structure
The representation of the file set data type is internal and can be changed over time.
Arguments:
- (+) The point of this library is to provide high-level functions, users don't need to be concerned with how it's implemented
- (+) It allows adjustments to the representation, which is especially useful in the early days of the library.
- (+) It still allows the representation to be stabilized later if necessary and if it has proven itself
### Influence tracking
File set operations internally track the top-most directory that could influence the exact contents of a file set.
Specifically, `toSource` requires that the given `fileset` is completely determined by files within the directory specified by the `root` argument.
For example, even with `dir/file.txt` being the only file in `./.`, `toSource { root = ./dir; fileset = ./.; }` gives an error.
This is because `fileset` may as well be the result of filtering `./.` in a way that excludes `dir`.
Arguments:
- (+) This gives us the guarantee that adding new files to a project never breaks a file set expression.
This is also true in a lesser form for removed files:
only removing files explicitly referenced by paths can break a file set expression.
- (+) This can be removed later, if we discover it's too restrictive
- (-) It leads to errors when a sensible result could sometimes be returned, such as in the above example.
### Empty file set without a base
There is a special representation for an empty file set without a base path.
This is used for return values that should be empty but when there's no base path that would makes sense.
Arguments:
- Alternative: This could also be represented using `_internalBase = /.` and `_internalTree = null`.
- (+) Removes the need for a special representation.
- (-) Due to [influence tracking](#influence-tracking),
`union empty ./.` would have `/.` as the base path,
which would then prevent `toSource { root = ./.; fileset = union empty ./.; }` from working,
which is not as one would expect.
- (-) With the assumption that there can be multiple filesystem roots (as established with the [path library](../path/README.md)),
this would have to cause an error with `union empty pathWithAnotherFilesystemRoot`,
which is not as one would expect.
- Alternative: Do not have such a value and error when it would be needed as a return value
- (+) Removes the need for a special representation.
- (-) Leaves us with no identity element for `union` and no reasonable return value for `unions []`.
From a set theory perspective, which has a well-known notion of empty sets, this is unintuitive.
### No intersection for lists
While there is `intersection a b`, there is no function `intersections [ a b c ]`.
Arguments:
- (+) There is no known use case for such a function, it can be added later if a use case arises
- (+) There is no suitable return value for `intersections [ ]`, see also "Nullary intersections" [here](https://en.wikipedia.org/w/index.php?title=List_of_set_identities_and_relations&oldid=1177174035#Definitions)
- (-) Could throw an error for that case