Home Explore Blog CI



git

8th chunk of `Documentation/MyFirstObjectWalk.adoc`
46ba98c55a112e313b9372a6637b9b03e4ab61203a4744dd0000000100000fa2

----

NOTE: This output is intended to be machine-parsed. Therefore, we are not
sending it to `trace_printf()`, and we are not localizing it - we need scripts
to be able to count on the formatting to be exactly the way it is shown here.
If we were intending this output to be read by humans, we would need to localize
it with `_()`.

Finally, we'll ask `cmd_walken()` to use the object walk instead. Discussing
command line options is out of scope for this tutorial, so we'll just hardcode
a branch we can change at compile time. Where you call `final_rev_info_setup()`
and `walken_commit_walk()`, instead branch like so:

----
	if (1) {
		add_head_to_pending(&rev);
		walken_object_walk(&rev);
	} else {
		final_rev_info_setup(argc, argv, prefix, &rev);
		walken_commit_walk(&rev);
	}
----

NOTE: For simplicity, we've avoided all the filters and sorts we applied in
`final_rev_info_setup()` and simply added `HEAD` to our pending queue. If you
want, you can certainly use the filters we added before by moving
`final_rev_info_setup()` out of the conditional and removing the call to
`add_head_to_pending()`.

Now we can try to run our command! It should take noticeably longer than the
commit walk, but an examination of the output will give you an idea why. Your
output should look similar to this example, but with different counts:

----
Object walk completed. Found 55733 commits, 100274 blobs, 0 tags, and 104210 trees.
----

This makes sense. We have more trees than commits because the Git project has
lots of subdirectories which can change, plus at least one tree per commit. We
have no tags because we started on a commit (`HEAD`) and while tags can point to
commits, commits can't point to tags.

NOTE: You will have different counts when you run this yourself! The number of
objects grows along with the Git project.

=== Adding a Filter

There are a handful of filters that we can apply to the object walk laid out in
`Documentation/rev-list-options.adoc`. These filters are typically useful for
operations such as creating packfiles or performing a partial clone. They are
defined in `list-objects-filter-options.h`. For the purposes of this tutorial we
will use the "tree:1" filter, which causes the walk to omit all trees and blobs
which are not directly referenced by commits reachable from the commit in
`pending` when the walk begins. (`pending` is the list of objects which need to
be traversed during a walk; you can imagine a breadth-first tree traversal to
help understand. In our case, that means we omit trees and blobs not directly
referenced by `HEAD` or `HEAD`'s history, because we begin the walk with only
`HEAD` in the `pending` list.)

For now, we are not going to track the omitted objects, so we'll replace those
parameters with `NULL`. For the sake of simplicity, we'll add a simple
build-time branch to use our filter or not. Preface the line calling
`traverse_commit_list()` with the following, which will remind us which kind of
walk we've just performed:

----
	if (0) {
		/* Unfiltered: */
		trace_printf(_("Unfiltered object walk.\n"));
	} else {
		trace_printf(
			_("Filtered object walk with filterspec 'tree:1'.\n"));

		parse_list_objects_filter(&rev->filter, "tree:1");
	}
	traverse_commit_list(rev, walken_show_commit,
			     walken_show_object, NULL);
----

The `rev->filter` member is usually built directly from a command
line argument, so the module provides an easy way to build one from a string.
Even though we aren't taking user input right now, we can still build one with
a hardcoded string using `parse_list_objects_filter()`.

With the filter spec "tree:1", we are expecting to see _only_ the root tree for
each commit; therefore, the tree object count should be less than or equal to
the number of commits. (For an example of why that's true: `git commit --revert`
points to the same tree object as its grandparent.)

=== Counting Omitted Objects

We also have the capability to enumerate all objects which were omitted by a
filter, like

Title: Filtering the Object Walk
Summary
This section explains how to apply filters to the object walk in Git, specifically using the 'tree:1' filter to omit trees and blobs not directly referenced by commits, and how to count omitted objects, with code examples and explanations of the filter's effects on the object counts.