Home Explore Blog CI



git

6th chunk of `Documentation/MyFirstObjectWalk.adoc`
3d837a7548c21c463a959508af8fb5da6589053ad28cfa270000000100000fa7
 indicates that commits can be reordered after they're written, for
example with `git rebase`.

Let's try one more reordering of commits. `rev_info` exposes a `reverse` flag.
Set that flag somewhere inside of `final_rev_info_setup()`:

----
static void final_rev_info_setup(int argc, const char **argv, const char *prefix,
		struct rev_info *rev)
{
	...

	rev->reverse = 1;

	...
}
----

Run your walk again and note the difference in order. (If you remove the grep
pattern, you should see the last commit this call gives you as your current
HEAD.)

== Basic Object Walk

So far we've been walking only commits. But Git has more types of objects than
that! Let's see if we can walk _all_ objects, and find out some information
about each one.

We can base our work on an example. `git pack-objects` prepares all kinds of
objects for packing into a bitmap or packfile. The work we are interested in
resides in `builtin/pack-objects.c:get_object_list()`; examination of that
function shows that the all-object walk is being performed by
`traverse_commit_list()` or `traverse_commit_list_filtered()`. Those two
functions reside in `list-objects.c`; examining the source shows that, despite
the name, these functions traverse all kinds of objects. Let's have a look at
the arguments to `traverse_commit_list()`.

- `struct rev_info *revs`: This is the `rev_info` used for the walk. If
  its `filter` member is not `NULL`, then `filter` contains information for
  how to filter the object list.
- `show_commit_fn show_commit`: A callback which will be used to handle each
  individual commit object.
- `show_object_fn show_object`: A callback which will be used to handle each
  non-commit object (so each blob, tree, or tag).
- `void *show_data`: A context buffer which is passed in turn to `show_commit`
  and `show_object`.

In addition, `traverse_commit_list_filtered()` has an additional parameter:

- `struct oidset *omitted`: A linked-list of object IDs which the provided
  filter caused to be omitted.

It looks like these methods use callbacks we provide instead of needing us
to call it repeatedly ourselves. Cool! Let's add the callbacks first.

For the sake of this tutorial, we'll simply keep track of how many of each kind
of object we find. At file scope in `builtin/walken.c` add the following
tracking variables:

----
static int commit_count;
static int tag_count;
static int blob_count;
static int tree_count;
----

Commits are handled by a different callback than other objects; let's do that
one first:

----
static void walken_show_commit(struct commit *cmt, void *buf)
{
	commit_count++;
}
----

The `cmt` argument is fairly self-explanatory. But it's worth mentioning that
the `buf` argument is actually the context buffer that we can provide to the
traversal calls - `show_data`, which we mentioned a moment ago.

Since we have the `struct commit` object, we can look at all the same parts that
we looked at in our earlier commit-only walk. For the sake of this tutorial,
though, we'll just increment the commit counter and move on.

The callback for non-commits is a little different, as we'll need to check
which kind of object we're dealing with:

----
static void walken_show_object(struct object *obj, const char *str, void *buf)
{
	switch (obj->type) {
	case OBJ_TREE:
		tree_count++;
		break;
	case OBJ_BLOB:
		blob_count++;
		break;
	case OBJ_TAG:
		tag_count++;
		break;
	case OBJ_COMMIT:
		BUG("unexpected commit object in walken_show_object\n");
	default:
		BUG("unexpected object type %s in walken_show_object\n",
			type_name(obj->type));
	}
}
----

Again, `obj` is fairly self-explanatory, and we can guess that `buf` is the same
context pointer that `walken_show_commit()` receives: the `show_data` argument
to `traverse_commit_list()` and `traverse_commit_list_filtered()`. Finally,
`str` contains the name of the object, which ends up being something like
`foo.txt` (blob), `bar/baz` (tree), or `v1.2.3` (tag).

To help assure us that we aren't double-counting commits,

Title: Implementing an Object Walk in Git
Summary
This section discusses implementing an object walk in Git that can handle all types of objects, including commits, tags, blobs, and trees, using the `traverse_commit_list()` and `traverse_commit_list_filtered()` functions, and defines callback functions `walken_show_commit()` and `walken_show_object()` to track and process each type of object.