= My First Object Walk
== What's an Object Walk?
The object walk is a key concept in Git - this is the process that underpins
operations like object transfer and fsck. Beginning from a given commit, the
list of objects is found by walking parent relationships between commits (commit
X based on commit W) and containment relationships between objects (tree Y is
contained within commit X, and blob Z is located within tree Y, giving our
working tree for commit X something like `y/z.txt`).
A related concept is the revision walk, which is focused on commit objects and
their parent relationships and does not delve into other object types. The
revision walk is used for operations like `git log`.
=== Related Reading
- `Documentation/user-manual.adoc` under "Hacking Git" contains some coverage of
the revision walker in its various incarnations.
- `revision.h`
- https://eagain.net/articles/git-for-computer-scientists/[Git for Computer Scientists]
gives a good overview of the types of objects in Git and what your object
walk is really describing.
== Setting Up
Create a new branch from `master`.
----
git checkout -b revwalk origin/master
----
We'll put our fiddling into a new command. For fun, let's name it `git walken`.
Open up a new file `builtin/walken.c` and set up the command handler:
----
/*
* "git walken"
*
* Part of the "My First Object Walk" tutorial.
*/
#include "builtin.h"
#include "trace.h"
int cmd_walken(int argc, const char **argv, const char *prefix)
{
trace_printf(_("cmd_walken incoming...\n"));
return 0;
}
----
NOTE: `trace_printf()`, defined in `trace.h`, differs from `printf()` in
that it can be turned on or off at runtime. For the purposes of this
tutorial, we will write `walken` as though it is intended for use as
a "plumbing" command: that is, a command which is used primarily in
scripts, rather than interactively by humans (a "porcelain" command).
So we will send our debug output to `trace_printf()` instead.
When running, enable trace output by setting the environment variable `GIT_TRACE`.
Add usage text and `-h` handling, like all subcommands should consistently do
(our test suite will notice and complain if you fail to do so).
We'll need to include the `parse-options.h` header.
----
#include "parse-options.h"
...
int cmd_walken(int argc, const char **argv, const char *prefix)
{
const char * const walken_usage[] = {
N_("git walken"),
NULL,
};
struct option options[] = {
OPT_END()
};
argc = parse_options(argc, argv, prefix, options, walken_usage, 0);
...
}
----
Also add the relevant line in `builtin.h` near `cmd_whatchanged()`:
----
int cmd_walken(int argc, const char **argv, const char *prefix);
----
Include the command in `git.c` in `commands[]` near the entry for `whatchanged`,
maintaining alphabetical ordering:
----
{ "walken", cmd_walken, RUN_SETUP },
----
Add it to the `Makefile` near the line for `builtin/worktree.o`:
----
BUILTIN_OBJS += builtin/walken.o
----
Build and test out your command, without forgetting to ensure the `DEVELOPER`
flag is set, and with `GIT_TRACE` enabled so the debug output can be seen:
----
$ echo DEVELOPER=1 >>config.mak
$ make
$ GIT_TRACE=1 ./bin-wrappers/git walken
----
NOTE: For a more exhaustive overview of the new command process, take a look at
`Documentation/MyFirstContribution.adoc`.
NOTE: A reference implementation can be found at
https://github.com/nasamuffin/git/tree/revwalk.
=== `struct rev_cmdline_info`
The definition of `struct rev_cmdline_info` can be found in `revision.h`.
This struct is contained within the `rev_info` struct and is used to reflect
parameters provided by the user over the CLI.
`nr` represents the number of `rev_cmdline_entry` present in the array.
`alloc` is used by the `ALLOC_GROW` macro. Check `alloc.h` - this variable is
used to track the allocated size of the list.
Per entry, we find:
`item` is the object provided upon which to base the object walk. Items in Git
can be blobs, trees, commits,