Home Explore Blog CI



git

9th chunk of `Documentation/git-bisect-lk2009.adoc`
7e3a0764946b3b9df152c0f20f1f270729e15e42b5dd45f70000000100000fa6
 bisection points
all happened to be in an area where all the commits are
untestable. And in this case the user was asked to test many
untestable commits, which could be very inefficient.

Indeed untestable commits are often untestable because a breakage was
introduced at one time, and that breakage was fixed only after many
other commits were introduced.

This breakage is of course most of the time unrelated to the breakage
we are trying to locate in the commit graph. But it prevents us to
know if the interesting "bad behavior" is present or not.

So it is a fact that commits near an untestable commit have a high
probability of being untestable themselves. And the best bisection
commits are often found together too (due to the bisection algorithm).

This is why it is a bad idea to just chose the next best unskipped
bisection commit when the first one has been skipped.

We found that most commits on the graph may give quite a lot of
information when they are tested. And the commits that will not on
average give a lot of information are the one near the good and bad
commits.

So using a PRNG with a bias to favor commits away from the good and
bad commits looked like a good choice.

One obvious improvement to this algorithm would be to look for a
commit that has an associated value near the one of the best bisection
commit, and that is on another branch, before using the PRNG. Because
if such a commit exists, then it is not very likely to be untestable
too, so it will probably give more information than a nearly randomly
chosen one.

Checking merge bases
~~~~~~~~~~~~~~~~~~~~

There is another tweak in the bisection algorithm that has not been
described in the "bisection algorithm" above.

We supposed in the previous examples that the "good" commits were
ancestors of the "bad" commit. But this is not a requirement of "git
bisect".

Of course the "bad" commit cannot be an ancestor of a "good" commit,
because the ancestors of the good commits are supposed to be
"good". And all the "good" commits must be related to the bad commit.
They cannot be on a branch that has no link with the branch of the
"bad" commit. But it is possible for a good commit to be related to a
bad commit and yet not be neither one of its ancestor nor one of its
descendants.

For example, there can be a "main" branch, and a "dev" branch that was
forked of the main branch at a commit named "D" like this:

-------------
A-B-C-D-E-F-G  <--main
       \
        H-I-J  <--dev
-------------

The commit "D" is called a "merge base" for branch "main" and "dev"
because it's the best common ancestor for these branches for a merge.

Now let's suppose that commit J is bad and commit G is good and that
we apply the bisection algorithm like it has been previously
described.

As described in step 1) b) of the bisection algorithm, we remove all
the ancestors of the good commits because they are supposed to be good
too.

So we would be left with only:

-------------
H-I-J
-------------

But what happens if the first bad commit is "B" and if it has been
fixed in the "main" branch by commit "F"?

The result of such a bisection would be that we would find that H is
the first bad commit, when in fact it's B. So that would be wrong!

And yes it can happen in practice that people working on one branch
are not aware that people working on another branch fixed a bug! It
could also happen that F fixed more than one bug or that it is a
revert of some big development effort that was not ready to be
released.

In fact development teams often maintain both a development branch and
a maintenance branch, and it would be quite easy for them if "git
bisect" just worked when they want to bisect a regression on the
development branch that is not on the maintenance branch. They should
be able to start bisecting using:

-------------
$ git bisect start dev main
-------------

To enable that additional nice feature, when a bisection is started
and when some good commits are not ancestors of the bad commit,

Title: Advanced Git Bisect Algorithm
Summary
This text discusses the limitations of the standard Git bisect algorithm when dealing with untestable commits and multiple branches, and describes additional tweaks and improvements to the algorithm, including the use of a pseudo-random number generator to favor commits away from good and bad commits, and the consideration of merge bases to handle cases where good commits are not ancestors of bad commits, allowing for more accurate and efficient bisecting in complex branch scenarios.