xref: /llvm-project/llvm/docs/GitBisecting.rst (revision 0c660256eb41fb0ba44277a32f39d2a028f797f2)
145c6c82eSNico Weber===================
245c6c82eSNico WeberBisecting LLVM code
345c6c82eSNico Weber===================
445c6c82eSNico Weber
545c6c82eSNico WeberIntroduction
645c6c82eSNico Weber============
745c6c82eSNico Weber
845c6c82eSNico Weber``git bisect`` is a useful tool for finding which revision caused a bug.
945c6c82eSNico Weber
1045c6c82eSNico WeberThis document describes how to use ``git bisect``. In particular, while LLVM
1145c6c82eSNico Weberhas a mostly linear history, it has a few merge commits that added projects --
1245c6c82eSNico Weberand these merged the linear history of those projects. As a consequence, the
1345c6c82eSNico WeberLLVM repository has multiple roots: One "normal" root, and then one for each
1445c6c82eSNico Webertoplevel project that was developed out-of-tree and then merged later.
1545c6c82eSNico WeberAs of early 2020, the only such merged project is MLIR, but flang will likely
1645c6c82eSNico Weberbe merged in a similar way soon.
1745c6c82eSNico Weber
1845c6c82eSNico WeberBasic operation
1945c6c82eSNico Weber===============
2045c6c82eSNico Weber
2145c6c82eSNico WeberSee https://git-scm.com/docs/git-bisect for a good overview. In summary:
2245c6c82eSNico Weber
2345c6c82eSNico Weber  .. code-block:: bash
2445c6c82eSNico Weber
2545c6c82eSNico Weber     git bisect start
26*43def795SHafiz Abid Qadeer     git bisect bad main
2745c6c82eSNico Weber     git bisect good f00ba
2845c6c82eSNico Weber
2945c6c82eSNico Webergit will check out a revision in between. Try to reproduce your problem at
3045c6c82eSNico Weberthat revision, and run ``git bisect good`` or ``git bisect bad``.
3145c6c82eSNico Weber
3245c6c82eSNico WeberIf you can't repro at the current commit (maybe the build is broken), run
3345c6c82eSNico Weber``git bisect skip`` and git will pick a nearby alternate commit.
3445c6c82eSNico Weber
3545c6c82eSNico Weber(To abort a bisect, run ``git bisect reset``, and if git complains about not
36*43def795SHafiz Abid Qadeerbeing able to reset, do the usual ``git checkout -f main; git reset --hard
37*43def795SHafiz Abid Qadeerorigin/main`` dance and try again).
3845c6c82eSNico Weber
3945c6c82eSNico Weber``git bisect run``
4045c6c82eSNico Weber==================
4145c6c82eSNico Weber
4245c6c82eSNico WeberA single bisect step often requires first building clang, and then compiling
4345c6c82eSNico Webera large code base with just-built clang. This can take a long time, so it's
4445c6c82eSNico Webergood if it can happen completely automatically. ``git bisect run`` can do
4545c6c82eSNico Weberthis for you if you write a run script that reproduces the problem
4645c6c82eSNico Weberautomatically. Writing the script can take 10-20 minutes, but it's almost
4745c6c82eSNico Weberalways worth it -- you can do something else while the bisect runs (such
4845c6c82eSNico Weberas writing this document).
4945c6c82eSNico Weber
5045c6c82eSNico WeberHere's an example run script. It assumes that you're in ``llvm-project`` and
5145c6c82eSNico Weberthat you have a sibling ``llvm-build-project`` build directory where you
5245c6c82eSNico Weberconfigured CMake to use Ninja. You have a file ``repro.c`` in the current
5345c6c82eSNico Weberdirectory that makes clang crash at trunk, but it worked fine at revision
5445c6c82eSNico Weber``f00ba``.
5545c6c82eSNico Weber
5645c6c82eSNico Weber  .. code-block:: bash
5745c6c82eSNico Weber
5845c6c82eSNico Weber     # Build clang. If the build fails, `exit 125` causes this
5945c6c82eSNico Weber     # revision to be skipped
6045c6c82eSNico Weber     ninja -C ../llvm-build-project clang || exit 125
6145c6c82eSNico Weber
6245c6c82eSNico Weber     ../llvm-build-project/bin/clang repro.c
6345c6c82eSNico Weber
6445c6c82eSNico WeberTo make sure your run script works, it's a good idea to run ``./run.sh`` by
6545c6c82eSNico Weberhand and tweak the script until it works, then run ``git bisect good`` or
6645c6c82eSNico Weber``git bisect bad`` manually once based on the result of the script
6745c6c82eSNico Weber(check ``echo $?`` after your script ran), and only then run ``git bisect run
6845c6c82eSNico Weber./run.sh``. Don't forget to mark your run script as executable -- ``git bisect
6945c6c82eSNico Weberrun`` doesn't check for that, it just assumes the run script failed each time.
7045c6c82eSNico Weber
7145c6c82eSNico WeberOnce your run script works, run ``git bisect run ./run.sh`` and a few hours
7245c6c82eSNico Weberlater you'll know which commit caused the regression.
7345c6c82eSNico Weber
7445c6c82eSNico Weber(This is a very simple run script. Often, you want to use just-built clang
7545c6c82eSNico Weberto build a different project and then run a built executable of that project
7645c6c82eSNico Weberin the run script.)
7745c6c82eSNico Weber
7845c6c82eSNico WeberBisecting across multiple roots
7945c6c82eSNico Weber===============================
8045c6c82eSNico Weber
8145c6c82eSNico WeberHere's how LLVM's history currently looks:
8245c6c82eSNico Weber
83c1adb88aSNico Weber  .. code-block:: none
8445c6c82eSNico Weber
8545c6c82eSNico Weber     A-o-o-......-o-D-o-o-HEAD
8645c6c82eSNico Weber                   /
8745c6c82eSNico Weber       B-o-...-o-C-
8845c6c82eSNico Weber
8945c6c82eSNico Weber``A`` is the first commit in LLVM ever, ``97724f18c79c``.
9045c6c82eSNico Weber
91e4e9e106SNico Weber``B`` is the first commit in MLIR, ``aed0d21a62db``.
92e4e9e106SNico Weber
93e4e9e106SNico Weber``D`` is the merge commit that merged MLIR into the main LLVM repository,
94e4e9e106SNico Weber``0f0d0ed1c78f``.
95e4e9e106SNico Weber
96e4e9e106SNico Weber``C`` is the last commit in MLIR before it got merged, ``0f0d0ed1c78f^2``. (The
97e4e9e106SNico Weber``^n`` modifier selects the n'th parent of a merge commit.)
9845c6c82eSNico Weber
9945c6c82eSNico Weber``git bisect`` goes through all parent revisions. Due to the way MLIR was
10045c6c82eSNico Webermerged, at every revision at ``C`` or earlier, *only* the ``mlir/`` directory
10145c6c82eSNico Weberexists, and nothing else does.
10245c6c82eSNico Weber
10345c6c82eSNico WeberAs of early 2020, there is no flag to ``git bisect`` to tell it to not
10445c6c82eSNico Weberdescend into all reachable commits. Ideally, we'd want to tell it to only
10545c6c82eSNico Weberfollow the first parent of ``D``.
10645c6c82eSNico Weber
10745c6c82eSNico WeberThe best workaround is to pass a list of directories to ``git bisect``:
10845c6c82eSNico WeberIf you know the bug is due to a change in llvm, clang, or compiler-rt, use
109e4e9e106SNico Weber
110e4e9e106SNico Weber  .. code-block:: bash
111e4e9e106SNico Weber
112e4e9e106SNico Weber     git bisect start -- clang llvm compiler-rt
113e4e9e106SNico Weber
114e4e9e106SNico WeberThat way, the commits in ``mlir`` are never evaluated.
11545c6c82eSNico Weber
11645c6c82eSNico WeberAlternatively, ``git bisect skip aed0d21a6 aed0d21a6..0f0d0ed1c78f`` explicitly
11745c6c82eSNico Weberskips all commits on that branch. It takes 1.5 minutes to run on a fast
11845c6c82eSNico Webermachine, and makes ``git bisect log`` output unreadable. (``aed0d21a6`` is
11945c6c82eSNico Weberlisted twice because git ranges exclude the revision listed on the left,
12045c6c82eSNico Weberso it needs to be ignored explicitly.)
12145c6c82eSNico Weber
12245c6c82eSNico WeberMore Resources
12345c6c82eSNico Weber==============
12445c6c82eSNico Weber
12545c6c82eSNico Weberhttps://git-scm.com/book/en/v2/Git-Tools-Revision-Selection
126