xref: /netbsd-src/external/apache2/llvm/dist/llvm/docs/Bugpoint.rst (revision 82d56013d7b633d116a93943de88e08335357a7c)
17330f729Sjoerg====================================
27330f729SjoergLLVM bugpoint tool: design and usage
37330f729Sjoerg====================================
47330f729Sjoerg
57330f729Sjoerg.. contents::
67330f729Sjoerg   :local:
77330f729Sjoerg
87330f729SjoergDescription
97330f729Sjoerg===========
107330f729Sjoerg
117330f729Sjoerg``bugpoint`` narrows down the source of problems in LLVM tools and passes.  It
127330f729Sjoergcan be used to debug three types of failures: optimizer crashes, miscompilations
137330f729Sjoergby optimizers, or bad native code generation (including problems in the static
147330f729Sjoergand JIT compilers).  It aims to reduce large test cases to small, useful ones.
157330f729SjoergFor example, if ``opt`` crashes while optimizing a file, it will identify the
167330f729Sjoergoptimization (or combination of optimizations) that causes the crash, and reduce
177330f729Sjoergthe file down to a small example which triggers the crash.
187330f729Sjoerg
197330f729SjoergFor detailed case scenarios, such as debugging ``opt``, or one of the LLVM code
207330f729Sjoerggenerators, see :doc:`HowToSubmitABug`.
217330f729Sjoerg
227330f729SjoergDesign Philosophy
237330f729Sjoerg=================
247330f729Sjoerg
257330f729Sjoerg``bugpoint`` is designed to be a useful tool without requiring any hooks into
267330f729Sjoergthe LLVM infrastructure at all.  It works with any and all LLVM passes and code
277330f729Sjoerggenerators, and does not need to "know" how they work.  Because of this, it may
287330f729Sjoergappear to do stupid things or miss obvious simplifications.  ``bugpoint`` is
297330f729Sjoergalso designed to trade off programmer time for computer time in the
307330f729Sjoergcompiler-debugging process; consequently, it may take a long period of
317330f729Sjoerg(unattended) time to reduce a test case, but we feel it is still worth it. Note
327330f729Sjoergthat ``bugpoint`` is generally very quick unless debugging a miscompilation
337330f729Sjoergwhere each test of the program (which requires executing it) takes a long time.
347330f729Sjoerg
357330f729SjoergAutomatic Debugger Selection
367330f729Sjoerg----------------------------
377330f729Sjoerg
387330f729Sjoerg``bugpoint`` reads each ``.bc`` or ``.ll`` file specified on the command line
397330f729Sjoergand links them together into a single module, called the test program.  If any
407330f729SjoergLLVM passes are specified on the command line, it runs these passes on the test
417330f729Sjoergprogram.  If any of the passes crash, or if they produce malformed output (which
427330f729Sjoergcauses the verifier to abort), ``bugpoint`` starts the `crash debugger`_.
437330f729Sjoerg
447330f729SjoergOtherwise, if the ``-output`` option was not specified, ``bugpoint`` runs the
457330f729Sjoergtest program with the "safe" backend (which is assumed to generate good code) to
467330f729Sjoerggenerate a reference output.  Once ``bugpoint`` has a reference output for the
477330f729Sjoergtest program, it tries executing it with the selected code generator.  If the
487330f729Sjoergselected code generator crashes, ``bugpoint`` starts the `crash debugger`_ on
497330f729Sjoergthe code generator.  Otherwise, if the resulting output differs from the
507330f729Sjoergreference output, it assumes the difference resulted from a code generator
517330f729Sjoergfailure, and starts the `code generator debugger`_.
527330f729Sjoerg
537330f729SjoergFinally, if the output of the selected code generator matches the reference
547330f729Sjoergoutput, ``bugpoint`` runs the test program after all of the LLVM passes have
557330f729Sjoergbeen applied to it.  If its output differs from the reference output, it assumes
567330f729Sjoergthe difference resulted from a failure in one of the LLVM passes, and enters the
577330f729Sjoerg`miscompilation debugger`_.  Otherwise, there is no problem ``bugpoint`` can
587330f729Sjoergdebug.
597330f729Sjoerg
607330f729Sjoerg.. _crash debugger:
617330f729Sjoerg
627330f729SjoergCrash debugger
637330f729Sjoerg--------------
647330f729Sjoerg
657330f729SjoergIf an optimizer or code generator crashes, ``bugpoint`` will try as hard as it
667330f729Sjoergcan to reduce the list of passes (for optimizer crashes) and the size of the
677330f729Sjoergtest program.  First, ``bugpoint`` figures out which combination of optimizer
687330f729Sjoergpasses triggers the bug. This is useful when debugging a problem exposed by
697330f729Sjoerg``opt``, for example, because it runs over 38 passes.
707330f729Sjoerg
717330f729SjoergNext, ``bugpoint`` tries removing functions from the test program, to reduce its
727330f729Sjoergsize.  Usually it is able to reduce a test program to a single function, when
737330f729Sjoergdebugging intraprocedural optimizations.  Once the number of functions has been
747330f729Sjoergreduced, it attempts to delete various edges in the control flow graph, to
757330f729Sjoergreduce the size of the function as much as possible.  Finally, ``bugpoint``
767330f729Sjoergdeletes any individual LLVM instructions whose absence does not eliminate the
777330f729Sjoergfailure.  At the end, ``bugpoint`` should tell you what passes crash, give you a
787330f729Sjoergbitcode file, and give you instructions on how to reproduce the failure with
797330f729Sjoerg``opt`` or ``llc``.
807330f729Sjoerg
817330f729Sjoerg.. _code generator debugger:
827330f729Sjoerg
837330f729SjoergCode generator debugger
847330f729Sjoerg-----------------------
857330f729Sjoerg
867330f729SjoergThe code generator debugger attempts to narrow down the amount of code that is
877330f729Sjoergbeing miscompiled by the selected code generator.  To do this, it takes the test
887330f729Sjoergprogram and partitions it into two pieces: one piece which it compiles with the
897330f729Sjoerg"safe" backend (into a shared object), and one piece which it runs with either
907330f729Sjoergthe JIT or the static LLC compiler.  It uses several techniques to reduce the
917330f729Sjoergamount of code pushed through the LLVM code generator, to reduce the potential
927330f729Sjoergscope of the problem.  After it is finished, it emits two bitcode files (called
937330f729Sjoerg"test" [to be compiled with the code generator] and "safe" [to be compiled with
947330f729Sjoergthe "safe" backend], respectively), and instructions for reproducing the
957330f729Sjoergproblem.  The code generator debugger assumes that the "safe" backend produces
967330f729Sjoerggood code.
977330f729Sjoerg
987330f729Sjoerg.. _miscompilation debugger:
997330f729Sjoerg
1007330f729SjoergMiscompilation debugger
1017330f729Sjoerg-----------------------
1027330f729Sjoerg
1037330f729SjoergThe miscompilation debugger works similarly to the code generator debugger.  It
1047330f729Sjoergworks by splitting the test program into two pieces, running the optimizations
1057330f729Sjoergspecified on one piece, linking the two pieces back together, and then executing
1067330f729Sjoergthe result.  It attempts to narrow down the list of passes to the one (or few)
1077330f729Sjoergwhich are causing the miscompilation, then reduce the portion of the test
1087330f729Sjoergprogram which is being miscompiled.  The miscompilation debugger assumes that
1097330f729Sjoergthe selected code generator is working properly.
1107330f729Sjoerg
1117330f729SjoergAdvice for using bugpoint
1127330f729Sjoerg=========================
1137330f729Sjoerg
1147330f729Sjoerg``bugpoint`` can be a remarkably useful tool, but it sometimes works in
1157330f729Sjoergnon-obvious ways.  Here are some hints and tips:
1167330f729Sjoerg
1177330f729Sjoerg* In the code generator and miscompilation debuggers, ``bugpoint`` only works
1187330f729Sjoerg  with programs that have deterministic output.  Thus, if the program outputs
1197330f729Sjoerg  ``argv[0]``, the date, time, or any other "random" data, ``bugpoint`` may
1207330f729Sjoerg  misinterpret differences in these data, when output, as the result of a
1217330f729Sjoerg  miscompilation.  Programs should be temporarily modified to disable outputs
1227330f729Sjoerg  that are likely to vary from run to run.
1237330f729Sjoerg
124*82d56013Sjoerg* In the `crash debugger`_, ``bugpoint`` does not distinguish different crashes
1257330f729Sjoerg  during reduction. Thus, if new crash or miscompilation happens, ``bugpoint``
1267330f729Sjoerg  will continue with the new crash instead. If you would like to stick to
1277330f729Sjoerg  particular crash, you should write check scripts to validate the error
1287330f729Sjoerg  message, see ``-compile-command`` in :doc:`CommandGuide/bugpoint`.
1297330f729Sjoerg
1307330f729Sjoerg* In the code generator and miscompilation debuggers, debugging will go faster
1317330f729Sjoerg  if you manually modify the program or its inputs to reduce the runtime, but
1327330f729Sjoerg  still exhibit the problem.
1337330f729Sjoerg
1347330f729Sjoerg* ``bugpoint`` is extremely useful when working on a new optimization: it helps
1357330f729Sjoerg  track down regressions quickly.  To avoid having to relink ``bugpoint`` every
1367330f729Sjoerg  time you change your optimization however, have ``bugpoint`` dynamically load
1377330f729Sjoerg  your optimization with the ``-load`` option.
1387330f729Sjoerg
1397330f729Sjoerg* ``bugpoint`` can generate a lot of output and run for a long period of time.
1407330f729Sjoerg  It is often useful to capture the output of the program to file.  For example,
1417330f729Sjoerg  in the C shell, you can run:
1427330f729Sjoerg
1437330f729Sjoerg  .. code-block:: console
1447330f729Sjoerg
1457330f729Sjoerg    $ bugpoint  ... |& tee bugpoint.log
1467330f729Sjoerg
1477330f729Sjoerg  to get a copy of ``bugpoint``'s output in the file ``bugpoint.log``, as well
1487330f729Sjoerg  as on your terminal.
1497330f729Sjoerg
1507330f729Sjoerg* ``bugpoint`` cannot debug problems with the LLVM linker. If ``bugpoint``
1517330f729Sjoerg  crashes before you see its "All input ok" message, you might try ``llvm-link
1527330f729Sjoerg  -v`` on the same set of input files. If that also crashes, you may be
1537330f729Sjoerg  experiencing a linker bug.
1547330f729Sjoerg
1557330f729Sjoerg* ``bugpoint`` is useful for proactively finding bugs in LLVM.  Invoking
1567330f729Sjoerg  ``bugpoint`` with the ``-find-bugs`` option will cause the list of specified
1577330f729Sjoerg  optimizations to be randomized and applied to the program. This process will
1587330f729Sjoerg  repeat until a bug is found or the user kills ``bugpoint``.
1597330f729Sjoerg
1607330f729Sjoerg* ``bugpoint`` can produce IR which contains long names. Run ``opt
1617330f729Sjoerg  -metarenamer`` over the IR to rename everything using easy-to-read,
1627330f729Sjoerg  metasyntactic names. Alternatively, run ``opt -strip -instnamer`` to rename
1637330f729Sjoerg  everything with very short (often purely numeric) names.
1647330f729Sjoerg
1657330f729SjoergWhat to do when bugpoint isn't enough
1667330f729Sjoerg=====================================
1677330f729Sjoerg
1687330f729SjoergSometimes, ``bugpoint`` is not enough. In particular, InstCombine and
1697330f729SjoergTargetLowering both have visitor structured code with lots of potential
1707330f729Sjoergtransformations.  If the process of using bugpoint has left you with still too
1717330f729Sjoergmuch code to figure out and the problem seems to be in instcombine, the
1727330f729Sjoergfollowing steps may help.  These same techniques are useful with TargetLowering
1737330f729Sjoergas well.
1747330f729Sjoerg
1757330f729SjoergTurn on ``-debug-only=instcombine`` and see which transformations within
1767330f729Sjoerginstcombine are firing by selecting out lines with "``IC``" in them.
1777330f729Sjoerg
1787330f729SjoergAt this point, you have a decision to make.  Is the number of transformations
1797330f729Sjoergsmall enough to step through them using a debugger?  If so, then try that.
1807330f729Sjoerg
1817330f729SjoergIf there are too many transformations, then a source modification approach may
1827330f729Sjoergbe helpful.  In this approach, you can modify the source code of instcombine to
1837330f729Sjoergdisable just those transformations that are being performed on your test input
1847330f729Sjoergand perform a binary search over the set of transformations.  One set of places
1857330f729Sjoergto modify are the "``visit*``" methods of ``InstCombiner`` (*e.g.*
1867330f729Sjoerg``visitICmpInst``) by adding a "``return false``" as the first line of the
1877330f729Sjoergmethod.
1887330f729Sjoerg
1897330f729SjoergIf that still doesn't remove enough, then change the caller of
1907330f729Sjoerg``InstCombiner::DoOneIteration``, ``InstCombiner::runOnFunction`` to limit the
1917330f729Sjoergnumber of iterations.
1927330f729Sjoerg
1937330f729SjoergYou may also find it useful to use "``-stats``" now to see what parts of
1947330f729Sjoerginstcombine are firing.  This can guide where to put additional reporting code.
1957330f729Sjoerg
1967330f729SjoergAt this point, if the amount of transformations is still too large, then
1977330f729Sjoerginserting code to limit whether or not to execute the body of the code in the
1987330f729Sjoergvisit function can be helpful.  Add a static counter which is incremented on
1997330f729Sjoergevery invocation of the function.  Then add code which simply returns false on
2007330f729Sjoergdesired ranges.  For example:
2017330f729Sjoerg
2027330f729Sjoerg.. code-block:: c++
2037330f729Sjoerg
2047330f729Sjoerg
2057330f729Sjoerg  static int calledCount = 0;
2067330f729Sjoerg  calledCount++;
2077330f729Sjoerg  LLVM_DEBUG(if (calledCount < 212) return false);
2087330f729Sjoerg  LLVM_DEBUG(if (calledCount > 217) return false);
2097330f729Sjoerg  LLVM_DEBUG(if (calledCount == 213) return false);
2107330f729Sjoerg  LLVM_DEBUG(if (calledCount == 214) return false);
2117330f729Sjoerg  LLVM_DEBUG(if (calledCount == 215) return false);
2127330f729Sjoerg  LLVM_DEBUG(if (calledCount == 216) return false);
2137330f729Sjoerg  LLVM_DEBUG(dbgs() << "visitXOR calledCount: " << calledCount << "\n");
2147330f729Sjoerg  LLVM_DEBUG(dbgs() << "I: "; I->dump());
2157330f729Sjoerg
2167330f729Sjoergcould be added to ``visitXOR`` to limit ``visitXor`` to being applied only to
2177330f729Sjoergcalls 212 and 217. This is from an actual test case and raises an important
2187330f729Sjoergpoint---a simple binary search may not be sufficient, as transformations that
2197330f729Sjoerginteract may require isolating more than one call.  In TargetLowering, use
2207330f729Sjoerg``return SDNode();`` instead of ``return false;``.
2217330f729Sjoerg
2227330f729SjoergNow that the number of transformations is down to a manageable number, try
2237330f729Sjoergexamining the output to see if you can figure out which transformations are
2247330f729Sjoergbeing done.  If that can be figured out, then do the usual debugging.  If which
2257330f729Sjoergcode corresponds to the transformation being performed isn't obvious, set a
2267330f729Sjoergbreakpoint after the call count based disabling and step through the code.
2277330f729SjoergAlternatively, you can use "``printf``" style debugging to report waypoints.
228