17330f729Sjoerg==================================== 27330f729SjoergLLVM bugpoint tool: design and usage 37330f729Sjoerg==================================== 47330f729Sjoerg 57330f729Sjoerg.. contents:: 67330f729Sjoerg :local: 77330f729Sjoerg 87330f729SjoergDescription 97330f729Sjoerg=========== 107330f729Sjoerg 117330f729Sjoerg``bugpoint`` narrows down the source of problems in LLVM tools and passes. It 127330f729Sjoergcan be used to debug three types of failures: optimizer crashes, miscompilations 137330f729Sjoergby optimizers, or bad native code generation (including problems in the static 147330f729Sjoergand JIT compilers). It aims to reduce large test cases to small, useful ones. 157330f729SjoergFor example, if ``opt`` crashes while optimizing a file, it will identify the 167330f729Sjoergoptimization (or combination of optimizations) that causes the crash, and reduce 177330f729Sjoergthe file down to a small example which triggers the crash. 187330f729Sjoerg 197330f729SjoergFor detailed case scenarios, such as debugging ``opt``, or one of the LLVM code 207330f729Sjoerggenerators, see :doc:`HowToSubmitABug`. 217330f729Sjoerg 227330f729SjoergDesign Philosophy 237330f729Sjoerg================= 247330f729Sjoerg 257330f729Sjoerg``bugpoint`` is designed to be a useful tool without requiring any hooks into 267330f729Sjoergthe LLVM infrastructure at all. It works with any and all LLVM passes and code 277330f729Sjoerggenerators, and does not need to "know" how they work. Because of this, it may 287330f729Sjoergappear to do stupid things or miss obvious simplifications. ``bugpoint`` is 297330f729Sjoergalso designed to trade off programmer time for computer time in the 307330f729Sjoergcompiler-debugging process; consequently, it may take a long period of 317330f729Sjoerg(unattended) time to reduce a test case, but we feel it is still worth it. Note 327330f729Sjoergthat ``bugpoint`` is generally very quick unless debugging a miscompilation 337330f729Sjoergwhere each test of the program (which requires executing it) takes a long time. 347330f729Sjoerg 357330f729SjoergAutomatic Debugger Selection 367330f729Sjoerg---------------------------- 377330f729Sjoerg 387330f729Sjoerg``bugpoint`` reads each ``.bc`` or ``.ll`` file specified on the command line 397330f729Sjoergand links them together into a single module, called the test program. If any 407330f729SjoergLLVM passes are specified on the command line, it runs these passes on the test 417330f729Sjoergprogram. If any of the passes crash, or if they produce malformed output (which 427330f729Sjoergcauses the verifier to abort), ``bugpoint`` starts the `crash debugger`_. 437330f729Sjoerg 447330f729SjoergOtherwise, if the ``-output`` option was not specified, ``bugpoint`` runs the 457330f729Sjoergtest program with the "safe" backend (which is assumed to generate good code) to 467330f729Sjoerggenerate a reference output. Once ``bugpoint`` has a reference output for the 477330f729Sjoergtest program, it tries executing it with the selected code generator. If the 487330f729Sjoergselected code generator crashes, ``bugpoint`` starts the `crash debugger`_ on 497330f729Sjoergthe code generator. Otherwise, if the resulting output differs from the 507330f729Sjoergreference output, it assumes the difference resulted from a code generator 517330f729Sjoergfailure, and starts the `code generator debugger`_. 527330f729Sjoerg 537330f729SjoergFinally, if the output of the selected code generator matches the reference 547330f729Sjoergoutput, ``bugpoint`` runs the test program after all of the LLVM passes have 557330f729Sjoergbeen applied to it. If its output differs from the reference output, it assumes 567330f729Sjoergthe difference resulted from a failure in one of the LLVM passes, and enters the 577330f729Sjoerg`miscompilation debugger`_. Otherwise, there is no problem ``bugpoint`` can 587330f729Sjoergdebug. 597330f729Sjoerg 607330f729Sjoerg.. _crash debugger: 617330f729Sjoerg 627330f729SjoergCrash debugger 637330f729Sjoerg-------------- 647330f729Sjoerg 657330f729SjoergIf an optimizer or code generator crashes, ``bugpoint`` will try as hard as it 667330f729Sjoergcan to reduce the list of passes (for optimizer crashes) and the size of the 677330f729Sjoergtest program. First, ``bugpoint`` figures out which combination of optimizer 687330f729Sjoergpasses triggers the bug. This is useful when debugging a problem exposed by 697330f729Sjoerg``opt``, for example, because it runs over 38 passes. 707330f729Sjoerg 717330f729SjoergNext, ``bugpoint`` tries removing functions from the test program, to reduce its 727330f729Sjoergsize. Usually it is able to reduce a test program to a single function, when 737330f729Sjoergdebugging intraprocedural optimizations. Once the number of functions has been 747330f729Sjoergreduced, it attempts to delete various edges in the control flow graph, to 757330f729Sjoergreduce the size of the function as much as possible. Finally, ``bugpoint`` 767330f729Sjoergdeletes any individual LLVM instructions whose absence does not eliminate the 777330f729Sjoergfailure. At the end, ``bugpoint`` should tell you what passes crash, give you a 787330f729Sjoergbitcode file, and give you instructions on how to reproduce the failure with 797330f729Sjoerg``opt`` or ``llc``. 807330f729Sjoerg 817330f729Sjoerg.. _code generator debugger: 827330f729Sjoerg 837330f729SjoergCode generator debugger 847330f729Sjoerg----------------------- 857330f729Sjoerg 867330f729SjoergThe code generator debugger attempts to narrow down the amount of code that is 877330f729Sjoergbeing miscompiled by the selected code generator. To do this, it takes the test 887330f729Sjoergprogram and partitions it into two pieces: one piece which it compiles with the 897330f729Sjoerg"safe" backend (into a shared object), and one piece which it runs with either 907330f729Sjoergthe JIT or the static LLC compiler. It uses several techniques to reduce the 917330f729Sjoergamount of code pushed through the LLVM code generator, to reduce the potential 927330f729Sjoergscope of the problem. After it is finished, it emits two bitcode files (called 937330f729Sjoerg"test" [to be compiled with the code generator] and "safe" [to be compiled with 947330f729Sjoergthe "safe" backend], respectively), and instructions for reproducing the 957330f729Sjoergproblem. The code generator debugger assumes that the "safe" backend produces 967330f729Sjoerggood code. 977330f729Sjoerg 987330f729Sjoerg.. _miscompilation debugger: 997330f729Sjoerg 1007330f729SjoergMiscompilation debugger 1017330f729Sjoerg----------------------- 1027330f729Sjoerg 1037330f729SjoergThe miscompilation debugger works similarly to the code generator debugger. It 1047330f729Sjoergworks by splitting the test program into two pieces, running the optimizations 1057330f729Sjoergspecified on one piece, linking the two pieces back together, and then executing 1067330f729Sjoergthe result. It attempts to narrow down the list of passes to the one (or few) 1077330f729Sjoergwhich are causing the miscompilation, then reduce the portion of the test 1087330f729Sjoergprogram which is being miscompiled. The miscompilation debugger assumes that 1097330f729Sjoergthe selected code generator is working properly. 1107330f729Sjoerg 1117330f729SjoergAdvice for using bugpoint 1127330f729Sjoerg========================= 1137330f729Sjoerg 1147330f729Sjoerg``bugpoint`` can be a remarkably useful tool, but it sometimes works in 1157330f729Sjoergnon-obvious ways. Here are some hints and tips: 1167330f729Sjoerg 1177330f729Sjoerg* In the code generator and miscompilation debuggers, ``bugpoint`` only works 1187330f729Sjoerg with programs that have deterministic output. Thus, if the program outputs 1197330f729Sjoerg ``argv[0]``, the date, time, or any other "random" data, ``bugpoint`` may 1207330f729Sjoerg misinterpret differences in these data, when output, as the result of a 1217330f729Sjoerg miscompilation. Programs should be temporarily modified to disable outputs 1227330f729Sjoerg that are likely to vary from run to run. 1237330f729Sjoerg 124*82d56013Sjoerg* In the `crash debugger`_, ``bugpoint`` does not distinguish different crashes 1257330f729Sjoerg during reduction. Thus, if new crash or miscompilation happens, ``bugpoint`` 1267330f729Sjoerg will continue with the new crash instead. If you would like to stick to 1277330f729Sjoerg particular crash, you should write check scripts to validate the error 1287330f729Sjoerg message, see ``-compile-command`` in :doc:`CommandGuide/bugpoint`. 1297330f729Sjoerg 1307330f729Sjoerg* In the code generator and miscompilation debuggers, debugging will go faster 1317330f729Sjoerg if you manually modify the program or its inputs to reduce the runtime, but 1327330f729Sjoerg still exhibit the problem. 1337330f729Sjoerg 1347330f729Sjoerg* ``bugpoint`` is extremely useful when working on a new optimization: it helps 1357330f729Sjoerg track down regressions quickly. To avoid having to relink ``bugpoint`` every 1367330f729Sjoerg time you change your optimization however, have ``bugpoint`` dynamically load 1377330f729Sjoerg your optimization with the ``-load`` option. 1387330f729Sjoerg 1397330f729Sjoerg* ``bugpoint`` can generate a lot of output and run for a long period of time. 1407330f729Sjoerg It is often useful to capture the output of the program to file. For example, 1417330f729Sjoerg in the C shell, you can run: 1427330f729Sjoerg 1437330f729Sjoerg .. code-block:: console 1447330f729Sjoerg 1457330f729Sjoerg $ bugpoint ... |& tee bugpoint.log 1467330f729Sjoerg 1477330f729Sjoerg to get a copy of ``bugpoint``'s output in the file ``bugpoint.log``, as well 1487330f729Sjoerg as on your terminal. 1497330f729Sjoerg 1507330f729Sjoerg* ``bugpoint`` cannot debug problems with the LLVM linker. If ``bugpoint`` 1517330f729Sjoerg crashes before you see its "All input ok" message, you might try ``llvm-link 1527330f729Sjoerg -v`` on the same set of input files. If that also crashes, you may be 1537330f729Sjoerg experiencing a linker bug. 1547330f729Sjoerg 1557330f729Sjoerg* ``bugpoint`` is useful for proactively finding bugs in LLVM. Invoking 1567330f729Sjoerg ``bugpoint`` with the ``-find-bugs`` option will cause the list of specified 1577330f729Sjoerg optimizations to be randomized and applied to the program. This process will 1587330f729Sjoerg repeat until a bug is found or the user kills ``bugpoint``. 1597330f729Sjoerg 1607330f729Sjoerg* ``bugpoint`` can produce IR which contains long names. Run ``opt 1617330f729Sjoerg -metarenamer`` over the IR to rename everything using easy-to-read, 1627330f729Sjoerg metasyntactic names. Alternatively, run ``opt -strip -instnamer`` to rename 1637330f729Sjoerg everything with very short (often purely numeric) names. 1647330f729Sjoerg 1657330f729SjoergWhat to do when bugpoint isn't enough 1667330f729Sjoerg===================================== 1677330f729Sjoerg 1687330f729SjoergSometimes, ``bugpoint`` is not enough. In particular, InstCombine and 1697330f729SjoergTargetLowering both have visitor structured code with lots of potential 1707330f729Sjoergtransformations. If the process of using bugpoint has left you with still too 1717330f729Sjoergmuch code to figure out and the problem seems to be in instcombine, the 1727330f729Sjoergfollowing steps may help. These same techniques are useful with TargetLowering 1737330f729Sjoergas well. 1747330f729Sjoerg 1757330f729SjoergTurn on ``-debug-only=instcombine`` and see which transformations within 1767330f729Sjoerginstcombine are firing by selecting out lines with "``IC``" in them. 1777330f729Sjoerg 1787330f729SjoergAt this point, you have a decision to make. Is the number of transformations 1797330f729Sjoergsmall enough to step through them using a debugger? If so, then try that. 1807330f729Sjoerg 1817330f729SjoergIf there are too many transformations, then a source modification approach may 1827330f729Sjoergbe helpful. In this approach, you can modify the source code of instcombine to 1837330f729Sjoergdisable just those transformations that are being performed on your test input 1847330f729Sjoergand perform a binary search over the set of transformations. One set of places 1857330f729Sjoergto modify are the "``visit*``" methods of ``InstCombiner`` (*e.g.* 1867330f729Sjoerg``visitICmpInst``) by adding a "``return false``" as the first line of the 1877330f729Sjoergmethod. 1887330f729Sjoerg 1897330f729SjoergIf that still doesn't remove enough, then change the caller of 1907330f729Sjoerg``InstCombiner::DoOneIteration``, ``InstCombiner::runOnFunction`` to limit the 1917330f729Sjoergnumber of iterations. 1927330f729Sjoerg 1937330f729SjoergYou may also find it useful to use "``-stats``" now to see what parts of 1947330f729Sjoerginstcombine are firing. This can guide where to put additional reporting code. 1957330f729Sjoerg 1967330f729SjoergAt this point, if the amount of transformations is still too large, then 1977330f729Sjoerginserting code to limit whether or not to execute the body of the code in the 1987330f729Sjoergvisit function can be helpful. Add a static counter which is incremented on 1997330f729Sjoergevery invocation of the function. Then add code which simply returns false on 2007330f729Sjoergdesired ranges. For example: 2017330f729Sjoerg 2027330f729Sjoerg.. code-block:: c++ 2037330f729Sjoerg 2047330f729Sjoerg 2057330f729Sjoerg static int calledCount = 0; 2067330f729Sjoerg calledCount++; 2077330f729Sjoerg LLVM_DEBUG(if (calledCount < 212) return false); 2087330f729Sjoerg LLVM_DEBUG(if (calledCount > 217) return false); 2097330f729Sjoerg LLVM_DEBUG(if (calledCount == 213) return false); 2107330f729Sjoerg LLVM_DEBUG(if (calledCount == 214) return false); 2117330f729Sjoerg LLVM_DEBUG(if (calledCount == 215) return false); 2127330f729Sjoerg LLVM_DEBUG(if (calledCount == 216) return false); 2137330f729Sjoerg LLVM_DEBUG(dbgs() << "visitXOR calledCount: " << calledCount << "\n"); 2147330f729Sjoerg LLVM_DEBUG(dbgs() << "I: "; I->dump()); 2157330f729Sjoerg 2167330f729Sjoergcould be added to ``visitXOR`` to limit ``visitXor`` to being applied only to 2177330f729Sjoergcalls 212 and 217. This is from an actual test case and raises an important 2187330f729Sjoergpoint---a simple binary search may not be sufficient, as transformations that 2197330f729Sjoerginteract may require isolating more than one call. In TargetLowering, use 2207330f729Sjoerg``return SDNode();`` instead of ``return false;``. 2217330f729Sjoerg 2227330f729SjoergNow that the number of transformations is down to a manageable number, try 2237330f729Sjoergexamining the output to see if you can figure out which transformations are 2247330f729Sjoergbeing done. If that can be figured out, then do the usual debugging. If which 2257330f729Sjoergcode corresponds to the transformation being performed isn't obvious, set a 2267330f729Sjoergbreakpoint after the call count based disabling and step through the code. 2277330f729SjoergAlternatively, you can use "``printf``" style debugging to report waypoints. 228