1f0c3a346SStepan Dyatkovskiy================================= 2f0c3a346SStepan DyatkovskiyMergeFunctions pass, how it works 3f0c3a346SStepan Dyatkovskiy================================= 4f0c3a346SStepan Dyatkovskiy 5f0c3a346SStepan Dyatkovskiy.. contents:: 6f0c3a346SStepan Dyatkovskiy :local: 7f0c3a346SStepan Dyatkovskiy 8f0c3a346SStepan DyatkovskiyIntroduction 9f0c3a346SStepan Dyatkovskiy============ 10f0c3a346SStepan DyatkovskiySometimes code contains equal functions, or functions that does exactly the same 11f0c3a346SStepan Dyatkovskiything even though they are non-equal on the IR level (e.g.: multiplication on 2 12f0c3a346SStepan Dyatkovskiyand 'shl 1'). It could happen due to several reasons: mainly, the usage of 136373f5ddSAditya Kumartemplates and automatic code generators. Though, sometimes the user itself could 14f0c3a346SStepan Dyatkovskiywrite the same thing twice :-) 15f0c3a346SStepan Dyatkovskiy 16f0c3a346SStepan DyatkovskiyThe main purpose of this pass is to recognize such functions and merge them. 17f0c3a346SStepan Dyatkovskiy 186373f5ddSAditya KumarThis document is the extension to pass comments and describes the pass logic. It 196373f5ddSAditya Kumardescribes the algorithm that is used in order to compare functions and 206373f5ddSAditya Kumarexplains how we could combine equal functions correctly to keep the module 216373f5ddSAditya Kumarvalid. 22f0c3a346SStepan Dyatkovskiy 236373f5ddSAditya KumarMaterial is brought in a top-down form, so the reader could start to learn pass 246373f5ddSAditya Kumarfrom high level ideas and end with low-level algorithm details, thus preparing 256373f5ddSAditya Kumarhim or her for reading the sources. 26f0c3a346SStepan Dyatkovskiy 276373f5ddSAditya KumarThe main goal is to describe the algorithm and logic here and the concept. If 286373f5ddSAditya Kumaryou *don't want* to read the source code, but want to understand pass 296373f5ddSAditya Kumaralgorithms, this document is good for you. The author tries not to repeat the 306373f5ddSAditya Kumarsource-code and covers only common cases to avoid the cases of needing to 316373f5ddSAditya Kumarupdate this document after any minor code changes. 32f0c3a346SStepan Dyatkovskiy 33f0c3a346SStepan Dyatkovskiy 34f0c3a346SStepan DyatkovskiyWhat should I know to be able to follow along with this document? 35f0c3a346SStepan Dyatkovskiy----------------------------------------------------------------- 36f0c3a346SStepan Dyatkovskiy 376373f5ddSAditya KumarThe reader should be familiar with common compile-engineering principles and 386373f5ddSAditya KumarLLVM code fundamentals. In this article, we assume the reader is familiar with 396373f5ddSAditya Kumar`Single Static Assignment 406373f5ddSAditya Kumar<http://en.wikipedia.org/wiki/Static_single_assignment_form>`_ 416373f5ddSAditya Kumarconcept and has an understanding of 4272fd1033SSylvestre Ledru`IR structure <https://llvm.org/docs/LangRef.html#high-level-structure>`_. 43f0c3a346SStepan Dyatkovskiy 446373f5ddSAditya KumarWe will use terms such as 4572fd1033SSylvestre Ledru"`module <https://llvm.org/docs/LangRef.html#high-level-structure>`_", 4672fd1033SSylvestre Ledru"`function <https://llvm.org/docs/ProgrammersManual.html#the-function-class>`_", 47f0c3a346SStepan Dyatkovskiy"`basic block <http://en.wikipedia.org/wiki/Basic_block>`_", 4872fd1033SSylvestre Ledru"`user <https://llvm.org/docs/ProgrammersManual.html#the-user-class>`_", 4972fd1033SSylvestre Ledru"`value <https://llvm.org/docs/ProgrammersManual.html#the-value-class>`_", 506373f5ddSAditya Kumar"`instruction 5172fd1033SSylvestre Ledru<https://llvm.org/docs/ProgrammersManual.html#the-instruction-class>`_". 52f0c3a346SStepan Dyatkovskiy 536373f5ddSAditya KumarAs a good starting point, the Kaleidoscope tutorial can be used: 54f0c3a346SStepan Dyatkovskiy 55f0c3a346SStepan Dyatkovskiy:doc:`tutorial/index` 56f0c3a346SStepan Dyatkovskiy 576373f5ddSAditya KumarIt's especially important to understand chapter 3 of tutorial: 58f0c3a346SStepan Dyatkovskiy 59d8b9735aSEtienne Bergeron:doc:`tutorial/LangImpl03` 60f0c3a346SStepan Dyatkovskiy 616373f5ddSAditya KumarThe reader should also know how passes work in LLVM. They could use this 626373f5ddSAditya Kumararticle as a reference and start point here: 63f0c3a346SStepan Dyatkovskiy 64f0c3a346SStepan Dyatkovskiy:doc:`WritingAnLLVMPass` 65f0c3a346SStepan Dyatkovskiy 666373f5ddSAditya KumarWhat else? Well perhaps the reader should also have some experience in LLVM pass 67f0c3a346SStepan Dyatkovskiydebugging and bug-fixing. 68f0c3a346SStepan Dyatkovskiy 69f0c3a346SStepan DyatkovskiyNarrative structure 70f0c3a346SStepan Dyatkovskiy------------------- 716373f5ddSAditya KumarThe article consists of three parts. The first part explains pass functionality 726373f5ddSAditya Kumaron the top-level. The second part describes the comparison procedure itself. 736373f5ddSAditya KumarThe third part describes the merging process. 74f0c3a346SStepan Dyatkovskiy 756373f5ddSAditya KumarIn every part, the author tries to put the contents in the top-down form. 766373f5ddSAditya KumarThe top-level methods will first be described followed by the terminal ones at 776373f5ddSAditya Kumarthe end, in the tail of each part. If the reader sees the reference to the 78dfcd3dcfSChandler Carruthmethod that wasn't described yet, they will find its description a bit below. 79f0c3a346SStepan Dyatkovskiy 80f0c3a346SStepan DyatkovskiyBasics 81f0c3a346SStepan Dyatkovskiy====== 82f0c3a346SStepan Dyatkovskiy 83f0c3a346SStepan DyatkovskiyHow to do it? 84f0c3a346SStepan Dyatkovskiy------------- 856373f5ddSAditya KumarDo we need to merge functions? The obvious answer is: Yes, that is quite a 866373f5ddSAditya Kumarpossible case. We usually *do* have duplicates and it would be good to get rid 876373f5ddSAditya Kumarof them. But how do we detect duplicates? This is the idea: we split functions 886373f5ddSAditya Kumarinto smaller bricks or parts and compare the "bricks" amount. If equal, 896373f5ddSAditya Kumarwe compare the "bricks" themselves, and then do our conclusions about functions 90f0c3a346SStepan Dyatkovskiythemselves. 91f0c3a346SStepan Dyatkovskiy 926373f5ddSAditya KumarWhat could the difference be? For example, on a machine with 64-bit pointers 936373f5ddSAditya Kumar(let's assume we have only one address space), one function stores a 64-bit 946373f5ddSAditya Kumarinteger, while another one stores a pointer. If the target is the machine 95f0c3a346SStepan Dyatkovskiymentioned above, and if functions are identical, except the parameter type (we 966373f5ddSAditya Kumarcould consider it as a part of function type), then we can treat a ``uint64_t`` 976373f5ddSAditya Kumarand a ``void*`` as equal. 98f0c3a346SStepan Dyatkovskiy 996373f5ddSAditya KumarThis is just an example; more possible details are described a bit below. 100f0c3a346SStepan Dyatkovskiy 1016373f5ddSAditya KumarAs another example, the reader may imagine two more functions. The first 1026a6a83c6SSimon Pilgrimfunction performs a multiplication by 2, while the second one performs an 1036a6a83c6SSimon Pilgrimlogical left shift by 1. 104f0c3a346SStepan Dyatkovskiy 105f0c3a346SStepan DyatkovskiyPossible solutions 106f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^ 107f0c3a346SStepan DyatkovskiyLet's briefly consider possible options about how and what we have to implement 108f0c3a346SStepan Dyatkovskiyin order to create full-featured functions merging, and also what it would 1096373f5ddSAditya Kumarmean for us. 110f0c3a346SStepan Dyatkovskiy 1116373f5ddSAditya KumarEqual function detection obviously supposes that a "detector" method to be 1126373f5ddSAditya Kumarimplemented and latter should answer the question "whether functions are equal". 1136373f5ddSAditya KumarThis "detector" method consists of tiny "sub-detectors", which each answers 114f0c3a346SStepan Dyatkovskiyexactly the same question, but for function parts. 115f0c3a346SStepan Dyatkovskiy 116f0c3a346SStepan DyatkovskiyAs the second step, we should merge equal functions. So it should be a "merger" 117f0c3a346SStepan Dyatkovskiymethod. "Merger" accepts two functions *F1* and *F2*, and produces *F1F2* 118f0c3a346SStepan Dyatkovskiyfunction, the result of merging. 119f0c3a346SStepan Dyatkovskiy 1206373f5ddSAditya KumarHaving such routines in our hands, we can process a whole module, and merge all 121f0c3a346SStepan Dyatkovskiyequal functions. 122f0c3a346SStepan Dyatkovskiy 123f0c3a346SStepan DyatkovskiyIn this case, we have to compare every function with every another function. As 1246373f5ddSAditya Kumarthe reader may notice, this way seems to be quite expensive. Of course we could 125f0c3a346SStepan Dyatkovskiyintroduce hashing and other helpers, but it is still just an optimization, and 126f0c3a346SStepan Dyatkovskiythus the level of O(N*N) complexity. 127f0c3a346SStepan Dyatkovskiy 128f0c3a346SStepan DyatkovskiyCan we reach another level? Could we introduce logarithmical search, or random 129f0c3a346SStepan Dyatkovskiyaccess lookup? The answer is: "yes". 130f0c3a346SStepan Dyatkovskiy 131f0c3a346SStepan DyatkovskiyRandom-access 132f0c3a346SStepan Dyatkovskiy""""""""""""" 1336373f5ddSAditya KumarHow it could this be done? Just convert each function to a number, and gather 1346373f5ddSAditya Kumarall of them in a special hash-table. Functions with equal hashes are equal. 1356373f5ddSAditya KumarGood hashing means, that every function part must be taken into account. That 1366373f5ddSAditya Kumarmeans we have to convert every function part into some number, and then add it 137*e8fa9014SKazu Hiratainto the hash. The lookup-up time would be small, but such an approach adds some 1386373f5ddSAditya Kumardelay due to the hashing routine. 139f0c3a346SStepan Dyatkovskiy 140f0c3a346SStepan DyatkovskiyLogarithmical search 141f0c3a346SStepan Dyatkovskiy"""""""""""""""""""" 1426373f5ddSAditya KumarWe could introduce total ordering among the functions set, once ordered we 143f0c3a346SStepan Dyatkovskiycould then implement a logarithmical search. Lookup time still depends on N, 144f0c3a346SStepan Dyatkovskiybut adds a little of delay (*log(N)*). 145f0c3a346SStepan Dyatkovskiy 146f0c3a346SStepan DyatkovskiyPresent state 147f0c3a346SStepan Dyatkovskiy""""""""""""" 1486373f5ddSAditya KumarBoth of the approaches (random-access and logarithmical) have been implemented 1496373f5ddSAditya Kumarand tested and both give a very good improvement. What was most 1506373f5ddSAditya Kumarsurprising is that logarithmical search was faster; sometimes by up to 15%. The 1516373f5ddSAditya Kumarhashing method needs some extra CPU time, which is the main reason why it works 1526373f5ddSAditya Kumarslower; in most cases, total "hashing" time is greater than total 1536373f5ddSAditya Kumar"logarithmical-search" time. 154f0c3a346SStepan Dyatkovskiy 155f0c3a346SStepan DyatkovskiySo, preference has been granted to the "logarithmical search". 156f0c3a346SStepan Dyatkovskiy 157f0c3a346SStepan DyatkovskiyThough in the case of need, *logarithmical-search* (read "total-ordering") could 158f0c3a346SStepan Dyatkovskiybe used as a milestone on our way to the *random-access* implementation. 159f0c3a346SStepan Dyatkovskiy 1606373f5ddSAditya KumarEvery comparison is based either on the numbers or on the flags comparison. In 1616373f5ddSAditya Kumarthe *random-access* approach, we could use the same comparison algorithm. 1626373f5ddSAditya KumarDuring comparison, we exit once we find the difference, but here we might have 1636373f5ddSAditya Kumarto scan the whole function body every time (note, it could be slower). Like in 1646373f5ddSAditya Kumar"total-ordering", we will track every number and flag, but instead of 1656373f5ddSAditya Kumarcomparison, we should get the numbers sequence and then create the hash number. 1666373f5ddSAditya KumarSo, once again, *total-ordering* could be considered as a milestone for even 1676373f5ddSAditya Kumarfaster (in theory) random-access approach. 168f0c3a346SStepan Dyatkovskiy 169f0c3a346SStepan DyatkovskiyMergeFunctions, main fields and runOnModule 170f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1716373f5ddSAditya KumarThere are two main important fields in the class: 172f0c3a346SStepan Dyatkovskiy 173f0c3a346SStepan Dyatkovskiy``FnTree`` – the set of all unique functions. It keeps items that couldn't be 174f0c3a346SStepan Dyatkovskiymerged with each other. It is defined as: 175f0c3a346SStepan Dyatkovskiy 176f0c3a346SStepan Dyatkovskiy``std::set<FunctionNode> FnTree;`` 177f0c3a346SStepan Dyatkovskiy 178f0c3a346SStepan DyatkovskiyHere ``FunctionNode`` is a wrapper for ``llvm::Function`` class, with 179f0c3a346SStepan Dyatkovskiyimplemented “<” operator among the functions set (below we explain how it works 180f0c3a346SStepan Dyatkovskiyexactly; this is a key point in fast functions comparison). 181f0c3a346SStepan Dyatkovskiy 182f0c3a346SStepan Dyatkovskiy``Deferred`` – merging process can affect bodies of functions that are in 1836373f5ddSAditya Kumar``FnTree`` already. Obviously, such functions should be rechecked again. In this 1846373f5ddSAditya Kumarcase, we remove them from ``FnTree``, and mark them to be rescanned, namely 185f0c3a346SStepan Dyatkovskiyput them into ``Deferred`` list. 186f0c3a346SStepan Dyatkovskiy 187f0c3a346SStepan DyatkovskiyrunOnModule 188f0c3a346SStepan Dyatkovskiy""""""""""" 189f0c3a346SStepan DyatkovskiyThe algorithm is pretty simple: 190f0c3a346SStepan Dyatkovskiy 191f0c3a346SStepan Dyatkovskiy1. Put all module's functions into the *worklist*. 192f0c3a346SStepan Dyatkovskiy 193f0c3a346SStepan Dyatkovskiy2. Scan *worklist*'s functions twice: first enumerate only strong functions and 194f0c3a346SStepan Dyatkovskiythen only weak ones: 195f0c3a346SStepan Dyatkovskiy 1966373f5ddSAditya Kumar 2.1. Loop body: take a function from *worklist* (call it *FCur*) and try to 197f0c3a346SStepan Dyatkovskiy insert it into *FnTree*: check whether *FCur* is equal to one of functions 1986373f5ddSAditya Kumar in *FnTree*. If there *is* an equal function in *FnTree* 1996373f5ddSAditya Kumar (call it *FExists*): merge function *FCur* with *FExists*. Otherwise add 2006373f5ddSAditya Kumar the function from the *worklist* to *FnTree*. 201f0c3a346SStepan Dyatkovskiy 2026373f5ddSAditya Kumar3. Once the *worklist* scanning and merging operations are complete, check the 2036373f5ddSAditya Kumar*Deferred* list. If it is not empty: refill the *worklist* contents with 2046373f5ddSAditya Kumar*Deferred* list and redo step 2, if the *Deferred* list is empty, then exit 2056373f5ddSAditya Kumarfrom method. 206f0c3a346SStepan Dyatkovskiy 207f0c3a346SStepan DyatkovskiyComparison and logarithmical search 208f0c3a346SStepan Dyatkovskiy""""""""""""""""""""""""""""""""""" 209f0c3a346SStepan DyatkovskiyLet's recall our task: for every function *F* from module *M*, we have to find 2106373f5ddSAditya Kumarequal functions *F`* in the shortest time possible , and merge them into a 2116373f5ddSAditya Kumarsingle function. 212f0c3a346SStepan Dyatkovskiy 2136373f5ddSAditya KumarDefining total ordering among the functions set allows us to organize 2146373f5ddSAditya Kumarfunctions into a binary tree. The lookup procedure complexity would be 2156373f5ddSAditya Kumarestimated as O(log(N)) in this case. But how do we define *total-ordering*? 216f0c3a346SStepan Dyatkovskiy 217f0c3a346SStepan DyatkovskiyWe have to introduce a single rule applicable to every pair of functions, and 2186373f5ddSAditya Kumarfollowing this rule, then evaluate which of them is greater. What kind of rule 2196373f5ddSAditya Kumarcould it be? Let's declare it as the "compare" method that returns one of 3 220f0c3a346SStepan Dyatkovskiypossible values: 221f0c3a346SStepan Dyatkovskiy 222f0c3a346SStepan Dyatkovskiy-1, left is *less* than right, 223f0c3a346SStepan Dyatkovskiy 224f0c3a346SStepan Dyatkovskiy0, left and right are *equal*, 225f0c3a346SStepan Dyatkovskiy 226f0c3a346SStepan Dyatkovskiy1, left is *greater* than right. 227f0c3a346SStepan Dyatkovskiy 228f0c3a346SStepan DyatkovskiyOf course it means, that we have to maintain 229f0c3a346SStepan Dyatkovskiy*strict and non-strict order relation properties*: 230f0c3a346SStepan Dyatkovskiy 231f0c3a346SStepan Dyatkovskiy* reflexivity (``a <= a``, ``a == a``, ``a >= a``), 232f0c3a346SStepan Dyatkovskiy* antisymmetry (if ``a <= b`` and ``b <= a`` then ``a == b``), 233f0c3a346SStepan Dyatkovskiy* transitivity (``a <= b`` and ``b <= c``, then ``a <= c``) 234f0c3a346SStepan Dyatkovskiy* asymmetry (if ``a < b``, then ``a > b`` or ``a == b``). 235f0c3a346SStepan Dyatkovskiy 2366373f5ddSAditya KumarAs mentioned before, the comparison routine consists of 2376373f5ddSAditya Kumar"sub-comparison-routines", with each of them also consisting of 2386373f5ddSAditya Kumar"sub-comparison-routines", and so on. Finally, it ends up with primitive 239f0c3a346SStepan Dyatkovskiycomparison. 240f0c3a346SStepan Dyatkovskiy 2416373f5ddSAditya KumarBelow, we will use the following operations: 242f0c3a346SStepan Dyatkovskiy 2436373f5ddSAditya Kumar#. ``cmpNumbers(number1, number2)`` is a method that returns -1 if left is less 244f0c3a346SStepan Dyatkovskiy than right; 0, if left and right are equal; and 1 otherwise. 245f0c3a346SStepan Dyatkovskiy 2466373f5ddSAditya Kumar#. ``cmpFlags(flag1, flag2)`` is a hypothetical method that compares two flags. 247f0c3a346SStepan Dyatkovskiy The logic is the same as in ``cmpNumbers``, where ``true`` is 1, and 248f0c3a346SStepan Dyatkovskiy ``false`` is 0. 249f0c3a346SStepan Dyatkovskiy 2506373f5ddSAditya KumarThe rest of the article is based on *MergeFunctions.cpp* source code 2516373f5ddSAditya Kumar(found in *<llvm_dir>/lib/Transforms/IPO/MergeFunctions.cpp*). We would like 2526373f5ddSAditya Kumarto ask reader to keep this file open, so we could use it as a reference 2536373f5ddSAditya Kumarfor further explanations. 254f0c3a346SStepan Dyatkovskiy 2556373f5ddSAditya KumarNow, we're ready to proceed to the next chapter and see how it works. 256f0c3a346SStepan Dyatkovskiy 257f0c3a346SStepan DyatkovskiyFunctions comparison 258f0c3a346SStepan Dyatkovskiy==================== 259f0c3a346SStepan DyatkovskiyAt first, let's define how exactly we compare complex objects. 260f0c3a346SStepan Dyatkovskiy 2616373f5ddSAditya KumarComplex object comparison (function, basic-block, etc) is mostly based on its 2626373f5ddSAditya Kumarsub-object comparison results. It is similar to the next "tree" objects 263f0c3a346SStepan Dyatkovskiycomparison: 264f0c3a346SStepan Dyatkovskiy 265f0c3a346SStepan Dyatkovskiy#. For two trees *T1* and *T2* we perform *depth-first-traversal* and have 266f0c3a346SStepan Dyatkovskiy two sequences as a product: "*T1Items*" and "*T2Items*". 267f0c3a346SStepan Dyatkovskiy 2686373f5ddSAditya Kumar#. We then compare chains "*T1Items*" and "*T2Items*" in 2696373f5ddSAditya Kumar the most-significant-item-first order. The result of items comparison 2706373f5ddSAditya Kumar would be the result of *T1* and *T2* comparison itself. 271f0c3a346SStepan Dyatkovskiy 272f0c3a346SStepan DyatkovskiyFunctionComparator::compare(void) 273f0c3a346SStepan Dyatkovskiy--------------------------------- 2746373f5ddSAditya KumarA brief look at the source code tells us that the comparison starts in the 275f0c3a346SStepan Dyatkovskiy“``int FunctionComparator::compare(void)``” method. 276f0c3a346SStepan Dyatkovskiy 2776373f5ddSAditya Kumar1. The first parts to be compared are the function's attributes and some 2786373f5ddSAditya Kumarproperties that is outside the “attributes” term, but still could make the 2796373f5ddSAditya Kumarfunction different without changing its body. This part of the comparison is 2806373f5ddSAditya Kumarusually done within simple *cmpNumbers* or *cmpFlags* operations (e.g. 2816373f5ddSAditya Kumar``cmpFlags(F1->hasGC(), F2->hasGC())``). Below is a full list of function's 282f0c3a346SStepan Dyatkovskiyproperties to be compared on this stage: 283f0c3a346SStepan Dyatkovskiy 284f0c3a346SStepan Dyatkovskiy * *Attributes* (those are returned by ``Function::getAttributes()`` 285f0c3a346SStepan Dyatkovskiy method). 286f0c3a346SStepan Dyatkovskiy 287f0c3a346SStepan Dyatkovskiy * *GC*, for equivalence, *RHS* and *LHS* should be both either without 288f0c3a346SStepan Dyatkovskiy *GC* or with the same one. 289f0c3a346SStepan Dyatkovskiy 290f0c3a346SStepan Dyatkovskiy * *Section*, just like a *GC*: *RHS* and *LHS* should be defined in the 291f0c3a346SStepan Dyatkovskiy same section. 292f0c3a346SStepan Dyatkovskiy 293f0c3a346SStepan Dyatkovskiy * *Variable arguments*. *LHS* and *RHS* should be both either with or 294f0c3a346SStepan Dyatkovskiy without *var-args*. 295f0c3a346SStepan Dyatkovskiy 296f0c3a346SStepan Dyatkovskiy * *Calling convention* should be the same. 297f0c3a346SStepan Dyatkovskiy 298f0c3a346SStepan Dyatkovskiy2. Function type. Checked by ``FunctionComparator::cmpType(Type*, Type*)`` 299f0c3a346SStepan Dyatkovskiymethod. It checks return type and parameters type; the method itself will be 300f0c3a346SStepan Dyatkovskiydescribed later. 301f0c3a346SStepan Dyatkovskiy 302f0c3a346SStepan Dyatkovskiy3. Associate function formal parameters with each other. Then comparing function 303f0c3a346SStepan Dyatkovskiybodies, if we see the usage of *LHS*'s *i*-th argument in *LHS*'s body, then, 304f0c3a346SStepan Dyatkovskiywe want to see usage of *RHS*'s *i*-th argument at the same place in *RHS*'s 305f0c3a346SStepan Dyatkovskiybody, otherwise functions are different. On this stage we grant the preference 306f0c3a346SStepan Dyatkovskiyto those we met later in function body (value we met first would be *less*). 307f0c3a346SStepan DyatkovskiyThis is done by “``FunctionComparator::cmpValues(const Value*, const Value*)``” 308f0c3a346SStepan Dyatkovskiymethod (will be described a bit later). 309f0c3a346SStepan Dyatkovskiy 310f0c3a346SStepan Dyatkovskiy4. Function body comparison. As it written in method comments: 311f0c3a346SStepan Dyatkovskiy 312f0c3a346SStepan Dyatkovskiy“We do a CFG-ordered walk since the actual ordering of the blocks in the linked 313f0c3a346SStepan Dyatkovskiylist is immaterial. Our walk starts at the entry block for both functions, then 314f0c3a346SStepan Dyatkovskiytakes each block from each terminator in order. As an artifact, this also means 315f0c3a346SStepan Dyatkovskiythat unreachable blocks are ignored.” 316f0c3a346SStepan Dyatkovskiy 317f0c3a346SStepan DyatkovskiySo, using this walk we get BBs from *left* and *right* in the same order, and 318f0c3a346SStepan Dyatkovskiycompare them by “``FunctionComparator::compare(const BasicBlock*, const 319f0c3a346SStepan DyatkovskiyBasicBlock*)``” method. 320f0c3a346SStepan Dyatkovskiy 321f0c3a346SStepan DyatkovskiyWe also associate BBs with each other, like we did it with function formal 322f0c3a346SStepan Dyatkovskiyarguments (see ``cmpValues`` method below). 323f0c3a346SStepan Dyatkovskiy 324f0c3a346SStepan DyatkovskiyFunctionComparator::cmpType 325f0c3a346SStepan Dyatkovskiy--------------------------- 3266373f5ddSAditya KumarConsider how type comparison works. 327f0c3a346SStepan Dyatkovskiy 328f0c3a346SStepan Dyatkovskiy1. Coerce pointer to integer. If left type is a pointer, try to coerce it to the 329f0c3a346SStepan Dyatkovskiyinteger type. It could be done if its address space is 0, or if address spaces 330f0c3a346SStepan Dyatkovskiyare ignored at all. Do the same thing for the right type. 331f0c3a346SStepan Dyatkovskiy 332f0c3a346SStepan Dyatkovskiy2. If left and right types are equal, return 0. Otherwise we need to give 333f0c3a346SStepan Dyatkovskiypreference to one of them. So proceed to the next step. 334f0c3a346SStepan Dyatkovskiy 335f0c3a346SStepan Dyatkovskiy3. If types are of different kind (different type IDs). Return result of type 3366373f5ddSAditya KumarIDs comparison, treating them as numbers (use ``cmpNumbers`` operation). 337f0c3a346SStepan Dyatkovskiy 338f0c3a346SStepan Dyatkovskiy4. If types are vectors or integers, return result of their pointers comparison, 339f0c3a346SStepan Dyatkovskiycomparing them as numbers. 340f0c3a346SStepan Dyatkovskiy 341f0c3a346SStepan Dyatkovskiy5. Check whether type ID belongs to the next group (call it equivalent-group): 342f0c3a346SStepan Dyatkovskiy 343f0c3a346SStepan Dyatkovskiy * Void 344f0c3a346SStepan Dyatkovskiy 345f0c3a346SStepan Dyatkovskiy * Float 346f0c3a346SStepan Dyatkovskiy 347f0c3a346SStepan Dyatkovskiy * Double 348f0c3a346SStepan Dyatkovskiy 349f0c3a346SStepan Dyatkovskiy * X86_FP80 350f0c3a346SStepan Dyatkovskiy 351f0c3a346SStepan Dyatkovskiy * FP128 352f0c3a346SStepan Dyatkovskiy 353f0c3a346SStepan Dyatkovskiy * PPC_FP128 354f0c3a346SStepan Dyatkovskiy 355f0c3a346SStepan Dyatkovskiy * Label 356f0c3a346SStepan Dyatkovskiy 357f0c3a346SStepan Dyatkovskiy * Metadata. 358f0c3a346SStepan Dyatkovskiy 359f0c3a346SStepan Dyatkovskiy If ID belongs to group above, return 0. Since it's enough to see that 360f0c3a346SStepan Dyatkovskiy types has the same ``TypeID``. No additional information is required. 361f0c3a346SStepan Dyatkovskiy 362f0c3a346SStepan Dyatkovskiy6. Left and right are pointers. Return result of address space comparison 363f0c3a346SStepan Dyatkovskiy(numbers comparison). 364f0c3a346SStepan Dyatkovskiy 365f0c3a346SStepan Dyatkovskiy7. Complex types (structures, arrays, etc.). Follow complex objects comparison 366f0c3a346SStepan Dyatkovskiytechnique (see the very first paragraph of this chapter). Both *left* and 367f0c3a346SStepan Dyatkovskiy*right* are to be expanded and their element types will be checked the same 368f0c3a346SStepan Dyatkovskiyway. If we get -1 or 1 on some stage, return it. Otherwise return 0. 369f0c3a346SStepan Dyatkovskiy 370f0c3a346SStepan Dyatkovskiy8. Steps 1-6 describe all the possible cases, if we passed steps 1-6 and didn't 3716373f5ddSAditya Kumarget any conclusions, then invoke ``llvm_unreachable``, since it's quite an 372f0c3a346SStepan Dyatkovskiyunexpectable case. 373f0c3a346SStepan Dyatkovskiy 374f0c3a346SStepan DyatkovskiycmpValues(const Value*, const Value*) 375f0c3a346SStepan Dyatkovskiy------------------------------------- 376f0c3a346SStepan DyatkovskiyMethod that compares local values. 377f0c3a346SStepan Dyatkovskiy 3786373f5ddSAditya KumarThis method gives us an answer to a very curious question: whether we could 3796373f5ddSAditya Kumartreat local values as equal, and which value is greater otherwise. It's 3806373f5ddSAditya Kumarbetter to start from example: 381f0c3a346SStepan Dyatkovskiy 3826373f5ddSAditya KumarConsider the situation when we're looking at the same place in left 3836373f5ddSAditya Kumarfunction "*FL*" and in right function "*FR*". Every part of *left* place is 3846373f5ddSAditya Kumarequal to the corresponding part of *right* place, and (!) both parts use 3856373f5ddSAditya Kumar*Value* instances, for example: 386f0c3a346SStepan Dyatkovskiy 387124f2593SRenato Golin.. code-block:: text 388f0c3a346SStepan Dyatkovskiy 389f0c3a346SStepan Dyatkovskiy instr0 i32 %LV ; left side, function FL 390f0c3a346SStepan Dyatkovskiy instr0 i32 %RV ; right side, function FR 391f0c3a346SStepan Dyatkovskiy 392f0c3a346SStepan DyatkovskiySo, now our conclusion depends on *Value* instances comparison. 393f0c3a346SStepan Dyatkovskiy 3946373f5ddSAditya KumarThe main purpose of this method is to determine relation between such values. 395f0c3a346SStepan Dyatkovskiy 3966373f5ddSAditya KumarWhat can we expect from equal functions? At the same place, in functions 3976373f5ddSAditya Kumar"*FL*" and "*FR*" we expect to see *equal* values, or values *defined* at 3986373f5ddSAditya Kumarthe same place in "*FL*" and "*FR*". 399f0c3a346SStepan Dyatkovskiy 4006373f5ddSAditya KumarConsider a small example here: 401f0c3a346SStepan Dyatkovskiy 402124f2593SRenato Golin.. code-block:: text 403f0c3a346SStepan Dyatkovskiy 404f0c3a346SStepan Dyatkovskiy define void %f(i32 %pf0, i32 %pf1) { 405f0c3a346SStepan Dyatkovskiy instr0 i32 %pf0 instr1 i32 %pf1 instr2 i32 123 406f0c3a346SStepan Dyatkovskiy } 407f0c3a346SStepan Dyatkovskiy 408124f2593SRenato Golin.. code-block:: text 409f0c3a346SStepan Dyatkovskiy 410f0c3a346SStepan Dyatkovskiy define void %g(i32 %pg0, i32 %pg1) { 411f0c3a346SStepan Dyatkovskiy instr0 i32 %pg0 instr1 i32 %pg0 instr2 i32 123 412f0c3a346SStepan Dyatkovskiy } 413f0c3a346SStepan Dyatkovskiy 4146373f5ddSAditya KumarIn this example, *pf0* is associated with *pg0*, *pf1* is associated with 4156373f5ddSAditya Kumar*pg1*, and we also declare that *pf0* < *pf1*, and thus *pg0* < *pf1*. 416f0c3a346SStepan Dyatkovskiy 417f0c3a346SStepan DyatkovskiyInstructions with opcode "*instr0*" would be *equal*, since their types and 418f0c3a346SStepan Dyatkovskiyopcodes are equal, and values are *associated*. 419f0c3a346SStepan Dyatkovskiy 4206373f5ddSAditya KumarInstructions with opcode "*instr1*" from *f* is *greater* than instructions 4216373f5ddSAditya Kumarwith opcode "*instr1*" from *g*; here we have equal types and opcodes, but 4226373f5ddSAditya Kumar"*pf1* is greater than "*pg0*". 423f0c3a346SStepan Dyatkovskiy 4246373f5ddSAditya KumarInstructions with opcode "*instr2*" are equal, because their opcodes and 425f0c3a346SStepan Dyatkovskiytypes are equal, and the same constant is used as a value. 426f0c3a346SStepan Dyatkovskiy 4276373f5ddSAditya KumarWhat we associate in cmpValues? 428f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 429f0c3a346SStepan Dyatkovskiy* Function arguments. *i*-th argument from left function associated with 430f0c3a346SStepan Dyatkovskiy *i*-th argument from right function. 431f0c3a346SStepan Dyatkovskiy* BasicBlock instances. In basic-block enumeration loop we associate *i*-th 432f0c3a346SStepan Dyatkovskiy BasicBlock from the left function with *i*-th BasicBlock from the right 433f0c3a346SStepan Dyatkovskiy function. 434f0c3a346SStepan Dyatkovskiy* Instructions. 435f0c3a346SStepan Dyatkovskiy* Instruction operands. Note, we can meet *Value* here we have never seen 436f0c3a346SStepan Dyatkovskiy before. In this case it is not a function argument, nor *BasicBlock*, nor 4376373f5ddSAditya Kumar *Instruction*. It is a global value. It is a constant, since it's the only 4386373f5ddSAditya Kumar supposed global here. The method also compares: Constants that are of the 4396373f5ddSAditya Kumar same type and if right constant can be losslessly bit-casted to the left 4406373f5ddSAditya Kumar one, then we also compare them. 441f0c3a346SStepan Dyatkovskiy 442f0c3a346SStepan DyatkovskiyHow to implement cmpValues? 443f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^ 4446373f5ddSAditya Kumar*Association* is a case of equality for us. We just treat such values as equal, 4456373f5ddSAditya Kumarbut, in general, we need to implement antisymmetric relation. As mentioned 4466373f5ddSAditya Kumarabove, to understand what is *less*, we can use order in which we 4476373f5ddSAditya Kumarmeet values. If both values have the same order in a function (met at the same 4486373f5ddSAditya Kumartime), we then treat values as *associated*. Otherwise – it depends on who was 449f0c3a346SStepan Dyatkovskiyfirst. 450f0c3a346SStepan Dyatkovskiy 4516373f5ddSAditya KumarEvery time we run the top-level compare method, we initialize two identical 4526373f5ddSAditya Kumarmaps (one for the left side, another one for the right side): 453f0c3a346SStepan Dyatkovskiy 454f0c3a346SStepan Dyatkovskiy``map<Value, int> sn_mapL, sn_mapR;`` 455f0c3a346SStepan Dyatkovskiy 456f0c3a346SStepan DyatkovskiyThe key of the map is the *Value* itself, the *value* – is its order (call it 457f0c3a346SStepan Dyatkovskiy*serial number*). 458f0c3a346SStepan Dyatkovskiy 459f0c3a346SStepan DyatkovskiyTo add value *V* we need to perform the next procedure: 460f0c3a346SStepan Dyatkovskiy 461f0c3a346SStepan Dyatkovskiy``sn_map.insert(std::make_pair(V, sn_map.size()));`` 462f0c3a346SStepan Dyatkovskiy 4636373f5ddSAditya KumarFor the first *Value*, map will return *0*, for the second *Value* map will 4646373f5ddSAditya Kumarreturn *1*, and so on. 465f0c3a346SStepan Dyatkovskiy 4666373f5ddSAditya KumarWe can then check whether left and right values met at the same time with 4676373f5ddSAditya Kumara simple comparison: 468f0c3a346SStepan Dyatkovskiy 469f0c3a346SStepan Dyatkovskiy``cmpNumbers(sn_mapL[Left], sn_mapR[Right]);`` 470f0c3a346SStepan Dyatkovskiy 471f0c3a346SStepan DyatkovskiyOf course, we can combine insertion and comparison: 472f0c3a346SStepan Dyatkovskiy 473f0c3a346SStepan Dyatkovskiy.. code-block:: c++ 474f0c3a346SStepan Dyatkovskiy 475f0c3a346SStepan Dyatkovskiy std::pair<iterator, bool> 476f0c3a346SStepan Dyatkovskiy LeftRes = sn_mapL.insert(std::make_pair(Left, sn_mapL.size())), RightRes 477f0c3a346SStepan Dyatkovskiy = sn_mapR.insert(std::make_pair(Right, sn_mapR.size())); 478f0c3a346SStepan Dyatkovskiy return cmpNumbers(LeftRes.first->second, RightRes.first->second); 479f0c3a346SStepan Dyatkovskiy 480f0c3a346SStepan DyatkovskiyLet's look, how whole method could be implemented. 481f0c3a346SStepan Dyatkovskiy 4826373f5ddSAditya Kumar1. We have to start with the bad news. Consider function self and 483f0c3a346SStepan Dyatkovskiycross-referencing cases: 484f0c3a346SStepan Dyatkovskiy 485f0c3a346SStepan Dyatkovskiy.. code-block:: c++ 486f0c3a346SStepan Dyatkovskiy 487f0c3a346SStepan Dyatkovskiy // self-reference unsigned fact0(unsigned n) { return n > 1 ? n 488f0c3a346SStepan Dyatkovskiy * fact0(n-1) : 1; } unsigned fact1(unsigned n) { return n > 1 ? n * 489f0c3a346SStepan Dyatkovskiy fact1(n-1) : 1; } 490f0c3a346SStepan Dyatkovskiy 491f0c3a346SStepan Dyatkovskiy // cross-reference unsigned ping(unsigned n) { return n!= 0 ? pong(n-1) : 0; 492f0c3a346SStepan Dyatkovskiy } unsigned pong(unsigned n) { return n!= 0 ? ping(n-1) : 0; } 493f0c3a346SStepan Dyatkovskiy 494f0c3a346SStepan Dyatkovskiy.. 495f0c3a346SStepan Dyatkovskiy 496f0c3a346SStepan Dyatkovskiy This comparison has been implemented in initial *MergeFunctions* pass 497f0c3a346SStepan Dyatkovskiy version. But, unfortunately, it is not transitive. And this is the only case 498f0c3a346SStepan Dyatkovskiy we can't convert to less-equal-greater comparison. It is a seldom case, 4-5 4996373f5ddSAditya Kumar functions of 10000 (checked in test-suite), and, we hope, the reader would 500f0c3a346SStepan Dyatkovskiy forgive us for such a sacrifice in order to get the O(log(N)) pass time. 501f0c3a346SStepan Dyatkovskiy 502f0c3a346SStepan Dyatkovskiy2. If left/right *Value* is a constant, we have to compare them. Return 0 if it 503f0c3a346SStepan Dyatkovskiyis the same constant, or use ``cmpConstants`` method otherwise. 504f0c3a346SStepan Dyatkovskiy 505f0c3a346SStepan Dyatkovskiy3. If left/right is *InlineAsm* instance. Return result of *Value* pointers 506f0c3a346SStepan Dyatkovskiycomparison. 507f0c3a346SStepan Dyatkovskiy 508f0c3a346SStepan Dyatkovskiy4. Explicit association of *L* (left value) and *R* (right value). We need to 509f0c3a346SStepan Dyatkovskiyfind out whether values met at the same time, and thus are *associated*. Or we 5106373f5ddSAditya Kumarneed to put the rule: when we treat *L* < *R*. Now it is easy: we just return 5116373f5ddSAditya Kumarthe result of numbers comparison: 512f0c3a346SStepan Dyatkovskiy 513f0c3a346SStepan Dyatkovskiy.. code-block:: c++ 514f0c3a346SStepan Dyatkovskiy 515f0c3a346SStepan Dyatkovskiy std::pair<iterator, bool> 516f0c3a346SStepan Dyatkovskiy LeftRes = sn_mapL.insert(std::make_pair(Left, sn_mapL.size())), 517f0c3a346SStepan Dyatkovskiy RightRes = sn_mapR.insert(std::make_pair(Right, sn_mapR.size())); 518f0c3a346SStepan Dyatkovskiy if (LeftRes.first->second == RightRes.first->second) return 0; 519f0c3a346SStepan Dyatkovskiy if (LeftRes.first->second < RightRes.first->second) return -1; 520f0c3a346SStepan Dyatkovskiy return 1; 521f0c3a346SStepan Dyatkovskiy 5226373f5ddSAditya KumarNow when *cmpValues* returns 0, we can proceed the comparison procedure. 5236373f5ddSAditya KumarOtherwise, if we get (-1 or 1), we need to pass this result to the top level, 5246373f5ddSAditya Kumarand finish comparison procedure. 525f0c3a346SStepan Dyatkovskiy 526f0c3a346SStepan DyatkovskiycmpConstants 527f0c3a346SStepan Dyatkovskiy------------ 528f0c3a346SStepan DyatkovskiyPerforms constants comparison as follows: 529f0c3a346SStepan Dyatkovskiy 5306373f5ddSAditya Kumar1. Compare constant types using ``cmpType`` method. If the result is -1 or 1, 5316373f5ddSAditya Kumargoto step 2, otherwise proceed to step 3. 532f0c3a346SStepan Dyatkovskiy 533f0c3a346SStepan Dyatkovskiy2. If types are different, we still can check whether constants could be 534f0c3a346SStepan Dyatkovskiylosslessly bitcasted to each other. The further explanation is modification of 535f0c3a346SStepan Dyatkovskiy``canLosslesslyBitCastTo`` method. 536f0c3a346SStepan Dyatkovskiy 537f0c3a346SStepan Dyatkovskiy 2.1 Check whether constants are of the first class types 538f0c3a346SStepan Dyatkovskiy (``isFirstClassType`` check): 539f0c3a346SStepan Dyatkovskiy 540f0c3a346SStepan Dyatkovskiy 2.1.1. If both constants are *not* of the first class type: return result 541f0c3a346SStepan Dyatkovskiy of ``cmpType``. 542f0c3a346SStepan Dyatkovskiy 543f0c3a346SStepan Dyatkovskiy 2.1.2. Otherwise, if left type is not of the first class, return -1. If 544f0c3a346SStepan Dyatkovskiy right type is not of the first class, return 1. 545f0c3a346SStepan Dyatkovskiy 546f0c3a346SStepan Dyatkovskiy 2.1.3. If both types are of the first class type, proceed to the next step 547f0c3a346SStepan Dyatkovskiy (2.1.3.1). 548f0c3a346SStepan Dyatkovskiy 549f0c3a346SStepan Dyatkovskiy 2.1.3.1. If types are vectors, compare their bitwidth using the 550f0c3a346SStepan Dyatkovskiy *cmpNumbers*. If result is not 0, return it. 551f0c3a346SStepan Dyatkovskiy 552f0c3a346SStepan Dyatkovskiy 2.1.3.2. Different types, but not a vectors: 553f0c3a346SStepan Dyatkovskiy 554f0c3a346SStepan Dyatkovskiy * if both of them are pointers, good for us, we can proceed to step 3. 555f0c3a346SStepan Dyatkovskiy * if one of types is pointer, return result of *isPointer* flags 556f0c3a346SStepan Dyatkovskiy comparison (*cmpFlags* operation). 557f0c3a346SStepan Dyatkovskiy * otherwise we have no methods to prove bitcastability, and thus return 558f0c3a346SStepan Dyatkovskiy result of types comparison (-1 or 1). 559f0c3a346SStepan Dyatkovskiy 560f0c3a346SStepan DyatkovskiySteps below are for the case when types are equal, or case when constants are 561f0c3a346SStepan Dyatkovskiybitcastable: 562f0c3a346SStepan Dyatkovskiy 563f0c3a346SStepan Dyatkovskiy3. One of constants is a "*null*" value. Return the result of 564f0c3a346SStepan Dyatkovskiy``cmpFlags(L->isNullValue, R->isNullValue)`` comparison. 565f0c3a346SStepan Dyatkovskiy 566f0c3a346SStepan Dyatkovskiy4. Compare value IDs, and return result if it is not 0: 567f0c3a346SStepan Dyatkovskiy 568f0c3a346SStepan Dyatkovskiy.. code-block:: c++ 569f0c3a346SStepan Dyatkovskiy 570f0c3a346SStepan Dyatkovskiy if (int Res = cmpNumbers(L->getValueID(), R->getValueID())) 571f0c3a346SStepan Dyatkovskiy return Res; 572f0c3a346SStepan Dyatkovskiy 5736373f5ddSAditya Kumar5. Compare the contents of constants. The comparison depends on the kind of 574f0c3a346SStepan Dyatkovskiyconstants, but on this stage it is just a lexicographical comparison. Just see 575f0c3a346SStepan Dyatkovskiyhow it was described in the beginning of "*Functions comparison*" paragraph. 5766373f5ddSAditya KumarMathematically, it is equal to the next case: we encode left constant and right 577f0c3a346SStepan Dyatkovskiyconstant (with similar way *bitcode-writer* does). Then compare left code 578f0c3a346SStepan Dyatkovskiysequence and right code sequence. 579f0c3a346SStepan Dyatkovskiy 580f0c3a346SStepan Dyatkovskiycompare(const BasicBlock*, const BasicBlock*) 581f0c3a346SStepan Dyatkovskiy--------------------------------------------- 582f0c3a346SStepan DyatkovskiyCompares two *BasicBlock* instances. 583f0c3a346SStepan Dyatkovskiy 584f0c3a346SStepan DyatkovskiyIt enumerates instructions from left *BB* and right *BB*. 585f0c3a346SStepan Dyatkovskiy 586f0c3a346SStepan Dyatkovskiy1. It assigns serial numbers to the left and right instructions, using 587f0c3a346SStepan Dyatkovskiy``cmpValues`` method. 588f0c3a346SStepan Dyatkovskiy 589f0c3a346SStepan Dyatkovskiy2. If one of left or right is *GEP* (``GetElementPtr``), then treat *GEP* as 5906373f5ddSAditya Kumargreater than other instructions. If both instructions are *GEPs* use ``cmpGEP`` 591f0c3a346SStepan Dyatkovskiymethod for comparison. If result is -1 or 1, pass it to the top-level 592f0c3a346SStepan Dyatkovskiycomparison (return it). 593f0c3a346SStepan Dyatkovskiy 594f0c3a346SStepan Dyatkovskiy 3.1. Compare operations. Call ``cmpOperation`` method. If result is -1 or 595f0c3a346SStepan Dyatkovskiy 1, return it. 596f0c3a346SStepan Dyatkovskiy 597f0c3a346SStepan Dyatkovskiy 3.2. Compare number of operands, if result is -1 or 1, return it. 598f0c3a346SStepan Dyatkovskiy 599f0c3a346SStepan Dyatkovskiy 3.3. Compare operands themselves, use ``cmpValues`` method. Return result 600f0c3a346SStepan Dyatkovskiy if it is -1 or 1. 601f0c3a346SStepan Dyatkovskiy 602f0c3a346SStepan Dyatkovskiy 3.4. Compare type of operands, using ``cmpType`` method. Return result if 603f0c3a346SStepan Dyatkovskiy it is -1 or 1. 604f0c3a346SStepan Dyatkovskiy 605f0c3a346SStepan Dyatkovskiy 3.5. Proceed to the next instruction. 606f0c3a346SStepan Dyatkovskiy 607f0c3a346SStepan Dyatkovskiy4. We can finish instruction enumeration in 3 cases: 608f0c3a346SStepan Dyatkovskiy 609f0c3a346SStepan Dyatkovskiy 4.1. We reached the end of both left and right basic-blocks. We didn't 6106373f5ddSAditya Kumar exit on steps 1-3, so contents are equal, return 0. 611f0c3a346SStepan Dyatkovskiy 612f0c3a346SStepan Dyatkovskiy 4.2. We have reached the end of the left basic-block. Return -1. 613f0c3a346SStepan Dyatkovskiy 6146373f5ddSAditya Kumar 4.3. Return 1 (we reached the end of the right basic block). 615f0c3a346SStepan Dyatkovskiy 616f0c3a346SStepan DyatkovskiycmpGEP 617f0c3a346SStepan Dyatkovskiy------ 618f0c3a346SStepan DyatkovskiyCompares two GEPs (``getelementptr`` instructions). 619f0c3a346SStepan Dyatkovskiy 620f0c3a346SStepan DyatkovskiyIt differs from regular operations comparison with the only thing: possibility 621f0c3a346SStepan Dyatkovskiyto use ``accumulateConstantOffset`` method. 622f0c3a346SStepan Dyatkovskiy 623f0c3a346SStepan DyatkovskiySo, if we get constant offset for both left and right *GEPs*, then compare it as 624f0c3a346SStepan Dyatkovskiynumbers, and return comparison result. 625f0c3a346SStepan Dyatkovskiy 626f0c3a346SStepan DyatkovskiyOtherwise treat it like a regular operation (see previous paragraph). 627f0c3a346SStepan Dyatkovskiy 628f0c3a346SStepan DyatkovskiycmpOperation 629f0c3a346SStepan Dyatkovskiy------------ 630f0c3a346SStepan DyatkovskiyCompares instruction opcodes and some important operation properties. 631f0c3a346SStepan Dyatkovskiy 632f0c3a346SStepan Dyatkovskiy1. Compare opcodes, if it differs return the result. 633f0c3a346SStepan Dyatkovskiy 634f0c3a346SStepan Dyatkovskiy2. Compare number of operands. If it differs – return the result. 635f0c3a346SStepan Dyatkovskiy 636f0c3a346SStepan Dyatkovskiy3. Compare operation types, use *cmpType*. All the same – if types are 637f0c3a346SStepan Dyatkovskiydifferent, return result. 638f0c3a346SStepan Dyatkovskiy 639f0c3a346SStepan Dyatkovskiy4. Compare *subclassOptionalData*, get it with ``getRawSubclassOptionalData`` 640f0c3a346SStepan Dyatkovskiymethod, and compare it like a numbers. 641f0c3a346SStepan Dyatkovskiy 642f0c3a346SStepan Dyatkovskiy5. Compare operand types. 643f0c3a346SStepan Dyatkovskiy 6446373f5ddSAditya Kumar6. For some particular instructions, check equivalence (relation in our case) of 6456373f5ddSAditya Kumarsome significant attributes. For example, we have to compare alignment for 646f0c3a346SStepan Dyatkovskiy``load`` instructions. 647f0c3a346SStepan Dyatkovskiy 648f0c3a346SStepan DyatkovskiyO(log(N)) 649f0c3a346SStepan Dyatkovskiy--------- 650f0c3a346SStepan DyatkovskiyMethods described above implement order relationship. And latter, could be used 651f0c3a346SStepan Dyatkovskiyfor nodes comparison in a binary tree. So we can organize functions set into 652f0c3a346SStepan Dyatkovskiythe binary tree and reduce the cost of lookup procedure from 653f0c3a346SStepan DyatkovskiyO(N*N) to O(log(N)). 654f0c3a346SStepan Dyatkovskiy 655f0c3a346SStepan DyatkovskiyMerging process, mergeTwoFunctions 656f0c3a346SStepan Dyatkovskiy================================== 657f0c3a346SStepan DyatkovskiyOnce *MergeFunctions* detected that current function (*G*) is equal to one that 658f0c3a346SStepan Dyatkovskiywere analyzed before (function *F*) it calls ``mergeTwoFunctions(Function*, 659f0c3a346SStepan DyatkovskiyFunction*)``. 660f0c3a346SStepan Dyatkovskiy 661f0c3a346SStepan DyatkovskiyOperation affects ``FnTree`` contents with next way: *F* will stay in 662f0c3a346SStepan Dyatkovskiy``FnTree``. *G* being equal to *F* will not be added to ``FnTree``. Calls of 663f0c3a346SStepan Dyatkovskiy*G* would be replaced with something else. It changes bodies of callers. So, 664f0c3a346SStepan Dyatkovskiyfunctions that calls *G* would be put into ``Deferred`` set and removed from 665f0c3a346SStepan Dyatkovskiy``FnTree``, and analyzed again. 666f0c3a346SStepan Dyatkovskiy 667f0c3a346SStepan DyatkovskiyThe approach is next: 668f0c3a346SStepan Dyatkovskiy 669f0c3a346SStepan Dyatkovskiy1. Most wished case: when we can use alias and both of *F* and *G* are weak. We 670f0c3a346SStepan Dyatkovskiymake both of them with aliases to the third strong function *H*. Actually *H* 671f0c3a346SStepan Dyatkovskiyis *F*. See below how it's made (but it's better to look straight into the 672f0c3a346SStepan Dyatkovskiysource code). Well, this is a case when we can just replace *G* with *F* 673f0c3a346SStepan Dyatkovskiyeverywhere, we use ``replaceAllUsesWith`` operation here (*RAUW*). 674f0c3a346SStepan Dyatkovskiy 675f0c3a346SStepan Dyatkovskiy2. *F* could not be overridden, while *G* could. It would be good to do the 676f0c3a346SStepan Dyatkovskiynext: after merging the places where overridable function were used, still use 677f0c3a346SStepan Dyatkovskiyoverridable stub. So try to make *G* alias to *F*, or create overridable tail 678f0c3a346SStepan Dyatkovskiycall wrapper around *F* and replace *G* with that call. 679f0c3a346SStepan Dyatkovskiy 680f0c3a346SStepan Dyatkovskiy3. Neither *F* nor *G* could be overridden. We can't use *RAUW*. We can just 681f0c3a346SStepan Dyatkovskiychange the callers: call *F* instead of *G*. That's what 682f0c3a346SStepan Dyatkovskiy``replaceDirectCallers`` does. 683f0c3a346SStepan Dyatkovskiy 6846373f5ddSAditya KumarBelow is a detailed body description. 685f0c3a346SStepan Dyatkovskiy 686f0c3a346SStepan DyatkovskiyIf “F” may be overridden 687f0c3a346SStepan Dyatkovskiy------------------------ 688f0c3a346SStepan DyatkovskiyAs follows from ``mayBeOverridden`` comments: “whether the definition of this 68984666a19SSylvestre Ledruglobal may be replaced by something non-equivalent at link time”. If so, that's 690f0c3a346SStepan Dyatkovskiyok: we can use alias to *F* instead of *G* or change call instructions itself. 691f0c3a346SStepan Dyatkovskiy 692f0c3a346SStepan DyatkovskiyHasGlobalAliases, removeUsers 693f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 694f0c3a346SStepan DyatkovskiyFirst consider the case when we have global aliases of one function name to 695f0c3a346SStepan Dyatkovskiyanother. Our purpose is make both of them with aliases to the third strong 696f0c3a346SStepan Dyatkovskiyfunction. Though if we keep *F* alive and without major changes we can leave it 697f0c3a346SStepan Dyatkovskiyin ``FnTree``. Try to combine these two goals. 698f0c3a346SStepan Dyatkovskiy 699f0c3a346SStepan DyatkovskiyDo stub replacement of *F* itself with an alias to *F*. 700f0c3a346SStepan Dyatkovskiy 701f0c3a346SStepan Dyatkovskiy1. Create stub function *H*, with the same name and attributes like function 702f0c3a346SStepan Dyatkovskiy*F*. It takes maximum alignment of *F* and *G*. 703f0c3a346SStepan Dyatkovskiy 704f0c3a346SStepan Dyatkovskiy2. Replace all uses of function *F* with uses of function *H*. It is the two 705f0c3a346SStepan Dyatkovskiysteps procedure instead. First of all, we must take into account, all functions 706f0c3a346SStepan Dyatkovskiyfrom whom *F* is called would be changed: since we change the call argument 707f0c3a346SStepan Dyatkovskiy(from *F* to *H*). If so we must to review these caller functions again after 708f0c3a346SStepan Dyatkovskiythis procedure. We remove callers from ``FnTree``, method with name 709f0c3a346SStepan Dyatkovskiy``removeUsers(F)`` does that (don't confuse with ``replaceAllUsesWith``): 710f0c3a346SStepan Dyatkovskiy 711f0c3a346SStepan Dyatkovskiy 2.1. ``Inside removeUsers(Value* 712f0c3a346SStepan Dyatkovskiy V)`` we go through the all values that use value *V* (or *F* in our context). 713f0c3a346SStepan Dyatkovskiy If value is instruction, we go to function that holds this instruction and 714f0c3a346SStepan Dyatkovskiy mark it as to-be-analyzed-again (put to ``Deferred`` set), we also remove 715f0c3a346SStepan Dyatkovskiy caller from ``FnTree``. 716f0c3a346SStepan Dyatkovskiy 717f0c3a346SStepan Dyatkovskiy 2.2. Now we can do the replacement: call ``F->replaceAllUsesWith(H)``. 718f0c3a346SStepan Dyatkovskiy 719f0c3a346SStepan Dyatkovskiy3. *H* (that now "officially" plays *F*'s role) is replaced with alias to *F*. 720f0c3a346SStepan DyatkovskiyDo the same with *G*: replace it with alias to *F*. So finally everywhere *F* 721f0c3a346SStepan Dyatkovskiywas used, we use *H* and it is alias to *F*, and everywhere *G* was used we 722f0c3a346SStepan Dyatkovskiyalso have alias to *F*. 723f0c3a346SStepan Dyatkovskiy 724f0c3a346SStepan Dyatkovskiy4. Set *F* linkage to private. Make it strong :-) 725f0c3a346SStepan Dyatkovskiy 726f0c3a346SStepan DyatkovskiyNo global aliases, replaceDirectCallers 727f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7286373f5ddSAditya KumarIf global aliases are not supported. We call ``replaceDirectCallers``. Just 729f0c3a346SStepan Dyatkovskiygo through all calls of *G* and replace it with calls of *F*. If you look into 7306373f5ddSAditya Kumarthe method you will see that it scans all uses of *G* too, and if use is callee 7316373f5ddSAditya Kumar(if user is call instruction and *G* is used as what to be called), we replace 7326373f5ddSAditya Kumarit with use of *F*. 733f0c3a346SStepan Dyatkovskiy 734f0c3a346SStepan DyatkovskiyIf “F” could not be overridden, fix it! 735f0c3a346SStepan Dyatkovskiy""""""""""""""""""""""""""""""""""""""" 736f0c3a346SStepan Dyatkovskiy 737f0c3a346SStepan DyatkovskiyWe call ``writeThunkOrAlias(Function *F, Function *G)``. Here we try to replace 7386373f5ddSAditya Kumar*G* with alias to *F* first. The next conditions are essential: 739f0c3a346SStepan Dyatkovskiy 740f0c3a346SStepan Dyatkovskiy* target should support global aliases, 741f0c3a346SStepan Dyatkovskiy* the address itself of *G* should be not significant, not named and not 742f0c3a346SStepan Dyatkovskiy referenced anywhere, 743f0c3a346SStepan Dyatkovskiy* function should come with external, local or weak linkage. 744f0c3a346SStepan Dyatkovskiy 745f0c3a346SStepan DyatkovskiyOtherwise we write thunk: some wrapper that has *G's* interface and calls *F*, 746f0c3a346SStepan Dyatkovskiyso *G* could be replaced with this wrapper. 747f0c3a346SStepan Dyatkovskiy 748f0c3a346SStepan Dyatkovskiy*writeAlias* 749f0c3a346SStepan Dyatkovskiy 750f0c3a346SStepan DyatkovskiyAs follows from *llvm* reference: 751f0c3a346SStepan Dyatkovskiy 752f0c3a346SStepan Dyatkovskiy“Aliases act as *second name* for the aliasee value”. So we just want to create 7536373f5ddSAditya Kumara second name for *F* and use it instead of *G*: 754f0c3a346SStepan Dyatkovskiy 755f0c3a346SStepan Dyatkovskiy1. create global alias itself (*GA*), 756f0c3a346SStepan Dyatkovskiy 757f0c3a346SStepan Dyatkovskiy2. adjust alignment of *F* so it must be maximum of current and *G's* alignment; 758f0c3a346SStepan Dyatkovskiy 759f0c3a346SStepan Dyatkovskiy3. replace uses of *G*: 760f0c3a346SStepan Dyatkovskiy 761f0c3a346SStepan Dyatkovskiy 3.1. first mark all callers of *G* as to-be-analyzed-again, using 762f0c3a346SStepan Dyatkovskiy ``removeUsers`` method (see chapter above), 763f0c3a346SStepan Dyatkovskiy 764f0c3a346SStepan Dyatkovskiy 3.2. call ``G->replaceAllUsesWith(GA)``. 765f0c3a346SStepan Dyatkovskiy 766f0c3a346SStepan Dyatkovskiy4. Get rid of *G*. 767f0c3a346SStepan Dyatkovskiy 768f0c3a346SStepan Dyatkovskiy*writeThunk* 769f0c3a346SStepan Dyatkovskiy 770f0c3a346SStepan DyatkovskiyAs it written in method comments: 771f0c3a346SStepan Dyatkovskiy 772f0c3a346SStepan Dyatkovskiy“Replace G with a simple tail call to bitcast(F). Also replace direct uses of G 773f0c3a346SStepan Dyatkovskiywith bitcast(F). Deletes G.” 774f0c3a346SStepan Dyatkovskiy 775f0c3a346SStepan DyatkovskiyIn general it does the same as usual when we want to replace callee, except the 776f0c3a346SStepan Dyatkovskiyfirst point: 777f0c3a346SStepan Dyatkovskiy 778f0c3a346SStepan Dyatkovskiy1. We generate tail call wrapper around *F*, but with interface that allows use 779f0c3a346SStepan Dyatkovskiyit instead of *G*. 780f0c3a346SStepan Dyatkovskiy 781f0c3a346SStepan Dyatkovskiy2. “As-usual”: ``removeUsers`` and ``replaceAllUsesWith`` then. 782f0c3a346SStepan Dyatkovskiy 783f0c3a346SStepan Dyatkovskiy3. Get rid of *G*. 784f0c3a346SStepan Dyatkovskiy 785f0c3a346SStepan Dyatkovskiy 786