xref: /llvm-project/llvm/docs/MergeFunctions.rst (revision e8fa9014cce41e1057e295b035dda73420612686)
1f0c3a346SStepan Dyatkovskiy=================================
2f0c3a346SStepan DyatkovskiyMergeFunctions pass, how it works
3f0c3a346SStepan Dyatkovskiy=================================
4f0c3a346SStepan Dyatkovskiy
5f0c3a346SStepan Dyatkovskiy.. contents::
6f0c3a346SStepan Dyatkovskiy   :local:
7f0c3a346SStepan Dyatkovskiy
8f0c3a346SStepan DyatkovskiyIntroduction
9f0c3a346SStepan Dyatkovskiy============
10f0c3a346SStepan DyatkovskiySometimes code contains equal functions, or functions that does exactly the same
11f0c3a346SStepan Dyatkovskiything even though they are non-equal on the IR level (e.g.: multiplication on 2
12f0c3a346SStepan Dyatkovskiyand 'shl 1'). It could happen due to several reasons: mainly, the usage of
136373f5ddSAditya Kumartemplates and automatic code generators. Though, sometimes the user itself could
14f0c3a346SStepan Dyatkovskiywrite the same thing twice :-)
15f0c3a346SStepan Dyatkovskiy
16f0c3a346SStepan DyatkovskiyThe main purpose of this pass is to recognize such functions and merge them.
17f0c3a346SStepan Dyatkovskiy
186373f5ddSAditya KumarThis document is the extension to pass comments and describes the pass logic. It
196373f5ddSAditya Kumardescribes the algorithm that is used in order to compare functions and
206373f5ddSAditya Kumarexplains how we could combine equal functions correctly to keep the module
216373f5ddSAditya Kumarvalid.
22f0c3a346SStepan Dyatkovskiy
236373f5ddSAditya KumarMaterial is brought in a top-down form, so the reader could start to learn pass
246373f5ddSAditya Kumarfrom high level ideas and end with low-level algorithm details, thus preparing
256373f5ddSAditya Kumarhim or her for reading the sources.
26f0c3a346SStepan Dyatkovskiy
276373f5ddSAditya KumarThe main goal is to describe the algorithm and logic here and the concept. If
286373f5ddSAditya Kumaryou *don't want* to read the source code, but want to understand pass
296373f5ddSAditya Kumaralgorithms, this document is good for you. The author tries not to repeat the
306373f5ddSAditya Kumarsource-code and covers only common cases to avoid the cases of needing to
316373f5ddSAditya Kumarupdate this document after any minor code changes.
32f0c3a346SStepan Dyatkovskiy
33f0c3a346SStepan Dyatkovskiy
34f0c3a346SStepan DyatkovskiyWhat should I know to be able to follow along with this document?
35f0c3a346SStepan Dyatkovskiy-----------------------------------------------------------------
36f0c3a346SStepan Dyatkovskiy
376373f5ddSAditya KumarThe reader should be familiar with common compile-engineering principles and
386373f5ddSAditya KumarLLVM code fundamentals. In this article, we assume the reader is familiar with
396373f5ddSAditya Kumar`Single Static Assignment
406373f5ddSAditya Kumar<http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
416373f5ddSAditya Kumarconcept and has an understanding of
4272fd1033SSylvestre Ledru`IR structure <https://llvm.org/docs/LangRef.html#high-level-structure>`_.
43f0c3a346SStepan Dyatkovskiy
446373f5ddSAditya KumarWe will use terms such as
4572fd1033SSylvestre Ledru"`module <https://llvm.org/docs/LangRef.html#high-level-structure>`_",
4672fd1033SSylvestre Ledru"`function <https://llvm.org/docs/ProgrammersManual.html#the-function-class>`_",
47f0c3a346SStepan Dyatkovskiy"`basic block <http://en.wikipedia.org/wiki/Basic_block>`_",
4872fd1033SSylvestre Ledru"`user <https://llvm.org/docs/ProgrammersManual.html#the-user-class>`_",
4972fd1033SSylvestre Ledru"`value <https://llvm.org/docs/ProgrammersManual.html#the-value-class>`_",
506373f5ddSAditya Kumar"`instruction
5172fd1033SSylvestre Ledru<https://llvm.org/docs/ProgrammersManual.html#the-instruction-class>`_".
52f0c3a346SStepan Dyatkovskiy
536373f5ddSAditya KumarAs a good starting point, the Kaleidoscope tutorial can be used:
54f0c3a346SStepan Dyatkovskiy
55f0c3a346SStepan Dyatkovskiy:doc:`tutorial/index`
56f0c3a346SStepan Dyatkovskiy
576373f5ddSAditya KumarIt's especially important to understand chapter 3 of tutorial:
58f0c3a346SStepan Dyatkovskiy
59d8b9735aSEtienne Bergeron:doc:`tutorial/LangImpl03`
60f0c3a346SStepan Dyatkovskiy
616373f5ddSAditya KumarThe reader should also know how passes work in LLVM. They could use this
626373f5ddSAditya Kumararticle as a reference and start point here:
63f0c3a346SStepan Dyatkovskiy
64f0c3a346SStepan Dyatkovskiy:doc:`WritingAnLLVMPass`
65f0c3a346SStepan Dyatkovskiy
666373f5ddSAditya KumarWhat else? Well perhaps the reader should also have some experience in LLVM pass
67f0c3a346SStepan Dyatkovskiydebugging and bug-fixing.
68f0c3a346SStepan Dyatkovskiy
69f0c3a346SStepan DyatkovskiyNarrative structure
70f0c3a346SStepan Dyatkovskiy-------------------
716373f5ddSAditya KumarThe article consists of three parts. The first part explains pass functionality
726373f5ddSAditya Kumaron the top-level. The second part describes the comparison procedure itself.
736373f5ddSAditya KumarThe third part describes the merging process.
74f0c3a346SStepan Dyatkovskiy
756373f5ddSAditya KumarIn every part, the author tries to put the contents in the top-down form.
766373f5ddSAditya KumarThe top-level methods will first be described followed by the terminal ones at
776373f5ddSAditya Kumarthe end, in the tail of each part. If the reader sees the reference to the
78dfcd3dcfSChandler Carruthmethod that wasn't described yet, they will find its description a bit below.
79f0c3a346SStepan Dyatkovskiy
80f0c3a346SStepan DyatkovskiyBasics
81f0c3a346SStepan Dyatkovskiy======
82f0c3a346SStepan Dyatkovskiy
83f0c3a346SStepan DyatkovskiyHow to do it?
84f0c3a346SStepan Dyatkovskiy-------------
856373f5ddSAditya KumarDo we need to merge functions? The obvious answer is: Yes, that is quite a
866373f5ddSAditya Kumarpossible case. We usually *do* have duplicates and it would be good to get rid
876373f5ddSAditya Kumarof them. But how do we detect duplicates? This is the idea: we split functions
886373f5ddSAditya Kumarinto smaller bricks or parts and compare the "bricks" amount. If equal,
896373f5ddSAditya Kumarwe compare the "bricks" themselves, and then do our conclusions about functions
90f0c3a346SStepan Dyatkovskiythemselves.
91f0c3a346SStepan Dyatkovskiy
926373f5ddSAditya KumarWhat could the difference be? For example, on a machine with 64-bit pointers
936373f5ddSAditya Kumar(let's assume we have only one address space), one function stores a 64-bit
946373f5ddSAditya Kumarinteger, while another one stores a pointer. If the target is the machine
95f0c3a346SStepan Dyatkovskiymentioned above, and if functions are identical, except the parameter type (we
966373f5ddSAditya Kumarcould consider it as a part of function type), then we can treat a ``uint64_t``
976373f5ddSAditya Kumarand a ``void*`` as equal.
98f0c3a346SStepan Dyatkovskiy
996373f5ddSAditya KumarThis is just an example; more possible details are described a bit below.
100f0c3a346SStepan Dyatkovskiy
1016373f5ddSAditya KumarAs another example, the reader may imagine two more functions. The first
1026a6a83c6SSimon Pilgrimfunction performs a multiplication by 2, while the second one performs an
1036a6a83c6SSimon Pilgrimlogical left shift by 1.
104f0c3a346SStepan Dyatkovskiy
105f0c3a346SStepan DyatkovskiyPossible solutions
106f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^
107f0c3a346SStepan DyatkovskiyLet's briefly consider possible options about how and what we have to implement
108f0c3a346SStepan Dyatkovskiyin order to create full-featured functions merging, and also what it would
1096373f5ddSAditya Kumarmean for us.
110f0c3a346SStepan Dyatkovskiy
1116373f5ddSAditya KumarEqual function detection obviously supposes that a "detector" method to be
1126373f5ddSAditya Kumarimplemented and latter should answer the question "whether functions are equal".
1136373f5ddSAditya KumarThis "detector" method consists of tiny "sub-detectors", which each answers
114f0c3a346SStepan Dyatkovskiyexactly the same question, but for function parts.
115f0c3a346SStepan Dyatkovskiy
116f0c3a346SStepan DyatkovskiyAs the second step, we should merge equal functions. So it should be a "merger"
117f0c3a346SStepan Dyatkovskiymethod. "Merger" accepts two functions *F1* and *F2*, and produces *F1F2*
118f0c3a346SStepan Dyatkovskiyfunction, the result of merging.
119f0c3a346SStepan Dyatkovskiy
1206373f5ddSAditya KumarHaving such routines in our hands, we can process a whole module, and merge all
121f0c3a346SStepan Dyatkovskiyequal functions.
122f0c3a346SStepan Dyatkovskiy
123f0c3a346SStepan DyatkovskiyIn this case, we have to compare every function with every another function. As
1246373f5ddSAditya Kumarthe reader may notice, this way seems to be quite expensive. Of course we could
125f0c3a346SStepan Dyatkovskiyintroduce hashing and other helpers, but it is still just an optimization, and
126f0c3a346SStepan Dyatkovskiythus the level of O(N*N) complexity.
127f0c3a346SStepan Dyatkovskiy
128f0c3a346SStepan DyatkovskiyCan we reach another level? Could we introduce logarithmical search, or random
129f0c3a346SStepan Dyatkovskiyaccess lookup? The answer is: "yes".
130f0c3a346SStepan Dyatkovskiy
131f0c3a346SStepan DyatkovskiyRandom-access
132f0c3a346SStepan Dyatkovskiy"""""""""""""
1336373f5ddSAditya KumarHow it could this be done? Just convert each function to a number, and gather
1346373f5ddSAditya Kumarall of them in a special hash-table. Functions with equal hashes are equal.
1356373f5ddSAditya KumarGood hashing means, that every function part must be taken into account. That
1366373f5ddSAditya Kumarmeans we have to convert every function part into some number, and then add it
137*e8fa9014SKazu Hiratainto the hash. The lookup-up time would be small, but such an approach adds some
1386373f5ddSAditya Kumardelay due to the hashing routine.
139f0c3a346SStepan Dyatkovskiy
140f0c3a346SStepan DyatkovskiyLogarithmical search
141f0c3a346SStepan Dyatkovskiy""""""""""""""""""""
1426373f5ddSAditya KumarWe could introduce total ordering among the functions set, once ordered we
143f0c3a346SStepan Dyatkovskiycould then implement a logarithmical search. Lookup time still depends on N,
144f0c3a346SStepan Dyatkovskiybut adds a little of delay (*log(N)*).
145f0c3a346SStepan Dyatkovskiy
146f0c3a346SStepan DyatkovskiyPresent state
147f0c3a346SStepan Dyatkovskiy"""""""""""""
1486373f5ddSAditya KumarBoth of the approaches (random-access and logarithmical) have been implemented
1496373f5ddSAditya Kumarand tested and both give a very good improvement. What was most
1506373f5ddSAditya Kumarsurprising is that logarithmical search was faster; sometimes by up to 15%. The
1516373f5ddSAditya Kumarhashing method needs some extra CPU time, which is the main reason why it works
1526373f5ddSAditya Kumarslower; in most cases, total "hashing" time is greater than total
1536373f5ddSAditya Kumar"logarithmical-search" time.
154f0c3a346SStepan Dyatkovskiy
155f0c3a346SStepan DyatkovskiySo, preference has been granted to the "logarithmical search".
156f0c3a346SStepan Dyatkovskiy
157f0c3a346SStepan DyatkovskiyThough in the case of need, *logarithmical-search* (read "total-ordering") could
158f0c3a346SStepan Dyatkovskiybe used as a milestone on our way to the *random-access* implementation.
159f0c3a346SStepan Dyatkovskiy
1606373f5ddSAditya KumarEvery comparison is based either on the numbers or on the flags comparison. In
1616373f5ddSAditya Kumarthe *random-access* approach, we could use the same comparison algorithm.
1626373f5ddSAditya KumarDuring comparison, we exit once we find the difference, but here we might have
1636373f5ddSAditya Kumarto scan the whole function body every time (note, it could be slower). Like in
1646373f5ddSAditya Kumar"total-ordering", we will track every number and flag, but instead of
1656373f5ddSAditya Kumarcomparison, we should get the numbers sequence and then create the hash number.
1666373f5ddSAditya KumarSo, once again, *total-ordering* could be considered as a milestone for even
1676373f5ddSAditya Kumarfaster (in theory) random-access approach.
168f0c3a346SStepan Dyatkovskiy
169f0c3a346SStepan DyatkovskiyMergeFunctions, main fields and runOnModule
170f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1716373f5ddSAditya KumarThere are two main important fields in the class:
172f0c3a346SStepan Dyatkovskiy
173f0c3a346SStepan Dyatkovskiy``FnTree``  – the set of all unique functions. It keeps items that couldn't be
174f0c3a346SStepan Dyatkovskiymerged with each other. It is defined as:
175f0c3a346SStepan Dyatkovskiy
176f0c3a346SStepan Dyatkovskiy``std::set<FunctionNode> FnTree;``
177f0c3a346SStepan Dyatkovskiy
178f0c3a346SStepan DyatkovskiyHere ``FunctionNode`` is a wrapper for ``llvm::Function`` class, with
179f0c3a346SStepan Dyatkovskiyimplemented “<” operator among the functions set (below we explain how it works
180f0c3a346SStepan Dyatkovskiyexactly; this is a key point in fast functions comparison).
181f0c3a346SStepan Dyatkovskiy
182f0c3a346SStepan Dyatkovskiy``Deferred`` – merging process can affect bodies of functions that are in
1836373f5ddSAditya Kumar``FnTree`` already. Obviously, such functions should be rechecked again. In this
1846373f5ddSAditya Kumarcase, we remove them from ``FnTree``, and mark them to be rescanned, namely
185f0c3a346SStepan Dyatkovskiyput them into ``Deferred`` list.
186f0c3a346SStepan Dyatkovskiy
187f0c3a346SStepan DyatkovskiyrunOnModule
188f0c3a346SStepan Dyatkovskiy"""""""""""
189f0c3a346SStepan DyatkovskiyThe algorithm is pretty simple:
190f0c3a346SStepan Dyatkovskiy
191f0c3a346SStepan Dyatkovskiy1. Put all module's functions into the *worklist*.
192f0c3a346SStepan Dyatkovskiy
193f0c3a346SStepan Dyatkovskiy2. Scan *worklist*'s functions twice: first enumerate only strong functions and
194f0c3a346SStepan Dyatkovskiythen only weak ones:
195f0c3a346SStepan Dyatkovskiy
1966373f5ddSAditya Kumar   2.1. Loop body: take a function from *worklist*  (call it *FCur*) and try to
197f0c3a346SStepan Dyatkovskiy   insert it into *FnTree*: check whether *FCur* is equal to one of functions
1986373f5ddSAditya Kumar   in *FnTree*. If there *is* an equal function in *FnTree*
1996373f5ddSAditya Kumar   (call it *FExists*): merge function *FCur* with *FExists*. Otherwise add
2006373f5ddSAditya Kumar   the function from the *worklist* to *FnTree*.
201f0c3a346SStepan Dyatkovskiy
2026373f5ddSAditya Kumar3. Once the *worklist* scanning and merging operations are complete, check the
2036373f5ddSAditya Kumar*Deferred* list. If it is not empty: refill the *worklist* contents with
2046373f5ddSAditya Kumar*Deferred* list and redo step 2, if the *Deferred* list is empty, then exit
2056373f5ddSAditya Kumarfrom method.
206f0c3a346SStepan Dyatkovskiy
207f0c3a346SStepan DyatkovskiyComparison and logarithmical search
208f0c3a346SStepan Dyatkovskiy"""""""""""""""""""""""""""""""""""
209f0c3a346SStepan DyatkovskiyLet's recall our task: for every function *F* from module *M*, we have to find
2106373f5ddSAditya Kumarequal functions *F`* in the shortest time possible , and merge them into a
2116373f5ddSAditya Kumarsingle function.
212f0c3a346SStepan Dyatkovskiy
2136373f5ddSAditya KumarDefining total ordering among the functions set allows us to organize
2146373f5ddSAditya Kumarfunctions into a binary tree. The lookup procedure complexity would be
2156373f5ddSAditya Kumarestimated as O(log(N)) in this case. But how do we define *total-ordering*?
216f0c3a346SStepan Dyatkovskiy
217f0c3a346SStepan DyatkovskiyWe have to introduce a single rule applicable to every pair of functions, and
2186373f5ddSAditya Kumarfollowing this rule, then evaluate which of them is greater. What kind of rule
2196373f5ddSAditya Kumarcould it be? Let's declare it as the "compare" method that returns one of 3
220f0c3a346SStepan Dyatkovskiypossible values:
221f0c3a346SStepan Dyatkovskiy
222f0c3a346SStepan Dyatkovskiy-1, left is *less* than right,
223f0c3a346SStepan Dyatkovskiy
224f0c3a346SStepan Dyatkovskiy0, left and right are *equal*,
225f0c3a346SStepan Dyatkovskiy
226f0c3a346SStepan Dyatkovskiy1, left is *greater* than right.
227f0c3a346SStepan Dyatkovskiy
228f0c3a346SStepan DyatkovskiyOf course it means, that we have to maintain
229f0c3a346SStepan Dyatkovskiy*strict and non-strict order relation properties*:
230f0c3a346SStepan Dyatkovskiy
231f0c3a346SStepan Dyatkovskiy* reflexivity (``a <= a``, ``a == a``, ``a >= a``),
232f0c3a346SStepan Dyatkovskiy* antisymmetry (if ``a <= b`` and ``b <= a`` then ``a == b``),
233f0c3a346SStepan Dyatkovskiy* transitivity (``a <= b`` and ``b <= c``, then ``a <= c``)
234f0c3a346SStepan Dyatkovskiy* asymmetry (if ``a < b``, then ``a > b`` or ``a == b``).
235f0c3a346SStepan Dyatkovskiy
2366373f5ddSAditya KumarAs mentioned before, the comparison routine consists of
2376373f5ddSAditya Kumar"sub-comparison-routines", with each of them also consisting of
2386373f5ddSAditya Kumar"sub-comparison-routines", and so on. Finally, it ends up with primitive
239f0c3a346SStepan Dyatkovskiycomparison.
240f0c3a346SStepan Dyatkovskiy
2416373f5ddSAditya KumarBelow, we will use the following operations:
242f0c3a346SStepan Dyatkovskiy
2436373f5ddSAditya Kumar#. ``cmpNumbers(number1, number2)`` is a method that returns -1 if left is less
244f0c3a346SStepan Dyatkovskiy   than right; 0, if left and right are equal; and 1 otherwise.
245f0c3a346SStepan Dyatkovskiy
2466373f5ddSAditya Kumar#. ``cmpFlags(flag1, flag2)`` is a hypothetical method that compares two flags.
247f0c3a346SStepan Dyatkovskiy   The logic is the same as in ``cmpNumbers``, where ``true`` is 1, and
248f0c3a346SStepan Dyatkovskiy   ``false`` is 0.
249f0c3a346SStepan Dyatkovskiy
2506373f5ddSAditya KumarThe rest of the article is based on *MergeFunctions.cpp* source code
2516373f5ddSAditya Kumar(found in *<llvm_dir>/lib/Transforms/IPO/MergeFunctions.cpp*). We would like
2526373f5ddSAditya Kumarto ask reader to keep this file open, so we could use it as a reference
2536373f5ddSAditya Kumarfor further explanations.
254f0c3a346SStepan Dyatkovskiy
2556373f5ddSAditya KumarNow, we're ready to proceed to the next chapter and see how it works.
256f0c3a346SStepan Dyatkovskiy
257f0c3a346SStepan DyatkovskiyFunctions comparison
258f0c3a346SStepan Dyatkovskiy====================
259f0c3a346SStepan DyatkovskiyAt first, let's define how exactly we compare complex objects.
260f0c3a346SStepan Dyatkovskiy
2616373f5ddSAditya KumarComplex object comparison (function, basic-block, etc) is mostly based on its
2626373f5ddSAditya Kumarsub-object comparison results. It is similar to the next "tree" objects
263f0c3a346SStepan Dyatkovskiycomparison:
264f0c3a346SStepan Dyatkovskiy
265f0c3a346SStepan Dyatkovskiy#. For two trees *T1* and *T2* we perform *depth-first-traversal* and have
266f0c3a346SStepan Dyatkovskiy   two sequences as a product: "*T1Items*" and "*T2Items*".
267f0c3a346SStepan Dyatkovskiy
2686373f5ddSAditya Kumar#. We then compare chains "*T1Items*" and "*T2Items*" in
2696373f5ddSAditya Kumar   the most-significant-item-first order. The result of items comparison
2706373f5ddSAditya Kumar   would be the result of *T1* and *T2* comparison itself.
271f0c3a346SStepan Dyatkovskiy
272f0c3a346SStepan DyatkovskiyFunctionComparator::compare(void)
273f0c3a346SStepan Dyatkovskiy---------------------------------
2746373f5ddSAditya KumarA brief look at the source code tells us that the comparison starts in the
275f0c3a346SStepan Dyatkovskiy“``int FunctionComparator::compare(void)``” method.
276f0c3a346SStepan Dyatkovskiy
2776373f5ddSAditya Kumar1. The first parts to be compared are the function's attributes and some
2786373f5ddSAditya Kumarproperties that is outside the “attributes” term, but still could make the
2796373f5ddSAditya Kumarfunction different without changing its body. This part of the comparison is
2806373f5ddSAditya Kumarusually done within simple *cmpNumbers* or *cmpFlags* operations (e.g.
2816373f5ddSAditya Kumar``cmpFlags(F1->hasGC(), F2->hasGC())``). Below is a full list of function's
282f0c3a346SStepan Dyatkovskiyproperties to be compared on this stage:
283f0c3a346SStepan Dyatkovskiy
284f0c3a346SStepan Dyatkovskiy  * *Attributes* (those are returned by ``Function::getAttributes()``
285f0c3a346SStepan Dyatkovskiy    method).
286f0c3a346SStepan Dyatkovskiy
287f0c3a346SStepan Dyatkovskiy  * *GC*, for equivalence, *RHS* and *LHS* should be both either without
288f0c3a346SStepan Dyatkovskiy    *GC* or with the same one.
289f0c3a346SStepan Dyatkovskiy
290f0c3a346SStepan Dyatkovskiy  * *Section*, just like a *GC*: *RHS* and *LHS* should be defined in the
291f0c3a346SStepan Dyatkovskiy    same section.
292f0c3a346SStepan Dyatkovskiy
293f0c3a346SStepan Dyatkovskiy  * *Variable arguments*. *LHS* and *RHS* should be both either with or
294f0c3a346SStepan Dyatkovskiy    without *var-args*.
295f0c3a346SStepan Dyatkovskiy
296f0c3a346SStepan Dyatkovskiy  * *Calling convention* should be the same.
297f0c3a346SStepan Dyatkovskiy
298f0c3a346SStepan Dyatkovskiy2. Function type. Checked by ``FunctionComparator::cmpType(Type*, Type*)``
299f0c3a346SStepan Dyatkovskiymethod. It checks return type and parameters type; the method itself will be
300f0c3a346SStepan Dyatkovskiydescribed later.
301f0c3a346SStepan Dyatkovskiy
302f0c3a346SStepan Dyatkovskiy3. Associate function formal parameters with each other. Then comparing function
303f0c3a346SStepan Dyatkovskiybodies, if we see the usage of *LHS*'s *i*-th argument in *LHS*'s body, then,
304f0c3a346SStepan Dyatkovskiywe want to see usage of *RHS*'s *i*-th argument at the same place in *RHS*'s
305f0c3a346SStepan Dyatkovskiybody, otherwise functions are different. On this stage we grant the preference
306f0c3a346SStepan Dyatkovskiyto those we met later in function body (value we met first would be *less*).
307f0c3a346SStepan DyatkovskiyThis is done by “``FunctionComparator::cmpValues(const Value*, const Value*)``”
308f0c3a346SStepan Dyatkovskiymethod (will be described a bit later).
309f0c3a346SStepan Dyatkovskiy
310f0c3a346SStepan Dyatkovskiy4. Function body comparison. As it written in method comments:
311f0c3a346SStepan Dyatkovskiy
312f0c3a346SStepan Dyatkovskiy“We do a CFG-ordered walk since the actual ordering of the blocks in the linked
313f0c3a346SStepan Dyatkovskiylist is immaterial. Our walk starts at the entry block for both functions, then
314f0c3a346SStepan Dyatkovskiytakes each block from each terminator in order. As an artifact, this also means
315f0c3a346SStepan Dyatkovskiythat unreachable blocks are ignored.”
316f0c3a346SStepan Dyatkovskiy
317f0c3a346SStepan DyatkovskiySo, using this walk we get BBs from *left* and *right* in the same order, and
318f0c3a346SStepan Dyatkovskiycompare them by “``FunctionComparator::compare(const BasicBlock*, const
319f0c3a346SStepan DyatkovskiyBasicBlock*)``” method.
320f0c3a346SStepan Dyatkovskiy
321f0c3a346SStepan DyatkovskiyWe also associate BBs with each other, like we did it with function formal
322f0c3a346SStepan Dyatkovskiyarguments (see ``cmpValues`` method below).
323f0c3a346SStepan Dyatkovskiy
324f0c3a346SStepan DyatkovskiyFunctionComparator::cmpType
325f0c3a346SStepan Dyatkovskiy---------------------------
3266373f5ddSAditya KumarConsider how type comparison works.
327f0c3a346SStepan Dyatkovskiy
328f0c3a346SStepan Dyatkovskiy1. Coerce pointer to integer. If left type is a pointer, try to coerce it to the
329f0c3a346SStepan Dyatkovskiyinteger type. It could be done if its address space is 0, or if address spaces
330f0c3a346SStepan Dyatkovskiyare ignored at all. Do the same thing for the right type.
331f0c3a346SStepan Dyatkovskiy
332f0c3a346SStepan Dyatkovskiy2. If left and right types are equal, return 0. Otherwise we need to give
333f0c3a346SStepan Dyatkovskiypreference to one of them. So proceed to the next step.
334f0c3a346SStepan Dyatkovskiy
335f0c3a346SStepan Dyatkovskiy3. If types are of different kind (different type IDs). Return result of type
3366373f5ddSAditya KumarIDs comparison, treating them as numbers (use ``cmpNumbers`` operation).
337f0c3a346SStepan Dyatkovskiy
338f0c3a346SStepan Dyatkovskiy4. If types are vectors or integers, return result of their pointers comparison,
339f0c3a346SStepan Dyatkovskiycomparing them as numbers.
340f0c3a346SStepan Dyatkovskiy
341f0c3a346SStepan Dyatkovskiy5. Check whether type ID belongs to the next group (call it equivalent-group):
342f0c3a346SStepan Dyatkovskiy
343f0c3a346SStepan Dyatkovskiy   * Void
344f0c3a346SStepan Dyatkovskiy
345f0c3a346SStepan Dyatkovskiy   * Float
346f0c3a346SStepan Dyatkovskiy
347f0c3a346SStepan Dyatkovskiy   * Double
348f0c3a346SStepan Dyatkovskiy
349f0c3a346SStepan Dyatkovskiy   * X86_FP80
350f0c3a346SStepan Dyatkovskiy
351f0c3a346SStepan Dyatkovskiy   * FP128
352f0c3a346SStepan Dyatkovskiy
353f0c3a346SStepan Dyatkovskiy   * PPC_FP128
354f0c3a346SStepan Dyatkovskiy
355f0c3a346SStepan Dyatkovskiy   * Label
356f0c3a346SStepan Dyatkovskiy
357f0c3a346SStepan Dyatkovskiy   * Metadata.
358f0c3a346SStepan Dyatkovskiy
359f0c3a346SStepan Dyatkovskiy   If ID belongs to group above, return 0. Since it's enough to see that
360f0c3a346SStepan Dyatkovskiy   types has the same ``TypeID``. No additional information is required.
361f0c3a346SStepan Dyatkovskiy
362f0c3a346SStepan Dyatkovskiy6. Left and right are pointers. Return result of address space comparison
363f0c3a346SStepan Dyatkovskiy(numbers comparison).
364f0c3a346SStepan Dyatkovskiy
365f0c3a346SStepan Dyatkovskiy7. Complex types (structures, arrays, etc.). Follow complex objects comparison
366f0c3a346SStepan Dyatkovskiytechnique (see the very first paragraph of this chapter). Both *left* and
367f0c3a346SStepan Dyatkovskiy*right* are to be expanded and their element types will be checked the same
368f0c3a346SStepan Dyatkovskiyway. If we get -1 or 1 on some stage, return it. Otherwise return 0.
369f0c3a346SStepan Dyatkovskiy
370f0c3a346SStepan Dyatkovskiy8. Steps 1-6 describe all the possible cases, if we passed steps 1-6 and didn't
3716373f5ddSAditya Kumarget any conclusions, then invoke ``llvm_unreachable``, since it's quite an
372f0c3a346SStepan Dyatkovskiyunexpectable case.
373f0c3a346SStepan Dyatkovskiy
374f0c3a346SStepan DyatkovskiycmpValues(const Value*, const Value*)
375f0c3a346SStepan Dyatkovskiy-------------------------------------
376f0c3a346SStepan DyatkovskiyMethod that compares local values.
377f0c3a346SStepan Dyatkovskiy
3786373f5ddSAditya KumarThis method gives us an answer to a very curious question: whether we could
3796373f5ddSAditya Kumartreat local values as equal, and which value is greater otherwise. It's
3806373f5ddSAditya Kumarbetter to start from example:
381f0c3a346SStepan Dyatkovskiy
3826373f5ddSAditya KumarConsider the situation when we're looking at the same place in left
3836373f5ddSAditya Kumarfunction "*FL*" and in right function "*FR*". Every part of *left* place is
3846373f5ddSAditya Kumarequal to the corresponding part of *right* place, and (!) both parts use
3856373f5ddSAditya Kumar*Value* instances, for example:
386f0c3a346SStepan Dyatkovskiy
387124f2593SRenato Golin.. code-block:: text
388f0c3a346SStepan Dyatkovskiy
389f0c3a346SStepan Dyatkovskiy   instr0 i32 %LV   ; left side, function FL
390f0c3a346SStepan Dyatkovskiy   instr0 i32 %RV   ; right side, function FR
391f0c3a346SStepan Dyatkovskiy
392f0c3a346SStepan DyatkovskiySo, now our conclusion depends on *Value* instances comparison.
393f0c3a346SStepan Dyatkovskiy
3946373f5ddSAditya KumarThe main purpose of this method is to determine relation between such values.
395f0c3a346SStepan Dyatkovskiy
3966373f5ddSAditya KumarWhat can we expect from equal functions? At the same place, in functions
3976373f5ddSAditya Kumar"*FL*" and "*FR*" we expect to see *equal* values, or values *defined* at
3986373f5ddSAditya Kumarthe same place in "*FL*" and "*FR*".
399f0c3a346SStepan Dyatkovskiy
4006373f5ddSAditya KumarConsider a small example here:
401f0c3a346SStepan Dyatkovskiy
402124f2593SRenato Golin.. code-block:: text
403f0c3a346SStepan Dyatkovskiy
404f0c3a346SStepan Dyatkovskiy  define void %f(i32 %pf0, i32 %pf1) {
405f0c3a346SStepan Dyatkovskiy    instr0 i32 %pf0 instr1 i32 %pf1 instr2 i32 123
406f0c3a346SStepan Dyatkovskiy  }
407f0c3a346SStepan Dyatkovskiy
408124f2593SRenato Golin.. code-block:: text
409f0c3a346SStepan Dyatkovskiy
410f0c3a346SStepan Dyatkovskiy  define void %g(i32 %pg0, i32 %pg1) {
411f0c3a346SStepan Dyatkovskiy    instr0 i32 %pg0 instr1 i32 %pg0 instr2 i32 123
412f0c3a346SStepan Dyatkovskiy  }
413f0c3a346SStepan Dyatkovskiy
4146373f5ddSAditya KumarIn this example, *pf0* is associated with *pg0*, *pf1* is associated with
4156373f5ddSAditya Kumar*pg1*, and we also declare that *pf0* < *pf1*, and thus *pg0* < *pf1*.
416f0c3a346SStepan Dyatkovskiy
417f0c3a346SStepan DyatkovskiyInstructions with opcode "*instr0*" would be *equal*, since their types and
418f0c3a346SStepan Dyatkovskiyopcodes are equal, and values are *associated*.
419f0c3a346SStepan Dyatkovskiy
4206373f5ddSAditya KumarInstructions with opcode "*instr1*" from *f* is *greater* than instructions
4216373f5ddSAditya Kumarwith opcode "*instr1*" from *g*; here we have equal types and opcodes, but
4226373f5ddSAditya Kumar"*pf1* is greater than "*pg0*".
423f0c3a346SStepan Dyatkovskiy
4246373f5ddSAditya KumarInstructions with opcode "*instr2*" are equal, because their opcodes and
425f0c3a346SStepan Dyatkovskiytypes are equal, and the same constant is used as a value.
426f0c3a346SStepan Dyatkovskiy
4276373f5ddSAditya KumarWhat we associate in cmpValues?
428f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
429f0c3a346SStepan Dyatkovskiy* Function arguments. *i*-th argument from left function associated with
430f0c3a346SStepan Dyatkovskiy  *i*-th argument from right function.
431f0c3a346SStepan Dyatkovskiy* BasicBlock instances. In basic-block enumeration loop we associate *i*-th
432f0c3a346SStepan Dyatkovskiy  BasicBlock from the left function with *i*-th BasicBlock from the right
433f0c3a346SStepan Dyatkovskiy  function.
434f0c3a346SStepan Dyatkovskiy* Instructions.
435f0c3a346SStepan Dyatkovskiy* Instruction operands. Note, we can meet *Value* here we have never seen
436f0c3a346SStepan Dyatkovskiy  before. In this case it is not a function argument, nor *BasicBlock*, nor
4376373f5ddSAditya Kumar  *Instruction*. It is a global value. It is a constant, since it's the only
4386373f5ddSAditya Kumar  supposed global here. The method also compares: Constants that are of the
4396373f5ddSAditya Kumar  same type and if right constant can be losslessly bit-casted to the left
4406373f5ddSAditya Kumar  one, then we also compare them.
441f0c3a346SStepan Dyatkovskiy
442f0c3a346SStepan DyatkovskiyHow to implement cmpValues?
443f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^
4446373f5ddSAditya Kumar*Association* is a case of equality for us. We just treat such values as equal,
4456373f5ddSAditya Kumarbut, in general, we need to implement antisymmetric relation. As mentioned
4466373f5ddSAditya Kumarabove, to understand what is *less*, we can use order in which we
4476373f5ddSAditya Kumarmeet values. If both values have the same order in a function (met at the same
4486373f5ddSAditya Kumartime), we then treat values as *associated*. Otherwise – it depends on who was
449f0c3a346SStepan Dyatkovskiyfirst.
450f0c3a346SStepan Dyatkovskiy
4516373f5ddSAditya KumarEvery time we run the top-level compare method, we initialize two identical
4526373f5ddSAditya Kumarmaps (one for the left side, another one for the right side):
453f0c3a346SStepan Dyatkovskiy
454f0c3a346SStepan Dyatkovskiy``map<Value, int> sn_mapL, sn_mapR;``
455f0c3a346SStepan Dyatkovskiy
456f0c3a346SStepan DyatkovskiyThe key of the map is the *Value* itself, the *value* – is its order (call it
457f0c3a346SStepan Dyatkovskiy*serial number*).
458f0c3a346SStepan Dyatkovskiy
459f0c3a346SStepan DyatkovskiyTo add value *V* we need to perform the next procedure:
460f0c3a346SStepan Dyatkovskiy
461f0c3a346SStepan Dyatkovskiy``sn_map.insert(std::make_pair(V, sn_map.size()));``
462f0c3a346SStepan Dyatkovskiy
4636373f5ddSAditya KumarFor the first *Value*, map will return *0*, for the second *Value* map will
4646373f5ddSAditya Kumarreturn *1*, and so on.
465f0c3a346SStepan Dyatkovskiy
4666373f5ddSAditya KumarWe can then check whether left and right values met at the same time with
4676373f5ddSAditya Kumara simple comparison:
468f0c3a346SStepan Dyatkovskiy
469f0c3a346SStepan Dyatkovskiy``cmpNumbers(sn_mapL[Left], sn_mapR[Right]);``
470f0c3a346SStepan Dyatkovskiy
471f0c3a346SStepan DyatkovskiyOf course, we can combine insertion and comparison:
472f0c3a346SStepan Dyatkovskiy
473f0c3a346SStepan Dyatkovskiy.. code-block:: c++
474f0c3a346SStepan Dyatkovskiy
475f0c3a346SStepan Dyatkovskiy  std::pair<iterator, bool>
476f0c3a346SStepan Dyatkovskiy    LeftRes = sn_mapL.insert(std::make_pair(Left, sn_mapL.size())), RightRes
477f0c3a346SStepan Dyatkovskiy    = sn_mapR.insert(std::make_pair(Right, sn_mapR.size()));
478f0c3a346SStepan Dyatkovskiy  return cmpNumbers(LeftRes.first->second, RightRes.first->second);
479f0c3a346SStepan Dyatkovskiy
480f0c3a346SStepan DyatkovskiyLet's look, how whole method could be implemented.
481f0c3a346SStepan Dyatkovskiy
4826373f5ddSAditya Kumar1. We have to start with the bad news. Consider function self and
483f0c3a346SStepan Dyatkovskiycross-referencing cases:
484f0c3a346SStepan Dyatkovskiy
485f0c3a346SStepan Dyatkovskiy.. code-block:: c++
486f0c3a346SStepan Dyatkovskiy
487f0c3a346SStepan Dyatkovskiy  // self-reference unsigned fact0(unsigned n) { return n > 1 ? n
488f0c3a346SStepan Dyatkovskiy  * fact0(n-1) : 1; } unsigned fact1(unsigned n) { return n > 1 ? n *
489f0c3a346SStepan Dyatkovskiy  fact1(n-1) : 1; }
490f0c3a346SStepan Dyatkovskiy
491f0c3a346SStepan Dyatkovskiy  // cross-reference unsigned ping(unsigned n) { return n!= 0 ? pong(n-1) : 0;
492f0c3a346SStepan Dyatkovskiy  } unsigned pong(unsigned n) { return n!= 0 ? ping(n-1) : 0; }
493f0c3a346SStepan Dyatkovskiy
494f0c3a346SStepan Dyatkovskiy..
495f0c3a346SStepan Dyatkovskiy
496f0c3a346SStepan Dyatkovskiy  This comparison has been implemented in initial *MergeFunctions* pass
497f0c3a346SStepan Dyatkovskiy  version. But, unfortunately, it is not transitive. And this is the only case
498f0c3a346SStepan Dyatkovskiy  we can't convert to less-equal-greater comparison. It is a seldom case, 4-5
4996373f5ddSAditya Kumar  functions of 10000 (checked in test-suite), and, we hope, the reader would
500f0c3a346SStepan Dyatkovskiy  forgive us for such a sacrifice in order to get the O(log(N)) pass time.
501f0c3a346SStepan Dyatkovskiy
502f0c3a346SStepan Dyatkovskiy2. If left/right *Value* is a constant, we have to compare them. Return 0 if it
503f0c3a346SStepan Dyatkovskiyis the same constant, or use ``cmpConstants`` method otherwise.
504f0c3a346SStepan Dyatkovskiy
505f0c3a346SStepan Dyatkovskiy3. If left/right is *InlineAsm* instance. Return result of *Value* pointers
506f0c3a346SStepan Dyatkovskiycomparison.
507f0c3a346SStepan Dyatkovskiy
508f0c3a346SStepan Dyatkovskiy4. Explicit association of *L* (left value) and *R*  (right value). We need to
509f0c3a346SStepan Dyatkovskiyfind out whether values met at the same time, and thus are *associated*. Or we
5106373f5ddSAditya Kumarneed to put the rule: when we treat *L* < *R*. Now it is easy: we just return
5116373f5ddSAditya Kumarthe result of numbers comparison:
512f0c3a346SStepan Dyatkovskiy
513f0c3a346SStepan Dyatkovskiy.. code-block:: c++
514f0c3a346SStepan Dyatkovskiy
515f0c3a346SStepan Dyatkovskiy   std::pair<iterator, bool>
516f0c3a346SStepan Dyatkovskiy     LeftRes = sn_mapL.insert(std::make_pair(Left, sn_mapL.size())),
517f0c3a346SStepan Dyatkovskiy     RightRes = sn_mapR.insert(std::make_pair(Right, sn_mapR.size()));
518f0c3a346SStepan Dyatkovskiy   if (LeftRes.first->second == RightRes.first->second) return 0;
519f0c3a346SStepan Dyatkovskiy   if (LeftRes.first->second < RightRes.first->second) return -1;
520f0c3a346SStepan Dyatkovskiy   return 1;
521f0c3a346SStepan Dyatkovskiy
5226373f5ddSAditya KumarNow when *cmpValues* returns 0, we can proceed the comparison procedure.
5236373f5ddSAditya KumarOtherwise, if we get (-1 or 1), we need to pass this result to the top level,
5246373f5ddSAditya Kumarand finish comparison procedure.
525f0c3a346SStepan Dyatkovskiy
526f0c3a346SStepan DyatkovskiycmpConstants
527f0c3a346SStepan Dyatkovskiy------------
528f0c3a346SStepan DyatkovskiyPerforms constants comparison as follows:
529f0c3a346SStepan Dyatkovskiy
5306373f5ddSAditya Kumar1. Compare constant types using ``cmpType`` method. If the result is -1 or 1,
5316373f5ddSAditya Kumargoto step 2, otherwise proceed to step 3.
532f0c3a346SStepan Dyatkovskiy
533f0c3a346SStepan Dyatkovskiy2. If types are different, we still can check whether constants could be
534f0c3a346SStepan Dyatkovskiylosslessly bitcasted to each other. The further explanation is modification of
535f0c3a346SStepan Dyatkovskiy``canLosslesslyBitCastTo`` method.
536f0c3a346SStepan Dyatkovskiy
537f0c3a346SStepan Dyatkovskiy   2.1 Check whether constants are of the first class types
538f0c3a346SStepan Dyatkovskiy   (``isFirstClassType`` check):
539f0c3a346SStepan Dyatkovskiy
540f0c3a346SStepan Dyatkovskiy   2.1.1. If both constants are *not* of the first class type: return result
541f0c3a346SStepan Dyatkovskiy   of ``cmpType``.
542f0c3a346SStepan Dyatkovskiy
543f0c3a346SStepan Dyatkovskiy   2.1.2. Otherwise, if left type is not of the first class, return -1. If
544f0c3a346SStepan Dyatkovskiy   right type is not of the first class, return 1.
545f0c3a346SStepan Dyatkovskiy
546f0c3a346SStepan Dyatkovskiy   2.1.3. If both types are of the first class type, proceed to the next step
547f0c3a346SStepan Dyatkovskiy   (2.1.3.1).
548f0c3a346SStepan Dyatkovskiy
549f0c3a346SStepan Dyatkovskiy   2.1.3.1. If types are vectors, compare their bitwidth using the
550f0c3a346SStepan Dyatkovskiy   *cmpNumbers*. If result is not 0, return it.
551f0c3a346SStepan Dyatkovskiy
552f0c3a346SStepan Dyatkovskiy   2.1.3.2. Different types, but not a vectors:
553f0c3a346SStepan Dyatkovskiy
554f0c3a346SStepan Dyatkovskiy   * if both of them are pointers, good for us, we can proceed to step 3.
555f0c3a346SStepan Dyatkovskiy   * if one of types is pointer, return result of *isPointer* flags
556f0c3a346SStepan Dyatkovskiy     comparison (*cmpFlags* operation).
557f0c3a346SStepan Dyatkovskiy   * otherwise we have no methods to prove bitcastability, and thus return
558f0c3a346SStepan Dyatkovskiy     result of types comparison (-1 or 1).
559f0c3a346SStepan Dyatkovskiy
560f0c3a346SStepan DyatkovskiySteps below are for the case when types are equal, or case when constants are
561f0c3a346SStepan Dyatkovskiybitcastable:
562f0c3a346SStepan Dyatkovskiy
563f0c3a346SStepan Dyatkovskiy3. One of constants is a "*null*" value. Return the result of
564f0c3a346SStepan Dyatkovskiy``cmpFlags(L->isNullValue, R->isNullValue)`` comparison.
565f0c3a346SStepan Dyatkovskiy
566f0c3a346SStepan Dyatkovskiy4. Compare value IDs, and return result if it is not 0:
567f0c3a346SStepan Dyatkovskiy
568f0c3a346SStepan Dyatkovskiy.. code-block:: c++
569f0c3a346SStepan Dyatkovskiy
570f0c3a346SStepan Dyatkovskiy  if (int Res = cmpNumbers(L->getValueID(), R->getValueID()))
571f0c3a346SStepan Dyatkovskiy    return Res;
572f0c3a346SStepan Dyatkovskiy
5736373f5ddSAditya Kumar5. Compare the contents of constants. The comparison depends on the kind of
574f0c3a346SStepan Dyatkovskiyconstants, but on this stage it is just a lexicographical comparison. Just see
575f0c3a346SStepan Dyatkovskiyhow it was described in the beginning of "*Functions comparison*" paragraph.
5766373f5ddSAditya KumarMathematically, it is equal to the next case: we encode left constant and right
577f0c3a346SStepan Dyatkovskiyconstant (with similar way *bitcode-writer* does). Then compare left code
578f0c3a346SStepan Dyatkovskiysequence and right code sequence.
579f0c3a346SStepan Dyatkovskiy
580f0c3a346SStepan Dyatkovskiycompare(const BasicBlock*, const BasicBlock*)
581f0c3a346SStepan Dyatkovskiy---------------------------------------------
582f0c3a346SStepan DyatkovskiyCompares two *BasicBlock* instances.
583f0c3a346SStepan Dyatkovskiy
584f0c3a346SStepan DyatkovskiyIt enumerates instructions from left *BB* and right *BB*.
585f0c3a346SStepan Dyatkovskiy
586f0c3a346SStepan Dyatkovskiy1. It assigns serial numbers to the left and right instructions, using
587f0c3a346SStepan Dyatkovskiy``cmpValues`` method.
588f0c3a346SStepan Dyatkovskiy
589f0c3a346SStepan Dyatkovskiy2. If one of left or right is *GEP* (``GetElementPtr``), then treat *GEP* as
5906373f5ddSAditya Kumargreater than other instructions. If both instructions are *GEPs* use ``cmpGEP``
591f0c3a346SStepan Dyatkovskiymethod for comparison. If result is -1 or 1, pass it to the top-level
592f0c3a346SStepan Dyatkovskiycomparison (return it).
593f0c3a346SStepan Dyatkovskiy
594f0c3a346SStepan Dyatkovskiy   3.1. Compare operations. Call ``cmpOperation`` method. If result is -1 or
595f0c3a346SStepan Dyatkovskiy   1, return it.
596f0c3a346SStepan Dyatkovskiy
597f0c3a346SStepan Dyatkovskiy   3.2. Compare number of operands, if result is -1 or 1, return it.
598f0c3a346SStepan Dyatkovskiy
599f0c3a346SStepan Dyatkovskiy   3.3. Compare operands themselves, use ``cmpValues`` method. Return result
600f0c3a346SStepan Dyatkovskiy   if it is -1 or 1.
601f0c3a346SStepan Dyatkovskiy
602f0c3a346SStepan Dyatkovskiy   3.4. Compare type of operands, using ``cmpType`` method. Return result if
603f0c3a346SStepan Dyatkovskiy   it is -1 or 1.
604f0c3a346SStepan Dyatkovskiy
605f0c3a346SStepan Dyatkovskiy   3.5. Proceed to the next instruction.
606f0c3a346SStepan Dyatkovskiy
607f0c3a346SStepan Dyatkovskiy4. We can finish instruction enumeration in 3 cases:
608f0c3a346SStepan Dyatkovskiy
609f0c3a346SStepan Dyatkovskiy   4.1. We reached the end of both left and right basic-blocks. We didn't
6106373f5ddSAditya Kumar   exit on steps 1-3, so contents are equal, return 0.
611f0c3a346SStepan Dyatkovskiy
612f0c3a346SStepan Dyatkovskiy   4.2. We have reached the end of the left basic-block. Return -1.
613f0c3a346SStepan Dyatkovskiy
6146373f5ddSAditya Kumar   4.3. Return 1 (we reached the end of the right basic block).
615f0c3a346SStepan Dyatkovskiy
616f0c3a346SStepan DyatkovskiycmpGEP
617f0c3a346SStepan Dyatkovskiy------
618f0c3a346SStepan DyatkovskiyCompares two GEPs (``getelementptr`` instructions).
619f0c3a346SStepan Dyatkovskiy
620f0c3a346SStepan DyatkovskiyIt differs from regular operations comparison with the only thing: possibility
621f0c3a346SStepan Dyatkovskiyto use ``accumulateConstantOffset`` method.
622f0c3a346SStepan Dyatkovskiy
623f0c3a346SStepan DyatkovskiySo, if we get constant offset for both left and right *GEPs*, then compare it as
624f0c3a346SStepan Dyatkovskiynumbers, and return comparison result.
625f0c3a346SStepan Dyatkovskiy
626f0c3a346SStepan DyatkovskiyOtherwise treat it like a regular operation (see previous paragraph).
627f0c3a346SStepan Dyatkovskiy
628f0c3a346SStepan DyatkovskiycmpOperation
629f0c3a346SStepan Dyatkovskiy------------
630f0c3a346SStepan DyatkovskiyCompares instruction opcodes and some important operation properties.
631f0c3a346SStepan Dyatkovskiy
632f0c3a346SStepan Dyatkovskiy1. Compare opcodes, if it differs return the result.
633f0c3a346SStepan Dyatkovskiy
634f0c3a346SStepan Dyatkovskiy2. Compare number of operands. If it differs – return the result.
635f0c3a346SStepan Dyatkovskiy
636f0c3a346SStepan Dyatkovskiy3. Compare operation types, use *cmpType*. All the same – if types are
637f0c3a346SStepan Dyatkovskiydifferent, return result.
638f0c3a346SStepan Dyatkovskiy
639f0c3a346SStepan Dyatkovskiy4. Compare *subclassOptionalData*, get it with ``getRawSubclassOptionalData``
640f0c3a346SStepan Dyatkovskiymethod, and compare it like a numbers.
641f0c3a346SStepan Dyatkovskiy
642f0c3a346SStepan Dyatkovskiy5. Compare operand types.
643f0c3a346SStepan Dyatkovskiy
6446373f5ddSAditya Kumar6. For some particular instructions, check equivalence (relation in our case) of
6456373f5ddSAditya Kumarsome significant attributes. For example, we have to compare alignment for
646f0c3a346SStepan Dyatkovskiy``load`` instructions.
647f0c3a346SStepan Dyatkovskiy
648f0c3a346SStepan DyatkovskiyO(log(N))
649f0c3a346SStepan Dyatkovskiy---------
650f0c3a346SStepan DyatkovskiyMethods described above implement order relationship. And latter, could be used
651f0c3a346SStepan Dyatkovskiyfor nodes comparison in a binary tree. So we can organize functions set into
652f0c3a346SStepan Dyatkovskiythe binary tree and reduce the cost of lookup procedure from
653f0c3a346SStepan DyatkovskiyO(N*N) to O(log(N)).
654f0c3a346SStepan Dyatkovskiy
655f0c3a346SStepan DyatkovskiyMerging process, mergeTwoFunctions
656f0c3a346SStepan Dyatkovskiy==================================
657f0c3a346SStepan DyatkovskiyOnce *MergeFunctions* detected that current function (*G*) is equal to one that
658f0c3a346SStepan Dyatkovskiywere analyzed before (function *F*) it calls ``mergeTwoFunctions(Function*,
659f0c3a346SStepan DyatkovskiyFunction*)``.
660f0c3a346SStepan Dyatkovskiy
661f0c3a346SStepan DyatkovskiyOperation affects ``FnTree`` contents with next way: *F* will stay in
662f0c3a346SStepan Dyatkovskiy``FnTree``. *G* being equal to *F* will not be added to ``FnTree``. Calls of
663f0c3a346SStepan Dyatkovskiy*G* would be replaced with something else. It changes bodies of callers. So,
664f0c3a346SStepan Dyatkovskiyfunctions that calls *G* would be put into ``Deferred`` set and removed from
665f0c3a346SStepan Dyatkovskiy``FnTree``, and analyzed again.
666f0c3a346SStepan Dyatkovskiy
667f0c3a346SStepan DyatkovskiyThe approach is next:
668f0c3a346SStepan Dyatkovskiy
669f0c3a346SStepan Dyatkovskiy1. Most wished case: when we can use alias and both of *F* and *G* are weak. We
670f0c3a346SStepan Dyatkovskiymake both of them with aliases to the third strong function *H*. Actually *H*
671f0c3a346SStepan Dyatkovskiyis *F*. See below how it's made (but it's better to look straight into the
672f0c3a346SStepan Dyatkovskiysource code). Well, this is a case when we can just replace *G* with *F*
673f0c3a346SStepan Dyatkovskiyeverywhere, we use ``replaceAllUsesWith`` operation here (*RAUW*).
674f0c3a346SStepan Dyatkovskiy
675f0c3a346SStepan Dyatkovskiy2. *F* could not be overridden, while *G* could. It would be good to do the
676f0c3a346SStepan Dyatkovskiynext: after merging the places where overridable function were used, still use
677f0c3a346SStepan Dyatkovskiyoverridable stub. So try to make *G* alias to *F*, or create overridable tail
678f0c3a346SStepan Dyatkovskiycall wrapper around *F* and replace *G* with that call.
679f0c3a346SStepan Dyatkovskiy
680f0c3a346SStepan Dyatkovskiy3. Neither *F* nor *G* could be overridden. We can't use *RAUW*. We can just
681f0c3a346SStepan Dyatkovskiychange the callers: call *F* instead of *G*.  That's what
682f0c3a346SStepan Dyatkovskiy``replaceDirectCallers`` does.
683f0c3a346SStepan Dyatkovskiy
6846373f5ddSAditya KumarBelow is a detailed body description.
685f0c3a346SStepan Dyatkovskiy
686f0c3a346SStepan DyatkovskiyIf “F” may be overridden
687f0c3a346SStepan Dyatkovskiy------------------------
688f0c3a346SStepan DyatkovskiyAs follows from ``mayBeOverridden`` comments: “whether the definition of this
68984666a19SSylvestre Ledruglobal may be replaced by something non-equivalent at link time”. If so, that's
690f0c3a346SStepan Dyatkovskiyok: we can use alias to *F* instead of *G* or change call instructions itself.
691f0c3a346SStepan Dyatkovskiy
692f0c3a346SStepan DyatkovskiyHasGlobalAliases, removeUsers
693f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
694f0c3a346SStepan DyatkovskiyFirst consider the case when we have global aliases of one function name to
695f0c3a346SStepan Dyatkovskiyanother. Our purpose is  make both of them with aliases to the third strong
696f0c3a346SStepan Dyatkovskiyfunction. Though if we keep *F* alive and without major changes we can leave it
697f0c3a346SStepan Dyatkovskiyin ``FnTree``. Try to combine these two goals.
698f0c3a346SStepan Dyatkovskiy
699f0c3a346SStepan DyatkovskiyDo stub replacement of *F* itself with an alias to *F*.
700f0c3a346SStepan Dyatkovskiy
701f0c3a346SStepan Dyatkovskiy1. Create stub function *H*, with the same name and attributes like function
702f0c3a346SStepan Dyatkovskiy*F*. It takes maximum alignment of *F* and *G*.
703f0c3a346SStepan Dyatkovskiy
704f0c3a346SStepan Dyatkovskiy2. Replace all uses of function *F* with uses of function *H*. It is the two
705f0c3a346SStepan Dyatkovskiysteps procedure instead. First of all, we must take into account, all functions
706f0c3a346SStepan Dyatkovskiyfrom whom *F* is called would be changed: since we change the call argument
707f0c3a346SStepan Dyatkovskiy(from *F* to *H*). If so we must to review these caller functions again after
708f0c3a346SStepan Dyatkovskiythis procedure. We remove callers from ``FnTree``, method with name
709f0c3a346SStepan Dyatkovskiy``removeUsers(F)`` does that (don't confuse with ``replaceAllUsesWith``):
710f0c3a346SStepan Dyatkovskiy
711f0c3a346SStepan Dyatkovskiy   2.1. ``Inside removeUsers(Value*
712f0c3a346SStepan Dyatkovskiy   V)`` we go through the all values that use value *V* (or *F* in our context).
713f0c3a346SStepan Dyatkovskiy   If value is instruction, we go to function that holds this instruction and
714f0c3a346SStepan Dyatkovskiy   mark it as to-be-analyzed-again (put to ``Deferred`` set), we also remove
715f0c3a346SStepan Dyatkovskiy   caller from ``FnTree``.
716f0c3a346SStepan Dyatkovskiy
717f0c3a346SStepan Dyatkovskiy   2.2. Now we can do the replacement: call ``F->replaceAllUsesWith(H)``.
718f0c3a346SStepan Dyatkovskiy
719f0c3a346SStepan Dyatkovskiy3. *H* (that now "officially" plays *F*'s role) is replaced with alias to *F*.
720f0c3a346SStepan DyatkovskiyDo the same with *G*: replace it with alias to *F*. So finally everywhere *F*
721f0c3a346SStepan Dyatkovskiywas used, we use *H* and it is alias to *F*, and everywhere *G* was used we
722f0c3a346SStepan Dyatkovskiyalso have alias to *F*.
723f0c3a346SStepan Dyatkovskiy
724f0c3a346SStepan Dyatkovskiy4. Set *F* linkage to private. Make it strong :-)
725f0c3a346SStepan Dyatkovskiy
726f0c3a346SStepan DyatkovskiyNo global aliases, replaceDirectCallers
727f0c3a346SStepan Dyatkovskiy^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7286373f5ddSAditya KumarIf global aliases are not supported. We call ``replaceDirectCallers``. Just
729f0c3a346SStepan Dyatkovskiygo through all calls of *G* and replace it with calls of *F*. If you look into
7306373f5ddSAditya Kumarthe method you will see that it scans all uses of *G* too, and if use is callee
7316373f5ddSAditya Kumar(if user is call instruction and *G* is used as what to be called), we replace
7326373f5ddSAditya Kumarit with use of *F*.
733f0c3a346SStepan Dyatkovskiy
734f0c3a346SStepan DyatkovskiyIf “F” could not be overridden, fix it!
735f0c3a346SStepan Dyatkovskiy"""""""""""""""""""""""""""""""""""""""
736f0c3a346SStepan Dyatkovskiy
737f0c3a346SStepan DyatkovskiyWe call ``writeThunkOrAlias(Function *F, Function *G)``. Here we try to replace
7386373f5ddSAditya Kumar*G* with alias to *F* first. The next conditions are essential:
739f0c3a346SStepan Dyatkovskiy
740f0c3a346SStepan Dyatkovskiy* target should support global aliases,
741f0c3a346SStepan Dyatkovskiy* the address itself of  *G* should be not significant, not named and not
742f0c3a346SStepan Dyatkovskiy  referenced anywhere,
743f0c3a346SStepan Dyatkovskiy* function should come with external, local or weak linkage.
744f0c3a346SStepan Dyatkovskiy
745f0c3a346SStepan DyatkovskiyOtherwise we write thunk: some wrapper that has *G's* interface and calls *F*,
746f0c3a346SStepan Dyatkovskiyso *G* could be replaced with this wrapper.
747f0c3a346SStepan Dyatkovskiy
748f0c3a346SStepan Dyatkovskiy*writeAlias*
749f0c3a346SStepan Dyatkovskiy
750f0c3a346SStepan DyatkovskiyAs follows from *llvm* reference:
751f0c3a346SStepan Dyatkovskiy
752f0c3a346SStepan Dyatkovskiy“Aliases act as *second name* for the aliasee value”. So we just want to create
7536373f5ddSAditya Kumara second name for *F* and use it instead of *G*:
754f0c3a346SStepan Dyatkovskiy
755f0c3a346SStepan Dyatkovskiy1. create global alias itself (*GA*),
756f0c3a346SStepan Dyatkovskiy
757f0c3a346SStepan Dyatkovskiy2. adjust alignment of *F* so it must be maximum of current and *G's* alignment;
758f0c3a346SStepan Dyatkovskiy
759f0c3a346SStepan Dyatkovskiy3. replace uses of *G*:
760f0c3a346SStepan Dyatkovskiy
761f0c3a346SStepan Dyatkovskiy   3.1. first mark all callers of *G* as to-be-analyzed-again, using
762f0c3a346SStepan Dyatkovskiy   ``removeUsers`` method (see chapter above),
763f0c3a346SStepan Dyatkovskiy
764f0c3a346SStepan Dyatkovskiy   3.2. call ``G->replaceAllUsesWith(GA)``.
765f0c3a346SStepan Dyatkovskiy
766f0c3a346SStepan Dyatkovskiy4. Get rid of *G*.
767f0c3a346SStepan Dyatkovskiy
768f0c3a346SStepan Dyatkovskiy*writeThunk*
769f0c3a346SStepan Dyatkovskiy
770f0c3a346SStepan DyatkovskiyAs it written in method comments:
771f0c3a346SStepan Dyatkovskiy
772f0c3a346SStepan Dyatkovskiy“Replace G with a simple tail call to bitcast(F). Also replace direct uses of G
773f0c3a346SStepan Dyatkovskiywith bitcast(F). Deletes G.”
774f0c3a346SStepan Dyatkovskiy
775f0c3a346SStepan DyatkovskiyIn general it does the same as usual when we want to replace callee, except the
776f0c3a346SStepan Dyatkovskiyfirst point:
777f0c3a346SStepan Dyatkovskiy
778f0c3a346SStepan Dyatkovskiy1. We generate tail call wrapper around *F*, but with interface that allows use
779f0c3a346SStepan Dyatkovskiyit instead of *G*.
780f0c3a346SStepan Dyatkovskiy
781f0c3a346SStepan Dyatkovskiy2. “As-usual”: ``removeUsers`` and ``replaceAllUsesWith`` then.
782f0c3a346SStepan Dyatkovskiy
783f0c3a346SStepan Dyatkovskiy3. Get rid of *G*.
784f0c3a346SStepan Dyatkovskiy
785f0c3a346SStepan Dyatkovskiy
786