xref: /llvm-project/llvm/test/ThinLTO/AArch64/cgdata-two-rounds-caching.ll (revision dc85d5263ed5e416cb4ddf405611472f4ef12fd3)
1*dc85d526SKyungwoo Lee; This test verifies whether we can outline a singleton instance (i.e., an instance that does not repeat)
2*dc85d526SKyungwoo Lee; by running two codegen rounds.
3*dc85d526SKyungwoo Lee; This test also verifies if caches for the two-round codegens are correctly working.
4*dc85d526SKyungwoo Lee
5*dc85d526SKyungwoo Lee; REQUIRES: asserts
6*dc85d526SKyungwoo Lee; RUN: rm -rf %t
7*dc85d526SKyungwoo Lee; RUN: split-file %s %t
8*dc85d526SKyungwoo Lee
9*dc85d526SKyungwoo Lee; 0. Base case without a cache.
10*dc85d526SKyungwoo Lee; Verify each outlining instance is singleton with the global outlining for thinlto.
11*dc85d526SKyungwoo Lee; They will be identical, which can be folded by the linker with ICF.
12*dc85d526SKyungwoo Lee; RUN: opt -module-hash -module-summary %t/thin-one.ll -o %t/thin-one.bc
13*dc85d526SKyungwoo Lee; RUN: opt -module-hash -module-summary %t/thin-two.ll -o %t/thin-two.bc
14*dc85d526SKyungwoo Lee; RUN: llvm-lto2 run %t/thin-one.bc %t/thin-two.bc -o %t/thinlto \
15*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-one.bc,_f3,px -r %t/thin-one.bc,_g,x \
16*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-two.bc,_f1,px -r %t/thin-two.bc,_f2,px -r %t/thin-two.bc,_g,x \
17*dc85d526SKyungwoo Lee; RUN:  -codegen-data-thinlto-two-rounds
18*dc85d526SKyungwoo Lee
19*dc85d526SKyungwoo Lee; thin-one.ll will have one outlining instance (matched in the global outlined hash tree)
20*dc85d526SKyungwoo Lee; RUN: llvm-objdump -d %t/thinlto.1 | FileCheck %s --check-prefix=THINLTO-1
21*dc85d526SKyungwoo Lee; THINLTO-1: _OUTLINED_FUNCTION{{.*}}>:
22*dc85d526SKyungwoo Lee; THINLTO-1-NEXT:  mov
23*dc85d526SKyungwoo Lee; THINLTO-1-NEXT:  mov
24*dc85d526SKyungwoo Lee; THINLTO-1-NEXT:  b
25*dc85d526SKyungwoo Lee
26*dc85d526SKyungwoo Lee; thin-two.ll will have two outlining instances (matched in the global outlined hash tree)
27*dc85d526SKyungwoo Lee; RUN: llvm-objdump -d %t/thinlto.2 | FileCheck %s --check-prefix=THINLTO-2
28*dc85d526SKyungwoo Lee; THINLTO-2: _OUTLINED_FUNCTION{{.*}}>:
29*dc85d526SKyungwoo Lee; THINLTO-2-NEXT:  mov
30*dc85d526SKyungwoo Lee; THINLTO-2-NEXT:  mov
31*dc85d526SKyungwoo Lee; THINLTO-2-NEXT:  b
32*dc85d526SKyungwoo Lee; THINLTO-2: _OUTLINED_FUNCTION{{.*}}>:
33*dc85d526SKyungwoo Lee; THINLTO-2-NEXT:  mov
34*dc85d526SKyungwoo Lee; THINLTO-2-NEXT:  mov
35*dc85d526SKyungwoo Lee; THINLTO-2-NEXT:  b
36*dc85d526SKyungwoo Lee
37*dc85d526SKyungwoo Lee; 1. Run this with a cache for the first time.
38*dc85d526SKyungwoo Lee; RUN: rm -rf %t.cache
39*dc85d526SKyungwoo Lee; RUN: llvm-lto2 run %t/thin-one.bc %t/thin-two.bc -o %t/thinlto-cold \
40*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-one.bc,_f3,px -r %t/thin-one.bc,_g,x \
41*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-two.bc,_f1,px -r %t/thin-two.bc,_f2,px -r %t/thin-two.bc,_g,x \
42*dc85d526SKyungwoo Lee; RUN:  -codegen-data-thinlto-two-rounds -cache-dir %t.cache -debug-only=lto -thinlto-threads 1 > %t.log-cold.txt 2>&1
43*dc85d526SKyungwoo Lee; RUN: cat %t.log-cold.txt | FileCheck %s --check-prefix=COLD
44*dc85d526SKyungwoo Lee; diff %t/thinlto.1 %t/thinlto-cold.1
45*dc85d526SKyungwoo Lee; diff %t/thinlto.2 %t/thinlto-cold.2
46*dc85d526SKyungwoo Lee
47*dc85d526SKyungwoo Lee; COLD: [FirstRound] Cache Miss for {{.*}}thin-one.bc
48*dc85d526SKyungwoo Lee; COLD: [FirstRound] Cache Miss for {{.*}}thin-two.bc
49*dc85d526SKyungwoo Lee; COLD: [SecondRound] Cache Miss for {{.*}}thin-one.bc
50*dc85d526SKyungwoo Lee; COLD: [SecondRound] Cache Miss for {{.*}}thin-two.bc
51*dc85d526SKyungwoo Lee
52*dc85d526SKyungwoo Lee; There are two input bitcode files and each one is operated with 3 caches:
53*dc85d526SKyungwoo Lee; CG/IR caches for the first round and the second round CG cache.
54*dc85d526SKyungwoo Lee; So the total number of files are 2 * 3 = 6.
55*dc85d526SKyungwoo Lee; RUN: ls %t.cache | count 6
56*dc85d526SKyungwoo Lee
57*dc85d526SKyungwoo Lee; 2. Without any changes, simply re-running it will hit the cache.
58*dc85d526SKyungwoo Lee; RUN: llvm-lto2 run %t/thin-one.bc %t/thin-two.bc -o %t/thinlto-warm \
59*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-one.bc,_f3,px -r %t/thin-one.bc,_g,x \
60*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-two.bc,_f1,px -r %t/thin-two.bc,_f2,px -r %t/thin-two.bc,_g,x \
61*dc85d526SKyungwoo Lee; RUN:  -codegen-data-thinlto-two-rounds -cache-dir %t.cache -debug-only=lto -thinlto-threads 1 > %t.log-warm.txt 2>&1
62*dc85d526SKyungwoo Lee; RUN: cat %t.log-warm.txt | FileCheck %s --check-prefix=WARM
63*dc85d526SKyungwoo Lee; diff %t/thinlto.1 %t/thinlto-warm.1
64*dc85d526SKyungwoo Lee; diff %t/thinlto.2 %t/thinlto-warm.2
65*dc85d526SKyungwoo Lee
66*dc85d526SKyungwoo Lee; WARM-NOT: Cache Miss
67*dc85d526SKyungwoo Lee
68*dc85d526SKyungwoo Lee; 3. Assume thin-one.ll has been modified to thin-one-modified.ll.
69*dc85d526SKyungwoo Lee; The merged CG data remains unchanged as this modification does not affect the hash tree built from thin-two.bc.
70*dc85d526SKyungwoo Lee; Therefore, both the first and second round runs update only this module.
71*dc85d526SKyungwoo Lee; RUN: opt -module-hash -module-summary %t/thin-one-modified.ll -o %t/thin-one.bc
72*dc85d526SKyungwoo Lee; RUN: llvm-lto2 run %t/thin-one.bc %t/thin-two.bc -o %t/thinlto-warm-modified \
73*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-one.bc,_f3,px -r %t/thin-one.bc,_g,x \
74*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-two.bc,_f1,px -r %t/thin-two.bc,_f2,px -r %t/thin-two.bc,_g,x \
75*dc85d526SKyungwoo Lee; RUN:  -codegen-data-thinlto-two-rounds -cache-dir %t.cache -debug-only=lto -thinlto-threads 1 > %t.log-warm-modified.txt 2>&1
76*dc85d526SKyungwoo Lee; RUN: cat %t.log-warm-modified.txt | FileCheck %s --check-prefix=WARM-MODIFIED
77*dc85d526SKyungwoo Lee; diff %t/thinlto.1 %t/thinlto-warm-modified.1
78*dc85d526SKyungwoo Lee; diff %t/thinlto.2 %t/thinlto-warm-modified.2
79*dc85d526SKyungwoo Lee
80*dc85d526SKyungwoo Lee; WARM-MODIFIED: [FirstRound] Cache Miss for {{.*}}thin-one.bc
81*dc85d526SKyungwoo Lee; WARM-MODIFIED-NOT: [FirstRound] Cache Miss for {{.*}}thin-two.bc
82*dc85d526SKyungwoo Lee; WARM-MODIFIED: [SecondRound] Cache Miss for {{.*}}thin-one.bc
83*dc85d526SKyungwoo Lee; WARM-MODIFIED-NOT: [SecondRound] Cache Miss for {{.*}}thin-two.bc
84*dc85d526SKyungwoo Lee
85*dc85d526SKyungwoo Lee; 4. Additionally, thin-two.ll has been modified to thin-two-modified.ll.
86*dc85d526SKyungwoo Lee; In this case, the merged CG data, which is global, is updated.
87*dc85d526SKyungwoo Lee; Although the first round run updates only the thin-two.bc module,
88*dc85d526SKyungwoo Lee; as the module thin-one.bc remains the same as in step 3 above,
89*dc85d526SKyungwoo Lee; the second round run will update all modules, resulting in different binaries.
90*dc85d526SKyungwoo Lee; RUN: opt -module-hash -module-summary %t/thin-one-modified.ll -o %t/thin-one.bc
91*dc85d526SKyungwoo Lee; RUN: opt -module-hash -module-summary %t/thin-two-modified.ll -o %t/thin-two.bc
92*dc85d526SKyungwoo Lee; RUN: llvm-lto2 run %t/thin-one.bc %t/thin-two.bc -o %t/thinlto-warm-modified-all \
93*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-one.bc,_f3,px -r %t/thin-one.bc,_g,x \
94*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-two.bc,_f1,px -r %t/thin-two.bc,_f2,px -r %t/thin-two.bc,_g,x \
95*dc85d526SKyungwoo Lee; RUN:  -codegen-data-thinlto-two-rounds -cache-dir %t.cache -debug-only=lto -thinlto-threads 1 > %t.log-warm-modified-all.txt 2>&1
96*dc85d526SKyungwoo Lee; RUN: cat %t.log-warm-modified-all.txt | FileCheck %s --check-prefix=WARM-MODIFIED-ALL
97*dc85d526SKyungwoo Lee; RUN: not diff %t/thinlto.1 %t/thinlto-warm-modified-all.1
98*dc85d526SKyungwoo Lee; RUN: not diff %t/thinlto.2 %t/thinlto-warm-modified-all.2
99*dc85d526SKyungwoo Lee
100*dc85d526SKyungwoo Lee; WARM-MODIFIED-ALL-NOT: [FirstRound] Cache Miss for {{.*}}thin-one.bc
101*dc85d526SKyungwoo Lee; WARM-MODIFIED-ALL: [FirstRound] Cache Miss for {{.*}}thin-two.bc
102*dc85d526SKyungwoo Lee; WARM-MODIFIED-ALL: [SecondRound] Cache Miss for {{.*}}thin-one.bc
103*dc85d526SKyungwoo Lee; WARM-MODIFIED-ALL: [SecondRound] Cache Miss for {{.*}}thin-two.bc
104*dc85d526SKyungwoo Lee
105*dc85d526SKyungwoo Lee; thin-one-modified.ll won't be outlined.
106*dc85d526SKyungwoo Lee; RUN: llvm-objdump -d %t/thinlto-warm-modified-all.1 | FileCheck %s --check-prefix=THINLTO-1-MODIFIED-ALL
107*dc85d526SKyungwoo Lee; THINLTO-1-MODIFIED-ALL-NOT: _OUTLINED_FUNCTION{{.*}}>:
108*dc85d526SKyungwoo Lee
109*dc85d526SKyungwoo Lee; thin-two-modified.ll will have two (longer) outlining instances (matched in the global outlined hash tree)
110*dc85d526SKyungwoo Lee; RUN: llvm-objdump -d %t/thinlto-warm-modified-all.2| FileCheck %s --check-prefix=THINLTO-2-MODIFIED-ALL
111*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL: _OUTLINED_FUNCTION{{.*}}>:
112*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL:  mov
113*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL:  mov
114*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL:  mov
115*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL:  b
116*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL: _OUTLINED_FUNCTION{{.*}}>:
117*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL:  mov
118*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL:  mov
119*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL:  mov
120*dc85d526SKyungwoo Lee; THINLTO-2-MODIFIED-ALL:  b
121*dc85d526SKyungwoo Lee
122*dc85d526SKyungwoo Lee; 5. Re-running it will hit the cache.
123*dc85d526SKyungwoo Lee; RUN: llvm-lto2 run %t/thin-one.bc %t/thin-two.bc -o %t/thinlto-warm-again \
124*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-one.bc,_f3,px -r %t/thin-one.bc,_g,x \
125*dc85d526SKyungwoo Lee; RUN:  -r %t/thin-two.bc,_f1,px -r %t/thin-two.bc,_f2,px -r %t/thin-two.bc,_g,x \
126*dc85d526SKyungwoo Lee; RUN:  -codegen-data-thinlto-two-rounds -cache-dir %t.cache -debug-only=lto -thinlto-threads 1 > %t.log-warm-again.txt 2>&1
127*dc85d526SKyungwoo Lee; RUN: cat %t.log-warm-again.txt | FileCheck %s --check-prefix=WARM-AGAIN
128*dc85d526SKyungwoo Lee; RUN: diff %t/thinlto-warm-modified-all.1 %t/thinlto-warm-again.1
129*dc85d526SKyungwoo Lee; RUN: diff %t/thinlto-warm-modified-all.2 %t/thinlto-warm-again.2
130*dc85d526SKyungwoo Lee
131*dc85d526SKyungwoo Lee; WARM-AGAIN-NOT: Cache Miss
132*dc85d526SKyungwoo Lee
133*dc85d526SKyungwoo Lee;--- thin-one.ll
134*dc85d526SKyungwoo Leetarget datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
135*dc85d526SKyungwoo Leetarget triple = "arm64-apple-darwin"
136*dc85d526SKyungwoo Lee
137*dc85d526SKyungwoo Leedeclare i32 @g(i32, i32, i32)
138*dc85d526SKyungwoo Leedefine i32 @f3() minsize {
139*dc85d526SKyungwoo Lee  %1 = call i32 @g(i32 30, i32 1, i32 2);
140*dc85d526SKyungwoo Lee ret i32 %1
141*dc85d526SKyungwoo Lee}
142*dc85d526SKyungwoo Lee
143*dc85d526SKyungwoo Lee;--- thin-one-modified.ll
144*dc85d526SKyungwoo Leetarget datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
145*dc85d526SKyungwoo Leetarget triple = "arm64-apple-darwin"
146*dc85d526SKyungwoo Lee
147*dc85d526SKyungwoo Leedeclare i32 @g(i32, i32, i32)
148*dc85d526SKyungwoo Leedefine i32 @f3() minsize {
149*dc85d526SKyungwoo Lee  %1 = call i32 @g(i32 31, i32 1, i32 2);
150*dc85d526SKyungwoo Lee ret i32 %1
151*dc85d526SKyungwoo Lee}
152*dc85d526SKyungwoo Lee
153*dc85d526SKyungwoo Lee;--- thin-two.ll
154*dc85d526SKyungwoo Leetarget datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
155*dc85d526SKyungwoo Leetarget triple = "arm64-apple-darwin"
156*dc85d526SKyungwoo Lee
157*dc85d526SKyungwoo Leedeclare i32 @g(i32, i32, i32)
158*dc85d526SKyungwoo Leedefine i32 @f1() minsize {
159*dc85d526SKyungwoo Lee  %1 = call i32 @g(i32 10, i32 1, i32 2);
160*dc85d526SKyungwoo Lee  ret i32 %1
161*dc85d526SKyungwoo Lee}
162*dc85d526SKyungwoo Leedefine i32 @f2() minsize {
163*dc85d526SKyungwoo Lee  %1 = call i32 @g(i32 20, i32 1, i32 2);
164*dc85d526SKyungwoo Lee  ret i32 %1
165*dc85d526SKyungwoo Lee}
166*dc85d526SKyungwoo Lee
167*dc85d526SKyungwoo Lee;--- thin-two-modified.ll
168*dc85d526SKyungwoo Leetarget datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
169*dc85d526SKyungwoo Leetarget triple = "arm64-apple-darwin"
170*dc85d526SKyungwoo Lee
171*dc85d526SKyungwoo Leedeclare i32 @g(i32, i32, i32)
172*dc85d526SKyungwoo Leedefine i32 @f1() minsize {
173*dc85d526SKyungwoo Lee  %1 = call i32 @g(i32 10, i32 1, i32 2);
174*dc85d526SKyungwoo Lee  ret i32 %1
175*dc85d526SKyungwoo Lee}
176*dc85d526SKyungwoo Leedefine i32 @f2() minsize {
177*dc85d526SKyungwoo Lee  %1 = call i32 @g(i32 10, i32 1, i32 2);
178*dc85d526SKyungwoo Lee  ret i32 %1
179*dc85d526SKyungwoo Lee}
180