#
e45de8a5 |
| 26-Sep-2016 |
Evandro Menezes <e.menezes@samsung.com> |
Add support to optionally limit the size of jump tables.
Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated a
Add support to optionally limit the size of jump tables.
Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated as such branch predictors are, they tend to have well defined limits beyond which their effectiveness is hampered or even nullified. One such limit is the number of possible destinations for a given indirect branches that such branch predictors can handle.
This patch considers a limit that a target may set to the number of destination addresses in a jump table.
Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar <aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>.
Differential revision: https://reviews.llvm.org/D21940
llvm-svn: 282412
show more ...
|
#
1ed771f5 |
| 14-Sep-2016 |
Sanjay Patel <spatel@rotateright.com> |
getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI
llvm-svn: 281495
|
#
92e33a3e |
| 09-Sep-2016 |
Saleem Abdulrasool <compnerd@compnerd.org> |
ARM: move the builtins libcall CC setup
Move the target specific setup into the target specific lowering setup. As pointed out by Anton, the initial change was moving this too high up the stack res
ARM: move the builtins libcall CC setup
Move the target specific setup into the target specific lowering setup. As pointed out by Anton, the initial change was moving this too high up the stack resulting in a violation of the layering (the target generic code path setup target specific bits). Sink this into the ARM specific setup. NFC.
llvm-svn: 281088
show more ...
|
#
02d9851c |
| 07-Sep-2016 |
Saleem Abdulrasool <compnerd@compnerd.org> |
CodeGen: ensure that libcalls are always AAPCS CC
The original commit was too aggressive about marking LibCalls as AAPCS. The libcalls contain libc/libm/libunwind calls which are not AAPCS, but C.
CodeGen: ensure that libcalls are always AAPCS CC
The original commit was too aggressive about marking LibCalls as AAPCS. The libcalls contain libc/libm/libunwind calls which are not AAPCS, but C.
llvm-svn: 280833
show more ...
|
#
a7ade33d |
| 07-Sep-2016 |
Saleem Abdulrasool <compnerd@compnerd.org> |
Revert "CodeGen: ensure that libcalls are always AAPCS CC"
This reverts SVN r280683. Revert until I figure out why this is breaking lli tests.
llvm-svn: 280778
|
#
a6519b1d |
| 06-Sep-2016 |
Saleem Abdulrasool <compnerd@compnerd.org> |
CodeGen: ensure that libcalls are always AAPCS CC
All of the builtins are designed to be invoked with ARM AAPCS CC even on ARM AAPCS VFP CC hosts. Tweak the default initialisation to ARM AAPCS CC r
CodeGen: ensure that libcalls are always AAPCS CC
All of the builtins are designed to be invoked with ARM AAPCS CC even on ARM AAPCS VFP CC hosts. Tweak the default initialisation to ARM AAPCS CC rather than C CC for ARM/thumb targets.
The changes to the tests are necessary to ensure that the calling convention for the lowered library calls are honoured. Furthermore, these adjustments cause certain branch invocations to change to branch-and-link since the returned value needs to be moved across registers (d0 -> r0, r1).
llvm-svn: 280683
show more ...
|
#
b57d0a2f |
| 29-Aug-2016 |
Sanjay Patel <spatel@rotateright.com> |
[TargetLowering] remove fdiv and frem from canOpTrap() (PR29114)
Assuming the default FP env, we should not treat fdiv and frem any differently in terms of trapping behavior than any other FP op. I
[TargetLowering] remove fdiv and frem from canOpTrap() (PR29114)
Assuming the default FP env, we should not treat fdiv and frem any differently in terms of trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env.
This matches how we treat these ops in IR with isSafeToSpeculativelyExecute(). There's a similar bug in Constant::canTrap().
This bug manifests in PR29114: https://llvm.org/bugs/show_bug.cgi?id=29114 ...as a sequence of scalar divisions instead of a vector division on x86 for a <3 x float> type.
Differential Revision: https://reviews.llvm.org/D23974
llvm-svn: 279970
show more ...
|
#
a5cc25e5 |
| 22-Aug-2016 |
Tim Shen <timshen91@gmail.com> |
[SSP] Do not set __guard_local to hidden for OpenBSD SSP
__guard_local is defined as long on OpenBSD. If the source file contains a definition of __guard_local, it mismatches with the int8 pointer t
[SSP] Do not set __guard_local to hidden for OpenBSD SSP
__guard_local is defined as long on OpenBSD. If the source file contains a definition of __guard_local, it mismatches with the int8 pointer type used in LLVM. In that case, Module::getOrInsertGlobal() returns a cast operation instead of a GlobalVariable. Trying to set the visibility on the cast operation leads to random segfaults (seen when compiling the OpenBSD kernel, which also runs with stack protection).
In the kernel, the hidden attribute does not matter. For userspace code, __guard_local is defined as hidden in the startup code. If a program re-defines __guard_local, the definition from the startup code will either win or the linker complains about multiple definitions (depending on whether the re-defined __guard_local is placed in the common segment or not).
It also matches what gcc on OpenBSD does.
Thanks Stefan Kempf <sisnkemp@gmail.com> for the patch!
Differential Revision: http://reviews.llvm.org/D23674
llvm-svn: 279449
show more ...
|
#
f679530b |
| 04-Aug-2016 |
Nikolai Bozhenov <nikolai.bozhenov@intel.com> |
[X86] Heuristic to selectively build Newton-Raphson SQRT estimation
On modern Intel processors hardware SQRT in many cases is faster than RSQRT followed by Newton-Raphson refinement. The patch intro
[X86] Heuristic to selectively build Newton-Raphson SQRT estimation
On modern Intel processors hardware SQRT in many cases is faster than RSQRT followed by Newton-Raphson refinement. The patch introduces a simple heuristic to choose between hardware SQRT instruction and Newton-Raphson software estimation.
The patch treats scalars and vectors differently. The heuristic is that for scalars the compiler should optimize for latency while for vectors it should optimize for throughput. It is based on the assumption that throughput bound code is likely to be vectorized.
Basically, the patch disables scalar NR for big cores and disables NR completely for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores. Secondly, vector SQRT has been greatly improved in Skylake and has better throughput compared to NR.
Differential Revision: https://reviews.llvm.org/D21379
llvm-svn: 277725
show more ...
|
#
941a705b |
| 28-Jul-2016 |
Matthias Braun <matze@braunis.de> |
MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference instead of a pointer.
llvm-svn: 277017
|
#
0af80cd6 |
| 15-Jul-2016 |
Justin Lebar <jlebar@google.com> |
[CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary: Previously we took an unsigned.
Hooray for type-safety.
Reviewers: chandlerc
Subscribers: dsanders, ll
[CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary: Previously we took an unsigned.
Hooray for type-safety.
Reviewers: chandlerc
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D22282
llvm-svn: 275591
show more ...
|
#
e4f5e4f4 |
| 30-Jun-2016 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
CodeGen: Use MachineInstr& in TargetLowering, NFC
This is a mechanical change to make TargetLowering API take MachineInstr& (instead of MachineInstr*), since the argument is expected to be a valid M
CodeGen: Use MachineInstr& in TargetLowering, NFC
This is a mechanical change to make TargetLowering API take MachineInstr& (instead of MachineInstr*), since the argument is expected to be a valid MachineInstr. In one case, changed a parameter from MachineInstr* to MachineBasicBlock::iterator, since it was used as an insertion point.
As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753.
llvm-svn: 274287
show more ...
|
#
bf2c03ee |
| 21-Jun-2016 |
Daniel Sanders <daniel.sanders@imgtec.com> |
[arm+x86] Make GNU variants behave like GNU w.r.t combining sin+cos into sincos.
Summary: canCombineSinCosLibcall() would previously combine sin+cos into sincos for GNUX32/GNUEABI/GNUEABIHF regardle
[arm+x86] Make GNU variants behave like GNU w.r.t combining sin+cos into sincos.
Summary: canCombineSinCosLibcall() would previously combine sin+cos into sincos for GNUX32/GNUEABI/GNUEABIHF regardless of whether UnsafeFPMath were set or not. However, GNU would only combine them for UnsafeFPMath because sincos does not set errno like sin and cos do. It seems likely that this is an oversight.
Reviewers: t.p.northover
Subscribers: t.p.northover, aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D21431
llvm-svn: 273259
show more ...
|
#
148a6469 |
| 17-Jun-2016 |
James Y Knight <jyknight@google.com> |
Support expanding partial-word cmpxchg to full-word cmpxchg in AtomicExpandPass.
Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1 or 2-byte. For those, you need to mask and s
Support expanding partial-word cmpxchg to full-word cmpxchg in AtomicExpandPass.
Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1 or 2-byte. For those, you need to mask and shift the 1 or 2 byte values appropriately to use the 4-byte instruction.
This change adds support for cmpxchg-based instruction sets (only SPARC, in LLVM). The support can be extended for LL/SC-based PPC and MIPS in the future, supplanting the ISel expansions those architectures currently use.
Tests added for the IR transform and SPARCv9.
Differential Revision: http://reviews.llvm.org/D21029
llvm-svn: 273025
show more ...
|
#
bd4243c5 |
| 09-Jun-2016 |
Davide Italiano <davide@freebsd.org> |
[CodeGen] Change getSDagStackGuard to get an internal sym.
Fixes a crash in the backend during an LTO build of rtld(1) in FreeBSD.
llvm-svn: 272262
|
#
22bfa832 |
| 07-Jun-2016 |
Etienne Bergeron <etienneb@google.com> |
[stack-protection] Add support for MSVC buffer security check
Summary: This patch is adding support for the MSVC buffer security check implementation
The buffer security check is turned on with the
[stack-protection] Add support for MSVC buffer security check
Summary: This patch is adding support for the MSVC buffer security check implementation
The buffer security check is turned on with the '/GS' compiler switch. * https://msdn.microsoft.com/en-us/library/8dbf701c.aspx * To be added to clang here: http://reviews.llvm.org/D20347
Some overview of buffer security check feature and implementation: * https://msdn.microsoft.com/en-us/library/aa290051(VS.71).aspx * http://www.ksyash.com/2011/01/buffer-overflow-protection-3/ * http://blog.osom.info/2012/02/understanding-vs-c-compilers-buffer.html
For the following example: ``` int example(int offset, int index) { char buffer[10]; memset(buffer, 0xCC, index); return buffer[index]; } ```
The MSVC compiler is adding these instructions to perform stack integrity check: ``` push ebp mov ebp,esp sub esp,50h [1] mov eax,dword ptr [__security_cookie (01068024h)] [2] xor eax,ebp [3] mov dword ptr [ebp-4],eax push ebx push esi push edi mov eax,dword ptr [index] push eax push 0CCh lea ecx,[buffer] push ecx call _memset (010610B9h) add esp,0Ch mov eax,dword ptr [index] movsx eax,byte ptr buffer[eax] pop edi pop esi pop ebx [4] mov ecx,dword ptr [ebp-4] [5] xor ecx,ebp [6] call @__security_check_cookie@4 (01061276h) mov esp,ebp pop ebp ret ```
The instrumentation above is: * [1] is loading the global security canary, * [3] is storing the local computed ([2]) canary to the guard slot, * [4] is loading the guard slot and ([5]) re-compute the global canary, * [6] is validating the resulting canary with the '__security_check_cookie' and performs error handling.
Overview of the current stack-protection implementation: * lib/CodeGen/StackProtector.cpp * There is a default stack-protection implementation applied on intermediate representation. * The target can overload 'getIRStackGuard' method if it has a standard location for the stack protector cookie. * An intrinsic 'Intrinsic::stackprotector' is added to the prologue. It will be expanded by the instruction selection pass (DAG or Fast). * Basic Blocks are added to every instrumented function to receive the code for handling stack guard validation and errors handling. * Guard manipulation and comparison are added directly to the intermediate representation.
* lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp * lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp * There is an implementation that adds instrumentation during instruction selection (for better handling of sibbling calls). * see long comment above 'class StackProtectorDescriptor' declaration. * The target needs to override 'getSDagStackGuard' to activate SDAG stack protection generation. (note: getIRStackGuard MUST be nullptr). * 'getSDagStackGuard' returns the appropriate stack guard (security cookie) * The code is generated by 'SelectionDAGBuilder.cpp' and 'SelectionDAGISel.cpp'.
* include/llvm/Target/TargetLowering.h * Contains function to retrieve the default Guard 'Value'; should be overriden by each target to select which implementation is used and provide Guard 'Value'.
* lib/Target/X86/X86ISelLowering.cpp * Contains the x86 specialisation; Guard 'Value' used by the SelectionDAG algorithm.
Function-based Instrumentation: * The MSVC doesn't inline the stack guard comparison in every function. Instead, a call to '__security_check_cookie' is added to the epilogue before every return instructions. * To support function-based instrumentation, this patch is * adding a function to get the function-based check (llvm 'Value', see include/llvm/Target/TargetLowering.h), * If provided, the stack protection instrumentation won't be inlined and a call to that function will be added to the prologue. * modifying (SelectionDAGISel.cpp) do avoid producing basic blocks used for inline instrumentation, * generating the function-based instrumentation during the ISEL pass (SelectionDAGBuilder.cpp), * if FastISEL (not SelectionDAG), using the fallback which rely on the same function-based implemented over intermediate representation (StackProtector.cpp).
Modifications * adding support for MSVC (lib/Target/X86/X86ISelLowering.cpp) * adding support function-based instrumentation (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp, .h)
Results
* IR generated instrumentation: ``` clang-cl /GS test.cc /Od /c -mllvm -print-isel-input ```
``` *** Final LLVM Code input to ISel ***
; Function Attrs: nounwind sspstrong define i32 @"\01?example@@YAHHH@Z"(i32 %offset, i32 %index) #0 { entry: %StackGuardSlot = alloca i8* <<<-- Allocated guard slot %0 = call i8* @llvm.stackguard() <<<-- Loading Stack Guard value call void @llvm.stackprotector(i8* %0, i8** %StackGuardSlot) <<<-- Prologue intrinsic call (store to Guard slot) %index.addr = alloca i32, align 4 %offset.addr = alloca i32, align 4 %buffer = alloca [10 x i8], align 1 store i32 %index, i32* %index.addr, align 4 store i32 %offset, i32* %offset.addr, align 4 %arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 0 %1 = load i32, i32* %index.addr, align 4 call void @llvm.memset.p0i8.i32(i8* %arraydecay, i8 -52, i32 %1, i32 1, i1 false) %2 = load i32, i32* %index.addr, align 4 %arrayidx = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 %2 %3 = load i8, i8* %arrayidx, align 1 %conv = sext i8 %3 to i32 %4 = load volatile i8*, i8** %StackGuardSlot <<<-- Loading Guard slot call void @__security_check_cookie(i8* %4) <<<-- Epilogue function-based check ret i32 %conv } ```
* SelectionDAG generated instrumentation:
``` clang-cl /GS test.cc /O1 /c /FA ```
``` "?example@@YAHHH@Z": # @"\01?example@@YAHHH@Z" # BB#0: # %entry pushl %esi subl $16, %esp movl ___security_cookie, %eax <<<-- Loading Stack Guard value movl 28(%esp), %esi movl %eax, 12(%esp) <<<-- Store to Guard slot leal 2(%esp), %eax pushl %esi pushl $204 pushl %eax calll _memset addl $12, %esp movsbl 2(%esp,%esi), %esi movl 12(%esp), %ecx <<<-- Loading Guard slot calll @__security_check_cookie@4 <<<-- Epilogue function-based check movl %esi, %eax addl $16, %esp popl %esi retl ```
Reviewers: kcc, pcc, eugenis, rnk
Subscribers: majnemer, llvm-commits, hans, thakis, rnk
Differential Revision: http://reviews.llvm.org/D20346
llvm-svn: 272053
show more ...
|
#
e4b3812e |
| 31-May-2016 |
Ahmed Bougacha <ahmed.bougacha@gmail.com> |
[CodeGen] Don't mark FMINNUM/FMAXNUM Expand twice. NFC.
They're already in the all_valuetypes() loop above.
llvm-svn: 271316
|
#
33772c53 |
| 28-Apr-2016 |
Craig Topper <craig.topper@gmail.com> |
[CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cas
[CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior.
llvm-svn: 267853
show more ...
|
#
d66607bd |
| 26-Apr-2016 |
Sanjay Patel <spatel@rotateright.com> |
[CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch
This is part of solving PR27344: https://llvm.org/bugs/show_bug.cgi?id=27344
CGP should undo the Sim
[CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch
This is part of solving PR27344: https://llvm.org/bugs/show_bug.cgi?id=27344
CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place.
For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly.
As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable.
Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. A proposal to change that is in D19435.
Differential Revision: http://reviews.llvm.org/D19488
llvm-svn: 267572
show more ...
|
#
940d19a0 |
| 22-Apr-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
TLI: Only iterate over integer vector types
Instead of iterating over all vectors and skipping integers.
llvm-svn: 267220
|
#
a1d8bc55 |
| 19-Apr-2016 |
Tim Shen <timshen91@gmail.com> |
[PPC, SSP] Support PowerPC Linux stack protection.
llvm-svn: 266809
|
#
7873fb9d |
| 12-Apr-2016 |
James Y Knight <jyknight@google.com> |
Pre-fill LibcallRoutineNames with nullptr.
And rearrange InitLibcallNames slightly.
llvm-svn: 266142
|
#
19f6cce4 |
| 12-Apr-2016 |
James Y Knight <jyknight@google.com> |
Add __atomic_* lowering to AtomicExpandPass.
(Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames)
AtomicExpandPa
Add __atomic_* lowering to AtomicExpandPass.
(Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames)
AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size.
This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified.
Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend.
This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing.
It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching.
At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets.
Differential Revision: http://reviews.llvm.org/D18200
llvm-svn: 266115
show more ...
|
#
d41b54be |
| 12-Apr-2016 |
Rafael Espindola <rafael.espindola@gmail.com> |
This reverts commit r266002, r266011 and r266016.
They broke the msan bot.
Original message:
Add __atomic_* lowering to AtomicExpandPass.
AtomicExpandPass can now lower atomic load, atomic store,
This reverts commit r266002, r266011 and r266016.
They broke the msan bot.
Original message:
Add __atomic_* lowering to AtomicExpandPass.
AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size.
This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified.
Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend.
This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing.
It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching.
At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets.
Differential Revision: http://reviews.llvm.org/D18200
llvm-svn: 266062
show more ...
|
#
b91d38c5 |
| 11-Apr-2016 |
James Y Knight <jyknight@google.com> |
Add __atomic_* lowering to AtomicExpandPass.
AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't suppor
Add __atomic_* lowering to AtomicExpandPass.
AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size.
This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified.
Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend.
This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing.
It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching.
At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets.
Differential Revision: http://reviews.llvm.org/D18200
llvm-svn: 266002
show more ...
|