RATIONALE.md - OpenGrok cross reference for /llvm-project/libc/benchmarks/RATIONALE.md

Lines Matching +full:cycle +full:- +full:frequency
1 # Benchmarking `llvm-libc`'s memory functions
9 -   **code size** (to reduce instruction cache pressure),
10 -   **Profile Guided Optimization** friendliness,
11 -   **hyperthreading / multithreading** friendliness.
23 5.  **Cost-effectiveness**: Benchmark tests are economical.
31 peculiarities of designing good microbenchmarks for `llvm-libc` memory
36 As seen in the [README.md](README.md#stochastic-mode) the microbenchmarking
39 accurately down to the cycle**.
44  - [Performance
47  - [High Precision Event
50  - [Real-Time Clocks (RTC)](https://en.wikipedia.org/wiki/Real-time_clock): used
53 In theory **Performance Counters** provide cycle accurate measurement via the
60 order](https://en.wikipedia.org/wiki/Out-of-order_execution) and
72 are micro-architecture dependent: **it is generally not possible to compare two
73 micro-architectures exposing the same performance counters.**
76 meaning. Although we want to benchmark `llvm-libc` memory functions for all
78 triples](https://clang.llvm.org/docs/CrossCompilation.html#target-triple), there
83 -   Reading performance counters is done through Kernel [System
86 -   [Interruptions](https://en.wikipedia.org/wiki/Interrupt#Processor_response)
88 -   If the system is already under monitoring (virtual machines or system wide
91 -   The Kernel can decide to [migrate the
94 -   [Dynamic frequency
100 ### Cycle accuracy conclusion
103 inconsistent across micro-architectures and imprecise on modern CPUs for small
117 [SNR](https://en.wikipedia.org/wiki/Signal-to-noise_ratio).
119 ### Repeating code N-times until precision is sufficient
123 -   We measure the time it takes to run the code _N_ times (Initially _N_ is 10
125 -   We deduce an approximation of the runtime of one iteration (= _runtime_ /
127 -   We increase _N_ by _X%_ and repeat the measurement (geometric progression).
128 -   We keep track of the _one iteration runtime approximation_ and build a
130 -   We stop the process when the difference between the weighted mean and the
178 ### Effect of dynamic frequency scaling
180 Modern processors implement [dynamic frequency
181 scaling](https://en.wikipedia.org/wiki/Dynamic_frequency_scaling). In so-called
182 `performance` mode the CPU will increase its frequency and run faster than usual
185 limits, the number of cores currently in use, and the maximum frequency of the
188 **Decision: When benchmarking we want to make sure the dynamic frequency scaling
190 events are not impacted by frequency scaling.**
197 reservation](https://stackoverflow.com/questions/13583146/whole-one-core-dedicated-to-single-proces…
222 misses](https://en.wikipedia.org/wiki/Translation_lookaside_buffer#TLB-miss_handling).
228     We reuse some parts of Google Benchmark (detection of frequency scaling, CPU
232     -   Google Benchmark privileges code based configuration via macros and
242     -   Output of the measurements is done through a `BenchmarkReporter` class,