Lines Matching +full:cycle +full:- +full:frequency

1 # Benchmarking `llvm-libc`'s memory functions
9 - **code size** (to reduce instruction cache pressure),
10 - **Profile Guided Optimization** friendliness,
11 - **hyperthreading / multithreading** friendliness.
23 5. **Cost-effectiveness**: Benchmark tests are economical.
31 peculiarities of designing good microbenchmarks for `llvm-libc` memory
36 As seen in the [README.md](README.md#stochastic-mode) the microbenchmarking
39 accurately down to the cycle**.
44 - [Performance
47 - [High Precision Event
50 - [Real-Time Clocks (RTC)](https://en.wikipedia.org/wiki/Real-time_clock): used
53 In theory **Performance Counters** provide cycle accurate measurement via the
60 order](https://en.wikipedia.org/wiki/Out-of-order_execution) and
72 are micro-architecture dependent: **it is generally not possible to compare two
73 micro-architectures exposing the same performance counters.**
76 meaning. Although we want to benchmark `llvm-libc` memory functions for all
78 triples](https://clang.llvm.org/docs/CrossCompilation.html#target-triple), there
83 - Reading performance counters is done through Kernel [System
86 - [Interruptions](https://en.wikipedia.org/wiki/Interrupt#Processor_response)
88 - If the system is already under monitoring (virtual machines or system wide
91 - The Kernel can decide to [migrate the
94 - [Dynamic frequency
100 ### Cycle accuracy conclusion
103 inconsistent across micro-architectures and imprecise on modern CPUs for small
117 [SNR](https://en.wikipedia.org/wiki/Signal-to-noise_ratio).
119 ### Repeating code N-times until precision is sufficient
123 - We measure the time it takes to run the code _N_ times (Initially _N_ is 10
125 - We deduce an approximation of the runtime of one iteration (= _runtime_ /
127 - We increase _N_ by _X%_ and repeat the measurement (geometric progression).
128 - We keep track of the _one iteration runtime approximation_ and build a
130 - We stop the process when the difference between the weighted mean and the
178 ### Effect of dynamic frequency scaling
180 Modern processors implement [dynamic frequency
181 scaling](https://en.wikipedia.org/wiki/Dynamic_frequency_scaling). In so-called
182 `performance` mode the CPU will increase its frequency and run faster than usual
185 limits, the number of cores currently in use, and the maximum frequency of the
188 **Decision: When benchmarking we want to make sure the dynamic frequency scaling
190 events are not impacted by frequency scaling.**
197 reservation](https://stackoverflow.com/questions/13583146/whole-one-core-dedicated-to-single-proces…
222 misses](https://en.wikipedia.org/wiki/Translation_lookaside_buffer#TLB-miss_handling).
228 We reuse some parts of Google Benchmark (detection of frequency scaling, CPU
232 - Google Benchmark privileges code based configuration via macros and
242 - Output of the measurements is done through a `BenchmarkReporter` class,