Lines Matching full:kernel
1 # Optimizing Linux Kernel with BOLT
6 Many Linux applications spend a significant amount of their execution time in the kernel. Thus, when we consider code optimization for system performance, it is essential to improve the CPU utilization not only in the user-space applications and libraries but also in the kernel. BOLT has demonstrated double-digit gains while being applied to user-space programs. This guide shows how to apply BOLT to the x86-64 Linux kernel and enhance your system's performance. In our experiments, BOLT boosted database TPS by 2 percent when applied to the kernel compiled with the highest level optimizations, including PGO and LTO. The database spent ~40% of the time in the kernel and was quite sensitive to kernel performance.
10 While BOLT optimizations are not specific to the Linux kernel, certain quirks distinguish the kernel from user-level applications.
12 BOLT has been successfully applied to and tested with several flavors of the x86-64 Linux kernel.
17 BOLT operates on a statically-linked kernel executable, a.k.a. `vmlinux` binary. However, most Linux distributions use a `vmlinuz` compressed image for system booting. To use BOLT on the kernel, you must either repackage `vmlinuz` after BOLT optimizations or add steps for running BOLT into the kernel build and rebuild `vmlinuz`. Uncompressing `vmlinuz` and repackaging it with a new `vmlinux` binary falls beyond the scope of this guide, and at some point, we may add the capability to run BOLT directly on `vmlinuz`. Meanwhile, this guide focuses on steps for integrating BOLT into the kernel build process.
20 ### Building the Kernel
22 After downloading the kernel sources and configuration for your distribution, you should be able to build `vmlinuz` using the `make bzImage` command. Ideally, the kernel should binary match the kernel on the system you are about to optimize (the target system). The binary matching part is critical as BOLT performs profile matching and optimizations at the binary level. We recommend installing a freshly built kernel on the target system to avoid any discrepancies.
24 Note that the kernel build will produce several artifacts besides bzImage. The most important of them is the uncompressed `vmlinux` binary, which will be used in the next steps. Make sure to save this file.
28 Once the target system boots with the freshly-built kernel, start your workload, such as a database benchmark. While the system is under load, collect the kernel profile using perf:
52 Two changes are required for the kernel build. The first one is optional but highly recommended. It introduces a BOLT-reserved space into `vmlinux` code section:
56 --- a/arch/x86/kernel/vmlinux.lds.S
57 +++ b/arch/x86/kernel/vmlinux.lds.S
101 By improving the code layout, BOLT can boost the kernel's performance by up to 5% by reducing instruction cache misses and branch mispredictions. When measuring total system performance, you should scale this number accordingly based on the time your application spends in the kernel (excluding I/O time).
106 The timing and duration of the profiling may have a significant effect on the performance of the BOLTed kernel. If you don't know your workload well, it's recommended that you profile for the whole duration of the benchmark run. As longer times will result in larger `perf.data` files, you can lower the profiling frequency by providing a smaller value of `-F` flag. E.g., to record the kernel profile for half an hour, use the following command: