#
caba1a6f |
| 01-Dec-2022 |
ryo <ryo@NetBSD.org> |
Improve tprof(4)
- Multiple events can now be handled simultaneously. - Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance, instead of being configured at TPROF_IOC_START. - T
Improve tprof(4)
- Multiple events can now be handled simultaneously. - Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance, instead of being configured at TPROF_IOC_START. - The configured counters can be started and stopped repeatedly by PROF_IOC_START/TPROF_IOC_STOP. - The value of the performance counter can be obtained at any timing as a 64bit value with TPROF_IOC_GETCOUNTS. - Backend common parts are handled in tprof.c as much as possible, and functions on the tprof_backend side have been reimplemented to be more primitive. - The reset value of counter overflows for profiling can now be adjusted. It is calculated by default from the CPU clock (speed of cycle counter) and TPROF_HZ, but for some events the value may be too large to be sufficient for profiling. The event counter can be specified as a ratio to the default or as an absolute value when configuring the event counter. - Due to overall changes, API and ABI have been changed. TPROF_VERSION and TPROF_BACKEND_VERSION were updated.
show more ...
|
#
a087cb3c |
| 13-Jul-2018 |
maxv <maxv@NetBSD.org> |
Revamp tprof.
Rewrite the Intel backend to use the generic PMC interface, which is available on all Intel CPUs. Synchronize the AMD backend with the new interface.
The kernel identifies the PMC int
Revamp tprof.
Rewrite the Intel backend to use the generic PMC interface, which is available on all Intel CPUs. Synchronize the AMD backend with the new interface.
The kernel identifies the PMC interface, and gives its id to userland. Userland then queries the events itself (via cpuid etc). These events depend on the PMC interface.
The tprof utility is rewritten to allow the user to choose which event to count (which was not possible until now, the event was hardcoded in the backend). The command line format is based on usr.bin/pmc, eg:
tprof -e llc-misses:k -o output sleep 20
The man page is updated too, but the arguments will likely change soon anyway so it doesn't matter a lot.
The tprof utility has three tables:
Intel Architectural Version 1 Intel Skylake/Kabylake AMD Family 10h
A CPU can support a combination of tables. For example Kabylake has Intel-Architectural-Version-1 and its own Intel-Kabylake table.
For now the Intel Skylake/Kabylake table contains only one event, just to demonstrate that the combination of tables works. Tested on an Intel Core i5 Kabylake.
The code for AMD Family 10h is taken from the code I had written for usr.bin/pmc. I haven't tested it yet, but it's the same as pmc(1), so I guess it works as-is.
The whole thing is written in such a way that (I think) it is not complicated to add more CPU models, and more architectures (other than x86).
show more ...
|