xref: /openbsd-src/share/man/man4/man4.amd64/pctr.4 (revision f2da64fbbbf1b03f09f390ab01267c93dfd77c4c)
1.\"	$OpenBSD: pctr.4,v 1.7 2013/07/16 16:05:49 schwarze Exp $
2.\"
3.\" Pentium performance counter driver for OpenBSD.
4.\" Copyright 1996 David Mazieres <dm@lcs.mit.edu>.
5.\"
6.\" Modification and redistribution in source and binary forms is
7.\" permitted provided that due credit is given to the author and the
8.\" OpenBSD project by leaving this copyright notice intact.
9.\"
10.Dd $Mdocdate: July 16 2013 $
11.Dt PCTR 4 amd64
12.Os
13.Sh NAME
14.Nm pctr
15.Nd driver for CPU performance counters
16.Sh SYNOPSIS
17.Cd "pseudo-device pctr 1"
18.Sh DESCRIPTION
19The
20.Nm
21device provides access to the performance counters on AMD and Intel brand
22processors, and to the TSC on others.
23.Pp
24Intel processors have two 40-bit performance counters which can be
25programmed to count events such as cache misses, branch target buffer hits,
26TLB misses, dual-issues, interrupts, pipeline flushes, and more.
27While AMD processors have four 48-bit counters, their precision is decreased
28to 40 bits.
29.Pp
30There is one
31.Em ioctl
32call to read the status of all counters, and one
33.Em ioctl
34call to program the function of each counter.
35All require the following includes:
36.Bd -literal -offset indent
37#include <sys/types.h>
38#include <machine/cpu.h>
39#include <machine/pctr.h>
40.Ed
41.Pp
42The current state of all counters can be read with the
43.Dv PCIOCRD
44.Em ioctl ,
45which takes an argument of type
46.Dv "struct pctrst" :
47.Bd -literal -offset indent
48#define PCTR_NUM	4
49struct pctrst {
50	u_int pctr_fn[PCTR_NUM];
51	pctrval pctr_tsc;
52	pctrval pctr_hwc[PCTR_NUM];
53};
54.Ed
55.Pp
56In this structure,
57.Em ctr_fn
58contains the functions of the counters, as previously set by the
59.Dv PCIOCS0 ,
60.Dv PCIOCS1 ,
61.Dv PCIOCS2
62and
63.Dv PCIOCS3
64ioctls (see below).
65.Em pctr_hwc
66contains the actual value of the hardware counters.
67.Em pctr_tsc
68is a free-running, 64-bit cycle counter.
69.Pp
70The functions of the counters can be programmed with ioctls
71.Dv PCIOCS0 ,
72.Dv PCIOCS1 ,
73.Dv PCIOCS2
74and
75.Dv PCIOCS3
76which require a writeable file descriptor and take an argument of type
77.Dv "unsigned int" . \&
78The meaning of this integer is dependent on the particular CPU.
79.Ss Time stamp counter
80The time stamp counter is available on most of the AMD and Intel CPUs.
81It is set to zero at boot time, and then increments with each cycle.
82Because the counter is 64-bits wide, it does not overflow.
83.Pp
84The time stamp counter can be read directly from user-mode using
85the
86.Fn rdtsc
87macro, which returns a 64-bit value of type
88.Em pctrval .
89The following example illustrates a simple use of
90.Fn rdtsc
91to measure the execution time of a hypothetical subroutine called
92.Fn functionx :
93.Bd -literal -offset indent
94void
95time_functionx(void)
96{
97	pctrval tsc;
98
99	tsc = rdtsc();
100	functionx();
101	tsc = rdtsc() - tsc;
102	printf("Functionx took %llu cycles.\en", tsc);
103}
104.Ed
105.Pp
106The value of the time stamp counter is also returned by the
107.Dv PCIOCRD
108.Em ioctl ,
109so that one can get an exact timestamp on readings of the hardware
110event counters.
111.Pp
112The performance counters can be read directly from user-mode without
113need to invoke the kernel.
114The macro
115.Fn rdpmc ctr
116takes 0, 1, 2 or 3 as an argument to specify a counter, and returns that
117counter's 40-bit value (which will be of type
118.Em pctrval ) .
119This is generally preferable to making a system call as it introduces
120less distortion in measurements.
121.Pp
122Counter functions supported by these CPUs contain several parts.
123The most significant byte (an 8-bit integer shifted left by
124.Dv PCTR_CM_SHIFT )
125contains a
126.Em "counter mask" .
127If non-zero, this sets a threshold for the number of times an event
128must occur in one cycle for the counter to be incremented.
129The
130.Em "counter mask"
131can therefore be used to count cycles in which an event
132occurs at least some number of times.
133The next byte contains several flags:
134.Bl -tag -width PCTR_EN
135.It Dv PCTR_U
136Enables counting of events that occur in user mode.
137.It Dv PCTR_K
138Enables counting of events that occur in kernel mode.
139You must set at least one of
140.Dv PCTR_K
141and
142.Dv PCTR_U
143to count anything.
144.It Dv PCTR_E
145Counts edges rather than cycles.
146For some functions this allows you
147to get an estimate of the number of events rather than the number of
148cycles occupied by those events.
149.It Dv PCTR_EN
150Enable counters.
151This bit must be set in the function for counter 0
152in order for either of the counters to be enabled.
153This bit should probably be set in counter 1 as well.
154.It Dv PCTR_I
155Inverts the sense of the
156.Em "counter mask" . \&
157When this bit is set, the counter only increments on cycles in which
158there are no
159.Em more
160events than specified in the
161.Em "counter mask" .
162.El
163.Pp
164The next byte (shifted left by the
165.Dv PCTR_UM_SHIFT )
166contains flags specific to the event being counted, also known as the
167.Em "unit mask" .
168.Pp
169For events dealing with the L2 cache, the following flags are valid
170on Intel brand processors:
171.Bl -tag -width PCTR_UM_M
172.It Dv PCTR_UM_M
173Count events involving modified cache coherency state lines.
174.It Dv PCTR_UM_E
175Count events involving exclusive cache coherency state lines.
176.It Dv PCTR_UM_S
177Count events involving shared cache coherency state lines.
178.It Dv PCTR_UM_I
179Count events involving invalid cache coherency state lines.
180.El
181.Pp
182To measure all L2 cache activity, all these bits should be set.
183They can be set with the macro
184.Dv PCTR_UM_MESI
185which contains the bitwise or of all of the above.
186.Pp
187For event types dealing with bus transactions, there is another flag
188that can be set in the
189.Em "unit mask" :
190.Bl -tag -width PCTR_UM_A
191.It Dv PCTR_UM_A
192Count all appropriate bus events, not just those initiated by the
193processor.
194.El
195.Pp
196Events marked
197.Em (MESI)
198require the
199.Dv PCTR_UM_[MESI]
200bits in the
201.Em "unit mask" . \&
202Events marked
203.Em (A)
204can take the
205.Dv PCTR_UM_A
206bit.
207.Pp
208Finally, the least significant byte of the counter function is the
209event type to count.
210A list of possible event functions could be obtained by running a
211.Xr pctr 1
212command with
213.Fl l
214option.
215.Sh FILES
216.Bl -tag -width /dev/pctr -compact
217.It Pa /dev/pctr
218.El
219.Sh ERRORS
220.Bl -tag -width "[ENODEV]"
221.It Bq Er ENODEV
222An attempt was made to set the counter functions on a CPU that does
223not support counters.
224.It Bq Er EINVAL
225An invalid counter function was provided as an argument to the
226.Dv PCIOCSx
227.Em ioctl .
228.It Bq Er EPERM
229An attempt was made to set the counter functions, but the device was
230not open for writing.
231.El
232.Sh SEE ALSO
233.Xr pctr 1 ,
234.Xr ioctl 2
235.Sh HISTORY
236A
237.Nm
238device first appeared in
239.Ox 2.0 .
240Support for amd64 architecture appeared in
241.Ox 4.3 .
242.Sh AUTHORS
243.An -nosplit
244The
245.Nm
246device was written by
247.An David Mazieres Aq Mt dm@lcs.mit.edu .
248Support for amd64 architecture was written by
249.An Mike Belopuhov Aq Mt mikeb@openbsd.org .
250.Sh BUGS
251Not all counter functions are completely accurate.
252Some of the functions may not make any sense at all.
253Also you should be aware of the possibility of an interrupt between
254invocations of
255.Fn rdpmc
256and/or
257.Fn rdtsc
258that can potentially decrease the accuracy of measurements.
259