xref: /netbsd-src/share/man/man3/__builtin_prefetch.3 (revision 17512a6a3373125d0368137df06da0a2d4ac9e5a)
1*17512a6aSrillig.\" $NetBSD: __builtin_prefetch.3,v 1.4 2024/09/07 20:33:53 rillig Exp $
2b7b8493aSjruoho.\"
3b7b8493aSjruoho.\" Copyright (c) 2010 Jukka Ruohonen <jruohonen@iki.fi>
4b7b8493aSjruoho.\" All rights reserved.
5b7b8493aSjruoho.\"
6b7b8493aSjruoho.\" Redistribution and use in source and binary forms, with or without
7b7b8493aSjruoho.\" modification, are permitted provided that the following conditions
8b7b8493aSjruoho.\" are met:
9b7b8493aSjruoho.\" 1. Redistributions of source code must retain the above copyright
10b7b8493aSjruoho.\"    notice, this list of conditions and the following disclaimer.
11b7b8493aSjruoho.\" 2. Redistributions in binary form must reproduce the above copyright
12b7b8493aSjruoho.\"    notice, this list of conditions and the following disclaimer in the
13b7b8493aSjruoho.\"    documentation and/or other materials provided with the distribution.
14b7b8493aSjruoho.\"
15b7b8493aSjruoho.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
16b7b8493aSjruoho.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
17b7b8493aSjruoho.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
18b7b8493aSjruoho.\" PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
19b7b8493aSjruoho.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
20b7b8493aSjruoho.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
21b7b8493aSjruoho.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
22b7b8493aSjruoho.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
23b7b8493aSjruoho.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
24b7b8493aSjruoho.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
25b7b8493aSjruoho.\" POSSIBILITY OF SUCH DAMAGE.
26b7b8493aSjruoho.\"
27b7b8493aSjruoho.Dd December 22, 2010
28b7b8493aSjruoho.Dt __BUILTIN_PREFETCH 3
29b7b8493aSjruoho.Os
30b7b8493aSjruoho.Sh NAME
31b7b8493aSjruoho.Nm __builtin_prefetch
32b7b8493aSjruoho.Nd GNU extension to prefetch memory
33b7b8493aSjruoho.Sh SYNOPSIS
34b7b8493aSjruoho.Ft void
35c85eb929Swiz.Fn __builtin_prefetch "const void *addr" "..."
36b7b8493aSjruoho.Sh DESCRIPTION
37b7b8493aSjruohoThe
38b7b8493aSjruoho.Fn __builtin_prefetch
39b7b8493aSjruohofunction prefetches memory from
40b7b8493aSjruoho.Fa addr .
41b7b8493aSjruohoThe rationale is to minimize cache-miss latency by
42b7b8493aSjruohotrying to move data into a cache before accessing the data.
43b7b8493aSjruohoPossible use cases include frequently called sections of code
44b7b8493aSjruohoin which it is known that the data in a given address is likely
45b7b8493aSjruohoto be accessed soon.
46b7b8493aSjruoho.Pp
47b7b8493aSjruohoIn addition to
48b7b8493aSjruoho.Fa addr ,
49b7b8493aSjruohothere are two optional
50b7b8493aSjruoho.Xr stdarg 3
51b7b8493aSjruohoarguments,
52b7b8493aSjruoho.Fa rw
53b7b8493aSjruohoand
54b7b8493aSjruoho.Fa locality .
55b7b8493aSjruohoThe value of the latter should be a compile-time
56b7b8493aSjruohoconstant integer between 0 and 3.
57b7b8493aSjruohoThe higher the value, the higher the temporal locality in the data.
58b7b8493aSjruohoWhen
59b7b8493aSjruoho.Fa locality
60b7b8493aSjruohois 0, it is assumed that there is little or no temporal locality in the data;
61b7b8493aSjruohoafter access, it is not necessary to leave the data in the cache.
62b7b8493aSjruohoThe default value is 3.
63b7b8493aSjruohoThe value of
64b7b8493aSjruoho.Fa rw
65b7b8493aSjruohois either 0 or 1, corresponding with read and write prefetch, respectively.
66b7b8493aSjruohoThe default value of
67b7b8493aSjruoho.Fa rw
68b7b8493aSjruohois 0.
69b7b8493aSjruohoAlso
70b7b8493aSjruoho.Fa rw
71b7b8493aSjruohomust be a compile-time constant integer.
72b7b8493aSjruoho.Pp
73b7b8493aSjruohoThe
74b7b8493aSjruoho.Fn __builtin_prefetch
75b7b8493aSjruohofunction translates into prefetch instructions
76b7b8493aSjruohoonly if the architecture has support for these.
77b7b8493aSjruohoIf there is no support,
78b7b8493aSjruoho.Fa addr
79b7b8493aSjruohois evaluated only if it includes side effects,
80b7b8493aSjruohoalthough no warnings are issued by
81b7b8493aSjruoho.Xr gcc 1 .
82b7b8493aSjruoho.Sh EXAMPLES
83b7b8493aSjruohoThe following optimization appears in the heavily used
84b7b8493aSjruoho.Fn cpu_in_cksum
85b7b8493aSjruohofunction that calculates checksums for the
86b7b8493aSjruoho.Xr inet 4
87b7b8493aSjruohoheaders:
88b7b8493aSjruoho.Bd -literal -offset indent
89b7b8493aSjruohowhile (mlen >= 32) {
90b7b8493aSjruoho	__builtin_prefetch(data + 32);
91b7b8493aSjruoho	partial += *(uint16_t *)data;
92b7b8493aSjruoho	partial += *(uint16_t *)(data + 2);
93b7b8493aSjruoho	partial += *(uint16_t *)(data + 4);
94b7b8493aSjruoho
95b7b8493aSjruoho	\&...
96b7b8493aSjruoho
97b7b8493aSjruoho	partial += *(uint16_t *)(data + 28);
98b7b8493aSjruoho	partial += *(uint16_t *)(data + 30);
99b7b8493aSjruoho
100b7b8493aSjruoho	data += 32;
101b7b8493aSjruoho	mlen -= 32;
102b7b8493aSjruoho
103b7b8493aSjruoho	\&...
104b7b8493aSjruoho.Ed
105b7b8493aSjruoho.Sh SEE ALSO
106b7b8493aSjruoho.Xr gcc 1 ,
107b7b8493aSjruoho.Xr attribute 3
108b7b8493aSjruoho.Rs
109b7b8493aSjruoho.%A Ulrich Drepper
110b7b8493aSjruoho.%T What Every Programmer Should Know About Memory
111b7b8493aSjruoho.%D November 21, 2007
112*17512a6aSrillig.%U https://www.akkadia.org/drepper/cpumemory.pdf
113b7b8493aSjruoho.Re
114b7b8493aSjruoho.Sh CAVEATS
115b7b8493aSjruohoThis is a non-standard, compiler-specific extension.
116