xref: /netbsd-src/lib/libc/atomic/membar_ops.3 (revision 4f8ce3b31dd3bccb2f78a8118c558aacc733ac7d)
1.\"	$NetBSD: membar_ops.3,v 1.10 2022/04/09 23:32:52 riastradh Exp $
2.\"
3.\" Copyright (c) 2007, 2008 The NetBSD Foundation, Inc.
4.\" All rights reserved.
5.\"
6.\" This code is derived from software contributed to The NetBSD Foundation
7.\" by Jason R. Thorpe.
8.\"
9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions
11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright
13.\" notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright
15.\" notice, this list of conditions and the following disclaimer in the
16.\" documentation and/or other materials provided with the distribution.
17.\"
18.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
19.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
20.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
21.\" PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
22.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
23.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
24.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
25.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
26.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
27.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
28.\" POSSIBILITY OF SUCH DAMAGE.
29.\"
30.Dd March 30, 2022
31.Dt MEMBAR_OPS 3
32.Os
33.Sh NAME
34.Nm membar_ops ,
35.Nm membar_acquire ,
36.Nm membar_release ,
37.Nm membar_producer ,
38.Nm membar_consumer ,
39.Nm membar_datadep_consumer ,
40.Nm membar_sync
41.Nd memory ordering barriers
42.\" .Sh LIBRARY
43.\" .Lb libc
44.Sh SYNOPSIS
45.In sys/atomic.h
46.\"
47.Ft void
48.Fn membar_acquire "void"
49.Ft void
50.Fn membar_release "void"
51.Ft void
52.Fn membar_producer "void"
53.Ft void
54.Fn membar_consumer "void"
55.Ft void
56.Fn membar_datadep_consumer "void"
57.Ft void
58.Fn membar_sync "void"
59.Sh DESCRIPTION
60The
61.Nm
62family of functions prevent reordering of memory operations, as needed
63for synchronization in multiprocessor execution environments that have
64relaxed load and store order.
65.Pp
66In general, memory barriers must come in pairs \(em a barrier on one
67CPU, such as
68.Fn membar_release ,
69must pair with a barrier on another CPU, such as
70.Fn membar_acquire ,
71in order to synchronize anything between the two CPUs.
72Code using
73.Nm
74should generally be annotated with comments identifying how they are
75paired.
76.Pp
77.Nm
78affect only operations on regular memory, not on device
79memory; see
80.Xr bus_space 9
81and
82.Xr bus_dma 9
83for machine-independent interfaces to handling device memory and DMA
84operations for device drivers.
85.Pp
86Unlike C11,
87.Em all
88memory operations \(em that is, all loads and stores on regular
89memory \(em are affected by
90.Nm ,
91not just C11 atomic operations on
92.Vt _Atomic\^ Ns -qualified
93objects.
94.Bl -tag -width abcd
95.It Fn membar_acquire
96Any load preceding
97.Fn membar_acquire
98will happen before all memory operations following it.
99.Pp
100A load followed by a
101.Fn membar_acquire
102implies a
103.Em load-acquire
104operation in the language of C11.
105.Fn membar_acquire
106should only be used after atomic read/modify/write, such as
107.Xr atomic_cas_uint 3 .
108For regular loads, instead of
109.Li "x = *p; membar_acquire()" ,
110you should use
111.Li "x = atomic_load_acquire(p)" .
112.Pp
113.Fn membar_acquire
114is typically used in code that implements locking primitives to ensure
115that a lock protects its data, and is typically paired with
116.Fn membar_release ;
117see below for an example.
118.It Fn membar_release
119All memory operations preceding
120.Fn membar_release
121will happen before any store that follows it.
122.Pp
123A
124.Fn membar_release
125followed by a store implies a
126.Em store-release
127operation in the language of C11.
128.Fn membar_release
129should only be used before atomic read/modify/write, such as
130.Xr atomic_inc_uint 3 .
131For regular stores, instead of
132.Li "membar_release(); *p = x" ,
133you should use
134.Li "atomic_store_release(p, x)" .
135.Pp
136.Fn membar_release
137is typically paired with
138.Fn membar_acquire ,
139and is typically used in code that implements locking or reference
140counting primitives.
141Releasing a lock or reference count should use
142.Fn membar_release ,
143and acquiring a lock or handling an object after draining references
144should use
145.Fn membar_acquire ,
146so that whatever happened before releasing will also have happened
147before acquiring.
148For example:
149.Bd -literal -offset abcdefgh
150/* thread A -- release a reference */
151obj->state.mumblefrotz = 42;
152KASSERT(valid(&obj->state));
153membar_release();
154atomic_dec_uint(&obj->refcnt);
155
156/*
157 * thread B -- busy-wait until last reference is released,
158 * then lock it by setting refcnt to UINT_MAX
159 */
160while (atomic_cas_uint(&obj->refcnt, 0, -1) != 0)
161	continue;
162membar_acquire();
163KASSERT(valid(&obj->state));
164obj->state.mumblefrotz--;
165.Ed
166.Pp
167In this example,
168.Em if
169the load in
170.Fn atomic_cas_uint
171in thread B witnesses the store in
172.Fn atomic_dec_uint
173in thread A setting the reference count to zero,
174.Em then
175everything in thread A before the
176.Fn membar_release
177is guaranteed to happen before everything in thread B after the
178.Fn membar_acquire ,
179as if the machine had sequentially executed:
180.Bd -literal -offset abcdefgh
181obj->state.mumblefrotz = 42;	/* from thread A */
182KASSERT(valid(&obj->state));
183\&...
184KASSERT(valid(&obj->state));	/* from thread B */
185obj->state.mumblefrotz--;
186.Ed
187.Pp
188.Fn membar_release
189followed by a store, serving as a
190.Em store-release
191operation, may also be paired with a subsequent load followed by
192.Fn membar_acquire ,
193serving as the corresponding
194.Em load-acquire
195operation.
196However, you should use
197.Xr atomic_store_release 9
198and
199.Xr atomic_load_acquire 9
200instead in that situation, unless the store is an atomic
201read/modify/write which requires a separate
202.Fn membar_release .
203.It Fn membar_producer
204All stores preceding
205.Fn membar_producer
206will happen before any stores following it.
207.Pp
208.Fn membar_producer
209has no analogue in C11.
210.Pp
211.Fn membar_producer
212is typically used in code that produces data for read-only consumers
213which use
214.Fn membar_consumer ,
215such as
216.Sq seqlocked
217snapshots of statistics; see below for an example.
218.It Fn membar_consumer
219All loads preceding
220.Fn membar_consumer
221will complete before any loads after it.
222.Pp
223.Fn membar_consumer
224has no analogue in C11.
225.Pp
226.Fn membar_consumer
227is typically used in code that reads data from producers which use
228.Fn membar_producer ,
229such as
230.Sq seqlocked
231snapshots of statistics.
232For example:
233.Bd -literal
234struct {
235	/* version number and in-progress bit */
236	unsigned	seq;
237
238	/* read-only statistics, too large for atomic load */
239	unsigned	foo;
240	int		bar;
241	uint64_t	baz;
242} stats;
243
244	/* producer (must be serialized, e.g. with mutex(9)) */
245	stats->seq |= 1;	/* mark update in progress */
246	membar_producer();
247	stats->foo = count_foo();
248	stats->bar = measure_bar();
249	stats->baz = enumerate_baz();
250	membar_producer();
251	stats->seq++;		/* bump version number */
252
253	/* consumer (in parallel w/ producer, other consumers) */
254restart:
255	while ((seq = stats->seq) & 1)	/* wait for update */
256		SPINLOCK_BACKOFF_HOOK;
257	membar_consumer();
258	foo = stats->foo;	/* read out a candidate snapshot */
259	bar = stats->bar;
260	baz = stats->baz;
261	membar_consumer();
262	if (seq != stats->seq)	/* try again if version changed */
263		goto restart;
264.Ed
265.It Fn membar_datadep_consumer
266Same as
267.Fn membar_consumer ,
268but limited to loads of addresses dependent on prior loads, or
269.Sq data-dependent
270loads:
271.Bd -literal -offset indent
272int **pp, *p, v;
273
274p = *pp;
275membar_datadep_consumer();
276v = *p;
277consume(v);
278.Ed
279.Pp
280.Fn membar_datadep_consumer
281is typically paired with
282.Fn membar_release
283by code that initializes an object before publishing it.
284However, you should use
285.Xr atomic_store_release 9
286and
287.Xr atomic_load_consume 9
288instead, to avoid obscure edge cases in case the consumer is not
289read-only.
290.Pp
291.Fn membar_datadep_consumer
292does not guarantee ordering of loads in branches, or
293.Sq control-dependent
294loads \(em you must use
295.Fn membar_consumer
296instead:
297.Bd -literal -offset indent
298int *ok, *p, v;
299
300if (*ok) {
301	membar_consumer();
302	v = *p;
303	consume(v);
304}
305.Ed
306.Pp
307Most CPUs do not reorder data-dependent loads (i.e., most CPUs
308guarantee that cached values are not stale in that case), so
309.Fn membar_datadep_consumer
310is a no-op on those CPUs.
311.It Fn membar_sync
312All memory operations preceding
313.Fn membar_sync
314will happen before any memory operations following it.
315.Pp
316.Fn membar_sync
317is a sequential consistency acquire/release barrier, analogous to
318.Li "atomic_thread_fence(memory_order_seq_cst)"
319in C11.
320.Pp
321.Fn membar_sync
322is typically paired with
323.Fn membar_sync .
324.Pp
325.Fn membar_sync
326is typically not needed except in exotic synchronization schemes like
327Dekker's algorithm that require store-before-load ordering.
328If you are tempted to reach for it, see if there is another way to do
329what you're trying to do first.
330.El
331.Sh DEPRECATED MEMORY BARRIERS
332The following memory barriers are deprecated.
333They were imported from Solaris, which describes them as providing
334ordering relative to
335.Sq lock acquisition ,
336but the documentation in
337.Nx
338disagreed with the implementation and use on the semantics.
339.Bl -tag -width abcd
340.It Fn membar_enter
341Originally documented as store-before-load/store, this was instead
342implemented as load-before-load/store on some platforms, which is what
343essentially all uses relied on.
344Now this is implemented as an alias for
345.Fn membar_sync
346everywhere, meaning a full load/store-before-load/store sequential
347consistency barrier, in order to guarantee what the documentation
348claimed
349.Em and
350what the implementation actually did.
351.Pp
352New code should use
353.Fn membar_acquire
354for load-before-load/store ordering, which is what most uses need, or
355.Fn membar_sync
356for store-before-load/store ordering, which typically only appears in
357exotic synchronization schemes like Dekker's algorithm.
358.It Fn membar_exit
359Alias for
360.Fn membar_release .
361This was originally meant to be paired with
362.Fn membar_enter .
363.Pp
364New code should use
365.Fn membar_release
366instead.
367.El
368.Sh SEE ALSO
369.Xr atomic_ops 3 ,
370.Xr atomic_loadstore 9 ,
371.Xr bus_dma 9 ,
372.Xr bus_space 9
373.Sh HISTORY
374The
375.Nm membar_ops
376functions first appeared in
377.Nx 5.0 .
378.Pp
379The data-dependent load barrier,
380.Fn membar_datadep_consumer ,
381first appeared in
382.Nx 7.0 .
383.Pp
384The
385.Fn membar_acquire
386and
387.Fn membar_release
388functions first appeared, and the
389.Fn membar_enter
390and
391.Fn membar_exit
392functions were deprecated, in
393.Nx 10.0 .
394