1.\" $NetBSD: membar_ops.3,v 1.10 2022/04/09 23:32:52 riastradh Exp $ 2.\" 3.\" Copyright (c) 2007, 2008 The NetBSD Foundation, Inc. 4.\" All rights reserved. 5.\" 6.\" This code is derived from software contributed to The NetBSD Foundation 7.\" by Jason R. Thorpe. 8.\" 9.\" Redistribution and use in source and binary forms, with or without 10.\" modification, are permitted provided that the following conditions 11.\" are met: 12.\" 1. Redistributions of source code must retain the above copyright 13.\" notice, this list of conditions and the following disclaimer. 14.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" notice, this list of conditions and the following disclaimer in the 16.\" documentation and/or other materials provided with the distribution. 17.\" 18.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS 19.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED 20.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 21.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS 22.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 23.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 24.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 26.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 27.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 28.\" POSSIBILITY OF SUCH DAMAGE. 29.\" 30.Dd March 30, 2022 31.Dt MEMBAR_OPS 3 32.Os 33.Sh NAME 34.Nm membar_ops , 35.Nm membar_acquire , 36.Nm membar_release , 37.Nm membar_producer , 38.Nm membar_consumer , 39.Nm membar_datadep_consumer , 40.Nm membar_sync 41.Nd memory ordering barriers 42.\" .Sh LIBRARY 43.\" .Lb libc 44.Sh SYNOPSIS 45.In sys/atomic.h 46.\" 47.Ft void 48.Fn membar_acquire "void" 49.Ft void 50.Fn membar_release "void" 51.Ft void 52.Fn membar_producer "void" 53.Ft void 54.Fn membar_consumer "void" 55.Ft void 56.Fn membar_datadep_consumer "void" 57.Ft void 58.Fn membar_sync "void" 59.Sh DESCRIPTION 60The 61.Nm 62family of functions prevent reordering of memory operations, as needed 63for synchronization in multiprocessor execution environments that have 64relaxed load and store order. 65.Pp 66In general, memory barriers must come in pairs \(em a barrier on one 67CPU, such as 68.Fn membar_release , 69must pair with a barrier on another CPU, such as 70.Fn membar_acquire , 71in order to synchronize anything between the two CPUs. 72Code using 73.Nm 74should generally be annotated with comments identifying how they are 75paired. 76.Pp 77.Nm 78affect only operations on regular memory, not on device 79memory; see 80.Xr bus_space 9 81and 82.Xr bus_dma 9 83for machine-independent interfaces to handling device memory and DMA 84operations for device drivers. 85.Pp 86Unlike C11, 87.Em all 88memory operations \(em that is, all loads and stores on regular 89memory \(em are affected by 90.Nm , 91not just C11 atomic operations on 92.Vt _Atomic\^ Ns -qualified 93objects. 94.Bl -tag -width abcd 95.It Fn membar_acquire 96Any load preceding 97.Fn membar_acquire 98will happen before all memory operations following it. 99.Pp 100A load followed by a 101.Fn membar_acquire 102implies a 103.Em load-acquire 104operation in the language of C11. 105.Fn membar_acquire 106should only be used after atomic read/modify/write, such as 107.Xr atomic_cas_uint 3 . 108For regular loads, instead of 109.Li "x = *p; membar_acquire()" , 110you should use 111.Li "x = atomic_load_acquire(p)" . 112.Pp 113.Fn membar_acquire 114is typically used in code that implements locking primitives to ensure 115that a lock protects its data, and is typically paired with 116.Fn membar_release ; 117see below for an example. 118.It Fn membar_release 119All memory operations preceding 120.Fn membar_release 121will happen before any store that follows it. 122.Pp 123A 124.Fn membar_release 125followed by a store implies a 126.Em store-release 127operation in the language of C11. 128.Fn membar_release 129should only be used before atomic read/modify/write, such as 130.Xr atomic_inc_uint 3 . 131For regular stores, instead of 132.Li "membar_release(); *p = x" , 133you should use 134.Li "atomic_store_release(p, x)" . 135.Pp 136.Fn membar_release 137is typically paired with 138.Fn membar_acquire , 139and is typically used in code that implements locking or reference 140counting primitives. 141Releasing a lock or reference count should use 142.Fn membar_release , 143and acquiring a lock or handling an object after draining references 144should use 145.Fn membar_acquire , 146so that whatever happened before releasing will also have happened 147before acquiring. 148For example: 149.Bd -literal -offset abcdefgh 150/* thread A -- release a reference */ 151obj->state.mumblefrotz = 42; 152KASSERT(valid(&obj->state)); 153membar_release(); 154atomic_dec_uint(&obj->refcnt); 155 156/* 157 * thread B -- busy-wait until last reference is released, 158 * then lock it by setting refcnt to UINT_MAX 159 */ 160while (atomic_cas_uint(&obj->refcnt, 0, -1) != 0) 161 continue; 162membar_acquire(); 163KASSERT(valid(&obj->state)); 164obj->state.mumblefrotz--; 165.Ed 166.Pp 167In this example, 168.Em if 169the load in 170.Fn atomic_cas_uint 171in thread B witnesses the store in 172.Fn atomic_dec_uint 173in thread A setting the reference count to zero, 174.Em then 175everything in thread A before the 176.Fn membar_release 177is guaranteed to happen before everything in thread B after the 178.Fn membar_acquire , 179as if the machine had sequentially executed: 180.Bd -literal -offset abcdefgh 181obj->state.mumblefrotz = 42; /* from thread A */ 182KASSERT(valid(&obj->state)); 183\&... 184KASSERT(valid(&obj->state)); /* from thread B */ 185obj->state.mumblefrotz--; 186.Ed 187.Pp 188.Fn membar_release 189followed by a store, serving as a 190.Em store-release 191operation, may also be paired with a subsequent load followed by 192.Fn membar_acquire , 193serving as the corresponding 194.Em load-acquire 195operation. 196However, you should use 197.Xr atomic_store_release 9 198and 199.Xr atomic_load_acquire 9 200instead in that situation, unless the store is an atomic 201read/modify/write which requires a separate 202.Fn membar_release . 203.It Fn membar_producer 204All stores preceding 205.Fn membar_producer 206will happen before any stores following it. 207.Pp 208.Fn membar_producer 209has no analogue in C11. 210.Pp 211.Fn membar_producer 212is typically used in code that produces data for read-only consumers 213which use 214.Fn membar_consumer , 215such as 216.Sq seqlocked 217snapshots of statistics; see below for an example. 218.It Fn membar_consumer 219All loads preceding 220.Fn membar_consumer 221will complete before any loads after it. 222.Pp 223.Fn membar_consumer 224has no analogue in C11. 225.Pp 226.Fn membar_consumer 227is typically used in code that reads data from producers which use 228.Fn membar_producer , 229such as 230.Sq seqlocked 231snapshots of statistics. 232For example: 233.Bd -literal 234struct { 235 /* version number and in-progress bit */ 236 unsigned seq; 237 238 /* read-only statistics, too large for atomic load */ 239 unsigned foo; 240 int bar; 241 uint64_t baz; 242} stats; 243 244 /* producer (must be serialized, e.g. with mutex(9)) */ 245 stats->seq |= 1; /* mark update in progress */ 246 membar_producer(); 247 stats->foo = count_foo(); 248 stats->bar = measure_bar(); 249 stats->baz = enumerate_baz(); 250 membar_producer(); 251 stats->seq++; /* bump version number */ 252 253 /* consumer (in parallel w/ producer, other consumers) */ 254restart: 255 while ((seq = stats->seq) & 1) /* wait for update */ 256 SPINLOCK_BACKOFF_HOOK; 257 membar_consumer(); 258 foo = stats->foo; /* read out a candidate snapshot */ 259 bar = stats->bar; 260 baz = stats->baz; 261 membar_consumer(); 262 if (seq != stats->seq) /* try again if version changed */ 263 goto restart; 264.Ed 265.It Fn membar_datadep_consumer 266Same as 267.Fn membar_consumer , 268but limited to loads of addresses dependent on prior loads, or 269.Sq data-dependent 270loads: 271.Bd -literal -offset indent 272int **pp, *p, v; 273 274p = *pp; 275membar_datadep_consumer(); 276v = *p; 277consume(v); 278.Ed 279.Pp 280.Fn membar_datadep_consumer 281is typically paired with 282.Fn membar_release 283by code that initializes an object before publishing it. 284However, you should use 285.Xr atomic_store_release 9 286and 287.Xr atomic_load_consume 9 288instead, to avoid obscure edge cases in case the consumer is not 289read-only. 290.Pp 291.Fn membar_datadep_consumer 292does not guarantee ordering of loads in branches, or 293.Sq control-dependent 294loads \(em you must use 295.Fn membar_consumer 296instead: 297.Bd -literal -offset indent 298int *ok, *p, v; 299 300if (*ok) { 301 membar_consumer(); 302 v = *p; 303 consume(v); 304} 305.Ed 306.Pp 307Most CPUs do not reorder data-dependent loads (i.e., most CPUs 308guarantee that cached values are not stale in that case), so 309.Fn membar_datadep_consumer 310is a no-op on those CPUs. 311.It Fn membar_sync 312All memory operations preceding 313.Fn membar_sync 314will happen before any memory operations following it. 315.Pp 316.Fn membar_sync 317is a sequential consistency acquire/release barrier, analogous to 318.Li "atomic_thread_fence(memory_order_seq_cst)" 319in C11. 320.Pp 321.Fn membar_sync 322is typically paired with 323.Fn membar_sync . 324.Pp 325.Fn membar_sync 326is typically not needed except in exotic synchronization schemes like 327Dekker's algorithm that require store-before-load ordering. 328If you are tempted to reach for it, see if there is another way to do 329what you're trying to do first. 330.El 331.Sh DEPRECATED MEMORY BARRIERS 332The following memory barriers are deprecated. 333They were imported from Solaris, which describes them as providing 334ordering relative to 335.Sq lock acquisition , 336but the documentation in 337.Nx 338disagreed with the implementation and use on the semantics. 339.Bl -tag -width abcd 340.It Fn membar_enter 341Originally documented as store-before-load/store, this was instead 342implemented as load-before-load/store on some platforms, which is what 343essentially all uses relied on. 344Now this is implemented as an alias for 345.Fn membar_sync 346everywhere, meaning a full load/store-before-load/store sequential 347consistency barrier, in order to guarantee what the documentation 348claimed 349.Em and 350what the implementation actually did. 351.Pp 352New code should use 353.Fn membar_acquire 354for load-before-load/store ordering, which is what most uses need, or 355.Fn membar_sync 356for store-before-load/store ordering, which typically only appears in 357exotic synchronization schemes like Dekker's algorithm. 358.It Fn membar_exit 359Alias for 360.Fn membar_release . 361This was originally meant to be paired with 362.Fn membar_enter . 363.Pp 364New code should use 365.Fn membar_release 366instead. 367.El 368.Sh SEE ALSO 369.Xr atomic_ops 3 , 370.Xr atomic_loadstore 9 , 371.Xr bus_dma 9 , 372.Xr bus_space 9 373.Sh HISTORY 374The 375.Nm membar_ops 376functions first appeared in 377.Nx 5.0 . 378.Pp 379The data-dependent load barrier, 380.Fn membar_datadep_consumer , 381first appeared in 382.Nx 7.0 . 383.Pp 384The 385.Fn membar_acquire 386and 387.Fn membar_release 388functions first appeared, and the 389.Fn membar_enter 390and 391.Fn membar_exit 392functions were deprecated, in 393.Nx 10.0 . 394