1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
|
.\" $NetBSD: membar_ops.3,v 1.10 2022/04/09 23:32:52 riastradh Exp $
.\"
.\" Copyright (c) 2007, 2008 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" This code is derived from software contributed to The NetBSD Foundation
.\" by Jason R. Thorpe.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd March 30, 2022
.Dt MEMBAR_OPS 3
.Os
.Sh NAME
.Nm membar_ops ,
.Nm membar_acquire ,
.Nm membar_release ,
.Nm membar_producer ,
.Nm membar_consumer ,
.Nm membar_datadep_consumer ,
.Nm membar_sync
.Nd memory ordering barriers
.\" .Sh LIBRARY
.\" .Lb libc
.Sh SYNOPSIS
.In sys/atomic.h
.\"
.Ft void
.Fn membar_acquire "void"
.Ft void
.Fn membar_release "void"
.Ft void
.Fn membar_producer "void"
.Ft void
.Fn membar_consumer "void"
.Ft void
.Fn membar_datadep_consumer "void"
.Ft void
.Fn membar_sync "void"
.Sh DESCRIPTION
The
.Nm
family of functions prevent reordering of memory operations, as needed
for synchronization in multiprocessor execution environments that have
relaxed load and store order.
.Pp
In general, memory barriers must come in pairs \(em a barrier on one
CPU, such as
.Fn membar_release ,
must pair with a barrier on another CPU, such as
.Fn membar_acquire ,
in order to synchronize anything between the two CPUs.
Code using
.Nm
should generally be annotated with comments identifying how they are
paired.
.Pp
.Nm
affect only operations on regular memory, not on device
memory; see
.Xr bus_space 9
and
.Xr bus_dma 9
for machine-independent interfaces to handling device memory and DMA
operations for device drivers.
.Pp
Unlike C11,
.Em all
memory operations \(em that is, all loads and stores on regular
memory \(em are affected by
.Nm ,
not just C11 atomic operations on
.Vt _Atomic\^ Ns -qualified
objects.
.Bl -tag -width abcd
.It Fn membar_acquire
Any load preceding
.Fn membar_acquire
will happen before all memory operations following it.
.Pp
A load followed by a
.Fn membar_acquire
implies a
.Em load-acquire
operation in the language of C11.
.Fn membar_acquire
should only be used after atomic read/modify/write, such as
.Xr atomic_cas_uint 3 .
For regular loads, instead of
.Li "x = *p; membar_acquire()" ,
you should use
.Li "x = atomic_load_acquire(p)" .
.Pp
.Fn membar_acquire
is typically used in code that implements locking primitives to ensure
that a lock protects its data, and is typically paired with
.Fn membar_release ;
see below for an example.
.It Fn membar_release
All memory operations preceding
.Fn membar_release
will happen before any store that follows it.
.Pp
A
.Fn membar_release
followed by a store implies a
.Em store-release
operation in the language of C11.
.Fn membar_release
should only be used before atomic read/modify/write, such as
.Xr atomic_inc_uint 3 .
For regular stores, instead of
.Li "membar_release(); *p = x" ,
you should use
.Li "atomic_store_release(p, x)" .
.Pp
.Fn membar_release
is typically paired with
.Fn membar_acquire ,
and is typically used in code that implements locking or reference
counting primitives.
Releasing a lock or reference count should use
.Fn membar_release ,
and acquiring a lock or handling an object after draining references
should use
.Fn membar_acquire ,
so that whatever happened before releasing will also have happened
before acquiring.
For example:
.Bd -literal -offset abcdefgh
/* thread A -- release a reference */
obj->state.mumblefrotz = 42;
KASSERT(valid(&obj->state));
membar_release();
atomic_dec_uint(&obj->refcnt);
/*
* thread B -- busy-wait until last reference is released,
* then lock it by setting refcnt to UINT_MAX
*/
while (atomic_cas_uint(&obj->refcnt, 0, -1) != 0)
continue;
membar_acquire();
KASSERT(valid(&obj->state));
obj->state.mumblefrotz--;
.Ed
.Pp
In this example,
.Em if
the load in
.Fn atomic_cas_uint
in thread B witnesses the store in
.Fn atomic_dec_uint
in thread A setting the reference count to zero,
.Em then
everything in thread A before the
.Fn membar_release
is guaranteed to happen before everything in thread B after the
.Fn membar_acquire ,
as if the machine had sequentially executed:
.Bd -literal -offset abcdefgh
obj->state.mumblefrotz = 42; /* from thread A */
KASSERT(valid(&obj->state));
\&...
KASSERT(valid(&obj->state)); /* from thread B */
obj->state.mumblefrotz--;
.Ed
.Pp
.Fn membar_release
followed by a store, serving as a
.Em store-release
operation, may also be paired with a subsequent load followed by
.Fn membar_acquire ,
serving as the corresponding
.Em load-acquire
operation.
However, you should use
.Xr atomic_store_release 9
and
.Xr atomic_load_acquire 9
instead in that situation, unless the store is an atomic
read/modify/write which requires a separate
.Fn membar_release .
.It Fn membar_producer
All stores preceding
.Fn membar_producer
will happen before any stores following it.
.Pp
.Fn membar_producer
has no analogue in C11.
.Pp
.Fn membar_producer
is typically used in code that produces data for read-only consumers
which use
.Fn membar_consumer ,
such as
.Sq seqlocked
snapshots of statistics; see below for an example.
.It Fn membar_consumer
All loads preceding
.Fn membar_consumer
will complete before any loads after it.
.Pp
.Fn membar_consumer
has no analogue in C11.
.Pp
.Fn membar_consumer
is typically used in code that reads data from producers which use
.Fn membar_producer ,
such as
.Sq seqlocked
snapshots of statistics.
For example:
.Bd -literal
struct {
/* version number and in-progress bit */
unsigned seq;
/* read-only statistics, too large for atomic load */
unsigned foo;
int bar;
uint64_t baz;
} stats;
/* producer (must be serialized, e.g. with mutex(9)) */
stats->seq |= 1; /* mark update in progress */
membar_producer();
stats->foo = count_foo();
stats->bar = measure_bar();
stats->baz = enumerate_baz();
membar_producer();
stats->seq++; /* bump version number */
/* consumer (in parallel w/ producer, other consumers) */
restart:
while ((seq = stats->seq) & 1) /* wait for update */
SPINLOCK_BACKOFF_HOOK;
membar_consumer();
foo = stats->foo; /* read out a candidate snapshot */
bar = stats->bar;
baz = stats->baz;
membar_consumer();
if (seq != stats->seq) /* try again if version changed */
goto restart;
.Ed
.It Fn membar_datadep_consumer
Same as
.Fn membar_consumer ,
but limited to loads of addresses dependent on prior loads, or
.Sq data-dependent
loads:
.Bd -literal -offset indent
int **pp, *p, v;
p = *pp;
membar_datadep_consumer();
v = *p;
consume(v);
.Ed
.Pp
.Fn membar_datadep_consumer
is typically paired with
.Fn membar_release
by code that initializes an object before publishing it.
However, you should use
.Xr atomic_store_release 9
and
.Xr atomic_load_consume 9
instead, to avoid obscure edge cases in case the consumer is not
read-only.
.Pp
.Fn membar_datadep_consumer
does not guarantee ordering of loads in branches, or
.Sq control-dependent
loads \(em you must use
.Fn membar_consumer
instead:
.Bd -literal -offset indent
int *ok, *p, v;
if (*ok) {
membar_consumer();
v = *p;
consume(v);
}
.Ed
.Pp
Most CPUs do not reorder data-dependent loads (i.e., most CPUs
guarantee that cached values are not stale in that case), so
.Fn membar_datadep_consumer
is a no-op on those CPUs.
.It Fn membar_sync
All memory operations preceding
.Fn membar_sync
will happen before any memory operations following it.
.Pp
.Fn membar_sync
is a sequential consistency acquire/release barrier, analogous to
.Li "atomic_thread_fence(memory_order_seq_cst)"
in C11.
.Pp
.Fn membar_sync
is typically paired with
.Fn membar_sync .
.Pp
.Fn membar_sync
is typically not needed except in exotic synchronization schemes like
Dekker's algorithm that require store-before-load ordering.
If you are tempted to reach for it, see if there is another way to do
what you're trying to do first.
.El
.Sh DEPRECATED MEMORY BARRIERS
The following memory barriers are deprecated.
They were imported from Solaris, which describes them as providing
ordering relative to
.Sq lock acquisition ,
but the documentation in
.Nx
disagreed with the implementation and use on the semantics.
.Bl -tag -width abcd
.It Fn membar_enter
Originally documented as store-before-load/store, this was instead
implemented as load-before-load/store on some platforms, which is what
essentially all uses relied on.
Now this is implemented as an alias for
.Fn membar_sync
everywhere, meaning a full load/store-before-load/store sequential
consistency barrier, in order to guarantee what the documentation
claimed
.Em and
what the implementation actually did.
.Pp
New code should use
.Fn membar_acquire
for load-before-load/store ordering, which is what most uses need, or
.Fn membar_sync
for store-before-load/store ordering, which typically only appears in
exotic synchronization schemes like Dekker's algorithm.
.It Fn membar_exit
Alias for
.Fn membar_release .
This was originally meant to be paired with
.Fn membar_enter .
.Pp
New code should use
.Fn membar_release
instead.
.El
.Sh SEE ALSO
.Xr atomic_ops 3 ,
.Xr atomic_loadstore 9 ,
.Xr bus_dma 9 ,
.Xr bus_space 9
.Sh HISTORY
The
.Nm membar_ops
functions first appeared in
.Nx 5.0 .
.Pp
The data-dependent load barrier,
.Fn membar_datadep_consumer ,
first appeared in
.Nx 7.0 .
.Pp
The
.Fn membar_acquire
and
.Fn membar_release
functions first appeared, and the
.Fn membar_enter
and
.Fn membar_exit
functions were deprecated, in
.Nx 10.0 .
|