1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
|
.\" $NetBSD: condvar.9,v 1.31 2023/09/07 20:01:43 ad Exp $
.\"
.\" Copyright (c) 2006, 2007, 2008, 2020, 2023 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" This code is derived from software contributed to The NetBSD Foundation
.\" by Andrew Doran.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd September 7, 2023
.Dt CONDVAR 9
.Os
.Sh NAME
.Nm cv ,
.Nm condvar ,
.Nm cv_init ,
.Nm cv_destroy ,
.Nm cv_wait ,
.Nm cv_wait_sig ,
.Nm cv_timedwait ,
.Nm cv_timedwait_sig ,
.Nm cv_timedwaitbt ,
.Nm cv_timedwaitbt_sig ,
.Nm cv_signal ,
.Nm cv_broadcast ,
.Nm cv_has_waiters
.Nd condition variables
.Sh SYNOPSIS
.In sys/condvar.h
.Ft void
.Fn cv_init "kcondvar_t *cv" "const char *wmesg"
.Ft void
.Fn cv_destroy "kcondvar_t *cv"
.Ft void
.Fn cv_wait "kcondvar_t *cv" "kmutex_t *mtx"
.Ft int
.Fn cv_wait_sig "kcondvar_t *cv" "kmutex_t *mtx"
.Ft int
.Fn cv_timedwait "kcondvar_t *cv" "kmutex_t *mtx" "int ticks"
.Ft int
.Fn cv_timedwait_sig "kcondvar_t *cv" "kmutex_t *mtx" "int ticks"
.Ft int
.Fn cv_timedwaitbt "kcondvar_t *cv" "kmutex_t *mtx" "struct bintime *bt" \
"const struct bintime *epsilon"
.Ft int
.Fn cv_timedwaitbt_sig "kcondvar_t *cv" "kmutex_t *mtx" "struct bintime *bt" \
"const struct bintime *epsilon"
.Ft void
.Fn cv_signal "kcondvar_t *cv"
.Ft void
.Fn cv_broadcast "kcondvar_t *cv"
.Ft bool
.Fn cv_has_waiters "kcondvar_t *cv"
.Pp
.Cd "options DIAGNOSTIC"
.Cd "options LOCKDEBUG"
.Sh DESCRIPTION
Condition variables (CVs) are used in the kernel to synchronize access
to resources that are limited (for example, memory) and to wait for
pending I/O operations to complete.
.Pp
The
.Vt kcondvar_t
type provides storage for the CV object.
This should be treated as an opaque object and not examined directly by
consumers.
.Sh OPTIONS
.Bl -tag -width abcd
.It Cd "options DIAGNOSTIC"
.Pp
Kernels compiled with the
.Dv DIAGNOSTIC
option perform basic sanity checks on CV operations.
.It Cd "options LOCKDEBUG"
.Pp
Kernels compiled with the
.Dv LOCKDEBUG
option perform potentially CPU intensive sanity checks
on CV operations.
.El
.Sh FUNCTIONS
.Bl -tag -width abcd
.It Fn cv_init "cv" "wmesg"
.Pp
Initialize a CV for use.
No other operations can be performed on the CV until it has been initialized.
.Pp
The
.Fa wmesg
argument specifies a string of no more than 8 characters that describes
the resource or condition associated with the CV.
The kernel does not use this argument directly but makes it available for
utilities such as
.Xr ps 1
to display.
.It Fn cv_destroy "cv"
.Pp
Release resources used by a CV.
If there could be waiters, they should be awoken first with
.Fn cv_broadcast .
The CV must not be used afterwards.
.It Fn cv_wait "cv" "mtx"
.Pp
Cause the current LWP to wait non-interruptably for access to a resource,
or for an I/O operation to complete.
The LWP will resume execution when awoken by another thread using
.Fn cv_signal
or
.Fn cv_broadcast .
.Pp
.Fa mtx
specifies a kernel mutex to be used as an interlock, and must be held by the
calling LWP on entry to
.Fn cv_wait .
It will be released once the LWP has prepared to sleep, and will be reacquired
before
.Fn cv_wait
returns.
.Pp
A small window exists between testing for availability of a resource and
waiting for the resource with
.Fn cv_wait ,
in which the resource may become available again.
The interlock is used to guarantee that the resource will not be signalled
as available until the calling LWP has begun to wait for it.
.Pp
Non-interruptable waits have the potential to deadlock the system, and so must
be kept short (typically, under one second).
.Pp
.Fn cv_wait
is typically used within a loop or restartable code sequence, because it
may awaken spuriously.
The calling LWP should re-check the condition that caused the wait.
If necessary, the calling LWP may call
.Fn cv_wait
again to continue waiting.
.It Fn cv_wait_sig "cv" "mtx"
.Pp
As per
.Fn cv_wait ,
but causes the current LWP to wait interruptably.
If the LWP receives a signal, or is interrupted by another condition such
as its containing process exiting, the wait is ended early and an error
code returned.
.Pp
If
.Fn cv_wait_sig
returns as a result of a signal, the return value is
.Er ERESTART
if the signal
has the
.Dv SA_RESTART
property.
If awoken normally, the value is zero, and
.Er EINTR
under all other conditions.
.It Fn cv_timedwait "cv" "mtx" "ticks"
.Pp
As per
.Fn cv_wait ,
but will return early if a timeout specified by the
.Fa ticks
argument expires.
.Pp
.Fa ticks
is an architecture and system dependent value related to the number of
clock interrupts per second.
See
.Xr hz 9
for details.
The
.Xr mstohz 9
macro can be used to convert a timeout expressed in milliseconds to
one suitable for
.Fn cv_timedwait .
If the
.Fa ticks
argument is zero,
.Fn cv_timedwait
behaves exactly like
.Fn cv_wait .
.Pp
If the timeout expires before the LWP is awoken, the return value is
.Er EWOULDBLOCK .
If awoken normally, the return value is zero.
.It Fn cv_timedwait_sig "cv" "mtx" "ticks"
.Pp
As per
.Fn cv_wait_sig ,
but also accepts a timeout value and will return
.Er EWOULDBLOCK
if the timeout expires.
.It Fn cv_timedwaitbt "cv" "mtx" "bt" "epsilon"
.It Fn cv_timedwaitbt_sig "cv" "mtx" "bt" "epsilon"
.Pp
As per
.Fn cv_wait
and
.Fn cv_wait_sig ,
but will return early if the duration
.Fa bt
has elapsed, immediately if
.Fa bt
is zero.
On return,
.Fn cv_timedwaitbt
and
.Fn cv_timedwaitbt_sig
subtract the time elapsed from
.Fa bt
in place, or set it to zero if there is no time remaining.
.Pp
Note that
.Fn cv_timedwaitbt
and
.Fn cv_timedwaitbt_sig
may return zero indicating success, rather than
.Er EWOULDBLOCK ,
even if they set the timeout to zero; this means that the caller must
re-check the condition in order to avoid potentially losing a
.Fn cv_signal ,
but the
.Em next
wait will time out immediately.
.Pp
The hint
.Fa epsilon ,
which can be
.Dv DEFAULT_TIMEOUT_EPSILON
if in doubt, requests that the wakeup not be delayed more than
.Fa bt Li "+" Fa epsilon ,
so that the system can coalesce multiple wakeups within their
respective epsilons into a single high-resolution clock interrupt or
choose to use cheaper low-resolution clock interrupts instead.
.Pp
However, the system is still limited by its best clock interrupt
resolution and by scheduling competition, which may delay the wakeup by
more than
.Fa bt Li "+" Fa epsilon .
.It Fn cv_signal "cv"
.Pp
Awaken one LWP waiting on the specified condition variable.
Where there are waiters sleeping non-interruptaby, more than one
LWP may be awoken.
This can be used to avoid a "thundering herd" problem, where a large
number of LWPs are awoken following an event, but only one LWP can
process the event.
.Pp
The mutex passed to the wait function
.Po Fa mtx Pc
should be held or have been released immediately before
.Fn cv_signal
is called.
.Pp
(Note that
.Fn cv_signal
is erroneously named in that it does not send a signal in the traditional
sense to LWPs waiting on a CV.)
.It Fn cv_broadcast "cv"
.Pp
Awaken all LWPs waiting on the specified condition variable.
.Pp
As with
.Fn cv_signal ,
the mutex passed to the wait function
.Po Fa mtx Pc
should be held or have been released immediately before
.Fn cv_broadcast
is called.
.It Fn cv_has_waiters "cv"
.Pp
Return
.Dv true
if one or more LWPs are waiting on the specified condition variable.
.Pp
.Fn cv_has_waiters
cannot test reliably for interruptable waits.
It should only be used to test for non-interruptable waits
made using
.Fn cv_wait .
.Pp
.Fn cv_has_waiters
should only be used when making diagnostic assertions, and must
be called while holding the interlocking mutex passed to
.Fn cv_wait .
.El
.Sh EXAMPLES
Consuming a resource:
.Bd -literal
/*
* Lock the resource. Its mutex will also serve as the
* interlock.
*/
mutex_enter(&res->mutex);
/*
* Wait for the resource to become available. Timeout after
* five seconds. If the resource is not available within the
* allotted time, return an error.
*/
struct bintime timeout = { .sec = 5, .frac = 0 };
while (res->state == BUSY) {
error = cv_timedwaitbt(&res->condvar,
&res->mutex, &timeout, DEFAULT_TIMEOUT_EPSILON);
if (error) {
KASSERT(error == EWOULDBLOCK);
mutex_exit(&res->mutex);
return ETIMEDOUT;
}
}
/*
* It's now available to us. Take ownership of the
* resource, and consume it.
*/
res->state = BUSY;
mutex_exit(&res->mutex);
consume(res);
.Ed
.Pp
Releasing a resource for the next consumer to use:
.Bd -literal
mutex_enter(&res->mutex);
res->state = IDLE;
cv_signal(&res->condvar);
mutex_exit(&res->mutex);
.Ed
.Sh CODE REFERENCES
The core of the CV implementation is in
.Pa sys/kern/kern_condvar.c .
.Pp
The header file
.Pa sys/sys/condvar.h
describes the public interface.
.Sh SEE ALSO
.Xr sigaction 2 ,
.Xr membar_ops 3 ,
.Xr errno 9 ,
.Xr mstohz 9 ,
.Xr mutex 9 ,
.Xr rwlock 9
.Pp
.Rs
.%A Jim Mauro
.%A Richard McDougall
.%T Solaris Internals: Core Kernel Architecture
.%I Prentice Hall
.%D 2001
.%O ISBN 0-13-022496-0
.Re
.Sh HISTORY
The CV primitives first appeared in
.Nx 5.0 .
The
.Fn cv_timedwaitbt
and
.Fn cv_timedwaitbt_sig
primitives first appeared in
.Nx 9.0 .
|