1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
|
<table class="head">
<tr>
<td class="head-ltitle">BPF(4)</td>
<td class="head-vol">Device Drivers Manual</td>
<td class="head-rtitle">BPF(4)</td>
</tr>
</table>
<div class="manual-text">
<section class="Sh">
<h1 class="Sh" id="NAME"><a class="permalink" href="#NAME">NAME</a></h1>
<p class="Pp"><code class="Nm">bpf</code> — <span class="Nd">Berkeley
Packet Filter</span></p>
</section>
<section class="Sh">
<h1 class="Sh" id="SYNOPSIS"><a class="permalink" href="#SYNOPSIS">SYNOPSIS</a></h1>
<p class="Pp"><code class="Cd">device bpf</code></p>
</section>
<section class="Sh">
<h1 class="Sh" id="DESCRIPTION"><a class="permalink" href="#DESCRIPTION">DESCRIPTION</a></h1>
<p class="Pp">The Berkeley Packet Filter provides a raw interface to data link
layers in a protocol independent fashion. All packets on the network, even
those destined for other hosts, are accessible through this mechanism.</p>
<p class="Pp">The packet filter appears as a character special device,
<span class="Pa">/dev/bpf</span>. After opening the device, the file
descriptor must be bound to a specific network interface with the
<code class="Dv">BIOCSETIF</code> ioctl. A given interface can be shared by
multiple listeners, and the filter underlying each descriptor will see an
identical packet stream.</p>
<p class="Pp">Associated with each open instance of a
<code class="Nm">bpf</code> file is a user-settable packet filter. Whenever
a packet is received by an interface, all file descriptors listening on that
interface apply their filter. Each descriptor that accepts the packet
receives its own copy.</p>
<p class="Pp">A packet can be sent out on the network by writing to a
<code class="Nm">bpf</code> file descriptor. The writes are unbuffered,
meaning only one packet can be processed per write. Currently, only writes
to Ethernets and SLIP links are supported.</p>
</section>
<section class="Sh">
<h1 class="Sh" id="BUFFER_MODES"><a class="permalink" href="#BUFFER_MODES">BUFFER
MODES</a></h1>
<p class="Pp"><code class="Nm">bpf</code> devices deliver packet data to the
application via memory buffers provided by the application. The buffer mode
is set using the <code class="Dv">BIOCSETBUFMODE</code> ioctl, and read
using the <code class="Dv">BIOCGETBUFMODE</code> ioctl.</p>
<section class="Ss">
<h2 class="Ss" id="Buffered_read_mode"><a class="permalink" href="#Buffered_read_mode">Buffered
read mode</a></h2>
<p class="Pp">By default, <code class="Nm">bpf</code> devices operate in the
<code class="Dv">BPF_BUFMODE_BUFFER</code> mode, in which packet data is
copied explicitly from kernel to user memory using the
<a class="Xr">read(2)</a> system call. The user process will declare a fixed
buffer size that will be used both for sizing internal buffers and for all
<a class="Xr">read(2)</a> operations on the file. This size is queried using
the <code class="Dv">BIOCGBLEN</code> ioctl, and is set using the
<code class="Dv">BIOCSBLEN</code> ioctl. Note that an individual packet
larger than the buffer size is necessarily truncated.</p>
</section>
<section class="Ss">
<h2 class="Ss" id="Zero-copy_buffer_mode"><a class="permalink" href="#Zero-copy_buffer_mode">Zero-copy
buffer mode</a></h2>
<p class="Pp"><code class="Nm">bpf</code> devices may also operate in the
<code class="Dv">BPF_BUFMODE_ZEROCOPY</code> mode, in which packet data is
written directly into two user memory buffers by the kernel, avoiding both
system call and copying overhead. Buffers are of fixed (and equal) size,
page-aligned, and an even multiple of the page size. The maximum zero-copy
buffer size is returned by the <code class="Dv">BIOCGETZMAX</code> ioctl.
Note that an individual packet larger than the buffer size is necessarily
truncated.</p>
<p class="Pp">The user process registers two memory buffers using the
<code class="Dv">BIOCSETZBUF</code> ioctl, which accepts a
<var class="Vt">struct bpf_zbuf</var> pointer as an argument:</p>
<div class="Bd Pp Li">
<pre>struct bpf_zbuf {
void *bz_bufa;
void *bz_bufb;
size_t bz_buflen;
};</pre>
</div>
<p class="Pp"><var class="Vt">bz_bufa</var> is a pointer to the userspace
address of the first buffer that will be filled, and
<var class="Vt">bz_bufb</var> is a pointer to the second buffer.
<code class="Nm">bpf</code> will then cycle between the two buffers as they
fill and are acknowledged.</p>
<p class="Pp">Each buffer begins with a fixed-length header to hold
synchronization and data length information for the buffer:</p>
<div class="Bd Pp Li">
<pre>struct bpf_zbuf_header {
volatile u_int bzh_kernel_gen; /* Kernel generation number. */
volatile u_int bzh_kernel_len; /* Length of data in the buffer. */
volatile u_int bzh_user_gen; /* User generation number. */
/* ...padding for future use... */
};</pre>
</div>
<p class="Pp">The header structure of each buffer, including all padding, should
be zeroed before it is configured using <code class="Dv">BIOCSETZBUF</code>.
Remaining space in the buffer will be used by the kernel to store packet
data, laid out in the same format as with buffered read mode.</p>
<p class="Pp">The kernel and the user process follow a simple acknowledgement
protocol via the buffer header to synchronize access to the buffer: when the
header generation numbers, <var class="Vt">bzh_kernel_gen</var> and
<var class="Vt">bzh_user_gen</var>, hold the same value, the kernel owns the
buffer, and when they differ, userspace owns the buffer.</p>
<p class="Pp">While the kernel owns the buffer, the contents are unstable and
may change asynchronously; while the user process owns the buffer, its
contents are stable and will not be changed until the buffer has been
acknowledged.</p>
<p class="Pp">Initializing the buffer headers to all 0's before registering the
buffer has the effect of assigning initial ownership of both buffers to the
kernel. The kernel signals that a buffer has been assigned to userspace by
modifying <var class="Vt">bzh_kernel_gen</var>, and userspace acknowledges
the buffer and returns it to the kernel by setting the value of
<var class="Vt">bzh_user_gen</var> to the value of
<var class="Vt">bzh_kernel_gen</var>.</p>
<p class="Pp">In order to avoid caching and memory re-ordering effects, the user
process must use atomic operations and memory barriers when checking for and
acknowledging buffers:</p>
<div class="Bd Pp Li">
<pre>#include <machine/atomic.h>
/*
* Return ownership of a buffer to the kernel for reuse.
*/
static void
buffer_acknowledge(struct bpf_zbuf_header *bzh)
{
atomic_store_rel_int(&bzh->bzh_user_gen, bzh->bzh_kernel_gen);
}
/*
* Check whether a buffer has been assigned to userspace by the kernel.
* Return true if userspace owns the buffer, and false otherwise.
*/
static int
buffer_check(struct bpf_zbuf_header *bzh)
{
return (bzh->bzh_user_gen !=
atomic_load_acq_int(&bzh->bzh_kernel_gen));
}</pre>
</div>
<p class="Pp">The user process may force the assignment of the next buffer, if
any data is pending, to userspace using the
<code class="Dv">BIOCROTZBUF</code> ioctl. This allows the user process to
retrieve data in a partially filled buffer before the buffer is full, such
as following a timeout; the process must recheck for buffer ownership using
the header generation numbers, as the buffer will not be assigned to
userspace if no data was present.</p>
<p class="Pp">As in the buffered read mode, <a class="Xr">kqueue(2)</a>,
<a class="Xr">poll(2)</a>, and <a class="Xr">select(2)</a> may be used to
sleep awaiting the availability of a completed buffer. They will return a
readable file descriptor when ownership of the next buffer is assigned to
user space.</p>
<p class="Pp">In the current implementation, the kernel may assign zero, one, or
both buffers to the user process; however, an earlier implementation
maintained the invariant that at most one buffer could be assigned to the
user process at a time. In order to both ensure progress and high
performance, user processes should acknowledge a completely processed buffer
as quickly as possible, returning it for reuse, and not block waiting on a
second buffer while holding another buffer.</p>
</section>
</section>
<section class="Sh">
<h1 class="Sh" id="IOCTLS"><a class="permalink" href="#IOCTLS">IOCTLS</a></h1>
<p class="Pp">The <a class="Xr">ioctl(2)</a> command codes below are defined in
<code class="In"><<a class="In">net/bpf.h</a>></code>. All commands
require these includes:</p>
<div class="Bd Pp Li">
<pre> #include <sys/types.h>
#include <sys/time.h>
#include <sys/ioctl.h>
#include <net/bpf.h></pre>
</div>
<p class="Pp">Additionally, <code class="Dv">BIOCGETIF</code> and
<code class="Dv">BIOCSETIF</code> require
<code class="In"><<a class="In">sys/socket.h</a>></code> and
<code class="In"><<a class="In">net/if.h</a>></code>.</p>
<p class="Pp">In addition to <code class="Dv">FIONREAD</code> the following
commands may be applied to any open <code class="Nm">bpf</code> file. The
(third) argument to <a class="Xr">ioctl(2)</a> should be a pointer to the
type indicated.</p>
<dl class="Bl-tag">
<dt id="BIOCGETIFLIST"><a class="permalink" href="#BIOCGETIFLIST"><code class="Dv">BIOCGETIFLIST</code></a></dt>
<dd>(<code class="Li">struct bpf_iflist</code>) Returns list of available
tapping points, that can later be attached to with
<code class="Dv">BIOCSETIF</code>. On entry the
<var class="Vt">bi_ubuf</var> shall point to user supplied buffer. The
<var class="Vt">bi_size</var> shall specify length of the buffer, or 0 if
the request is used to determine the required length. The
<var class="Vt">bi_count</var> can be used to limit the output to first
<var class="Va">count</var> entries, otherwise shall be 0. On return, if
the buffer length was enough to accommodate all desired entries, then the
supplied buffer is filled with NUL-terminated names of available tapping
points and <var class="Vt">bi_count</var> is set to the number of copied
names. Otherwise <code class="Er">ENOSPC</code> is returned.</dd>
<dt id="BIOCGBLEN"><a class="permalink" href="#BIOCGBLEN"><code class="Dv">BIOCGBLEN</code></a></dt>
<dd>(<code class="Li">u_int</code>) Returns the required buffer length for
reads on <code class="Nm">bpf</code> files.</dd>
<dt id="BIOCSBLEN"><a class="permalink" href="#BIOCSBLEN"><code class="Dv">BIOCSBLEN</code></a></dt>
<dd>(<code class="Li">u_int</code>) Sets the buffer length for reads on
<code class="Nm">bpf</code> files. The buffer must be set before the file
is attached to an interface with <code class="Dv">BIOCSETIF</code>. If the
requested buffer size cannot be accommodated, the closest allowable size
will be set and returned in the argument. A read call will result in
<code class="Er">EINVAL</code> if it is passed a buffer that is not this
size.</dd>
<dt id="BIOCGDLT"><a class="permalink" href="#BIOCGDLT"><code class="Dv">BIOCGDLT</code></a></dt>
<dd>(<code class="Li">u_int</code>) Returns the type of the data link layer
underlying the attached interface. <code class="Er">EINVAL</code> is
returned if no interface has been specified. The device types, prefixed
with “<code class="Li">DLT_</code>”, are defined in
<code class="In"><<a class="In">net/bpf.h</a>></code>.</dd>
<dt id="BIOCGDLTLIST"><a class="permalink" href="#BIOCGDLTLIST"><code class="Dv">BIOCGDLTLIST</code></a></dt>
<dd>(<code class="Li">struct bpf_dltlist</code>) Returns an array of the
available types of the data link layer underlying the attached interface:
<div class="Bd Pp Bd-indent Li">
<pre>struct bpf_dltlist {
u_int bfl_len;
u_int *bfl_list;
};</pre>
</div>
<p class="Pp">The available types are returned in the array pointed to by
the <var class="Va">bfl_list</var> field while their length in u_int is
supplied to the <var class="Va">bfl_len</var> field.
<code class="Er">ENOMEM</code> is returned if there is not enough buffer
space and <code class="Er">EFAULT</code> is returned if a bad address is
encountered. The <var class="Va">bfl_len</var> field is modified on
return to indicate the actual length in u_int of the array returned. If
<var class="Va">bfl_list</var> is <code class="Dv">NULL</code>, the
<var class="Va">bfl_len</var> field is set to indicate the required
length of an array in u_int.</p>
</dd>
<dt id="BIOCSDLT"><a class="permalink" href="#BIOCSDLT"><code class="Dv">BIOCSDLT</code></a></dt>
<dd>(<code class="Li">u_int</code>) Changes the type of the data link layer
underlying the attached interface. <code class="Er">EINVAL</code> is
returned if no interface has been specified or the specified type is not
available for the interface.</dd>
<dt id="BIOCPROMISC"><a class="permalink" href="#BIOCPROMISC"><code class="Dv">BIOCPROMISC</code></a></dt>
<dd>Forces the interface into promiscuous mode. All packets, not just those
destined for the local host, are processed. Since more than one file can
be listening on a given interface, a listener that opened its interface
non-promiscuously may receive packets promiscuously. This problem can be
remedied with an appropriate filter.
<p class="Pp">The interface remains in promiscuous mode until all files
listening promiscuously are closed.</p>
</dd>
<dt id="BIOCFLUSH"><a class="permalink" href="#BIOCFLUSH"><code class="Dv">BIOCFLUSH</code></a></dt>
<dd>Flushes the buffer of incoming packets, and resets the statistics that are
returned by BIOCGSTATS.</dd>
<dt id="BIOCGETIF"><a class="permalink" href="#BIOCGETIF"><code class="Dv">BIOCGETIF</code></a></dt>
<dd>(<code class="Li">struct ifreq</code>) Returns the name of the hardware
interface that the file is listening on. The name is returned in the
ifr_name field of the <code class="Li">ifreq</code> structure. All other
fields are undefined.</dd>
<dt id="BIOCSETIF"><a class="permalink" href="#BIOCSETIF"><code class="Dv">BIOCSETIF</code></a></dt>
<dd>(<code class="Li">struct ifreq</code>) Sets the hardware interface
associated with the file. This command must be performed before any
packets can be read. The device is indicated by name using the
<code class="Li">ifr_name</code> field of the
<code class="Li">ifreq</code> structure. Additionally, performs the
actions of <code class="Dv">BIOCFLUSH</code>.</dd>
<dt id="BIOCSRTIMEOUT"><a class="permalink" href="#BIOCSRTIMEOUT"><code class="Dv">BIOCSRTIMEOUT</code></a></dt>
<dd style="width: auto;"> </dd>
<dt id="BIOCGRTIMEOUT"><a class="permalink" href="#BIOCGRTIMEOUT"><code class="Dv">BIOCGRTIMEOUT</code></a></dt>
<dd>(<code class="Li">struct timeval</code>) Sets or gets the read timeout
parameter. The argument specifies the length of time to wait before timing
out on a read request. This parameter is initialized to zero by
<a class="Xr">open(2)</a>, indicating no timeout.</dd>
<dt id="BIOCGSTATS"><a class="permalink" href="#BIOCGSTATS"><code class="Dv">BIOCGSTATS</code></a></dt>
<dd>(<code class="Li">struct bpf_stat</code>) Returns the following structure
of packet statistics:
<div class="Bd Pp Li">
<pre>struct bpf_stat {
u_int bs_recv; /* number of packets received */
u_int bs_drop; /* number of packets dropped */
};</pre>
</div>
<p class="Pp">The fields are:</p>
<dl class="Bl-hang Bd-indent">
<dt id="bs_recv"><a class="permalink" href="#bs_recv"><code class="Li">bs_recv</code></a></dt>
<dd>the number of packets received by the descriptor since opened or reset
(including any buffered since the last read call); and</dd>
<dt id="bs_drop"><a class="permalink" href="#bs_drop"><code class="Li">bs_drop</code></a></dt>
<dd>the number of packets which were accepted by the filter but dropped by
the kernel because of buffer overflows (i.e., the application's reads
are not keeping up with the packet traffic).</dd>
</dl>
</dd>
<dt id="BIOCIMMEDIATE"><a class="permalink" href="#BIOCIMMEDIATE"><code class="Dv">BIOCIMMEDIATE</code></a></dt>
<dd>(<code class="Li">u_int</code>) Enables or disables “immediate
mode”, based on the truth value of the argument. When immediate
mode is enabled, reads return immediately upon packet reception.
Otherwise, a read will block until either the kernel buffer becomes full
or a timeout occurs. This is useful for programs like
<a class="Xr">rarpd(8)</a> which must respond to messages in real time.
The default for a new file is off.</dd>
<dt id="BIOCSETF"><a class="permalink" href="#BIOCSETF"><code class="Dv">BIOCSETF</code></a></dt>
<dd style="width: auto;"> </dd>
<dt id="BIOCSETFNR"><a class="permalink" href="#BIOCSETFNR"><code class="Dv">BIOCSETFNR</code></a></dt>
<dd>(<code class="Li">struct bpf_program</code>) Sets the read filter program
used by the kernel to discard uninteresting packets. An array of
instructions and its length is passed in using the following structure:
<div class="Bd Pp Li">
<pre>struct bpf_program {
u_int bf_len;
struct bpf_insn *bf_insns;
};</pre>
</div>
<p class="Pp">The filter program is pointed to by the
<code class="Li">bf_insns</code> field while its length in units of
‘<code class="Li">struct bpf_insn</code>’ is given by the
<code class="Li">bf_len</code> field. See section
<a class="Sx" href="#FILTER_MACHINE">FILTER MACHINE</a> for an
explanation of the filter language. The only difference between
<code class="Dv">BIOCSETF</code> and <code class="Dv">BIOCSETFNR</code>
is <code class="Dv">BIOCSETF</code> performs the actions of
<code class="Dv">BIOCFLUSH</code> while
<code class="Dv">BIOCSETFNR</code> does not.</p>
</dd>
<dt id="BIOCSETWF"><a class="permalink" href="#BIOCSETWF"><code class="Dv">BIOCSETWF</code></a></dt>
<dd>(<code class="Li">struct bpf_program</code>) Sets the write filter program
used by the kernel to control what type of packets can be written to the
interface. See the <code class="Dv">BIOCSETF</code> command for more
information on the <code class="Nm">bpf</code> filter program.</dd>
<dt id="BIOCVERSION"><a class="permalink" href="#BIOCVERSION"><code class="Dv">BIOCVERSION</code></a></dt>
<dd>(<code class="Li">struct bpf_version</code>) Returns the major and minor
version numbers of the filter language currently recognized by the kernel.
Before installing a filter, applications must check that the current
version is compatible with the running kernel. Version numbers are
compatible if the major numbers match and the application minor is less
than or equal to the kernel minor. The kernel version number is returned
in the following structure:
<div class="Bd Pp Li">
<pre>struct bpf_version {
u_short bv_major;
u_short bv_minor;
};</pre>
</div>
<p class="Pp" id="ioctl">The current version numbers are given by
<code class="Dv">BPF_MAJOR_VERSION</code> and
<code class="Dv">BPF_MINOR_VERSION</code> from
<code class="In"><<a class="In">net/bpf.h</a>></code>. An
incompatible filter may result in undefined behavior (most likely, an
error returned by
<a class="permalink" href="#ioctl"><code class="Fn">ioctl</code></a>()
or haphazard packet matching).</p>
</dd>
<dt id="BIOCGRSIG"><a class="permalink" href="#BIOCGRSIG"><code class="Dv">BIOCGRSIG</code></a></dt>
<dd style="width: auto;"> </dd>
<dt id="BIOCSRSIG"><a class="permalink" href="#BIOCSRSIG"><code class="Dv">BIOCSRSIG</code></a></dt>
<dd>(<code class="Li">u_int</code>) Sets or gets the receive signal. This
signal will be sent to the process or process group specified by
<code class="Dv">FIOSETOWN</code>. It defaults to
<code class="Dv">SIGIO</code>.</dd>
<dt id="BIOCSHDRCMPLT"><a class="permalink" href="#BIOCSHDRCMPLT"><code class="Dv">BIOCSHDRCMPLT</code></a></dt>
<dd style="width: auto;"> </dd>
<dt id="BIOCGHDRCMPLT"><a class="permalink" href="#BIOCGHDRCMPLT"><code class="Dv">BIOCGHDRCMPLT</code></a></dt>
<dd>(<code class="Li">u_int</code>) Sets or gets the status of the
“header complete” flag. Set to zero if the link level source
address should be filled in automatically by the interface output routine.
Set to one if the link level source address will be written, as provided,
to the wire. This flag is initialized to zero by default.</dd>
<dt id="BIOCSSEESENT"><a class="permalink" href="#BIOCSSEESENT"><code class="Dv">BIOCSSEESENT</code></a></dt>
<dd style="width: auto;"> </dd>
<dt id="BIOCGSEESENT"><a class="permalink" href="#BIOCGSEESENT"><code class="Dv">BIOCGSEESENT</code></a></dt>
<dd>(<code class="Li">u_int</code>) These commands are obsolete but left for
compatibility. Use <code class="Dv">BIOCSDIRECTION</code> and
<code class="Dv">BIOCGDIRECTION</code> instead. Sets or gets the flag
determining whether locally generated packets on the interface should be
returned by BPF. Set to zero to see only incoming packets on the
interface. Set to one to see packets originating locally and remotely on
the interface. This flag is initialized to one by default.</dd>
<dt id="BIOCSDIRECTION"><a class="permalink" href="#BIOCSDIRECTION"><code class="Dv">BIOCSDIRECTION</code></a></dt>
<dd style="width: auto;"> </dd>
<dt id="BIOCGDIRECTION"><a class="permalink" href="#BIOCGDIRECTION"><code class="Dv">BIOCGDIRECTION</code></a></dt>
<dd>(<code class="Li">u_int</code>) Sets or gets the setting determining
whether incoming, outgoing, or all packets on the interface should be
returned by BPF. Set to <code class="Dv">BPF_D_IN</code> to see only
incoming packets on the interface. Set to
<code class="Dv">BPF_D_INOUT</code> to see packets originating locally and
remotely on the interface. Set to <code class="Dv">BPF_D_OUT</code> to see
only outgoing packets on the interface. This setting is initialized to
<code class="Dv">BPF_D_INOUT</code> by default.</dd>
<dt id="BIOCSTSTAMP"><a class="permalink" href="#BIOCSTSTAMP"><code class="Dv">BIOCSTSTAMP</code></a></dt>
<dd style="width: auto;"> </dd>
<dt id="BIOCGTSTAMP"><a class="permalink" href="#BIOCGTSTAMP"><code class="Dv">BIOCGTSTAMP</code></a></dt>
<dd>(<code class="Li">u_int</code>) Set or get format and resolution of the
time stamps returned by BPF. Set to
<code class="Dv">BPF_T_MICROTIME</code>,
<code class="Dv">BPF_T_MICROTIME_FAST</code>,
<code class="Dv">BPF_T_MICROTIME_MONOTONIC</code>, or
<code class="Dv">BPF_T_MICROTIME_MONOTONIC_FAST</code> to get time stamps
in 64-bit <var class="Vt">struct timeval</var> format. Set to
<code class="Dv">BPF_T_NANOTIME</code>,
<code class="Dv">BPF_T_NANOTIME_FAST</code>,
<code class="Dv">BPF_T_NANOTIME_MONOTONIC</code>, or
<code class="Dv">BPF_T_NANOTIME_MONOTONIC_FAST</code> to get time stamps
in 64-bit <var class="Vt">struct timespec</var> format. Set to
<code class="Dv">BPF_T_BINTIME</code>,
<code class="Dv">BPF_T_BINTIME_FAST</code>,
<code class="Dv">BPF_T_NANOTIME_MONOTONIC</code>, or
<code class="Dv">BPF_T_BINTIME_MONOTONIC_FAST</code> to get time stamps in
64-bit <var class="Vt">struct bintime</var> format. Set to
<code class="Dv">BPF_T_NONE</code> to ignore time stamp. All 64-bit time
stamp formats are wrapped in <var class="Vt">struct bpf_ts</var>. The
<code class="Dv">BPF_T_MICROTIME_FAST</code>,
<code class="Dv">BPF_T_NANOTIME_FAST</code>,
<code class="Dv">BPF_T_BINTIME_FAST</code>,
<code class="Dv">BPF_T_MICROTIME_MONOTONIC_FAST</code>,
<code class="Dv">BPF_T_NANOTIME_MONOTONIC_FAST</code>, and
<code class="Dv">BPF_T_BINTIME_MONOTONIC_FAST</code> are analogs of
corresponding formats without _FAST suffix but do not perform a full time
counter query, so their accuracy is one timer tick. The
<code class="Dv">BPF_T_MICROTIME_MONOTONIC</code>,
<code class="Dv">BPF_T_NANOTIME_MONOTONIC</code>,
<code class="Dv">BPF_T_BINTIME_MONOTONIC</code>,
<code class="Dv">BPF_T_MICROTIME_MONOTONIC_FAST</code>,
<code class="Dv">BPF_T_NANOTIME_MONOTONIC_FAST</code>, and
<code class="Dv">BPF_T_BINTIME_MONOTONIC_FAST</code> store the time
elapsed since kernel boot. This setting is initialized to
<code class="Dv">BPF_T_MICROTIME</code> by default.</dd>
<dt id="BIOCFEEDBACK"><a class="permalink" href="#BIOCFEEDBACK"><code class="Dv">BIOCFEEDBACK</code></a></dt>
<dd>(<code class="Li">u_int</code>) Set packet feedback mode. This allows
injected packets to be fed back as input to the interface when output via
the interface is successful. When <code class="Dv">BPF_D_INOUT</code>
direction is set, injected outgoing packet is not returned by BPF to avoid
duplication. This flag is initialized to zero by default.</dd>
<dt id="BIOCLOCK"><a class="permalink" href="#BIOCLOCK"><code class="Dv">BIOCLOCK</code></a></dt>
<dd>Set the locked flag on the <code class="Nm">bpf</code> descriptor. This
prevents the execution of ioctl commands which could change the underlying
operating parameters of the device.</dd>
<dt id="BIOCGETBUFMODE"><a class="permalink" href="#BIOCGETBUFMODE"><code class="Dv">BIOCGETBUFMODE</code></a></dt>
<dd style="width: auto;"> </dd>
<dt id="BIOCSETBUFMODE"><a class="permalink" href="#BIOCSETBUFMODE"><code class="Dv">BIOCSETBUFMODE</code></a></dt>
<dd>(<code class="Li">u_int</code>) Get or set the current
<code class="Nm">bpf</code> buffering mode; possible values are
<code class="Dv">BPF_BUFMODE_BUFFER</code>, buffered read mode, and
<code class="Dv">BPF_BUFMODE_ZBUF</code>, zero-copy buffer mode.</dd>
<dt id="BIOCSETZBUF"><a class="permalink" href="#BIOCSETZBUF"><code class="Dv">BIOCSETZBUF</code></a></dt>
<dd>(<code class="Li">struct bpf_zbuf</code>) Set the current zero-copy buffer
locations; buffer locations may be set only once zero-copy buffer mode has
been selected, and prior to attaching to an interface. Buffers must be of
identical size, page-aligned, and an integer multiple of pages in size.
The three fields <var class="Vt">bz_bufa</var>,
<var class="Vt">bz_bufb</var>, and <var class="Vt">bz_buflen</var> must be
filled out. If buffers have already been set for this device, the ioctl
will fail.</dd>
<dt id="BIOCGETZMAX"><a class="permalink" href="#BIOCGETZMAX"><code class="Dv">BIOCGETZMAX</code></a></dt>
<dd>(<code class="Li">size_t</code>) Get the largest individual zero-copy
buffer size allowed. As two buffers are used in zero-copy buffer mode, the
limit (in practice) is twice the returned size. As zero-copy buffers
consume kernel address space, conservative selection of buffer size is
suggested, especially when there are multiple <code class="Nm">bpf</code>
descriptors in use on 32-bit systems.</dd>
<dt id="BIOCROTZBUF"><a class="permalink" href="#BIOCROTZBUF"><code class="Dv">BIOCROTZBUF</code></a></dt>
<dd>Force ownership of the next buffer to be assigned to userspace, if any
data present in the buffer. If no data is present, the buffer will remain
owned by the kernel. This allows consumers of zero-copy buffering to
implement timeouts and retrieve partially filled buffers. In order to
handle the case where no data is present in the buffer and therefore
ownership is not assigned, the user process must check
<var class="Vt">bzh_kernel_gen</var> against
<var class="Vt">bzh_user_gen</var>.</dd>
<dt id="BIOCSETVLANPCP"><a class="permalink" href="#BIOCSETVLANPCP"><code class="Dv">BIOCSETVLANPCP</code></a></dt>
<dd>Set the VLAN PCP bits to the supplied value.</dd>
</dl>
</section>
<section class="Sh">
<h1 class="Sh" id="STANDARD_IOCTLS"><a class="permalink" href="#STANDARD_IOCTLS">STANDARD
IOCTLS</a></h1>
<p class="Pp"><code class="Nm">bpf</code> now supports several standard
<a class="Xr">ioctl(2)</a>'s which allow the user to do async and/or
non-blocking I/O to an open
<a class="permalink" href="#bpf"><i class="Em" id="bpf">bpf</i></a> file
descriptor.</p>
<dl class="Bl-tag">
<dt id="FIONREAD"><a class="permalink" href="#FIONREAD"><code class="Dv">FIONREAD</code></a></dt>
<dd>(<code class="Li">int</code>) Returns the number of bytes that are
immediately available for reading.</dd>
<dt id="SIOCGIFADDR"><a class="permalink" href="#SIOCGIFADDR"><code class="Dv">SIOCGIFADDR</code></a></dt>
<dd>(<code class="Li">struct ifreq</code>) Returns the address associated with
the interface.</dd>
<dt id="FIONBIO"><a class="permalink" href="#FIONBIO"><code class="Dv">FIONBIO</code></a></dt>
<dd>(<code class="Li">int</code>) Sets or clears non-blocking I/O. If arg is
non-zero, then doing a <a class="Xr">read(2)</a> when no data is available
will return -1 and <var class="Va">errno</var> will be set to
<code class="Er">EAGAIN</code>. If arg is zero, non-blocking I/O is
disabled. Note: setting this overrides the timeout set by
<code class="Dv">BIOCSRTIMEOUT</code>.</dd>
<dt id="FIOASYNC"><a class="permalink" href="#FIOASYNC"><code class="Dv">FIOASYNC</code></a></dt>
<dd>(<code class="Li">int</code>) Enables or disables async I/O. When enabled
(arg is non-zero), the process or process group specified by
<code class="Dv">FIOSETOWN</code> will start receiving
<code class="Dv">SIGIO 's</code> when packets arrive. Note that you must
do an <code class="Dv">FIOSETOWN</code> in order for this to take effect,
as the system will not default this for you. The signal may be changed via
<code class="Dv">BIOCSRSIG</code>.</dd>
<dt id="FIOSETOWN"><a class="permalink" href="#FIOSETOWN"><code class="Dv">FIOSETOWN</code></a></dt>
<dd style="width: auto;"> </dd>
<dt id="FIOGETOWN"><a class="permalink" href="#FIOGETOWN"><code class="Dv">FIOGETOWN</code></a></dt>
<dd>(<code class="Li">int</code>) Sets or gets the process or process group
(if negative) that should receive <code class="Dv">SIGIO</code> when
packets are available. The signal may be changed using
<code class="Dv">BIOCSRSIG</code> (see above).</dd>
</dl>
</section>
<section class="Sh">
<h1 class="Sh" id="BPF_HEADER"><a class="permalink" href="#BPF_HEADER">BPF
HEADER</a></h1>
<p class="Pp">One of the following structures is prepended to each packet
returned by <a class="Xr">read(2)</a> or via a zero-copy buffer:</p>
<div class="Bd Pp Li">
<pre>struct bpf_xhdr {
struct bpf_ts bh_tstamp; /* time stamp */
uint32_t bh_caplen; /* length of captured portion */
uint32_t bh_datalen; /* original length of packet */
u_short bh_hdrlen; /* length of bpf header (this struct
plus alignment padding) */
};
struct bpf_hdr {
struct timeval bh_tstamp; /* time stamp */
uint32_t bh_caplen; /* length of captured portion */
uint32_t bh_datalen; /* original length of packet */
u_short bh_hdrlen; /* length of bpf header (this struct
plus alignment padding) */
};</pre>
</div>
<p class="Pp">The fields, whose values are stored in host order, and are:</p>
<p class="Pp"></p>
<dl class="Bl-tag Bl-compact">
<dt id="bh_tstamp"><a class="permalink" href="#bh_tstamp"><code class="Li">bh_tstamp</code></a></dt>
<dd>The time at which the packet was processed by the packet filter.</dd>
<dt id="bh_caplen"><a class="permalink" href="#bh_caplen"><code class="Li">bh_caplen</code></a></dt>
<dd>The length of the captured portion of the packet. This is the minimum of
the truncation amount specified by the filter and the length of the
packet.</dd>
<dt id="bh_datalen"><a class="permalink" href="#bh_datalen"><code class="Li">bh_datalen</code></a></dt>
<dd>The length of the packet off the wire. This value is independent of the
truncation amount specified by the filter.</dd>
<dt id="bh_hdrlen"><a class="permalink" href="#bh_hdrlen"><code class="Li">bh_hdrlen</code></a></dt>
<dd>The length of the <code class="Nm">bpf</code> header, which may not be
equal to
<a class="permalink" href="#sizeof"><code class="Fn" id="sizeof">sizeof</code></a>(<var class="Fa">struct
bpf_xhdr</var>) or <code class="Fn">sizeof</code>(<var class="Fa">struct
bpf_hdr</var>).</dd>
</dl>
<p class="Pp">The <code class="Li">bh_hdrlen</code> field exists to account for
padding between the header and the link level protocol. The purpose here is
to guarantee proper alignment of the packet data structures, which is
required on alignment sensitive architectures and improves performance on
many other architectures. The packet filter ensures that the
<var class="Vt">bpf_xhdr</var>, <var class="Vt">bpf_hdr</var> and the
network layer header will be word aligned. Currently,
<var class="Vt">bpf_hdr</var> is used when the time stamp is set to
<code class="Dv">BPF_T_MICROTIME</code>,
<code class="Dv">BPF_T_MICROTIME_FAST</code>,
<code class="Dv">BPF_T_MICROTIME_MONOTONIC</code>,
<code class="Dv">BPF_T_MICROTIME_MONOTONIC_FAST</code>, or
<code class="Dv">BPF_T_NONE</code> for backward compatibility reasons.
Otherwise, <var class="Vt">bpf_xhdr</var> is used. However,
<var class="Vt">bpf_hdr</var> may be deprecated in the near future. Suitable
precautions must be taken when accessing the link layer protocol fields on
alignment restricted machines. (This is not a problem on an Ethernet, since
the type field is a short falling on an even offset, and the addresses are
probably accessed in a bytewise fashion).</p>
<p class="Pp">Additionally, individual packets are padded so that each starts on
a word boundary. This requires that an application has some knowledge of how
to get from packet to packet. The macro
<code class="Dv">BPF_WORDALIGN</code> is defined in
<code class="In"><<a class="In">net/bpf.h</a>></code> to facilitate
this process. It rounds up its argument to the nearest word aligned value
(where a word is <code class="Dv">BPF_ALIGNMENT</code> bytes wide).</p>
<p class="Pp">For example, if ‘<code class="Li">p</code>’ points
to the start of a packet, this expression will advance it to the next
packet:</p>
<div class="Bd Bd-indent"><code class="Li">p = (char *)p +
BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen)</code></div>
<p class="Pp">For the alignment mechanisms to work properly, the buffer passed
to <a class="Xr">read(2)</a> must itself be word aligned. The
<a class="Xr">malloc(3)</a> function will always return an aligned
buffer.</p>
</section>
<section class="Sh">
<h1 class="Sh" id="FILTER_MACHINE"><a class="permalink" href="#FILTER_MACHINE">FILTER
MACHINE</a></h1>
<p class="Pp">A filter program is an array of instructions, with all branches
forwardly directed, terminated by a
<a class="permalink" href="#return"><i class="Em" id="return">return</i></a>
instruction. Each instruction performs some action on the pseudo-machine
state, which consists of an accumulator, index register, scratch memory
store, and implicit program counter.</p>
<p class="Pp">The following structure defines the instruction format:</p>
<div class="Bd Pp Li">
<pre>struct bpf_insn {
u_short code;
u_char jt;
u_char jf;
bpf_u_int32 k;
};</pre>
</div>
<p class="Pp">The <code class="Li">k</code> field is used in different ways by
different instructions, and the <code class="Li">jt</code> and
<code class="Li">jf</code> fields are used as offsets by the branch
instructions. The opcodes are encoded in a semi-hierarchical fashion. There
are eight classes of instructions: <code class="Dv">BPF_LD</code>,
<code class="Dv">BPF_LDX</code>, <code class="Dv">BPF_ST</code>,
<code class="Dv">BPF_STX</code>, <code class="Dv">BPF_ALU</code>,
<code class="Dv">BPF_JMP</code>, <code class="Dv">BPF_RET</code>, and
<code class="Dv">BPF_MISC</code>. Various other mode and operator bits are
or'd into the class to give the actual instructions. The classes and modes
are defined in
<code class="In"><<a class="In">net/bpf.h</a>></code>.</p>
<p class="Pp">Below are the semantics for each defined
<code class="Nm">bpf</code> instruction. We use the convention that A is the
accumulator, X is the index register, P[] packet data, and M[] scratch
memory store. P[i:n] gives the data at byte offset “i” in the
packet, interpreted as a word (n=4), unsigned halfword (n=2), or unsigned
byte (n=1). M[i] gives the i'th word in the scratch memory store, which is
only addressed in word units. The memory store is indexed from 0 to
<code class="Dv">BPF_MEMWORDS</code> - 1. <code class="Li">k</code>,
<code class="Li">jt</code>, and <code class="Li">jf</code> are the
corresponding fields in the instruction definition. “len”
refers to the length of the packet.</p>
<dl class="Bl-tag">
<dt id="BPF_LD"><a class="permalink" href="#BPF_LD"><code class="Dv">BPF_LD</code></a></dt>
<dd>These instructions copy a value into the accumulator. The type of the
source operand is specified by an “addressing mode” and can
be a constant (<code class="Dv">BPF_IMM</code>), packet data at a fixed
offset (<code class="Dv">BPF_ABS</code>), packet data at a variable offset
(<code class="Dv">BPF_IND</code>), the packet length
(<code class="Dv">BPF_LEN</code>), or a word in the scratch memory store
(<code class="Dv">BPF_MEM</code>). For <code class="Dv">BPF_IND</code> and
<code class="Dv">BPF_ABS</code>, the data size must be specified as a word
(<code class="Dv">BPF_W</code>), halfword (<code class="Dv">BPF_H</code>),
or byte (<code class="Dv">BPF_B</code>). The semantics of all the
recognized <code class="Dv">BPF_LD</code> instructions follow.
<div class="Bd Pp Li">
<pre>BPF_LD+BPF_W+BPF_ABS A <- P[k:4]
BPF_LD+BPF_H+BPF_ABS A <- P[k:2]
BPF_LD+BPF_B+BPF_ABS A <- P[k:1]
BPF_LD+BPF_W+BPF_IND A <- P[X+k:4]
BPF_LD+BPF_H+BPF_IND A <- P[X+k:2]
BPF_LD+BPF_B+BPF_IND A <- P[X+k:1]
BPF_LD+BPF_W+BPF_LEN A <- len
BPF_LD+BPF_IMM A <- k
BPF_LD+BPF_MEM A <- M[k]</pre>
</div>
</dd>
<dt id="BPF_LDX"><a class="permalink" href="#BPF_LDX"><code class="Dv">BPF_LDX</code></a></dt>
<dd>These instructions load a value into the index register. Note that the
addressing modes are more restrictive than those of the accumulator loads,
but they include <code class="Dv">BPF_MSH</code>, a hack for efficiently
loading the IP header length.
<div class="Bd Pp Li">
<pre>BPF_LDX+BPF_W+BPF_IMM X <- k
BPF_LDX+BPF_W+BPF_MEM X <- M[k]
BPF_LDX+BPF_W+BPF_LEN X <- len
BPF_LDX+BPF_B+BPF_MSH X <- 4*(P[k:1]&0xf)</pre>
</div>
</dd>
<dt id="BPF_ST"><a class="permalink" href="#BPF_ST"><code class="Dv">BPF_ST</code></a></dt>
<dd>This instruction stores the accumulator into the scratch memory. We do not
need an addressing mode since there is only one possibility for the
destination.
<div class="Bd Pp Li">
<pre>BPF_ST M[k] <- A</pre>
</div>
</dd>
<dt id="BPF_STX"><a class="permalink" href="#BPF_STX"><code class="Dv">BPF_STX</code></a></dt>
<dd>This instruction stores the index register in the scratch memory store.
<div class="Bd Pp Li">
<pre>BPF_STX M[k] <- X</pre>
</div>
</dd>
<dt id="BPF_ALU"><a class="permalink" href="#BPF_ALU"><code class="Dv">BPF_ALU</code></a></dt>
<dd>The alu instructions perform operations between the accumulator and index
register or constant, and store the result back in the accumulator. For
binary operations, a source mode is required
(<code class="Dv">BPF_K</code> or <code class="Dv">BPF_X</code>).
<div class="Bd Pp Li">
<pre>BPF_ALU+BPF_ADD+BPF_K A <- A + k
BPF_ALU+BPF_SUB+BPF_K A <- A - k
BPF_ALU+BPF_MUL+BPF_K A <- A * k
BPF_ALU+BPF_DIV+BPF_K A <- A / k
BPF_ALU+BPF_MOD+BPF_K A <- A % k
BPF_ALU+BPF_AND+BPF_K A <- A & k
BPF_ALU+BPF_OR+BPF_K A <- A | k
BPF_ALU+BPF_XOR+BPF_K A <- A ^ k
BPF_ALU+BPF_LSH+BPF_K A <- A << k
BPF_ALU+BPF_RSH+BPF_K A <- A >> k
BPF_ALU+BPF_ADD+BPF_X A <- A + X
BPF_ALU+BPF_SUB+BPF_X A <- A - X
BPF_ALU+BPF_MUL+BPF_X A <- A * X
BPF_ALU+BPF_DIV+BPF_X A <- A / X
BPF_ALU+BPF_MOD+BPF_X A <- A % X
BPF_ALU+BPF_AND+BPF_X A <- A & X
BPF_ALU+BPF_OR+BPF_X A <- A | X
BPF_ALU+BPF_XOR+BPF_X A <- A ^ X
BPF_ALU+BPF_LSH+BPF_X A <- A << X
BPF_ALU+BPF_RSH+BPF_X A <- A >> X
BPF_ALU+BPF_NEG A <- -A</pre>
</div>
</dd>
<dt id="BPF_JMP"><a class="permalink" href="#BPF_JMP"><code class="Dv">BPF_JMP</code></a></dt>
<dd>The jump instructions alter flow of control. Conditional jumps compare the
accumulator against a constant (<code class="Dv">BPF_K</code>) or the
index register (<code class="Dv">BPF_X</code>). If the result is true (or
non-zero), the true branch is taken, otherwise the false branch is taken.
Jump offsets are encoded in 8 bits so the longest jump is 256
instructions. However, the jump always (<code class="Dv">BPF_JA</code>)
opcode uses the 32 bit <code class="Li">k</code> field as the offset,
allowing arbitrarily distant destinations. All conditionals use unsigned
comparison conventions.
<div class="Bd Pp Li">
<pre>BPF_JMP+BPF_JA pc += k
BPF_JMP+BPF_JGT+BPF_K pc += (A > k) ? jt : jf
BPF_JMP+BPF_JGE+BPF_K pc += (A >= k) ? jt : jf
BPF_JMP+BPF_JEQ+BPF_K pc += (A == k) ? jt : jf
BPF_JMP+BPF_JSET+BPF_K pc += (A & k) ? jt : jf
BPF_JMP+BPF_JGT+BPF_X pc += (A > X) ? jt : jf
BPF_JMP+BPF_JGE+BPF_X pc += (A >= X) ? jt : jf
BPF_JMP+BPF_JEQ+BPF_X pc += (A == X) ? jt : jf
BPF_JMP+BPF_JSET+BPF_X pc += (A & X) ? jt : jf</pre>
</div>
</dd>
<dt id="BPF_RET"><a class="permalink" href="#BPF_RET"><code class="Dv">BPF_RET</code></a></dt>
<dd>The return instructions terminate the filter program and specify the
amount of packet to accept (i.e., they return the truncation amount). A
return value of zero indicates that the packet should be ignored. The
return value is either a constant (<code class="Dv">BPF_K</code>) or the
accumulator (<code class="Dv">BPF_A</code>).
<div class="Bd Pp Li">
<pre>BPF_RET+BPF_A accept A bytes
BPF_RET+BPF_K accept k bytes</pre>
</div>
</dd>
<dt id="BPF_MISC"><a class="permalink" href="#BPF_MISC"><code class="Dv">BPF_MISC</code></a></dt>
<dd>The miscellaneous category was created for anything that does not fit into
the above classes, and for any new instructions that might need to be
added. Currently, these are the register transfer instructions that copy
the index register to the accumulator or vice versa.
<div class="Bd Pp Li">
<pre>BPF_MISC+BPF_TAX X <- A
BPF_MISC+BPF_TXA A <- X</pre>
</div>
</dd>
</dl>
<p class="Pp" id="BPF_STMT">The <code class="Nm">bpf</code> interface provides
the following macros to facilitate array initializers:
<a class="permalink" href="#BPF_STMT"><code class="Fn">BPF_STMT</code></a>(<var class="Fa">opcode</var>,
<var class="Fa">operand</var>) and
<a class="permalink" href="#BPF_JUMP"><code class="Fn" id="BPF_JUMP">BPF_JUMP</code></a>(<var class="Fa">opcode</var>,
<var class="Fa">operand</var>, <var class="Fa">true_offset</var>,
<var class="Fa">false_offset</var>).</p>
</section>
<section class="Sh">
<h1 class="Sh" id="SYSCTL_VARIABLES"><a class="permalink" href="#SYSCTL_VARIABLES">SYSCTL
VARIABLES</a></h1>
<p class="Pp">A set of <a class="Xr">sysctl(8)</a> variables controls the
behaviour of the <code class="Nm">bpf</code> subsystem</p>
<dl class="Bl-tag">
<dt id="net.bpf.optimize_writers"><var class="Va">net.bpf.optimize_writers</var>:
<span class="No">0</span></dt>
<dd>Various programs use BPF to send (but not receive) raw packets (cdpd,
lldpd, dhcpd, dhcp relays, etc. are good examples of such programs). They
do not need incoming packets to be send to them. Turning this option on
makes new BPF users to be attached to write-only interface list until
program explicitly specifies read filter via
<a class="permalink" href="#pcap_set_filter"><code class="Fn" id="pcap_set_filter">pcap_set_filter</code></a>().
This removes any performance degradation for high-speed interfaces.</dd>
<dt id="net.bpf.stats"><var class="Va">net.bpf.stats</var>:</dt>
<dd>Binary interface for retrieving general statistics.</dd>
<dt id="net.bpf.zerocopy_enable"><var class="Va">net.bpf.zerocopy_enable</var>:
<span class="No">0</span></dt>
<dd>Permits zero-copy to be used with net BPF readers. Use with caution.</dd>
<dt id="net.bpf.maxinsns"><var class="Va">net.bpf.maxinsns</var>:
<span class="No">512</span></dt>
<dd>Maximum number of instructions that BPF program can contain. Use
<a class="Xr">tcpdump(1)</a> <code class="Fl">-d</code> option to
determine approximate number of instruction for any filter.</dd>
<dt id="net.bpf.maxbufsize"><var class="Va">net.bpf.maxbufsize</var>:
<span class="No">524288</span></dt>
<dd>Maximum buffer size to allocate for packets buffer.</dd>
<dt id="net.bpf.bufsize"><var class="Va">net.bpf.bufsize</var>:
<span class="No">4096</span></dt>
<dd>Default buffer size to allocate for packets buffer.</dd>
</dl>
</section>
<section class="Sh">
<h1 class="Sh" id="EXAMPLES"><a class="permalink" href="#EXAMPLES">EXAMPLES</a></h1>
<p class="Pp">The following filter is taken from the Reverse ARP Daemon. It
accepts only Reverse ARP requests.</p>
<div class="Bd Pp Li">
<pre>struct bpf_insn insns[] = {
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ARPOP_REVREQUEST, 0, 1),
BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
sizeof(struct ether_header)),
BPF_STMT(BPF_RET+BPF_K, 0),
};</pre>
</div>
<p class="Pp">This filter accepts only IP packets between host 128.3.112.15 and
128.3.112.35.</p>
<div class="Bd Pp Li">
<pre>struct bpf_insn insns[] = {
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
BPF_STMT(BPF_RET+BPF_K, 0),
};</pre>
</div>
<p class="Pp">Finally, this filter returns only TCP finger packets. We must
parse the IP header to reach the TCP header. The
<code class="Dv">BPF_JSET</code> instruction checks that the IP fragment
offset is 0 so we are sure that we have a TCP header.</p>
<div class="Bd Pp Li">
<pre>struct bpf_insn insns[] = {
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
BPF_STMT(BPF_RET+BPF_K, 0),
};</pre>
</div>
</section>
<section class="Sh">
<h1 class="Sh" id="SEE_ALSO"><a class="permalink" href="#SEE_ALSO">SEE
ALSO</a></h1>
<p class="Pp"><a class="Xr">tcpdump(1)</a>, <a class="Xr">ioctl(2)</a>,
<a class="Xr">kqueue(2)</a>, <a class="Xr">poll(2)</a>,
<a class="Xr">select(2)</a>, <a class="Xr">ng_bpf(4)</a>,
<a class="Xr">bpf(9)</a></p>
<p class="Pp"><cite class="Rs"><span class="RsA">McCanne, S.</span> and
<span class="RsA">Jacobson V.</span>, <span class="RsT">An efficient,
extensible, and portable network monitor</span>.</cite></p>
</section>
<section class="Sh">
<h1 class="Sh" id="HISTORY"><a class="permalink" href="#HISTORY">HISTORY</a></h1>
<p class="Pp">The Enet packet filter was created in 1980 by Mike Accetta and
Rick Rashid at Carnegie-Mellon University. Jeffrey Mogul, at Stanford,
ported the code to <span class="Ux">BSD</span> and continued its development
from 1983 on. Since then, it has evolved into the Ultrix Packet Filter at
DEC, a STREAMS NIT module under SunOS 4.1, and BPF.</p>
</section>
<section class="Sh">
<h1 class="Sh" id="AUTHORS"><a class="permalink" href="#AUTHORS">AUTHORS</a></h1>
<p class="Pp"><span class="An">Steven McCanne</span>, of Lawrence Berkeley
Laboratory, implemented BPF in Summer 1990. Much of the design is due to
<span class="An">Van Jacobson</span>.</p>
<p class="Pp">Support for zero-copy buffers was added by <span class="An">Robert
N. M. Watson</span> under contract to Seccuris Inc.</p>
</section>
<section class="Sh">
<h1 class="Sh" id="BUGS"><a class="permalink" href="#BUGS">BUGS</a></h1>
<p class="Pp">The read buffer must be of a fixed size (returned by the
<code class="Dv">BIOCGBLEN</code> ioctl).</p>
<p class="Pp">A file that does not request promiscuous mode may receive
promiscuously received packets as a side effect of another file requesting
this mode on the same hardware interface. This could be fixed in the kernel
with additional processing overhead. However, we favor the model where all
files must assume that the interface is promiscuous, and if so desired, must
utilize a filter to reject foreign packets.</p>
<p class="Pp">The <code class="Dv">SEESENT</code>,
<code class="Dv">DIRECTION</code>, and <code class="Dv">FEEDBACK</code>
settings have been observed to work incorrectly on some interface types,
including those with hardware loopback rather than software loopback, and
point-to-point interfaces. They appear to function correctly on a broad
range of Ethernet-style interfaces.</p>
</section>
</div>
<table class="foot">
<tr>
<td class="foot-date">December 10, 2025</td>
<td class="foot-os">FreeBSD 15.0</td>
</tr>
</table>
|