xref: /openbsd-src/share/man/man4/bpf.4 (revision 8500990981f885cbe5e6a4958549cacc238b5ae6)
1.\"	$OpenBSD: bpf.4,v 1.19 2003/10/22 18:42:40 canacar Exp $
2.\"     $NetBSD: bpf.4,v 1.7 1995/09/27 18:31:50 thorpej Exp $
3.\"
4.\" Copyright (c) 1990 The Regents of the University of California.
5.\" All rights reserved.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that: (1) source code distributions
9.\" retain the above copyright notice and this paragraph in its entirety, (2)
10.\" distributions including binary code include the above copyright notice and
11.\" this paragraph in its entirety in the documentation or other materials
12.\" provided with the distribution, and (3) all advertising materials mentioning
13.\" features or use of this software display the following acknowledgement:
14.\" ``This product includes software developed by the University of California,
15.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
16.\" the University nor the names of its contributors may be used to endorse
17.\" or promote products derived from this software without specific prior
18.\" written permission.
19.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
20.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
21.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
22.\"
23.\" This document is derived in part from the enet man page (enet.4)
24.\" distributed with 4.3BSD Unix.
25.\"
26.Dd May 23, 1991
27.Dt BPF 4
28.Os
29.Sh NAME
30.Nm bpf
31.Nd Berkeley Packet Filter
32.Sh SYNOPSIS
33.Cd "pseudo-device bpfilter 8"
34.Sh DESCRIPTION
35The Berkeley Packet Filter provides a raw interface to data link layers in
36a protocol-independent fashion.
37All packets on the network, even those destined for other hosts, are
38accessible through this mechanism.
39.Pp
40The packet filter appears as a character special device,
41.Pa /dev/bpf0 ,
42.Pa /dev/bpf1 ,
43etc.
44After opening the device, the file descriptor must be bound to a specific
45network interface with the
46.Dv BIOCSETIF
47ioctl.
48A given interface can be shared between multiple listeners, and the filter
49underlying each descriptor will see an identical packet stream.
50The total number of open files is limited to the value given in the kernel
51configuration; the example given in the
52.Sx SYNOPSIS
53above sets the limit to 8.
54.Pp
55A separate device file is required for each minor device.
56If a file is in use, the open will fail and
57.Va errno
58will be set to
59.Er EBUSY .
60.Pp
61Associated with each open instance of a
62.Nm
63file is a user-settable
64packet filter.
65Whenever a packet is received by an interface, all file descriptors
66listening on that interface apply their filter.
67Each descriptor that accepts the packet receives its own copy.
68.Pp
69Reads from these files return the next group of packets that have matched
70the filter.
71To improve performance, the buffer passed to read must be the same size as
72the buffers used internally by
73.Nm bpf .
74This size is returned by the
75.Dv BIOCGBLEN
76ioctl (see below), and under BSD, can be set with
77.Dv BIOCSBLEN .
78Note that an individual packet larger than this size is necessarily truncated.
79.Pp
80The packet filter will support any link level protocol that has fixed length
81headers.
82Currently, only Ethernet, SLIP, and PPP drivers have been modified to
83interact with
84.Nm bpf .
85.Pp
86Since packet data is in network byte order, applications should use the
87.Xr byteorder 3
88macros to extract multi-byte values.
89.Pp
90A packet can be sent out on the network by writing to a
91.Nm
92file descriptor.
93Each descriptor can also have a user-settable filter
94for controlling the writes.
95Only packets matching the filter are sent out of the interface.
96The writes are unbuffered, meaning only one packet can be processed per write.
97.Pp
98Once a descriptor is configured, further changes to the configuration
99can be prevented using the
100.Dv BIOCLOCK
101ioctl.
102.Ss Ioctls
103The ioctl command codes below are defined in
104.Aq Pa net/bpf.h .
105All commands require these includes:
106.Bd -unfilled -offset indent
107.Cd #include <sys/types.h>
108.Cd #include <sys/time.h>
109.Cd #include <sys/ioctl.h>
110.Cd #include <net/bpf.h>
111.Ed
112.Pp
113Additionally,
114.Dv BIOCGETIF
115and
116.Dv BIOCSETIF
117require
118.Aq Pa sys/socket.h
119and
120.Aq Pa net/if.h .
121.Pp
122The (third) argument to the
123.Xr ioctl 2
124call should be a pointer to the type indicated.
125.Bl -tag -width Ds
126.It Dv BIOCGBLEN ( Li int )
127Returns the required buffer length for reads on
128.Nm
129files.
130.It Dv BIOCSBLEN ( Li u_int )
131Sets the buffer length for reads on
132.Nm
133files.
134The buffer must be set before the file is attached to an interface with
135.Dv BIOCSETIF .
136If the requested buffer size cannot be accommodated, the closest allowable
137size will be set and returned in the argument.
138A read call will result in
139.Er EIO
140if it is passed a buffer that is not this size.
141.It Dv BIOCGDLT ( Li u_int )
142Returns the type of the data link layer underlying the attached interface.
143.Er EINVAL
144is returned if no interface has been specified.
145The device types, prefixed with
146.Dq DLT_ ,
147are defined in
148.Aq Pa net/bpf.h .
149.It Dv BIOCPROMISC
150Forces the interface into promiscuous mode.
151All packets, not just those destined for the local host, are processed.
152Since more than one file can be listening on a given interface, a listener
153that opened its interface non-promiscuously may receive packets promiscuously.
154This problem can be remedied with an appropriate filter.
155.Pp
156The interface remains in promiscuous mode until all files listening
157promiscuously are closed.
158.It Dv BIOCFLUSH
159Flushes the buffer of incoming packets and resets the statistics that are
160returned by
161.Dv BIOCGSTATS .
162.It Dv BIOCLOCK
163This ioctl is designed to prevent the security issues associated
164with an open
165.Nm
166descriptor in unprivileged programs.
167Even with dropped privileges, an open
168.Nm
169descriptor can be abused by a rogue program to listen on any interface
170on the system, send packets on these interfaces if the descriptor was
171opened read-write and send signals to arbitrary processes using the
172signaling mechanism of
173.Nm bpf .
174By allowing only
175.Dq known safe
176ioctls, the
177.DV BIOCLOCK
178ioctl prevents this abuse.
179The allowable ioctls are
180.Dv BIOCGBLEN ,
181.Dv BIOCFLUSH ,
182.Dv BIOCGDLT ,
183.Dv BIOCGETIF ,
184.Dv BIOCGRTIMEOUT ,
185.Dv BIOCSRTIMEOUT ,
186.Dv BIOCIMMEDIATE ,
187.Dv BIOCGSTATS ,
188.Dv BIOCVERSION ,
189.Dv BIOCGRSIG ,
190.Dv BIOCGHDRCMPLT ,
191.Dv TIOCGPGRP ,
192and
193.Dv FIONREAD .
194Use of any other ioctl is denied with error
195.Er EPERM .
196Once a descriptor is locked, it is not possible to unlock it.
197A process with root privileges is not affected by the lock.
198.Pp
199A privileged program can open a
200.Nm
201device, drop privileges, set the interface, filters and modes on the
202descriptor, and lock it.
203Once the descriptor is locked, the system is safe
204from further abuse through the descriptor.
205Locking a descriptor does not prevent writes.
206If the application does not need to send packets through
207.Nm bpf ,
208it can open the device read-only to prevent writing.
209If sending packets is necessary, a write-filter can be set before locking the
210descriptor to prevent arbitrary packets from being sent out.
211.It Dv BIOCGETIF ( Li "struct ifreq" )
212Returns the name of the hardware interface that the file is listening on.
213The name is returned in the
214.Fa ifr_name
215field of the
216.Li struct ifreq .
217All other fields are undefined.
218.It Dv BIOCSETIF ( Li "struct ifreq" )
219Sets the hardware interface associated with the file.
220This command must be performed before any packets can be read.
221The device is indicated by name using the
222.Fa ifr_name
223field of the
224.Li struct ifreq .
225Additionally, performs the actions of
226.Dv BIOCFLUSH .
227.It Dv BIOCSRTIMEOUT , BIOCGRTIMEOUT ( Li "struct timeval" )
228Set or get the read timeout parameter.
229The
230.Ar timeval
231specifies the length of time to wait before timing out on a read request.
232This parameter is initialized to zero by
233.Xr open 2 ,
234indicating no timeout.
235.It Dv BIOCGSTATS ( Li "struct bpf_stat" )
236Returns the following structure of packet statistics:
237.Bd -literal -offset indent
238struct bpf_stat {
239	u_int bs_recv;
240	u_int bs_drop;
241};
242.Ed
243.Pp
244The fields are:
245.Bl -tag -width bs_recv
246.It Fa bs_recv
247Number of packets received by the descriptor since opened or reset (including
248any buffered since the last read call).
249.It Fa bs_drop
250Number of packets which were accepted by the filter but dropped by the kernel
251because of buffer overflows (i.e., the application's reads aren't keeping up
252with the packet traffic).
253.El
254.It Dv BIOCIMMEDIATE ( Li u_int )
255Enable or disable
256.Dq immediate mode ,
257based on the truth value of the argument.
258When immediate mode is enabled, reads return immediately upon packet reception.
259Otherwise, a read will block until either the kernel buffer becomes full or a
260timeout occurs.
261This is useful for programs like
262.Xr rarpd 8 ,
263which must respond to messages in real time.
264The default for a new file is off.
265.It Dv BIOCSETF ( Li "struct bpf_program" )
266Sets the filter program used by the kernel to discard uninteresting packets.
267An array of instructions and its length are passed in using the following
268structure:
269.Bd -literal -offset indent
270struct bpf_program {
271	int bf_len;
272	struct bpf_insn *bf_insns;
273};
274.Ed
275.Pp
276The filter program is pointed to by the
277.Fa bf_insns
278field, while its length in units of
279.Li struct bpf_insn
280is given by the
281.Fa bf_len
282field.
283Also, the actions of
284.Dv BIOCFLUSH
285are performed.
286.Pp
287See section
288.Sx FILTER MACHINE
289for an explanation of the filter language.
290.It Dv BIOCSETWF ( Li "struct bpf_program" )
291Sets the filter program used by the kernel to filter the packets
292written to the descriptor before the packets are sent out on the
293network.
294See
295.Dv BIOCSETF
296for a description of the filter program.
297This ioctl also acts as
298.Dv BIOCFLUSH .
299.Pp
300Note that the filter operates on the packet data written to the descriptor.
301If the
302.Dq header complete
303flag is not set, the kernel sets the link-layer source address
304of the packet after filtering.
305.It Dv BIOCVERSION ( Li "struct bpf_version" )
306Returns the major and minor version numbers of the filter language currently
307recognized by the kernel.
308Before installing a filter, applications must check that the current version
309is compatible with the running kernel.
310Version numbers are compatible if the major numbers match and the application
311minor is less than or equal to the kernel minor.
312The kernel version number is returned in the following structure:
313.Bd -literal -offset indent
314struct bpf_version {
315	u_short bv_major;
316	u_short bv_minor;
317};
318.Ed
319.Pp
320The current version numbers are given by
321.Dv BPF_MAJOR_VERSION
322and
323.Dv BPF_MINOR_VERSION
324from
325.Aq Pa net/bpf.h .
326An incompatible filter may result in undefined behavior (most likely, an
327error returned by
328.Xr ioctl 2
329or haphazard packet matching).
330.It Dv BIOCSRSIG , BIOCGRSIG ( Li u_int )
331Set or get the receive signal.
332This signal will be sent to the process or process group specified by
333.Dv FIOSETOWN .
334It defaults to
335.Dv SIGIO .
336.It Dv BIOCSHDRCMPLT , BIOCGHDRCMPLT ( Li u_int )
337Set or get the status of the ``header complete'' flag.
338Set to zero if the link level source address should be filled in
339automatically by the interface output routine.
340Set to one if the link level source address will be written,
341as provided, to the wire.
342This flag is initialized to zero by default.
343.El
344.Ss Standard ioctls
345.Nm
346now supports several standard ioctls which allow the user to do asynchronous
347and/or non-blocking I/O to an open
348.Nm
349file descriptor.
350.Bl -tag -width Ds
351.It Dv FIONREAD ( Li int )
352Returns the number of bytes that are immediately available for reading.
353.It Dv SIOCGIFADDR ( Li "struct ifreq" )
354Returns the address associated with the interface.
355.It Dv FIONBIO ( Li int )
356Set or clear non-blocking I/O.
357If the argument is non-zero, enable non-blocking I/O.
358If the argument is zero, disable non-blocking I/O.
359If non-blocking I/O is enabled, the return value of a read while no data
360is available will be 0.
361The non-blocking read behavior is different from performing non-blocking
362reads on other file descriptors, which will return \-1 and set
363.Va errno
364to
365.Er EAGAIN
366if no data is available.
367Note: setting this overrides the timeout set by
368.Dv BIOCSRTIMEOUT .
369.It Dv FIOASYNC ( Li int )
370Enable or disable asynchronous I/O.
371When enabled (argument is non-zero), the process or process group specified
372by
373.Dv FIOSETOWN
374will start receiving
375.Dv SIGIO
376signals when packets arrive.
377Note that you must perform an
378.Dv FIOSETOWN
379command in order for this to take effect, as the system will not do it by
380default.
381The signal may be changed via
382.Dv BIOCSRSIG .
383.It Dv FIOSETOWN , FIOGETOWN ( Li int )
384Set or get the process or process group (if negative) that should receive
385.Dv SIGIO
386when packets are available.
387The signal may be changed using
388.Dv BIOCSRSIG
389(see above).
390.El
391.Ss BPF header
392The following structure is prepended to each packet returned by
393.Xr read 2 :
394.Bd -literal -offset indent
395struct bpf_hdr {
396	struct bpf_timeval bh_tstamp;
397	u_int32_t	bh_caplen;
398	u_int32_t	bh_datalen;
399	u_int16_t	bh_hdrlen;
400};
401.Ed
402.Pp
403The fields, stored in host order, are as follows:
404.Bl -tag -width Ds
405.It Fa bh_tstamp
406Time at which the packet was processed by the packet filter.
407.It Fa bh_caplen
408Length of the captured portion of the packet.
409This is the minimum of the truncation amount specified by the filter and the
410length of the packet.
411.It Fa bh_datalen
412Length of the packet off the wire.
413This value is independent of the truncation amount specified by the filter.
414.It Fa bh_hdrlen
415Length of the BPF header, which may not be equal to
416.Li sizeof(struct bpf_hdr) .
417.El
418.Pp
419The
420.Fa bh_hdrlen
421field exists to account for padding between the header and the link level
422protocol.
423The purpose here is to guarantee proper alignment of the packet data
424structures, which is required on alignment-sensitive architectures and
425improves performance on many other architectures.
426The packet filter ensures that the
427.Fa bpf_hdr
428and the network layer header will be word aligned.
429Suitable precautions must be taken when accessing the link layer protocol
430fields on alignment restricted machines.
431(This isn't a problem on an Ethernet, since the type field is a
432.Li short
433falling on an even offset, and the addresses are probably accessed in a
434bytewise fashion).
435.Pp
436Additionally, individual packets are padded so that each starts on a
437word boundary.
438This requires that an application has some knowledge of how to get from packet
439to packet.
440The macro
441.Dv BPF_WORDALIGN
442is defined in
443.Aq Pa net/bpf.h
444to facilitate this process.
445It rounds up its argument to the nearest word aligned value (where a word is
446.Dv BPF_ALIGNMENT
447bytes wide).
448For example, if
449.Va p
450points to the start of a packet, this expression will advance it to the
451next packet:
452.Pp
453.Dl p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen);
454.Pp
455For the alignment mechanisms to work properly, the buffer passed to
456.Xr read 2
457must itself be word aligned.
458.Xr malloc 3
459will always return an aligned buffer.
460.Ss Filter machine
461A filter program is an array of instructions with all branches forwardly
462directed, terminated by a
463.Dq return
464instruction.
465Each instruction performs some action on the pseudo-machine state, which
466consists of an accumulator, index register, scratch memory store, and
467implicit program counter.
468.Pp
469The following structure defines the instruction format:
470.Bd -literal -offset indent
471struct bpf_insn {
472	u_int16_t	code;
473	u_char		jt;
474	u_char		jf;
475	u_int32_t	k;
476};
477.Ed
478.Pp
479The
480.Fa k
481field is used in different ways by different instructions, and the
482.Fa jt
483and
484.Fa jf
485fields are used as offsets by the branch instructions.
486The opcodes are encoded in a semi-hierarchical fashion.
487There are eight classes of instructions:
488.Dv BPF_LD ,
489.Dv BPF_LDX ,
490.Dv BPF_ST ,
491.Dv BPF_STX ,
492.Dv BPF_ALU ,
493.Dv BPF_JMP ,
494.Dv BPF_RET ,
495and
496.Dv BPF_MISC .
497Various other mode and operator bits are logically OR'd into the class to
498give the actual instructions.
499The classes and modes are defined in
500.Aq Pa net/bpf.h .
501Below are the semantics for each defined
502.Nm
503instruction.
504We use the convention that A is the accumulator, X is the index register,
505P[] packet data, and M[] scratch memory store.
506P[i:n] gives the data at byte offset
507.Dq i
508in the packet, interpreted as a word (n=4), unsigned halfword (n=2), or
509unsigned byte (n=1).
510M[i] gives the i'th word in the scratch memory store, which is only addressed
511in word units.
512The memory store is indexed from 0 to
513.Dv BPF_MEMWORDS Ns \-1 .
514.Fa k ,
515.Fa jt ,
516and
517.Fa jf
518are the corresponding fields in the instruction definition.
519.Dq len
520refers to the length of the packet.
521.Bl -tag -width Ds
522.It Dv BPF_LD
523These instructions copy a value into the accumulator.
524The type of the source operand is specified by an
525.Dq addressing mode
526and can be a constant
527.Pf ( Dv BPF_IMM ) ,
528packet data at a fixed offset
529.Pf ( Dv BPF_ABS ) ,
530packet data at a variable offset
531.Pf ( Dv BPF_IND ) ,
532the packet length
533.Pf ( Dv BPF_LEN ) ,
534or a word in the scratch memory store
535.Pf ( Dv BPF_MEM ) .
536For
537.Dv BPF_IND
538and
539.Dv BPF_ABS ,
540the data size must be specified as a word
541.Pf ( Dv BPF_W ) ,
542halfword
543.Pf ( Dv BPF_H ) ,
544or byte
545.Pf ( Dv BPF_B ) .
546The semantics of all recognized
547.Dv BPF_LD
548instructions follow.
549.Pp
550.Bl -tag -width 32n -compact
551.Sm off
552.It Xo Dv BPF_LD No + Dv BPF_W No +
553.Dv BPF_ABS
554.Xc
555.Sm on
556A <- P[k:4]
557.Sm off
558.It Xo Dv BPF_LD No + Dv BPF_H No +
559.Dv BPF_ABS
560.Xc
561.Sm on
562A <- P[k:2]
563.Sm off
564.It Xo Dv BPF_LD No + Dv BPF_B No +
565.Dv BPF_ABS
566.Xc
567.Sm on
568A <- P[k:1]
569.Sm off
570.It Xo Dv BPF_LD No + Dv BPF_W No +
571.Dv BPF_IND
572.Xc
573.Sm on
574A <- P[X+k:4]
575.Sm off
576.It Xo Dv BPF_LD No + Dv BPF_H No +
577.Dv BPF_IND
578.Xc
579.Sm on
580A <- P[X+k:2]
581.Sm off
582.It Xo Dv BPF_LD No + Dv BPF_B No +
583.Dv BPF_IND
584.Xc
585.Sm on
586A <- P[X+k:1]
587.Sm off
588.It Xo Dv BPF_LD No + Dv BPF_W No +
589.Dv BPF_LEN
590.Xc
591.Sm on
592A <- len
593.Sm off
594.It Dv BPF_LD No + Dv BPF_IMM
595.Sm on
596A <- k
597.Sm off
598.It Dv BPF_LD No + Dv BPF_MEM
599.Sm on
600A <- M[k]
601.El
602.It Dv BPF_LDX
603These instructions load a value into the index register.
604Note that the addressing modes are more restricted than those of the
605accumulator loads, but they include
606.Dv BPF_MSH ,
607a hack for efficiently loading the IP header length.
608.Pp
609.Bl -tag -width 32n -compact
610.Sm off
611.It Xo Dv BPF_LDX No + Dv BPF_W No +
612.Dv BPF_IMM
613.Xc
614.Sm on
615X <- k
616.Sm off
617.It Xo Dv BPF_LDX No + Dv BPF_W No +
618.Dv BPF_MEM
619.Xc
620.Sm on
621X <- M[k]
622.Sm off
623.It Xo Dv BPF_LDX No + Dv BPF_W No +
624.Dv BPF_LEN
625.Xc
626.Sm on
627X <- len
628.Sm off
629.It Xo Dv BPF_LDX No + Dv BPF_B No +
630.Dv BPF_MSH
631.Xc
632.Sm on
633X <- 4*(P[k:1]&0xf)
634.El
635.It Dv BPF_ST
636This instruction stores the accumulator into the scratch memory.
637We do not need an addressing mode since there is only one possibility for
638the destination.
639.Pp
640.Bl -tag -width 32n -compact
641.It Dv BPF_ST
642M[k] <- A
643.El
644.It Dv BPF_STX
645This instruction stores the index register in the scratch memory store.
646.Pp
647.Bl -tag -width 32n -compact
648.It Dv BPF_STX
649M[k] <- X
650.El
651.It Dv BPF_ALU
652The ALU instructions perform operations between the accumulator and index
653register or constant, and store the result back in the accumulator.
654For binary operations, a source mode is required
655.Pf ( Dv BPF_K
656or
657.Dv BPF_X ) .
658.Pp
659.Bl -tag -width 32n -compact
660.Sm off
661.It Xo Dv BPF_ALU No + BPF_ADD No +
662.Dv BPF_K
663.Xc
664.Sm on
665A <- A + k
666.Sm off
667.It Xo Dv BPF_ALU No + BPF_SUB No +
668.Dv BPF_K
669.Xc
670.Sm on
671A <- A - k
672.Sm off
673.It Xo Dv BPF_ALU No + BPF_MUL No +
674.Dv BPF_K
675.Xc
676.Sm on
677A <- A * k
678.Sm off
679.It Xo Dv BPF_ALU No + BPF_DIV No +
680.Dv BPF_K
681.Xc
682.Sm on
683A <- A / k
684.Sm off
685.It Xo Dv BPF_ALU No + BPF_AND No +
686.Dv BPF_K
687.Xc
688.Sm on
689A <- A & k
690.Sm off
691.It Xo Dv BPF_ALU No + BPF_OR No +
692.Dv BPF_K
693.Xc
694.Sm on
695A <- A | k
696.Sm off
697.It Xo Dv BPF_ALU No + BPF_LSH No +
698.Dv BPF_K
699.Xc
700.Sm on
701A <- A << k
702.Sm off
703.It Xo Dv BPF_ALU No + BPF_RSH No +
704.Dv BPF_K
705.Xc
706.Sm on
707A <- A >> k
708.Sm off
709.It Xo Dv BPF_ALU No + BPF_ADD No +
710.Dv BPF_X
711.Xc
712.Sm on
713A <- A + X
714.Sm off
715.It Xo Dv BPF_ALU No + BPF_SUB No +
716.Dv BPF_X
717.Xc
718.Sm on
719A <- A - X
720.Sm off
721.It Xo Dv BPF_ALU No + BPF_MUL No +
722.Dv BPF_X
723.Xc
724.Sm on
725A <- A * X
726.Sm off
727.It Xo Dv BPF_ALU No + BPF_DIV No +
728.Dv BPF_X
729.Xc
730.Sm on
731A <- A / X
732.Sm off
733.It Xo Dv BPF_ALU No + BPF_AND No +
734.Dv BPF_X
735.Xc
736.Sm on
737A <- A & X
738.Sm off
739.It Xo Dv BPF_ALU No + BPF_OR No +
740.Dv BPF_X
741.Xc
742.Sm on
743A <- A | X
744.Sm off
745.It Xo Dv BPF_ALU No + BPF_LSH No +
746.Dv BPF_X
747.Xc
748.Sm on
749A <- A << X
750.Sm off
751.It Xo Dv BPF_ALU No + BPF_RSH No +
752.Dv BPF_X
753.Xc
754.Sm on
755A <- A >> X
756.Sm off
757.It Dv BPF_ALU No + BPF_NEG
758.Sm on
759A <- -A
760.El
761.It Dv BPF_JMP
762The jump instructions alter flow of control.
763Conditional jumps compare the accumulator against a constant
764.Pf ( Dv BPF_K )
765or the index register
766.Pf ( Dv BPF_X ) .
767If the result is true (or non-zero), the true branch is taken, otherwise the
768false branch is taken.
769Jump offsets are encoded in 8 bits so the longest jump is 256 instructions.
770However, the jump always
771.Pf ( Dv BPF_JA )
772opcode uses the 32-bit
773.Fa k
774field as the offset, allowing arbitrarily distant destinations.
775All conditionals use unsigned comparison conventions.
776.Pp
777.Bl -tag -width 32n -compact
778.Sm off
779.It Dv BPF_JMP No + BPF_JA
780pc += k
781.Sm on
782.Sm off
783.It Xo Dv BPF_JMP No + BPF_JGT No +
784.Dv BPF_K
785.Xc
786.Sm on
787pc += (A > k) ? jt : jf
788.Sm off
789.It Xo Dv BPF_JMP No + BPF_JGE No +
790.Dv BPF_K
791.Xc
792.Sm on
793pc += (A >= k) ? jt : jf
794.Sm off
795.It Xo Dv BPF_JMP No + BPF_JEQ No +
796.Dv BPF_K
797.Xc
798.Sm on
799pc += (A == k) ? jt : jf
800.Sm off
801.It Xo Dv BPF_JMP No + BPF_JSET No +
802.Dv BPF_K
803.Xc
804.Sm on
805pc += (A & k) ? jt : jf
806.Sm off
807.It Xo Dv BPF_JMP No + BPF_JGT No +
808.Dv BPF_X
809.Xc
810.Sm on
811pc += (A > X) ? jt : jf
812.Sm off
813.It Xo Dv BPF_JMP No + BPF_JGE No +
814.Dv BPF_X
815.Xc
816.Sm on
817pc += (A >= X) ? jt : jf
818.Sm off
819.It Xo Dv BPF_JMP No + BPF_JEQ No +
820.Dv BPF_X
821.Xc
822.Sm on
823pc += (A == X) ? jt : jf
824.Sm off
825.It Xo Dv BPF_JMP No + BPF_JSET No +
826.Dv BPF_X
827.Xc
828.Sm on
829pc += (A & X) ? jt : jf
830.El
831.It Dv BPF_RET
832The return instructions terminate the filter program and specify the
833amount of packet to accept (i.e., they return the truncation amount)
834or, for the write filter, the maximum acceptable size for the packet
835(i.e., the packet is dropped if it is larger than the returned
836amount).
837A return value of zero indicates that the packet should be ignored/dropped.
838The return value is either a constant
839.Pf ( Dv BPF_K )
840or the accumulator
841.Pf ( Dv BPF_A ) .
842.Pp
843.Bl -tag -width 32n -compact
844.It Dv BPF_RET No + Dv BPF_A
845Accept A bytes.
846.It Dv BPF_RET No + Dv BPF_K
847Accept k bytes.
848.El
849.It Dv BPF_MISC
850The miscellaneous category was created for anything that doesn't fit into
851the above classes, and for any new instructions that might need to be added.
852Currently, these are the register transfer instructions that copy the index
853register to the accumulator or vice versa.
854.Pp
855.Bl -tag -width 32n -compact
856.Sm off
857.It Dv BPF_MISC No + Dv BPF_TAX
858.Sm on
859X <- A
860.Sm off
861.It Dv BPF_MISC No + Dv BPF_TXA
862.Sm on
863A <- X
864.El
865.El
866.Pp
867The
868.Nm
869interface provides the following macros to facilitate array initializers:
870.Bd -filled -offset indent
871.Dv BPF_STMT ( Ns Ar opcode ,
872.Ar operand )
873.Pp
874.Dv BPF_JUMP ( Ns Ar opcode ,
875.Ar operand ,
876.Ar true_offset ,
877.Ar false_offset )
878.Ed
879.Sh FILES
880.Bl -tag -width /dev/bpf[0-9] -compact
881.It Pa /dev/bpf[0-9]
882BPF devices
883.El
884.Sh EXAMPLES
885The following filter is taken from the Reverse ARP daemon.
886It accepts only Reverse ARP requests.
887.Bd -literal -offset indent
888struct bpf_insn insns[] = {
889	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
890	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
891	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
892	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
893	BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
894	    sizeof(struct ether_header)),
895	BPF_STMT(BPF_RET+BPF_K, 0),
896};
897.Ed
898.Pp
899This filter accepts only IP packets between host 128.3.112.15 and
900128.3.112.35.
901.Bd -literal -offset indent
902struct bpf_insn insns[] = {
903	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
904	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
905	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
906	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
907	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
908	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
909	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
910	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
911	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
912	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
913	BPF_STMT(BPF_RET+BPF_K, 0),
914};
915.Ed
916.Pp
917Finally, this filter returns only TCP finger packets.
918We must parse the IP header to reach the TCP header.
919The
920.Dv BPF_JSET
921instruction checks that the IP fragment offset is 0 so we are sure that we
922have a TCP header.
923.Bd -literal -offset indent
924struct bpf_insn insns[] = {
925	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
926	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
927	BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
928	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
929	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
930	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
931	BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
932	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
933	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
934	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
935	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
936	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
937	BPF_STMT(BPF_RET+BPF_K, 0),
938};
939.Ed
940.Sh SEE ALSO
941.Xr ioctl 2 ,
942.Xr read 2 ,
943.Xr select 2 ,
944.Xr signal 3 ,
945.Xr tcpdump 8
946.Rs
947.%A McCanne, S.
948.%A Jacobson V.
949.%J "An efficient, extensible, and portable network monitor"
950.Re
951.Sh HISTORY
952The Enet packet filter was created in 1980 by Mike Accetta and Rick Rashid
953at Carnegie-Mellon University.
954Jeffrey Mogul, at Stanford, ported the code to BSD and continued its
955development from 1983 on.
956Since then, it has evolved into the Ultrix Packet Filter at DEC, a STREAMS
957NIT module under SunOS 4.1, and BPF.
958.Sh AUTHORS
959Steve McCanne of Lawrence Berkeley Laboratory implemented BPF in Summer 1990.
960Much of the design is due to Van Jacobson.
961.Sh BUGS
962The read buffer must be of a fixed size (returned by the
963.Dv BIOCGBLEN
964ioctl).
965.Pp
966A file that does not request promiscuous mode may receive promiscuously
967received packets as a side effect of another file requesting this mode on
968the same hardware interface.
969This could be fixed in the kernel with additional processing overhead.
970However, we favor the model where all files must assume that the interface
971is promiscuous, and if so desired, must utilize a filter to reject foreign
972packets.
973.Pp
974Data link protocols with variable length headers are not currently supported.
975