xref: /openbsd-src/share/man/man4/bpf.4 (revision 3a3fbb3f2e2521ab7c4a56b7ff7462ebd9095ec5)
1.\"	$OpenBSD: bpf.4,v 1.12 2001/10/05 14:45:53 mpech Exp $
2.\"     $NetBSD: bpf.4,v 1.7 1995/09/27 18:31:50 thorpej Exp $
3.\"
4.\" Copyright (c) 1990 The Regents of the University of California.
5.\" All rights reserved.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that: (1) source code distributions
9.\" retain the above copyright notice and this paragraph in its entirety, (2)
10.\" distributions including binary code include the above copyright notice and
11.\" this paragraph in its entirety in the documentation or other materials
12.\" provided with the distribution, and (3) all advertising materials mentioning
13.\" features or use of this software display the following acknowledgement:
14.\" ``This product includes software developed by the University of California,
15.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
16.\" the University nor the names of its contributors may be used to endorse
17.\" or promote products derived from this software without specific prior
18.\" written permission.
19.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
20.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
21.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
22.\"
23.\" This document is derived in part from the enet man page (enet.4)
24.\" distributed with 4.3BSD Unix.
25.\"
26.Dd May 23, 1991
27.Dt BPF 4
28.Os
29.Sh NAME
30.Nm bpf
31.Nd Berkeley Packet Filter
32.Sh SYNOPSIS
33.Cd pseudo-device bpfilter 8
34.Sh DESCRIPTION
35The Berkeley Packet Filter provides a raw interface to data link layers in
36a protocol-independent fashion.
37All packets on the network, even those destined for other hosts, are
38accessible through this mechanism.
39.Pp
40The packet filter appears as a character special device,
41.Pa /dev/bpf0 ,
42.Pa /dev/bpf1 ,
43etc.
44After opening the device, the file descriptor must be bound to a specific
45network interface with the
46.Dv BIOSETIF
47ioctl.
48A given interface can be shared between multiple listeners and the filter
49underlying each descriptor will see an identical packet stream.
50The total number of open files is limited to the value given in the kernel
51configuration; the example given in the
52.Sx SYNOPSIS
53above sets the limit to 8.
54.Pp
55A separate device file is required for each minor device.
56If a file is in use, the open will fail and
57.Va errno
58will be set to
59.Er EBUSY .
60.Pp
61Associated with each open instance of a
62.Nm
63file is a user-settable
64packet filter.
65Whenever a packet is received by an interface, all file descriptors
66listening on that interface apply their filter.
67Each descriptor that accepts the packet receives its own copy.
68.Pp
69Reads from these files return the next group of packets that have matched
70the filter.
71To improve performance, the buffer passed to read must be the same size as
72the buffers used internally by
73.Nm bpf .
74This size is returned by the
75.Dv BIOCGBLEN
76ioctl (see below), and under BSD, can be set with
77.Dv BIOCSBLEN .
78Note that an individual packet larger than this size is necessarily truncated.
79.Pp
80The packet filter will support any link level protocol that has fixed length
81headers.
82Currently, only Ethernet, SLIP, and PPP drivers have been modified to
83interact with
84.Nm bpf .
85.Pp
86Since packet data is in network byte order, applications should use the
87.Xr byteorder 3
88macros to extract multi-byte values.
89.Pp
90A packet can be sent out on the network by writing to a
91.Nm
92file descriptor.
93The writes are unbuffered, meaning only one packet can be processed per write.
94Currently, only writes to Ethernets and SLIP links are supported.
95.Ss Ioctls
96The ioctl command codes below are defined in
97.Aq Pa net/bpf.h .
98All commands require these includes:
99.Pp
100.Bd -offset indent
101.Cd #include <sys/types.h>
102.Cd #include <sys/time.h>
103.Cd #include <sys/ioctl.h>
104.Cd #include <net/bpf.h>
105.Ed
106.Pp
107Additionally,
108.Dv BIOCGETIF
109and
110.Dv BIOCSETIF
111require
112.Aq Pa net/if.h .
113.Pp
114The (third) argument to the
115.Xr ioctl 2
116call should be a pointer to the type indicated.
117.Bl -tag -width Ds
118.It Dv BIOCGBLEN Pf ( Li int Ns No )
119Returns the required buffer length for reads on
120.Nm
121files.
122.It Dv BIOCSBLEN Pf ( Li u_int Ns No )
123Sets the buffer length for reads on
124.Nm
125files.
126The buffer must be set before the file is attached to an interface with
127.Dv BIOCSETIF .
128If the requested buffer size cannot be accommodated, the closest allowable
129size will be set and returned in the argument.
130A read call will result in
131.Er EIO
132if it is passed a buffer that is not this size.
133.It Dv BIOCGDLT Pf ( Li u_int Ns No )
134Returns the type of the data link layer underlying the attached interface.
135.Er EINVAL
136is returned if no interface has been specified.
137The device types, prefixed with
138.Dq DLT_ ,
139are defined in
140.Aq Pa net/bpf.h .
141.It Dv BIOCPROMISC
142Forces the interface into promiscuous mode.
143All packets, not just those destined for the local host, are processed.
144Since more than one file can be listening on a given interface, a listener
145that opened its interface non-promiscuously may receive packets promiscuously.
146This problem can be remedied with an appropriate filter.
147.Pp
148The interface remains in promiscuous mode until all files listening
149promiscuously are closed.
150.It Dv BIOCFLUSH
151Flushes the buffer of incoming packets and resets the statistics that are
152returned by
153.Dv BIOCGSTATS .
154.It Dv BIOCGETIF Pf ( Li "struct ifreq" Ns No )
155Returns the name of the hardware interface that the file is listening on.
156The name is returned in the
157.Fa ifr_name
158field of the
159.Li struct ifreq .
160All other fields are undefined.
161.It Dv BIOCSETIF Pf ( Li "struct ifreq" Ns No )
162Sets the hardware interface associated with the file.
163This command must be performed before any packets can be read.
164The device is indicated by name using the
165.Fa ifr_name
166field of the
167.Li struct ifreq .
168Additionally, performs the actions of
169.Dv BIOCFLUSH .
170.It Xo Dv BIOCSRTIMEOUT , Dv BIOCGRTIMEOUT (
171.Li struct timeval Ns No )
172.Xc
173Set or get the read timeout parameter.
174The
175.Ar timeval
176specifies the length of time to wait before timing out on a read request.
177This parameter is initialized to zero by
178.Xr open 2 ,
179indicating no timeout.
180.It Dv BIOCGSTATS Pf ( Li "struct bpf_stat" Ns No )
181Returns the following structure of packet statistics:
182.Pp
183.Bd -literal -offset indent
184struct bpf_stat {
185	u_int bs_recv;
186	u_int bs_drop;
187};
188.Ed
189.Pp
190The fields are:
191.Pp
192.Bl -tag -width bs_recv
193.It Fa bs_recv
194Number of packets received by the descriptor since opened or reset (including
195any buffered since the last read call).
196.It Fa bs_drop
197Number of packets which were accepted by the filter but dropped by the kernel
198because of buffer overflows (i.e., the application's reads aren't keeping up
199with the packet traffic).
200.El
201.It Dv BIOCIMMEDIATE Pf ( Li u_int Ns No )
202Enable or disable
203.Dq immediate mode ,
204based on the truth value of the argument.
205When immediate mode is enabled, reads return immediately upon packet reception.
206Otherwise, a read will block until either the kernel buffer becomes full or a
207timeout occurs.
208This is useful for programs like
209.Xr rarpd 8 ,
210which must respond to messages in real time.
211The default for a new file is off.
212.It Dv BIOCSETF Pf ( Li "struct bpf_program" Ns No )
213Sets the filter program used by the kernel to discard uninteresting packets.
214An array of instructions and its length is passed in using the following
215structure:
216.Pp
217.Bd -literal -offset indent
218struct bpf_program {
219	int bf_len;
220	struct bpf_insn *bf_insns;
221};
222.Ed
223.Pp
224The filter program is pointed to by the
225.Fa bf_insns
226field while its length in units of
227.Li struct bpf_insn
228is given by the
229.Fa bf_len
230field.
231Also, the actions of
232.Dv BIOCFLUSH
233are performed.
234.Pp
235See section
236.Sx FILTER MACHINE
237for an explanation of the filter language.
238.It Dv BIOCVERSION Pf ( Li "struct bpf_version" Ns No )
239Returns the major and minor version numbers of the filter language currently
240recognized by the kernel.
241Before installing a filter, applications must check that the current version
242is compatible with the running kernel.
243Version numbers are compatible if the major numbers match and the application
244minor is less than or equal to the kernel minor.
245The kernel version number is returned in the following structure:
246.Pp
247.Bd -literal -offset indent
248struct bpf_version {
249	u_short bv_major;
250	u_short bv_minor;
251};
252.Ed
253.Pp
254The current version numbers are given by
255.Dv BPF_MAJOR_VERSION
256and
257.Dv BPF_MINOR_VERSION
258from
259.Aq Pa net/bpf.h .
260An incompatible filter may result in undefined behavior (most likely, an
261error returned by
262.Xr ioctl 2
263or haphazard packet matching).
264.It Xo Dv BIOCSRSIG , Dv BIOCGRSIG (
265.Li u_int Ns No )
266.Xc
267Set or get the receive signal.
268This signal will be sent to the process or process group specified by
269.Dv FIOSETOWN .
270It defaults to
271.Dv SIGIO .
272.It Xo Dv BIOCSHDRCMPLT , Dv BIOCGHDRCMPLT (
273.Li u_int Ns No )
274.Xc
275Set or get the status of the ``header complete'' flag.
276Set to zero if the link level source address should be filled in
277automatically by the the interface output routine.
278Set to one if the link level source address will be written,
279as provided, to the wire.
280This flag is initialized to zero by default.
281.El
282.Ss Standard ioctls
283.Nm
284now supports several standard ioctls which allow the user to do asynchronous
285and/or non-blocking I/O to an open
286.Nm
287file descriptor.
288.Bl -tag -width Ds
289.It Dv FIONREAD Pf ( Li int Ns No )
290Returns the number of bytes that are immediately available for reading.
291.It Dv SIOCGIFADDR Pf ( Li "struct ifreq" Ns No )
292Returns the address associated with the interface.
293.It Dv FIONBIO Pf ( Li int Ns No )
294Set or clear non-blocking I/O.
295If the argument is non-zero, then doing a read when no data is available will
296return \-1 and
297.Va errno
298will be set to
299.Er EWOULDBLOCK .
300If the argument is zero, non-blocking I/O is disabled.
301Note: setting this overrides the timeout set by
302.Dv BIOCSRTIMEOUT .
303.It Dv FIOASYNC Pf ( Li int Ns No )
304Enable or disable asynchronous I/O.
305When enabled (argument is non-zero), the process or process group specified
306by
307.Dv FIOSETOWN
308will start receiving
309.Dv SIGIO
310signals when packets arrive.
311Note that you must perform an
312.Dv FIOSETOWN
313command in order for this to take effect, as the system will not do it by
314default.
315The signal may be changed via
316.Dv BIOCSRSIG .
317.It Xo Dv FIOSETOWN , Dv FIOGETOWN (
318.Li int Ns No )
319.Xc
320Set or get the process or process group (if negative) that should receive
321.Dv SIGIO
322when packets are available.
323The signal may be changed using
324.Dv BIOCSRSIG
325(see above).
326.El
327.Ss BPF header
328The following structure is prepended to each packet returned by
329.Xr read 2 :
330.Pp
331.Bd -literal -offset indent
332
333struct bpf_hdr {
334	struct bpf_timeval bh_tstamp;
335	u_int32_t	bh_caplen;
336	u_int32_t	bh_datalen;
337	u_int16_t	bh_hdrlen;
338};
339.Ed
340.Pp
341The fields, stored in host order, are as follows:
342.Bl -tag -width Ds
343.It Fa bh_tstamp
344Time at which the packet was processed by the packet filter.
345.It Fa bh_caplen
346Length of the captured portion of the packet.
347This is the minimum of the truncation amount specified by the filter and the
348length of the packet.
349.It Fa bh_datalen
350Length of the packet off the wire.
351This value is independent of the truncation amount specified by the filter.
352.It Fa bh_hdrlen
353Length of the BPF header, which may not be equal to
354.Li sizeof(struct bpf_hdr) .
355.El
356.Pp
357The
358.Fa bh_hdrlen
359field exists to account for padding between the header and the link level
360protocol.
361The purpose here is to guarantee proper alignment of the packet data
362structures, which is required on alignment-sensitive architectures and
363improves performance on many other architectures.
364The packet filter ensures that the
365.Fa bpf_hdr
366and the network layer header will be word aligned.
367Suitable precautions must be taken when accessing the link layer protocol
368fields on alignment restricted machines.
369(This isn't a problem on an Ethernet, since the type field is a
370.Li short
371falling on an even offset, and the addresses are probably accessed in a
372bytewise fashion).
373.Pp
374Additionally, individual packets are padded so that each starts on a
375word boundary.
376This requires that an application has some knowledge of how to get from packet
377to packet.
378The macro
379.Dv BPF_WORDALIGN
380is defined in
381.Aq Pa net/bpf.h
382to facilitate this process.
383It rounds up its argument to the nearest word aligned value (where a word is
384.Dv BPF_ALIGNMENT
385bytes wide).
386For example, if
387.Va p
388points to the start of a packet, this expression will advance it to the
389next packet:
390.Pp
391.Dl p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen);
392.Pp
393For the alignment mechanisms to work properly, the buffer passed to
394.Xr read 2
395must itself be word aligned.
396.Xr malloc 3
397will always return an aligned buffer.
398.Ss Filter machine
399A filter program is an array of instructions with all branches forwardly
400directed, terminated by a
401.Dq return
402instruction.
403Each instruction performs some action on the pseudo-machine state, which
404consists of an accumulator, index register, scratch memory store, and
405implicit program counter.
406.Pp
407The following structure defines the instruction format:
408.Pp
409.Bd -literal -offset indent
410struct bpf_insn {
411	u_int16_t	code;
412	u_char		jt;
413	u_char		jf;
414	u_int32_t	k;
415};
416.Ed
417.Pp
418The
419.Fa k
420field is used in different ways by different instructions, and the
421.Fa jt
422and
423.Fa jf
424fields are used as offsets by the branch instructions.
425The opcodes are encoded in a semi-hierarchical fashion.
426There are eight classes of instructions:
427.Dv BPF_LD ,
428.Dv BPF_LDX ,
429.Dv BPF_ST ,
430.Dv BPF_STX ,
431.Dv BPF_ALU ,
432.Dv BPF_JMP ,
433.Dv BPF_RET ,
434and
435.Dv BPF_MISC .
436Various other mode and operator bits are logically OR'd into the class to
437given the actual instructions.
438The classes and modes are defined in
439.Aq Pa net/bpf.h .
440Below are the semantics for each defined
441.Nm
442instruction.
443We use the convention that A is the accumulator, X is the index register,
444P[] packet data, and M[] scratch memory store.
445P[i:n] gives the data at byte offset
446.Dq i
447in the packet, interpreted as a word (n=4), unsigned halfword (n=2), or
448unsigned byte (n=1).
449M[i] gives the i'th word in the scratch memory store, which is only addressed
450in word units.
451The memory store is indexed from 0 to
452.Dv BPF_MEMWORDS Ns No \-1 .
453.Fa k ,
454.Fa jt ,
455and
456.Fa jf
457are the corresponding fields in the instruction definition.
458.Dq len
459refers to the length of the packet.
460.Pp
461.Bl -tag -width Ds
462.It Dv BPF_LD
463These instructions copy a value into the accumulator.
464The type of the source operand is specified by an
465.Dq addressing mode
466and can be a constant
467.Pf ( Dv BPF_IMM ) ,
468packet data at a fixed offset
469.Pf ( Dv BPF_ABS ) ,
470packet data at a variable offset
471.Pf ( Dv BPF_IND ) ,
472the packet length
473.Pf ( Dv BPF_LEN ) ,
474or a word in the scratch memory store
475.Pf ( Dv BPF_MEM ) .
476For
477.Dv BPF_IND
478and
479.Dv BPF_ABS ,
480the data size must be specified as a word
481.Pf ( Dv BPF_W ) ,
482halfword
483.Pf ( Dv BPF_H ) ,
484or byte
485.Pf ( Dv BPF_B ) .
486The semantics of all recognized
487.Dv BPF_LD
488instructions follow.
489.Pp
490.Bl -tag -width 32n -compact
491.Sm off
492.It Xo Dv BPF_LD No + Dv BPF_W No +
493.Dv BPF_ABS
494.Xc
495.Sm on
496A <- P[k:4]
497.Sm off
498.It Xo Dv BPF_LD No + Dv BPF_H No +
499.Dv BPF_ABS
500.Xc
501.Sm on
502A <- P[k:2]
503.Sm off
504.It Xo Dv BPF_LD No + Dv BPF_B No +
505.Dv BPF_ABS
506.Xc
507.Sm on
508A <- P[k:1]
509.Sm off
510.It Xo Dv BPF_LD No + Dv BPF_W No +
511.Dv BPF_IND
512.Xc
513.Sm on
514A <- P[X+k:4]
515.Sm off
516.It Xo Dv BPF_LD No + Dv BPF_H No +
517.Dv BPF_IND
518.Xc
519.Sm on
520A <- P[X+k:2]
521.Sm off
522.It Xo Dv BPF_LD No + Dv BPF_B No +
523.Dv BPF_IND
524.Xc
525.Sm on
526A <- P[X+k:1]
527.Sm off
528.It Xo Dv BPF_LD No + Dv BPF_W No +
529.Dv BPF_LEN
530.Xc
531.Sm on
532A <- len
533.Sm off
534.It Dv BPF_LD No + Dv BPF_IMM
535.Sm on
536A <- k
537.It Dv BPF_LD No + Dv BPF_MEM
538.Sm on
539A <- M[k]
540.El
541.It Dv BPF_LDX
542These instructions load a value into the index register.
543Note that the addressing modes are more restricted than those of the
544accumulator loads, but they include
545.Dv BPF_MSH ,
546a hack for efficiently loading the IP header length.
547.Pp
548.Bl -tag -width 32n -compact
549.Sm off
550.It Xo Dv BPF_LDX No + Dv BPF_W No +
551.Dv BPF_IMM
552.Xc
553.Sm on
554X <- k
555.Sm off
556.It Xo Dv BPF_LDX No + Dv BPF_W No +
557.Dv BPF_MEM
558.Xc
559.Sm on
560X <- M[k]
561.Sm off
562.It Xo Dv BPF_LDX No + Dv BPF_W No +
563.Dv BPF_LEN
564.Xc
565.Sm on
566X <- len
567.Sm off
568.It Xo Dv BPF_LDX No + Dv BPF_B No +
569.Dv BPF_MSH
570.Xc
571.Sm on
572X <- 4*(P[k:1]&0xf)
573.El
574.It Dv BPF_ST
575This instruction stores the accumulator into the scratch memory.
576We do not need an addressing mode since there is only one possibility for
577the destination.
578.Pp
579.Bl -tag -width 32n -compact
580.It Dv BPF_ST
581M[k] <- A
582.El
583.It Dv BPF_STX
584This instruction stores the index register in the scratch memory store.
585.Pp
586.Bl -tag -width 32n -compact
587.It Dv BPF_STX
588M[k] <- X
589.El
590.It Dv BPF_ALU
591The ALU instructions perform operations between the accumulator and index
592register or constant, and store the result back in the accumulator.
593For binary operations, a source mode is required
594.Pf ( Dv BPF_K
595or
596.Dv BPF_X ) .
597.Pp
598.Bl -tag -width 32n -compact
599.Sm off
600.It Xo Dv BPF_ALU No + BPF_ADD No +
601.Dv BPF_K
602.Xc
603.Sm on
604A <- A + k
605.Sm off
606.It Xo Dv BPF_ALU No + BPF_SUB No +
607.Dv BPF_K
608.Xc
609.Sm on
610A <- A - k
611.Sm off
612.It Xo Dv BPF_ALU No + BPF_MUL No +
613.Dv BPF_K
614.Xc
615.Sm on
616A <- A * k
617.Sm off
618.It Xo Dv BPF_ALU No + BPF_DIV No +
619.Dv BPF_K
620.Xc
621.Sm on
622A <- A / k
623.Sm off
624.It Xo Dv BPF_ALU No + BPF_AND No +
625.Dv BPF_K
626.Xc
627.Sm on
628A <- A & k
629.Sm off
630.It Xo Dv BPF_ALU No + BPF_OR No +
631.Dv BPF_K
632.Xc
633.Sm on
634A <- A | k
635.Sm off
636.It Xo Dv BPF_ALU No + BPF_LSH No +
637.Dv BPF_K
638.Xc
639.Sm on
640A <- A << k
641.Sm off
642.It Xo Dv BPF_ALU No + BPF_RSH No +
643.Dv BPF_K
644.Xc
645.Sm on
646A <- A >> k
647.Sm off
648.It Xo Dv BPF_ALU No + BPF_ADD No +
649.Dv BPF_X
650.Xc
651.Sm on
652A <- A + X
653.Sm off
654.It Xo Dv BPF_ALU No + BPF_SUB No +
655.Dv BPF_X
656.Xc
657.Sm on
658A <- A - X
659.Sm off
660.It Xo Dv BPF_ALU No + BPF_MUL No +
661.Dv BPF_X
662.Xc
663.Sm on
664A <- A * X
665.Sm off
666.It Xo Dv BPF_ALU No + BPF_DIV No +
667.Dv BPF_X
668.Xc
669.Sm on
670A <- A / X
671.Sm off
672.It Xo Dv BPF_ALU No + BPF_AND No +
673.Dv BPF_X
674.Xc
675.Sm on
676A <- A & X
677.Sm off
678.It Xo Dv BPF_ALU No + BPF_OR No +
679.Dv BPF_X
680.Xc
681.Sm on
682A <- A | X
683.Sm off
684.It Xo Dv BPF_ALU No + BPF_LSH No +
685.Dv BPF_X
686.Xc
687.Sm on
688A <- A << X
689.Sm off
690.It Xo Dv BPF_ALU No + BPF_RSH No +
691.Dv BPF_X
692.Xc
693.Sm on
694A <- A >> X
695.Sm off
696.It Dv BPF_ALU No + BPF_NEG
697.Sm on
698A <- -A
699.El
700.It Dv BPF_JMP
701The jump instructions alter flow of control.
702Conditional jumps compare the accumulator against a constant
703.Pf ( Dv BPF_K )
704or the index register
705.Pf ( Dv BPF_X ) .
706If the result is true (or non-zero), the true branch is taken, otherwise the
707false branch is taken.
708Jump offsets are encoded in 8 bits so the longest jump is 256 instructions.
709However, the jump always
710.Pf ( Dv BPF_JA )
711opcode uses the 32-bit
712.Fa k
713field as the offset, allowing arbitrarily distant destinations.
714All conditionals use unsigned comparison conventions.
715.Pp
716.Bl -tag -width 32n -compact
717.Sm off
718.It Dv BPF_JMP No + BPF_JA
719pc += k
720.Sm on
721.Sm off
722.It Xo Dv BPF_JMP No + BPF_JGT No +
723.Dv BPF_K
724.Xc
725.Sm on
726pc += (A > k) ? jt : jf
727.Sm off
728.It Xo Dv BPF_JMP No + BPF_JGE No +
729.Dv BPF_K
730.Xc
731.Sm on
732pc += (A >= k) ? jt : jf
733.Sm off
734.It Xo Dv BPF_JMP No + BPF_JEQ No +
735.Dv BPF_K
736.Xc
737.Sm on
738pc += (A == k) ? jt : jf
739.Sm off
740.It Xo Dv BPF_JMP No + BPF_JSET No +
741.Dv BPF_K
742.Xc
743.Sm on
744pc += (A & k) ? jt : jf
745.Sm off
746.It Xo Dv BPF_JMP No + BPF_JGT No +
747.Dv BPF_X
748.Xc
749.Sm on
750pc += (A > X) ? jt : jf
751.Sm off
752.It Xo Dv BPF_JMP No + BPF_JGE No +
753.Dv BPF_X
754.Xc
755.Sm on
756pc += (A >= X) ? jt : jf
757.Sm off
758.It Xo Dv BPF_JMP No + BPF_JEQ No +
759.Dv BPF_X
760.Xc
761.Sm on
762pc += (A == X) ? jt : jf
763.Sm off
764.It Xo Dv BPF_JMP No + BPF_JSET No +
765.Dv BPF_X
766.Xc
767.Sm on
768pc += (A & X) ? jt : jf
769.El
770.It Dv BPF_RET
771The return instructions terminate the filter program and specify the amount
772of packet to accept (i.e., they return the truncation amount).
773A return value of zero indicates that the packet should be ignored.
774The return value is either a constant
775.Pf ( Dv BPF_K )
776of the accumulator
777.Pf ( Dv BPF_A ) .
778.Pp
779.Bl -tag -width 32n -compact
780.It Dv BPF_RET No + Dv BPF_A
781Accept A bytes.
782.It Dv BPF_RET No + Dv BPF_K
783Accept k bytes.
784.El
785.It Dv BPF_MISC
786The miscellaneous category was created for anything that doesn't fit into
787the above classes, and for any new instructions that might need to be added.
788Currently, these are the register transfer instructions that copy the index
789register to the accumulator or vice versa.
790.Pp
791.Bl -tag -width 32n -compact
792.Sm off
793.It Dv BPF_MISC No + Dv BPF_TAX
794.Sm on
795X <- A
796.Sm off
797.It Dv BPF_MISC No + Dv BPF_TXA
798.Sm on
799A <- X
800.El
801.El
802.Pp
803The
804.Nm
805interface provides the following macros to facilitate array initializers:
806.Pp
807.Bd -offset indent
808.Dv BPF_STMT Ns No ( Ns Ar opcode ,
809.Ar operand Ns No )
810.Pp
811.Dv BPF_JUMP Ns No ( Ns Ar opcode ,
812.Ar operand ,
813.Ar true_offset ,
814.Ar false_offset Ns No )
815.Ed
816.Sh EXAMPLES
817The following filter is taken from the Reverse ARP daemon.
818It accepts only Reverse ARP requests.
819.Pp
820.Bd -literal -offset indent
821struct bpf_insn insns[] = {
822	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
823	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
824	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
825	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
826	BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
827	    sizeof(struct ether_header)),
828	BPF_STMT(BPF_RET+BPF_K, 0),
829};
830.Ed
831.Pp
832This filter accepts only IP packets between host 128.3.112.15 and
833128.3.112.35.
834.Pp
835.Bd -literal -offset indent
836struct bpf_insn insns[] = {
837	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
838	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
839	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
840	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
841	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
842	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
843	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
844	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
845	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
846	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
847	BPF_STMT(BPF_RET+BPF_K, 0),
848};
849.Ed
850.Pp
851Finally, this filter returns only TCP finger packets.
852We must parse the IP header to reach the TCP header.
853The
854.Dv BPF_JSET
855instruction checks that the IP fragment offset is 0 so we are sure that we
856have a TCP header.
857.Pp
858.Bd -literal -offset indent
859struct bpf_insn insns[] = {
860	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
861	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
862	BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
863	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
864	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
865	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
866	BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
867	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
868	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
869	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
870	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
871	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
872	BPF_STMT(BPF_RET+BPF_K, 0),
873};
874.Ed
875.Sh SEE ALSO
876.Xr ioctl 2 ,
877.Xr read 2 ,
878.Xr select 2 ,
879.Xr signal 3 ,
880.Xr tcpdump 8
881.Rs
882.%A McCanne, S., Jacobson V.
883.%J "An efficient, extensible, and portable network monitor"
884.Re
885.Sh FILES
886.Bl -tag -width /dev/bpf[0-9] -compact
887.It Pa /dev/bpf[0-9]
888BPF devices
889.El
890.Sh AUTHORS
891Steve McCanne of Lawrence Berkeley Laboratory implemented BPF in Summer 1990.
892Much of the design is due to Van Jacobson.
893.Sh HISTORY
894The Enet packet filter was created in 1980 by Mike Accetta and Rick Rashid
895at Carnegie-Mellon University.
896Jeffrey Mogul, at Stanford, ported the code to BSD and continued its
897development from 1983 on.
898Since then, it has evolved into the Ultrix Packet Filter at DEC, a STREAMS
899NIT module under SunOS 4.1, and BPF.
900.Sh BUGS
901The read buffer must be of a fixed size (returned by the
902.Dv BIOCGBLEN
903ioctl).
904.Pp
905A file that does not request promiscuous mode may receive promiscuously
906received packets as a side effect of another file requesting this mode on
907the same hardware interface.
908This could be fixed in the kernel with additional processing overhead.
909However, we favor the model where all files must assume that the interface
910is promiscuous, and if so desired, must utilize a filter to reject foreign
911packets.
912.Pp
913Data link protocols with variable length headers are not currently supported.
914.Pp
915