xref: /openbsd-src/share/man/man4/bpf.4 (revision b2ea75c1b17e1a9a339660e7ed45cd24946b230e)
1.\"	$OpenBSD: bpf.4,v 1.10 2001/06/23 07:03:52 pjanzen Exp $
2.\"     $NetBSD: bpf.4,v 1.7 1995/09/27 18:31:50 thorpej Exp $
3.\"
4.\" Copyright (c) 1990 The Regents of the University of California.
5.\" All rights reserved.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that: (1) source code distributions
9.\" retain the above copyright notice and this paragraph in its entirety, (2)
10.\" distributions including binary code include the above copyright notice and
11.\" this paragraph in its entirety in the documentation or other materials
12.\" provided with the distribution, and (3) all advertising materials mentioning
13.\" features or use of this software display the following acknowledgement:
14.\" ``This product includes software developed by the University of California,
15.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
16.\" the University nor the names of its contributors may be used to endorse
17.\" or promote products derived from this software without specific prior
18.\" written permission.
19.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
20.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
21.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
22.\"
23.\" This document is derived in part from the enet man page (enet.4)
24.\" distributed with 4.3BSD Unix.
25.\"
26.Dd May 23, 1991
27.Dt BPF 4
28.Os
29.Sh NAME
30.Nm bpf
31.Nd Berkeley Packet Filter
32.Sh SYNOPSIS
33.Cd pseudo-device bpfilter 8
34.Sh DESCRIPTION
35The Berkeley Packet Filter provides a raw interface to data link layers in
36a protocol-independent fashion.
37All packets on the network, even those destined for other hosts, are
38accessible through this mechanism.
39.Pp
40The packet filter appears as a character special device,
41.Pa /dev/bpf0 ,
42.Pa /dev/bpf1 ,
43etc.
44After opening the device, the file descriptor must be bound to a specific
45network interface with the
46.Dv BIOSETIF
47ioctl.
48A given interface can be shared between multiple listeners and the filter
49underlying each descriptor will see an identical packet stream.
50The total number of open files is limited to the value given in the kernel
51configuration; the example given in the
52.Sx SYNOPSIS
53above sets the limit to 8.
54.Pp
55A separate device file is required for each minor device.
56If a file is in use, the open will fail and
57.Va errno
58will be set to
59.Er EBUSY .
60.Pp
61Associated with each open instance of a
62.Nm
63file is a user-settable
64packet filter.
65Whenever a packet is received by an interface, all file descriptors
66listening on that interface apply their filter.
67Each descriptor that accepts the packet receives its own copy.
68.Pp
69Reads from these files return the next group of packets that have matched
70the filter.
71To improve performance, the buffer passed to read must be the same size as
72the buffers used internally by
73.Nm bpf .
74This size is returned by the
75.Dv BIOCGBLEN
76ioctl (see below), and under BSD, can be set with
77.Dv BIOCSBLEN .
78Note that an individual packet larger than this size is necessarily truncated.
79.Pp
80The packet filter will support any link level protocol that has fixed length
81headers.
82Currently, only Ethernet, SLIP, and PPP drivers have been modified to
83interact with
84.Nm bpf .
85.Pp
86Since packet data is in network byte order, applications should use the
87.Xr byteorder 3
88macros to extract multi-byte values.
89.Pp
90A packet can be sent out on the network by writing to a
91.Nm
92file descriptor.
93The writes are unbuffered, meaning only one packet can be processed per write.
94Currently, only writes to Ethernets and SLIP links are supported.
95.Ss Ioctls
96The ioctl command codes below are defined in
97.Aq Pa net/bpf.h .
98All commands require these includes:
99.Pp
100.Bd -offset indent
101.Cd #include <sys/types.h>
102.Cd #include <sys/time.h>
103.Cd #include <sys/ioctl.h>
104.Cd #include <net/bpf.h>
105.Ed
106.Pp
107Additionally,
108.Dv BIOCGETIF
109and
110.Dv BIOCSETIF
111require
112.Aq Pa net/if.h .
113.Pp
114The (third) argument to the
115.Xr ioctl 2
116call should be a pointer to the type indicated.
117.Bl -tag -width Ds
118.It Dv BIOCGBLEN Pf ( Li int Ns No )
119Returns the required buffer length for reads on
120.Nm
121files.
122.It Dv BIOCSBLEN Pf ( Li u_int Ns No )
123Sets the buffer length for reads on
124.Nm
125files.
126The buffer must be set before the file is attached to an interface with
127.Dv BIOCSETIF .
128If the requested buffer size cannot be accommodated, the closest allowable
129size will be set and returned in the argument.
130A read call will result in
131.Er EIO
132if it is passed a buffer that is not this size.
133.It Dv BIOCGDLT Pf ( Li u_int Ns No )
134Returns the type of the data link layer underlying the attached interface.
135.Er EINVAL
136is returned if no interface has been specified.
137The device types, prefixed with
138.Dq DLT_ ,
139are defined in
140.Aq Pa net/bpf.h .
141.It Dv BIOCPROMISC
142Forces the interface into promiscuous mode.
143All packets, not just those destined for the local host, are processed.
144Since more than one file can be listening on a given interface, a listener
145that opened its interface non-promiscuously may receive packets promiscuously.
146This problem can be remedied with an appropriate filter.
147.Pp
148The interface remains in promiscuous mode until all files listening
149promiscuously are closed.
150.It Dv BIOCFLUSH
151Flushes the buffer of incoming packets and resets the statistics that are
152returned by
153.Dv BIOCGSTATS .
154.It Dv BIOCGETIF Pf ( Li "struct ifreq" Ns No )
155Returns the name of the hardware interface that the file is listening on.
156The name is returned in the
157.Fa ifr_name
158field of the
159.Li struct ifreq .
160All other fields are undefined.
161.It Dv BIOCSETIF Pf ( Li "struct ifreq" Ns No )
162Sets the hardware interface associated with the file.
163This command must be performed before any packets can be read.
164The device is indicated by name using the
165.Fa ifr_name
166field of the
167.Li struct ifreq .
168Additionally, performs the actions of
169.Dv BIOCFLUSH .
170.It Xo Dv BIOCSRTIMEOUT , Dv BIOCGRTIMEOUT (
171.Li struct timeval Ns No )
172.Xc
173Set or get the read timeout parameter.
174The
175.Ar timeval
176specifies the length of time to wait before timing out on a read request.
177This parameter is initialized to zero by
178.Xr open 2 ,
179indicating no timeout.
180.It Dv BIOCGSTATS Pf ( Li "struct bpf_stat" Ns No )
181Returns the following structure of packet statistics:
182.Pp
183.Bd -literal -offset indent
184struct bpf_stat {
185	u_int bs_recv;
186	u_int bs_drop;
187};
188.Ed
189.Pp
190The fields are:
191.Pp
192.Bl -tag -width bs_recv
193.It Fa bs_recv
194Number of packets received by the descriptor since opened or reset (including
195any buffered since the last read call).
196.It Fa bs_drop
197Number of packets which were accepted by the filter but dropped by the kernel
198because of buffer overflows (i.e., the application's reads aren't keeping up
199with the packet traffic).
200.El
201.It Dv BIOCIMMEDIATE Pf ( Li u_int Ns No )
202Enable or disable
203.Dq immediate mode ,
204based on the truth value of the argument.
205When immediate mode is enabled, reads return immediately upon packet reception.
206Otherwise, a read will block until either the kernel buffer becomes full or a
207timeout occurs.
208This is useful for programs like
209.Xr rarpd 8 ,
210which must respond to messages in real time.
211The default for a new file is off.
212.It Dv BIOCSETF Pf ( Li "struct bpf_program" Ns No )
213Sets the filter program used by the kernel to discard uninteresting packets.
214An array of instructions and its length is passed in using the following
215structure:
216.Pp
217.Bd -literal -offset indent
218struct bpf_program {
219	int bf_len;
220	struct bpf_insn *bf_insns;
221};
222.Ed
223.Pp
224The filter program is pointed to by the
225.Fa bf_insns
226field while its length in units of
227.Li struct bpf_insn
228is given by the
229.Fa bf_len
230field.
231Also, the actions of
232.Dv BIOCFLUSH
233are performed.
234.Pp
235See section
236.Sx FILTER MACHINE
237for an explanation of the filter language.
238.It Dv BIOCVERSION Pf ( Li "struct bpf_version" Ns No )
239Returns the major and minor version numbers of the filter language currently
240recognized by the kernel.
241Before installing a filter, applications must check that the current version
242is compatible with the running kernel.
243Version numbers are compatible if the major numbers match and the application
244minor is less than or equal to the kernel minor.
245The kernel version number is returned in the following structure:
246.Pp
247.Bd -literal -offset indent
248struct bpf_version {
249	u_short bv_major;
250	u_short bv_minor;
251};
252.Ed
253.Pp
254The current version numbers are given by
255.Dv BPF_MAJOR_VERSION
256and
257.Dv BPF_MINOR_VERSION
258from
259.Aq Pa net/bpf.h .
260An incompatible filter may result in undefined behavior (most likely, an
261error returned by
262.Xr ioctl 2
263or haphazard packet matching).
264.It Xo Dv BIOCSRSIG , Dv BIOCGRSIG (
265.Li u_int Ns No )
266.Xc
267Set or get the receive signal.
268This signal will be sent to the process or process group specified by
269.Dv FIOSETOWN .
270It defaults to
271.Dv SIGIO .
272.It Xo Dv BIOCSHDRCMPLT , Dv BIOCGHDRCMPLT (
273.Li u_int Ns No )
274.Xc
275Set or get the status of the ``header complete'' flag.  Set to zero if
276the link level source address should be filled in automatically by the
277the interface output routine.  Set to one if the link level source
278address will be written, as provided, to the wire.  This flag is
279initialized to zero by default.
280.El
281.Ss Standard ioctls
282.Nm
283now supports several standard ioctls which allow the user to do asynchronous
284and/or non-blocking I/O to an open
285.Nm
286file descriptor.
287.Bl -tag -width Ds
288.It Dv FIONREAD Pf ( Li int Ns No )
289Returns the number of bytes that are immediately available for reading.
290.It Dv SIOCGIFADDR Pf ( Li "struct ifreq" Ns No )
291Returns the address associated with the interface.
292.It Dv FIONBIO Pf ( Li int Ns No )
293Set or clear non-blocking I/O.
294If the argument is non-zero, then doing a read when no data is available will
295return \-1 and
296.Va errno
297will be set to
298.Er EWOULDBLOCK .
299If the argument is zero, non-blocking I/O is disabled.
300Note: setting this overrides the timeout set by
301.Dv BIOCSRTIMEOUT .
302.It Dv FIOASYNC Pf ( Li int Ns No )
303Enable or disable asynchronous I/O.
304When enabled (argument is non-zero), the process or process group specified
305by
306.Dv FIOSETOWN
307will start receiving
308.Dv SIGIO
309signals when packets arrive.
310Note that you must perform an
311.Dv FIOSETOWN
312command in order for this to take effect, as the system will not do it by
313default.
314The signal may be changed via
315.Dv BIOCSRSIG .
316.It Xo Dv FIOSETOWN , Dv FIOGETOWN (
317.Li int Ns No )
318.Xc
319Set or get the process or process group (if negative) that should receive
320.Dv SIGIO
321when packets are available.
322The signal may be changed using
323.Dv BIOCSRSIG
324(see above).
325.El
326.Ss BPF header
327The following structure is prepended to each packet returned by
328.Xr read 2 :
329.Pp
330.Bd -literal -offset indent
331struct bpf_hdr {
332	struct timeval bh_tstamp;
333	u_long bh_caplen;
334	u_long bh_datalen;
335	u_short bh_hdrlen;
336};
337.Ed
338.Pp
339The fields, stored in host order, are as follows:
340.Bl -tag -width Ds
341.It Fa bh_tstamp
342Time at which the packet was processed by the packet filter.
343.It Fa bh_caplen
344Length of the captured portion of the packet.
345This is the minimum of the truncation amount specified by the filter and the
346length of the packet.
347.It Fa bh_datalen
348Length of the packet off the wire.
349This value is independent of the truncation amount specified by the filter.
350.It Fa bh_hdrlen
351Length of the BPF header, which may not be equal to
352.Li sizeof(struct bpf_hdr) .
353.El
354.Pp
355The
356.Fa bh_hdrlen
357field exists to account for padding between the header and the link level
358protocol.
359The purpose here is to guarantee proper alignment of the packet data
360structures, which is required on alignment-sensitive architectures and
361improves performance on many other architectures.
362The packet filter ensures that the
363.Fa bpf_hdr
364and the network layer header will be word aligned.
365Suitable precautions must be taken when accessing the link layer protocol
366fields on alignment restricted machines.
367(This isn't a problem on an Ethernet, since the type field is a
368.Li short
369falling on an even offset, and the addresses are probably accessed in a
370bytewise fashion).
371.Pp
372Additionally, individual packets are padded so that each starts on a
373word boundary.
374This requires that an application has some knowledge of how to get from packet
375to packet.
376The macro
377.Dv BPF_WORDALIGN
378is defined in
379.Aq Pa net/bpf.h
380to facilitate this process.
381It rounds up its argument to the nearest word aligned value (where a word is
382.Dv BPF_ALIGNMENT
383bytes wide).
384For example, if
385.Va p
386points to the start of a packet, this expression will advance it to the
387next packet:
388.Pp
389.Dl p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen);
390.Pp
391For the alignment mechanisms to work properly, the buffer passed to
392.Xr read 2
393must itself be word aligned.
394.Xr malloc 3
395will always return an aligned buffer.
396.Ss Filter machine
397A filter program is an array of instructions with all branches forwardly
398directed, terminated by a
399.Dq return
400instruction.
401Each instruction performs some action on the pseudo-machine state, which
402consists of an accumulator, index register, scratch memory store, and
403implicit program counter.
404.Pp
405The following structure defines the instruction format:
406.Pp
407.Bd -literal -offset indent
408struct bpf_insn {
409	u_short code;
410	u_char jt;
411	u_char jf;
412	long k;
413};
414.Ed
415.Pp
416The
417.Fa k
418field is used in different ways by different instructions, and the
419.Fa jt
420and
421.Fa jf
422fields are used as offsets by the branch instructions.
423The opcodes are encoded in a semi-hierarchical fashion.
424There are eight classes of instructions:
425.Dv BPF_LD ,
426.Dv BPF_LDX ,
427.Dv BPF_ST ,
428.Dv BPF_STX ,
429.Dv BPF_ALU ,
430.Dv BPF_JMP ,
431.Dv BPF_RET ,
432and
433.Dv BPF_MISC .
434Various other mode and operator bits are logically OR'd into the class to
435given the actual instructions.
436The classes and modes are defined in
437.Aq Pa net/bpf.h .
438Below are the semantics for each defined
439.Nm
440instruction.
441We use the convention that A is the accumulator, X is the index register,
442P[] packet data, and M[] scratch memory store.
443P[i:n] gives the data at byte offset
444.Dq i
445in the packet, interpreted as a word (n=4), unsigned halfword (n=2), or
446unsigned byte (n=1).
447M[i] gives the i'th word in the scratch memory store, which is only addressed
448in word units.
449The memory store is indexed from 0 to
450.Dv BPF_MEMWORDS Ns No \-1 .
451.Fa k ,
452.Fa jt ,
453and
454.Fa jf
455are the corresponding fields in the instruction definition.
456.Dq len
457refers to the length of the packet.
458.Pp
459.Bl -tag -width Ds
460.It Dv BPF_LD
461These instructions copy a value into the accumulator.
462The type of the source operand is specified by an
463.Dq addressing mode
464and can be a constant
465.Pf ( Dv BPF_IMM ) ,
466packet data at a fixed offset
467.Pf ( Dv BPF_ABS ) ,
468packet data at a variable offset
469.Pf ( Dv BPF_IND ) ,
470the packet length
471.Pf ( Dv BPF_LEN ) ,
472or a word in the scratch memory store
473.Pf ( Dv BPF_MEM ) .
474For
475.Dv BPF_IND
476and
477.Dv BPF_ABS ,
478the data size must be specified as a word
479.Pf ( Dv BPF_W ) ,
480halfword
481.Pf ( Dv BPF_H ) ,
482or byte
483.Pf ( Dv BPF_B ) .
484The semantics of all recognized
485.Dv BPF_LD
486instructions follow.
487.Pp
488.Bl -tag -width 32n -compact
489.Sm off
490.It Xo Dv BPF_LD No + Dv BPF_W No +
491.Dv BPF_ABS
492.Xc
493.Sm on
494A <- P[k:4]
495.Sm off
496.It Xo Dv BPF_LD No + Dv BPF_H No +
497.Dv BPF_ABS
498.Xc
499.Sm on
500A <- P[k:2]
501.Sm off
502.It Xo Dv BPF_LD No + Dv BPF_B No +
503.Dv BPF_ABS
504.Xc
505.Sm on
506A <- P[k:1]
507.Sm off
508.It Xo Dv BPF_LD No + Dv BPF_W No +
509.Dv BPF_IND
510.Xc
511.Sm on
512A <- P[X+k:4]
513.Sm off
514.It Xo Dv BPF_LD No + Dv BPF_H No +
515.Dv BPF_IND
516.Xc
517.Sm on
518A <- P[X+k:2]
519.Sm off
520.It Xo Dv BPF_LD No + Dv BPF_B No +
521.Dv BPF_IND
522.Xc
523.Sm on
524A <- P[X+k:1]
525.Sm off
526.It Xo Dv BPF_LD No + Dv BPF_W No +
527.Dv BPF_LEN
528.Xc
529.Sm on
530A <- len
531.Sm off
532.It Dv BPF_LD No + Dv BPF_IMM
533.Sm on
534A <- k
535.It Dv BPF_LD No + Dv BPF_MEM
536.Sm on
537A <- M[k]
538.El
539.It Dv BPF_LDX
540These instructions load a value into the index register.
541Note that the addressing modes are more restricted than those of the
542accumulator loads, but they include
543.Dv BPF_MSH ,
544a hack for efficiently loading the IP header length.
545.Pp
546.Bl -tag -width 32n -compact
547.Sm off
548.It Xo Dv BPF_LDX No + Dv BPF_W No +
549.Dv BPF_IMM
550.Xc
551.Sm on
552X <- k
553.Sm off
554.It Xo Dv BPF_LDX No + Dv BPF_W No +
555.Dv BPF_MEM
556.Xc
557.Sm on
558X <- M[k]
559.Sm off
560.It Xo Dv BPF_LDX No + Dv BPF_W No +
561.Dv BPF_LEN
562.Xc
563.Sm on
564X <- len
565.Sm off
566.It Xo Dv BPF_LDX No + Dv BPF_B No +
567.Dv BPF_MSH
568.Xc
569.Sm on
570X <- 4*(P[k:1]&0xf)
571.El
572.It Dv BPF_ST
573This instruction stores the accumulator into the scratch memory.
574We do not need an addressing mode since there is only one possibility for
575the destination.
576.Pp
577.Bl -tag -width 32n -compact
578.It Dv BPF_ST
579M[k] <- A
580.El
581.It Dv BPF_STX
582This instruction stores the index register in the scratch memory store.
583.Pp
584.Bl -tag -width 32n -compact
585.It Dv BPF_STX
586M[k] <- X
587.El
588.It Dv BPF_ALU
589The ALU instructions perform operations between the accumulator and index
590register or constant, and store the result back in the accumulator.
591For binary operations, a source mode is required
592.Pf ( Dv BPF_K
593or
594.Dv BPF_X ) .
595.Pp
596.Bl -tag -width 32n -compact
597.Sm off
598.It Xo Dv BPF_ALU No + BPF_ADD No +
599.Dv BPF_K
600.Xc
601.Sm on
602A <- A + k
603.Sm off
604.It Xo Dv BPF_ALU No + BPF_SUB No +
605.Dv BPF_K
606.Xc
607.Sm on
608A <- A - k
609.Sm off
610.It Xo Dv BPF_ALU No + BPF_MUL No +
611.Dv BPF_K
612.Xc
613.Sm on
614A <- A * k
615.Sm off
616.It Xo Dv BPF_ALU No + BPF_DIV No +
617.Dv BPF_K
618.Xc
619.Sm on
620A <- A / k
621.Sm off
622.It Xo Dv BPF_ALU No + BPF_AND No +
623.Dv BPF_K
624.Xc
625.Sm on
626A <- A & k
627.Sm off
628.It Xo Dv BPF_ALU No + BPF_OR No +
629.Dv BPF_K
630.Xc
631.Sm on
632A <- A | k
633.Sm off
634.It Xo Dv BPF_ALU No + BPF_LSH No +
635.Dv BPF_K
636.Xc
637.Sm on
638A <- A << k
639.Sm off
640.It Xo Dv BPF_ALU No + BPF_RSH No +
641.Dv BPF_K
642.Xc
643.Sm on
644A <- A >> k
645.Sm off
646.It Xo Dv BPF_ALU No + BPF_ADD No +
647.Dv BPF_X
648.Xc
649.Sm on
650A <- A + X
651.Sm off
652.It Xo Dv BPF_ALU No + BPF_SUB No +
653.Dv BPF_X
654.Xc
655.Sm on
656A <- A - X
657.Sm off
658.It Xo Dv BPF_ALU No + BPF_MUL No +
659.Dv BPF_X
660.Xc
661.Sm on
662A <- A * X
663.Sm off
664.It Xo Dv BPF_ALU No + BPF_DIV No +
665.Dv BPF_X
666.Xc
667.Sm on
668A <- A / X
669.Sm off
670.It Xo Dv BPF_ALU No + BPF_AND No +
671.Dv BPF_X
672.Xc
673.Sm on
674A <- A & X
675.Sm off
676.It Xo Dv BPF_ALU No + BPF_OR No +
677.Dv BPF_X
678.Xc
679.Sm on
680A <- A | X
681.Sm off
682.It Xo Dv BPF_ALU No + BPF_LSH No +
683.Dv BPF_X
684.Xc
685.Sm on
686A <- A << X
687.Sm off
688.It Xo Dv BPF_ALU No + BPF_RSH No +
689.Dv BPF_X
690.Xc
691.Sm on
692A <- A >> X
693.Sm off
694.It Dv BPF_ALU No + BPF_NEG
695.Sm on
696A <- -A
697.El
698.It Dv BPF_JMP
699The jump instructions alter flow of control.
700Conditional jumps compare the accumulator against a constant
701.Pf ( Dv BPF_K )
702or the index register
703.Pf ( Dv BPF_X ) .
704If the result is true (or non-zero), the true branch is taken, otherwise the
705false branch is taken.
706Jump offsets are encoded in 8 bits so the longest jump is 256 instructions.
707However, the jump always
708.Pf ( Dv BPF_JA )
709opcode uses the 32-bit
710.Fa k
711field as the offset, allowing arbitrarily distant destinations.
712All conditionals use unsigned comparison conventions.
713.Pp
714.Bl -tag -width 32n -compact
715.Sm off
716.It Dv BPF_JMP No + BPF_JA
717pc += k
718.Sm on
719.Sm off
720.It Xo Dv BPF_JMP No + BPF_JGT No +
721.Dv BPF_K
722.Xc
723.Sm on
724pc += (A > k) ? jt : jf
725.Sm off
726.It Xo Dv BPF_JMP No + BPF_JGE No +
727.Dv BPF_K
728.Xc
729.Sm on
730pc += (A >= k) ? jt : jf
731.Sm off
732.It Xo Dv BPF_JMP No + BPF_JEQ No +
733.Dv BPF_K
734.Xc
735.Sm on
736pc += (A == k) ? jt : jf
737.Sm off
738.It Xo Dv BPF_JMP No + BPF_JSET No +
739.Dv BPF_K
740.Xc
741.Sm on
742pc += (A & k) ? jt : jf
743.Sm off
744.It Xo Dv BPF_JMP No + BPF_JGT No +
745.Dv BPF_X
746.Xc
747.Sm on
748pc += (A > X) ? jt : jf
749.Sm off
750.It Xo Dv BPF_JMP No + BPF_JGE No +
751.Dv BPF_X
752.Xc
753.Sm on
754pc += (A >= X) ? jt : jf
755.Sm off
756.It Xo Dv BPF_JMP No + BPF_JEQ No +
757.Dv BPF_X
758.Xc
759.Sm on
760pc += (A == X) ? jt : jf
761.Sm off
762.It Xo Dv BPF_JMP No + BPF_JSET No +
763.Dv BPF_X
764.Xc
765.Sm on
766pc += (A & X) ? jt : jf
767.El
768.It Dv BPF_RET
769The return instructions terminate the filter program and specify the amount
770of packet to accept (i.e., they return the truncation amount).
771A return value of zero indicates that the packet should be ignored.
772The return value is either a constant
773.Pf ( Dv BPF_K )
774of the accumulator
775.Pf ( Dv BPF_A ) .
776.Pp
777.Bl -tag -width 32n -compact
778.It Dv BPF_RET No + Dv BPF_A
779Accept A bytes.
780.It Dv BPF_RET No + Dv BPF_K
781Accept k bytes.
782.El
783.It Dv BPF_MISC
784The miscellaneous category was created for anything that doesn't fit into
785the above classes, and for any new instructions that might need to be added.
786Currently, these are the register transfer instructions that copy the index
787register to the accumulator or vice versa.
788.Pp
789.Bl -tag -width 32n -compact
790.Sm off
791.It Dv BPF_MISC No + Dv BPF_TAX
792.Sm on
793X <- A
794.Sm off
795.It Dv BPF_MISC No + Dv BPF_TXA
796.Sm on
797A <- X
798.El
799.El
800.Pp
801The
802.Nm
803interface provides the following macros to facilitate array initializers:
804.Pp
805.Bd -offset indent
806.Dv BPF_STMT Ns No ( Ns Ar opcode ,
807.Ar operand Ns No )
808.Pp
809.Dv BPF_JUMP Ns No ( Ns Ar opcode ,
810.Ar operand ,
811.Ar true_offset ,
812.Ar false_offset Ns No )
813.Ed
814.Sh EXAMPLES
815The following filter is taken from the Reverse ARP daemon.
816It accepts only Reverse ARP requests.
817.Pp
818.Bd -literal -offset indent
819struct bpf_insn insns[] = {
820	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
821	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
822	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
823	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
824	BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
825	    sizeof(struct ether_header)),
826	BPF_STMT(BPF_RET+BPF_K, 0),
827};
828.Ed
829.Pp
830This filter accepts only IP packets between host 128.3.112.15 and
831128.3.112.35.
832.Pp
833.Bd -literal -offset indent
834struct bpf_insn insns[] = {
835	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
836	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
837	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
838	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
839	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
840	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
841	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
842	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
843	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
844	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
845	BPF_STMT(BPF_RET+BPF_K, 0),
846};
847.Ed
848.Pp
849Finally, this filter returns only TCP finger packets.
850We must parse the IP header to reach the TCP header.
851The
852.Dv BPF_JSET
853instruction checks that the IP fragment offset is 0 so we are sure that we
854have a TCP header.
855.Pp
856.Bd -literal -offset indent
857struct bpf_insn insns[] = {
858	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
859	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
860	BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
861	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
862	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
863	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
864	BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
865	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
866	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
867	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
868	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
869	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
870	BPF_STMT(BPF_RET+BPF_K, 0),
871};
872.Ed
873.Sh SEE ALSO
874.Xr ioctl 2 ,
875.Xr read 2 ,
876.Xr select 2 ,
877.Xr signal 3 ,
878.Xr tcpdump 8
879.Rs
880.%A McCanne, S., Jacobson V.
881.%J "An efficient, extensible, and portable network monitor"
882.Re
883.Sh FILES
884.Bl -tag -width /dev/bpf[0-9] -compact
885.It Pa /dev/bpf[0-9]
886BPF devices
887.El
888.Sh AUTHORS
889Steve McCanne of Lawrence Berkeley Laboratory implemented BPF in Summer 1990.
890Much of the design is due to Van Jacobson.
891.Sh HISTORY
892The Enet packet filter was created in 1980 by Mike Accetta and Rick Rashid
893at Carnegie-Mellon University.
894Jeffrey Mogul, at Stanford, ported the code to BSD and continued its
895development from 1983 on.
896Since then, it has evolved into the Ultrix Packet Filter at DEC, a STREAMS
897NIT module under SunOS 4.1, and BPF.
898.Sh BUGS
899The read buffer must be of a fixed size (returned by the
900.Dv BIOCGBLEN
901ioctl).
902.Pp
903A file that does not request promiscuous mode may receive promiscuously
904received packets as a side effect of another file requesting this mode on
905the same hardware interface.
906This could be fixed in the kernel with additional processing overhead.
907However, we favor the model where all files must assume that the interface
908is promiscuous, and if so desired, must utilize a filter to reject foreign
909packets.
910.Pp
911Data link protocols with variable length headers are not currently supported.
912.Pp
913Under SunOS, if a
914.Nm
915application reads more than 2^31 bytes of data, read will fail with
916.Er EINVAL .
917You can either fix the bug in SunOS, or lseek to 0 when read fails for this
918reason.
919
920