xref: /openbsd-src/share/man/man4/bpf.4 (revision 850e275390052b330d93020bf619a739a3c277ac)
1.\"	$OpenBSD: bpf.4,v 1.29 2007/05/31 19:19:49 jmc Exp $
2.\"     $NetBSD: bpf.4,v 1.7 1995/09/27 18:31:50 thorpej Exp $
3.\"
4.\" Copyright (c) 1990 The Regents of the University of California.
5.\" All rights reserved.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that: (1) source code distributions
9.\" retain the above copyright notice and this paragraph in its entirety, (2)
10.\" distributions including binary code include the above copyright notice and
11.\" this paragraph in its entirety in the documentation or other materials
12.\" provided with the distribution, and (3) all advertising materials mentioning
13.\" features or use of this software display the following acknowledgement:
14.\" ``This product includes software developed by the University of California,
15.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
16.\" the University nor the names of its contributors may be used to endorse
17.\" or promote products derived from this software without specific prior
18.\" written permission.
19.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
20.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
21.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
22.\"
23.\" This document is derived in part from the enet man page (enet.4)
24.\" distributed with 4.3BSD Unix.
25.\"
26.Dd $Mdocdate: May 31 2007 $
27.Dt BPF 4
28.Os
29.Sh NAME
30.Nm bpf
31.Nd Berkeley Packet Filter
32.Sh SYNOPSIS
33.Cd "pseudo-device bpfilter"
34.Sh DESCRIPTION
35The Berkeley Packet Filter provides a raw interface to data link layers in
36a protocol-independent fashion.
37All packets on the network, even those destined for other hosts, are
38accessible through this mechanism.
39.Pp
40The packet filter appears as a character special device,
41.Pa /dev/bpf0 ,
42.Pa /dev/bpf1 ,
43etc.
44After opening the device, the file descriptor must be bound to a specific
45network interface with the
46.Dv BIOCSETIF
47.Xr ioctl 2 .
48A given interface can be shared between multiple listeners, and the filter
49underlying each descriptor will see an identical packet stream.
50.Pp
51A separate device file is required for each minor device.
52If a file is in use, the open will fail and
53.Va errno
54will be set to
55.Er EBUSY .
56The number of open files can be increased by creating additional
57device nodes with the
58.Xr MAKEDEV 8
59script.
60.Pp
61Associated with each open instance of a
62.Nm
63file is a user-settable
64packet filter.
65Whenever a packet is received by an interface, all file descriptors
66listening on that interface apply their filter.
67Each descriptor that accepts the packet receives its own copy.
68.Pp
69Reads from these files return the next group of packets that have matched
70the filter.
71To improve performance, the buffer passed to read must be the same size as
72the buffers used internally by
73.Nm bpf .
74This size is returned by the
75.Dv BIOCGBLEN
76.Xr ioctl 2
77and can be set with
78.Dv BIOCSBLEN .
79Note that an individual packet larger than this size is necessarily truncated.
80.Pp
81The packet filter will support any link level protocol that has fixed length
82headers.
83Currently, only Ethernet, SLIP, and PPP drivers have been modified to
84interact with
85.Nm bpf .
86.Pp
87Since packet data is in network byte order, applications should use the
88.Xr byteorder 3
89macros to extract multi-byte values.
90.Pp
91A packet can be sent out on the network by writing to a
92.Nm
93file descriptor.
94Each descriptor can also have a user-settable filter
95for controlling the writes.
96Only packets matching the filter are sent out of the interface.
97The writes are unbuffered, meaning only one packet can be processed per write.
98.Pp
99Once a descriptor is configured, further changes to the configuration
100can be prevented using the
101.Dv BIOCLOCK
102.Xr ioctl 2 .
103.Sh IOCTL INTERFACE
104The
105.Xr ioctl 2
106command codes below are defined in
107.Aq Pa net/bpf.h .
108All commands require these includes:
109.Bd -unfilled -offset indent
110.Cd #include <sys/types.h>
111.Cd #include <sys/time.h>
112.Cd #include <sys/ioctl.h>
113.Cd #include <net/bpf.h>
114.Ed
115.Pp
116Additionally,
117.Dv BIOCGETIF
118and
119.Dv BIOCSETIF
120require
121.Aq Pa sys/socket.h
122and
123.Aq Pa net/if.h .
124.Pp
125The (third) argument to the
126.Xr ioctl 2
127call should be a pointer to the type indicated.
128.Pp
129.Bl -tag -width Ds -compact
130.It Dv BIOCGBLEN Fa "u_int *"
131Returns the required buffer length for reads on
132.Nm
133files.
134.Pp
135.It Dv BIOCSBLEN Fa "u_int *"
136Sets the buffer length for reads on
137.Nm
138files.
139The buffer must be set before the file is attached to an interface with
140.Dv BIOCSETIF .
141If the requested buffer size cannot be accommodated, the closest allowable
142size will be set and returned in the argument.
143A read call will result in
144.Er EIO
145if it is passed a buffer that is not this size.
146.Pp
147.It Dv BIOCGDLT Fa "u_int *"
148Returns the type of the data link layer underlying the attached interface.
149.Er EINVAL
150is returned if no interface has been specified.
151The device types, prefixed with
152.Dq DLT_ ,
153are defined in
154.Aq Pa net/bpf.h .
155.Pp
156.It Dv BIOCGDLTLIST Fa "struct bpf_dltlist *"
157Returns an array of the available types of the data link layer
158underlying the attached interface:
159.Bd -literal -offset indent
160struct bpf_dltlist {
161	u_int bfl_len;
162	u_int *bfl_list;
163};
164.Ed
165.Pp
166The available types are returned in the array pointed to by the
167.Va bfl_list
168field while their length in
169.Vt u_int
170is supplied to the
171.Va bfl_len
172field.
173.Er ENOMEM
174is returned if there is not enough buffer space and
175.Er EFAULT
176is returned if a bad address is encountered.
177The
178.Va bfl_len
179field is modified on return to indicate the actual length in
180.Vt u_int
181of the array returned.
182If
183.Va bfl_list
184is
185.Dv NULL ,
186the
187.Va bfl_len
188field is set to indicate the required length of the array in
189.Vt u_int .
190.Pp
191.It Dv BIOCSDLT Fa "u_int *"
192Changes the type of the data link layer underlying the attached interface.
193.Er EINVAL
194is returned if no interface has been specified or the specified
195type is not available for the interface.
196.Pp
197.It Dv BIOCPROMISC
198Forces the interface into promiscuous mode.
199All packets, not just those destined for the local host, are processed.
200Since more than one file can be listening on a given interface, a listener
201that opened its interface non-promiscuously may receive packets promiscuously.
202This problem can be remedied with an appropriate filter.
203.Pp
204The interface remains in promiscuous mode until all files listening
205promiscuously are closed.
206.Pp
207.It Dv BIOCFLUSH
208Flushes the buffer of incoming packets and resets the statistics that are
209returned by
210.Dv BIOCGSTATS .
211.Pp
212.It Dv BIOCLOCK
213This ioctl is designed to prevent the security issues associated
214with an open
215.Nm
216descriptor in unprivileged programs.
217Even with dropped privileges, an open
218.Nm
219descriptor can be abused by a rogue program to listen on any interface
220on the system, send packets on these interfaces if the descriptor was
221opened read-write and send signals to arbitrary processes using the
222signaling mechanism of
223.Nm bpf .
224By allowing only
225.Dq known safe
226ioctls, the
227.Dv BIOCLOCK
228ioctl prevents this abuse.
229The allowable ioctls are
230.Dv BIOCFLUSH ,
231.Dv BIOCGBLEN ,
232.Dv BIOCGDIRFILT ,
233.Dv BIOCGDLT ,
234.Dv BIOCGDLTLIST ,
235.Dv BIOCGETIF ,
236.Dv BIOCGHDRCMPLT ,
237.Dv BIOCGRSIG ,
238.Dv BIOCGRTIMEOUT ,
239.Dv BIOCGSTATS ,
240.Dv BIOCIMMEDIATE ,
241.Dv BIOCLOCK ,
242.Dv BIOCSRTIMEOUT ,
243.Dv BIOCVERSION ,
244.Dv TIOCGPGRP ,
245and
246.Dv FIONREAD .
247Use of any other ioctl is denied with error
248.Er EPERM .
249Once a descriptor is locked, it is not possible to unlock it.
250A process with root privileges is not affected by the lock.
251.Pp
252A privileged program can open a
253.Nm
254device, drop privileges, set the interface, filters and modes on the
255descriptor, and lock it.
256Once the descriptor is locked, the system is safe
257from further abuse through the descriptor.
258Locking a descriptor does not prevent writes.
259If the application does not need to send packets through
260.Nm bpf ,
261it can open the device read-only to prevent writing.
262If sending packets is necessary, a write-filter can be set before locking the
263descriptor to prevent arbitrary packets from being sent out.
264.Pp
265.It Dv BIOCGETIF Fa "struct ifreq *"
266Returns the name of the hardware interface that the file is listening on.
267The name is returned in the
268.Fa ifr_name
269field of the
270.Li struct ifreq .
271All other fields are undefined.
272.Pp
273.It Dv BIOCSETIF Fa "struct ifreq *"
274Sets the hardware interface associated with the file.
275This command must be performed before any packets can be read.
276The device is indicated by name using the
277.Fa ifr_name
278field of the
279.Li struct ifreq .
280Additionally, performs the actions of
281.Dv BIOCFLUSH .
282.Pp
283.It Dv BIOCSRTIMEOUT Fa "struct timeval *"
284.It Dv BIOCGRTIMEOUT Fa "struct timeval *"
285Set or get the read timeout parameter.
286The
287.Ar timeval
288specifies the length of time to wait before timing out on a read request.
289This parameter is initialized to zero by
290.Xr open 2 ,
291indicating no timeout.
292.Pp
293.It Dv BIOCGSTATS Fa "struct bpf_stat *"
294Returns the following structure of packet statistics:
295.Bd -literal -offset indent
296struct bpf_stat {
297	u_int bs_recv;
298	u_int bs_drop;
299};
300.Ed
301.Pp
302The fields are:
303.Bl -tag -width bs_recv
304.It Fa bs_recv
305Number of packets received by the descriptor since opened or reset (including
306any buffered since the last read call).
307.It Fa bs_drop
308Number of packets which were accepted by the filter but dropped by the kernel
309because of buffer overflows (i.e., the application's reads aren't keeping up
310with the packet traffic).
311.El
312.Pp
313.It Dv BIOCIMMEDIATE Fa "u_int *"
314Enable or disable
315.Dq immediate mode ,
316based on the truth value of the argument.
317When immediate mode is enabled, reads return immediately upon packet reception.
318Otherwise, a read will block until either the kernel buffer becomes full or a
319timeout occurs.
320This is useful for programs like
321.Xr rarpd 8 ,
322which must respond to messages in real time.
323The default for a new file is off.
324.Pp
325.It Dv BIOCSETF Fa "struct bpf_program *"
326Sets the filter program used by the kernel to discard uninteresting packets.
327An array of instructions and its length are passed in using the following
328structure:
329.Bd -literal -offset indent
330struct bpf_program {
331	int bf_len;
332	struct bpf_insn *bf_insns;
333};
334.Ed
335.Pp
336The filter program is pointed to by the
337.Fa bf_insns
338field, while its length in units of
339.Li struct bpf_insn
340is given by the
341.Fa bf_len
342field.
343Also, the actions of
344.Dv BIOCFLUSH
345are performed.
346.Pp
347See section
348.Sx FILTER MACHINE
349for an explanation of the filter language.
350.Pp
351.It Dv BIOCSETWF Fa "struct bpf_program *"
352Sets the filter program used by the kernel to filter the packets
353written to the descriptor before the packets are sent out on the
354network.
355See
356.Dv BIOCSETF
357for a description of the filter program.
358This ioctl also acts as
359.Dv BIOCFLUSH .
360.Pp
361Note that the filter operates on the packet data written to the descriptor.
362If the
363.Dq header complete
364flag is not set, the kernel sets the link-layer source address
365of the packet after filtering.
366.Pp
367.It Dv BIOCVERSION Fa "struct bpf_version *"
368Returns the major and minor version numbers of the filter language currently
369recognized by the kernel.
370Before installing a filter, applications must check that the current version
371is compatible with the running kernel.
372Version numbers are compatible if the major numbers match and the application
373minor is less than or equal to the kernel minor.
374The kernel version number is returned in the following structure:
375.Bd -literal -offset indent
376struct bpf_version {
377	u_short bv_major;
378	u_short bv_minor;
379};
380.Ed
381.Pp
382The current version numbers are given by
383.Dv BPF_MAJOR_VERSION
384and
385.Dv BPF_MINOR_VERSION
386from
387.Aq Pa net/bpf.h .
388An incompatible filter may result in undefined behavior (most likely, an
389error returned by
390.Xr ioctl 2
391or haphazard packet matching).
392.Pp
393.It Dv BIOCSRSIG Fa "u_int *"
394.It Dv BIOCGRSIG Fa "u_int *"
395Set or get the receive signal.
396This signal will be sent to the process or process group specified by
397.Dv FIOSETOWN .
398It defaults to
399.Dv SIGIO .
400.Pp
401.It Dv BIOCSHDRCMPLT Fa "u_int *"
402.It Dv BIOCGHDRCMPLT Fa "u_int *"
403Set or get the status of the
404.Dq header complete
405flag.
406Set to zero if the link level source address should be filled in
407automatically by the interface output routine.
408Set to one if the link level source address will be written,
409as provided, to the wire.
410This flag is initialized to zero by default.
411.Pp
412.It Dv BIOCGFILDROP Fa "u_int *"
413.It Dv BIOCSFILDROP Fa "u_int *"
414Get or set the status of the
415.Dq filter drop
416flag.
417If non-zero, packets matching any filters will be reported to the
418associated interface so that they can be dropped.
419.Pp
420.It Dv BIOCGDIRFILT Fa "u_int *"
421.It Dv BIOCSDIRFILT Fa "u_int *"
422Get or set the status of the
423.Dq direction filter
424flag.
425If non-zero, packets matching the specified direction (either
426.Dv BPF_DIRECTION_IN
427or
428.Dv BPF_DIRECTION_OUT )
429will be ignored.
430.El
431.Ss Standard ioctls
432.Nm
433now supports several standard ioctls which allow the user to do asynchronous
434and/or non-blocking I/O to an open
435.Nm
436file descriptor.
437.Pp
438.Bl -tag -width Ds -compact
439.It Dv FIONREAD Fa "int *"
440Returns the number of bytes that are immediately available for reading.
441.Pp
442.It Dv SIOCGIFADDR Fa "struct ifreq *"
443Returns the address associated with the interface.
444.Pp
445.It Dv FIONBIO Fa "int *"
446Set or clear non-blocking I/O.
447If the argument is non-zero, enable non-blocking I/O.
448If the argument is zero, disable non-blocking I/O.
449If non-blocking I/O is enabled, the return value of a read while no data
450is available will be 0.
451The non-blocking read behavior is different from performing non-blocking
452reads on other file descriptors, which will return \-1 and set
453.Va errno
454to
455.Er EAGAIN
456if no data is available.
457Note: setting this overrides the timeout set by
458.Dv BIOCSRTIMEOUT .
459.Pp
460.It Dv FIOASYNC Fa "int *"
461Enable or disable asynchronous I/O.
462When enabled (argument is non-zero), the process or process group specified
463by
464.Dv FIOSETOWN
465will start receiving
466.Dv SIGIO
467signals when packets arrive.
468Note that you must perform an
469.Dv FIOSETOWN
470command in order for this to take effect, as the system will not do it by
471default.
472The signal may be changed via
473.Dv BIOCSRSIG .
474.Pp
475.It Dv FIOSETOWN Fa "int *"
476.It Dv FIOGETOWN Fa "int *"
477Set or get the process or process group (if negative) that should receive
478.Dv SIGIO
479when packets are available.
480The signal may be changed using
481.Dv BIOCSRSIG
482(see above).
483.El
484.Ss BPF header
485The following structure is prepended to each packet returned by
486.Xr read 2 :
487.Bd -literal -offset indent
488struct bpf_hdr {
489	struct bpf_timeval bh_tstamp;
490	u_int32_t	bh_caplen;
491	u_int32_t	bh_datalen;
492	u_int16_t	bh_hdrlen;
493};
494.Ed
495.Pp
496The fields, stored in host order, are as follows:
497.Bl -tag -width Ds
498.It Fa bh_tstamp
499Time at which the packet was processed by the packet filter.
500.It Fa bh_caplen
501Length of the captured portion of the packet.
502This is the minimum of the truncation amount specified by the filter and the
503length of the packet.
504.It Fa bh_datalen
505Length of the packet off the wire.
506This value is independent of the truncation amount specified by the filter.
507.It Fa bh_hdrlen
508Length of the BPF header, which may not be equal to
509.Li sizeof(struct bpf_hdr) .
510.El
511.Pp
512The
513.Fa bh_hdrlen
514field exists to account for padding between the header and the link level
515protocol.
516The purpose here is to guarantee proper alignment of the packet data
517structures, which is required on alignment-sensitive architectures and
518improves performance on many other architectures.
519The packet filter ensures that the
520.Fa bpf_hdr
521and the network layer header will be word aligned.
522Suitable precautions must be taken when accessing the link layer protocol
523fields on alignment restricted machines.
524(This isn't a problem on an Ethernet, since the type field is a
525.Li short
526falling on an even offset, and the addresses are probably accessed in a
527bytewise fashion).
528.Pp
529Additionally, individual packets are padded so that each starts on a
530word boundary.
531This requires that an application has some knowledge of how to get from packet
532to packet.
533The macro
534.Dv BPF_WORDALIGN
535is defined in
536.Aq Pa net/bpf.h
537to facilitate this process.
538It rounds up its argument to the nearest word aligned value (where a word is
539.Dv BPF_ALIGNMENT
540bytes wide).
541For example, if
542.Va p
543points to the start of a packet, this expression will advance it to the
544next packet:
545.Pp
546.Dl p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen);
547.Pp
548For the alignment mechanisms to work properly, the buffer passed to
549.Xr read 2
550must itself be word aligned.
551.Xr malloc 3
552will always return an aligned buffer.
553.Ss Filter machine
554A filter program is an array of instructions with all branches forwardly
555directed, terminated by a
556.Dq return
557instruction.
558Each instruction performs some action on the pseudo-machine state, which
559consists of an accumulator, index register, scratch memory store, and
560implicit program counter.
561.Pp
562The following structure defines the instruction format:
563.Bd -literal -offset indent
564struct bpf_insn {
565	u_int16_t	code;
566	u_char		jt;
567	u_char		jf;
568	u_int32_t	k;
569};
570.Ed
571.Pp
572The
573.Fa k
574field is used in different ways by different instructions, and the
575.Fa jt
576and
577.Fa jf
578fields are used as offsets by the branch instructions.
579The opcodes are encoded in a semi-hierarchical fashion.
580There are eight classes of instructions:
581.Dv BPF_LD ,
582.Dv BPF_LDX ,
583.Dv BPF_ST ,
584.Dv BPF_STX ,
585.Dv BPF_ALU ,
586.Dv BPF_JMP ,
587.Dv BPF_RET ,
588and
589.Dv BPF_MISC .
590Various other mode and operator bits are logically OR'd into the class to
591give the actual instructions.
592The classes and modes are defined in
593.Aq Pa net/bpf.h .
594Below are the semantics for each defined
595.Nm
596instruction.
597We use the convention that A is the accumulator, X is the index register,
598P[] packet data, and M[] scratch memory store.
599P[i:n] gives the data at byte offset
600.Dq i
601in the packet, interpreted as a word (n=4), unsigned halfword (n=2), or
602unsigned byte (n=1).
603M[i] gives the i'th word in the scratch memory store, which is only addressed
604in word units.
605The memory store is indexed from 0 to
606.Dv BPF_MEMWORDS Ns \-1 .
607.Fa k ,
608.Fa jt ,
609and
610.Fa jf
611are the corresponding fields in the instruction definition.
612.Dq len
613refers to the length of the packet.
614.Bl -tag -width Ds
615.It Dv BPF_LD
616These instructions copy a value into the accumulator.
617The type of the source operand is specified by an
618.Dq addressing mode
619and can be a constant
620.Pf ( Dv BPF_IMM ) ,
621packet data at a fixed offset
622.Pf ( Dv BPF_ABS ) ,
623packet data at a variable offset
624.Pf ( Dv BPF_IND ) ,
625the packet length
626.Pf ( Dv BPF_LEN ) ,
627or a word in the scratch memory store
628.Pf ( Dv BPF_MEM ) .
629For
630.Dv BPF_IND
631and
632.Dv BPF_ABS ,
633the data size must be specified as a word
634.Pf ( Dv BPF_W ) ,
635halfword
636.Pf ( Dv BPF_H ) ,
637or byte
638.Pf ( Dv BPF_B ) .
639The semantics of all recognized
640.Dv BPF_LD
641instructions follow.
642.Pp
643.Bl -tag -width 32n -compact
644.Sm off
645.It Xo Dv BPF_LD No + Dv BPF_W No +
646.Dv BPF_ABS
647.Xc
648.Sm on
649A <- P[k:4]
650.Sm off
651.It Xo Dv BPF_LD No + Dv BPF_H No +
652.Dv BPF_ABS
653.Xc
654.Sm on
655A <- P[k:2]
656.Sm off
657.It Xo Dv BPF_LD No + Dv BPF_B No +
658.Dv BPF_ABS
659.Xc
660.Sm on
661A <- P[k:1]
662.Sm off
663.It Xo Dv BPF_LD No + Dv BPF_W No +
664.Dv BPF_IND
665.Xc
666.Sm on
667A <- P[X+k:4]
668.Sm off
669.It Xo Dv BPF_LD No + Dv BPF_H No +
670.Dv BPF_IND
671.Xc
672.Sm on
673A <- P[X+k:2]
674.Sm off
675.It Xo Dv BPF_LD No + Dv BPF_B No +
676.Dv BPF_IND
677.Xc
678.Sm on
679A <- P[X+k:1]
680.Sm off
681.It Xo Dv BPF_LD No + Dv BPF_W No +
682.Dv BPF_LEN
683.Xc
684.Sm on
685A <- len
686.Sm off
687.It Dv BPF_LD No + Dv BPF_IMM
688.Sm on
689A <- k
690.Sm off
691.It Dv BPF_LD No + Dv BPF_MEM
692.Sm on
693A <- M[k]
694.El
695.It Dv BPF_LDX
696These instructions load a value into the index register.
697Note that the addressing modes are more restricted than those of the
698accumulator loads, but they include
699.Dv BPF_MSH ,
700a hack for efficiently loading the IP header length.
701.Pp
702.Bl -tag -width 32n -compact
703.Sm off
704.It Xo Dv BPF_LDX No + Dv BPF_W No +
705.Dv BPF_IMM
706.Xc
707.Sm on
708X <- k
709.Sm off
710.It Xo Dv BPF_LDX No + Dv BPF_W No +
711.Dv BPF_MEM
712.Xc
713.Sm on
714X <- M[k]
715.Sm off
716.It Xo Dv BPF_LDX No + Dv BPF_W No +
717.Dv BPF_LEN
718.Xc
719.Sm on
720X <- len
721.Sm off
722.It Xo Dv BPF_LDX No + Dv BPF_B No +
723.Dv BPF_MSH
724.Xc
725.Sm on
726X <- 4*(P[k:1]&0xf)
727.El
728.It Dv BPF_ST
729This instruction stores the accumulator into the scratch memory.
730We do not need an addressing mode since there is only one possibility for
731the destination.
732.Pp
733.Bl -tag -width 32n -compact
734.It Dv BPF_ST
735M[k] <- A
736.El
737.It Dv BPF_STX
738This instruction stores the index register in the scratch memory store.
739.Pp
740.Bl -tag -width 32n -compact
741.It Dv BPF_STX
742M[k] <- X
743.El
744.It Dv BPF_ALU
745The ALU instructions perform operations between the accumulator and index
746register or constant, and store the result back in the accumulator.
747For binary operations, a source mode is required
748.Pf ( Dv BPF_K
749or
750.Dv BPF_X ) .
751.Pp
752.Bl -tag -width 32n -compact
753.Sm off
754.It Xo Dv BPF_ALU No + BPF_ADD No +
755.Dv BPF_K
756.Xc
757.Sm on
758A <- A + k
759.Sm off
760.It Xo Dv BPF_ALU No + BPF_SUB No +
761.Dv BPF_K
762.Xc
763.Sm on
764A <- A - k
765.Sm off
766.It Xo Dv BPF_ALU No + BPF_MUL No +
767.Dv BPF_K
768.Xc
769.Sm on
770A <- A * k
771.Sm off
772.It Xo Dv BPF_ALU No + BPF_DIV No +
773.Dv BPF_K
774.Xc
775.Sm on
776A <- A / k
777.Sm off
778.It Xo Dv BPF_ALU No + BPF_AND No +
779.Dv BPF_K
780.Xc
781.Sm on
782A <- A & k
783.Sm off
784.It Xo Dv BPF_ALU No + BPF_OR No +
785.Dv BPF_K
786.Xc
787.Sm on
788A <- A | k
789.Sm off
790.It Xo Dv BPF_ALU No + BPF_LSH No +
791.Dv BPF_K
792.Xc
793.Sm on
794A <- A << k
795.Sm off
796.It Xo Dv BPF_ALU No + BPF_RSH No +
797.Dv BPF_K
798.Xc
799.Sm on
800A <- A >> k
801.Sm off
802.It Xo Dv BPF_ALU No + BPF_ADD No +
803.Dv BPF_X
804.Xc
805.Sm on
806A <- A + X
807.Sm off
808.It Xo Dv BPF_ALU No + BPF_SUB No +
809.Dv BPF_X
810.Xc
811.Sm on
812A <- A - X
813.Sm off
814.It Xo Dv BPF_ALU No + BPF_MUL No +
815.Dv BPF_X
816.Xc
817.Sm on
818A <- A * X
819.Sm off
820.It Xo Dv BPF_ALU No + BPF_DIV No +
821.Dv BPF_X
822.Xc
823.Sm on
824A <- A / X
825.Sm off
826.It Xo Dv BPF_ALU No + BPF_AND No +
827.Dv BPF_X
828.Xc
829.Sm on
830A <- A & X
831.Sm off
832.It Xo Dv BPF_ALU No + BPF_OR No +
833.Dv BPF_X
834.Xc
835.Sm on
836A <- A | X
837.Sm off
838.It Xo Dv BPF_ALU No + BPF_LSH No +
839.Dv BPF_X
840.Xc
841.Sm on
842A <- A << X
843.Sm off
844.It Xo Dv BPF_ALU No + BPF_RSH No +
845.Dv BPF_X
846.Xc
847.Sm on
848A <- A >> X
849.Sm off
850.It Dv BPF_ALU No + BPF_NEG
851.Sm on
852A <- -A
853.El
854.It Dv BPF_JMP
855The jump instructions alter flow of control.
856Conditional jumps compare the accumulator against a constant
857.Pf ( Dv BPF_K )
858or the index register
859.Pf ( Dv BPF_X ) .
860If the result is true (or non-zero), the true branch is taken, otherwise the
861false branch is taken.
862Jump offsets are encoded in 8 bits so the longest jump is 256 instructions.
863However, the jump always
864.Pf ( Dv BPF_JA )
865opcode uses the 32-bit
866.Fa k
867field as the offset, allowing arbitrarily distant destinations.
868All conditionals use unsigned comparison conventions.
869.Pp
870.Bl -tag -width 32n -compact
871.Sm off
872.It Dv BPF_JMP No + BPF_JA
873pc += k
874.Sm on
875.Sm off
876.It Xo Dv BPF_JMP No + BPF_JGT No +
877.Dv BPF_K
878.Xc
879.Sm on
880pc += (A > k) ? jt : jf
881.Sm off
882.It Xo Dv BPF_JMP No + BPF_JGE No +
883.Dv BPF_K
884.Xc
885.Sm on
886pc += (A >= k) ? jt : jf
887.Sm off
888.It Xo Dv BPF_JMP No + BPF_JEQ No +
889.Dv BPF_K
890.Xc
891.Sm on
892pc += (A == k) ? jt : jf
893.Sm off
894.It Xo Dv BPF_JMP No + BPF_JSET No +
895.Dv BPF_K
896.Xc
897.Sm on
898pc += (A & k) ? jt : jf
899.Sm off
900.It Xo Dv BPF_JMP No + BPF_JGT No +
901.Dv BPF_X
902.Xc
903.Sm on
904pc += (A > X) ? jt : jf
905.Sm off
906.It Xo Dv BPF_JMP No + BPF_JGE No +
907.Dv BPF_X
908.Xc
909.Sm on
910pc += (A >= X) ? jt : jf
911.Sm off
912.It Xo Dv BPF_JMP No + BPF_JEQ No +
913.Dv BPF_X
914.Xc
915.Sm on
916pc += (A == X) ? jt : jf
917.Sm off
918.It Xo Dv BPF_JMP No + BPF_JSET No +
919.Dv BPF_X
920.Xc
921.Sm on
922pc += (A & X) ? jt : jf
923.El
924.It Dv BPF_RET
925The return instructions terminate the filter program and specify the
926amount of packet to accept (i.e., they return the truncation amount)
927or, for the write filter, the maximum acceptable size for the packet
928(i.e., the packet is dropped if it is larger than the returned
929amount).
930A return value of zero indicates that the packet should be ignored/dropped.
931The return value is either a constant
932.Pf ( Dv BPF_K )
933or the accumulator
934.Pf ( Dv BPF_A ) .
935.Pp
936.Bl -tag -width 32n -compact
937.It Dv BPF_RET No + Dv BPF_A
938Accept A bytes.
939.It Dv BPF_RET No + Dv BPF_K
940Accept k bytes.
941.El
942.It Dv BPF_MISC
943The miscellaneous category was created for anything that doesn't fit into
944the above classes, and for any new instructions that might need to be added.
945Currently, these are the register transfer instructions that copy the index
946register to the accumulator or vice versa.
947.Pp
948.Bl -tag -width 32n -compact
949.Sm off
950.It Dv BPF_MISC No + Dv BPF_TAX
951.Sm on
952X <- A
953.Sm off
954.It Dv BPF_MISC No + Dv BPF_TXA
955.Sm on
956A <- X
957.El
958.El
959.Pp
960The
961.Nm
962interface provides the following macros to facilitate array initializers:
963.Bd -filled -offset indent
964.Dv BPF_STMT ( Ns Ar opcode ,
965.Ar operand )
966.Pp
967.Dv BPF_JUMP ( Ns Ar opcode ,
968.Ar operand ,
969.Ar true_offset ,
970.Ar false_offset )
971.Ed
972.Sh FILES
973.Bl -tag -width /dev/bpf[0-9] -compact
974.It Pa /dev/bpf[0-9]
975.Nm
976devices
977.El
978.Sh EXAMPLES
979The following filter is taken from the Reverse ARP daemon.
980It accepts only Reverse ARP requests.
981.Bd -literal -offset indent
982struct bpf_insn insns[] = {
983	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
984	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
985	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
986	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
987	BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
988	    sizeof(struct ether_header)),
989	BPF_STMT(BPF_RET+BPF_K, 0),
990};
991.Ed
992.Pp
993This filter accepts only IP packets between host 128.3.112.15 and
994128.3.112.35.
995.Bd -literal -offset indent
996struct bpf_insn insns[] = {
997	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
998	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
999	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
1000	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
1001	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
1002	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
1003	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
1004	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
1005	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
1006	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
1007	BPF_STMT(BPF_RET+BPF_K, 0),
1008};
1009.Ed
1010.Pp
1011Finally, this filter returns only TCP finger packets.
1012We must parse the IP header to reach the TCP header.
1013The
1014.Dv BPF_JSET
1015instruction checks that the IP fragment offset is 0 so we are sure that we
1016have a TCP header.
1017.Bd -literal -offset indent
1018struct bpf_insn insns[] = {
1019	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
1020	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
1021	BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
1022	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
1023	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
1024	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
1025	BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
1026	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
1027	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
1028	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
1029	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
1030	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
1031	BPF_STMT(BPF_RET+BPF_K, 0),
1032};
1033.Ed
1034.Sh SEE ALSO
1035.Xr ioctl 2 ,
1036.Xr read 2 ,
1037.Xr select 2 ,
1038.Xr signal 3 ,
1039.Xr MAKEDEV 8 ,
1040.Xr tcpdump 8
1041.Rs
1042.%A McCanne, S.
1043.%A Jacobson, V.
1044.%J "An efficient, extensible, and portable network monitor"
1045.Re
1046.Sh HISTORY
1047The Enet packet filter was created in 1980 by Mike Accetta and Rick Rashid
1048at Carnegie-Mellon University.
1049Jeffrey Mogul, at Stanford, ported the code to BSD and continued its
1050development from 1983 on.
1051Since then, it has evolved into the Ultrix Packet Filter at DEC, a STREAMS
1052NIT module under SunOS 4.1, and BPF.
1053.Sh AUTHORS
1054Steve McCanne of Lawrence Berkeley Laboratory implemented BPF in Summer 1990.
1055Much of the design is due to Van Jacobson.
1056.Sh BUGS
1057The read buffer must be of a fixed size (returned by the
1058.Dv BIOCGBLEN
1059ioctl).
1060.Pp
1061A file that does not request promiscuous mode may receive promiscuously
1062received packets as a side effect of another file requesting this mode on
1063the same hardware interface.
1064This could be fixed in the kernel with additional processing overhead.
1065However, we favor the model where all files must assume that the interface
1066is promiscuous, and if so desired, must utilize a filter to reject foreign
1067packets.
1068.Pp
1069Data link protocols with variable length headers are not currently supported.
1070