xref: /netbsd-src/share/man/man4/bpf.4 (revision bbde328be4e75ea9ad02e9715ea13ca54b797ada)
1.\" $NetBSD: bpf.4,v 1.45 2010/03/22 18:58:31 joerg Exp $
2.\"
3.\" -*- nroff -*-
4.\"
5.\"	$NetBSD: bpf.4,v 1.45 2010/03/22 18:58:31 joerg Exp $
6.\"
7.\" Copyright (c) 1990, 1991, 1992, 1993, 1994
8.\"	The Regents of the University of California.  All rights reserved.
9.\"
10.\" Redistribution and use in source and binary forms, with or without
11.\" modification, are permitted provided that: (1) source code distributions
12.\" retain the above copyright notice and this paragraph in its entirety, (2)
13.\" distributions including binary code include the above copyright notice and
14.\" this paragraph in its entirety in the documentation or other materials
15.\" provided with the distribution, and (3) all advertising materials mentioning
16.\" features or use of this software display the following acknowledgement:
17.\" ``This product includes software developed by the University of California,
18.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
19.\" the University nor the names of its contributors may be used to endorse
20.\" or promote products derived from this software without specific prior
21.\" written permission.
22.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
23.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
24.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
25.\"
26.\" This document is derived in part from the enet man page (enet.4)
27.\" distributed with 4.3BSD Unix.
28.\"
29.Dd March 13, 2010
30.Dt BPF 4
31.Os
32.Sh NAME
33.Nm bpf
34.Nd Berkeley Packet Filter raw network interface
35.Sh SYNOPSIS
36.Cd "pseudo-device bpfilter"
37.Sh DESCRIPTION
38The Berkeley Packet Filter
39provides a raw interface to data link layers in a protocol
40independent fashion.
41All packets on the network, even those destined for other hosts,
42are accessible through this mechanism.
43.Pp
44The packet filter appears as a character special device,
45.Pa /dev/bpf .
46After opening the device, the file descriptor must be bound to a
47specific network interface with the
48.Dv BIOSETIF
49ioctl.
50A given interface can be shared by multiple listeners, and the filter
51underlying each descriptor will see an identical packet stream.
52.Pp
53Associated with each open instance of a
54.Nm
55file is a user-settable packet filter.
56Whenever a packet is received by an interface,
57all file descriptors listening on that interface apply their filter.
58Each descriptor that accepts the packet receives its own copy.
59.Pp
60Reads from these files return the next group of packets
61that have matched the filter.
62To improve performance, the buffer passed to read must be
63the same size as the buffers used internally by
64.Nm .
65This size is returned by the
66.Dv BIOCGBLEN
67ioctl (see below), and under
68BSD, can be set with
69.Dv BIOCSBLEN .
70Note that an individual packet larger than this size is necessarily
71truncated.
72.Pp
73The packet filter will support any link level protocol that has fixed length
74headers.
75Currently, only Ethernet, SLIP and PPP drivers have been
76modified to interact with
77.Nm .
78.Pp
79Since packet data is in network byte order, applications should use the
80.Xr byteorder 3
81macros to extract multi-byte values.
82.Pp
83A packet can be sent out on the network by writing to a
84.Nm
85file descriptor.
86The writes are unbuffered, meaning only one packet can be processed per write.
87Currently, only writes to Ethernets and SLIP links are supported.
88.Sh IOCTLS
89The
90.Xr ioctl 2
91command codes below are defined in
92.In net/bpf.h .
93All commands require these includes:
94.Bd -literal -offset indent
95#include \*[Lt]sys/types.h\*[Gt]
96#include \*[Lt]sys/time.h\*[Gt]
97#include \*[Lt]sys/ioctl.h\*[Gt]
98#include \*[Lt]net/bpf.h\*[Gt]
99.Ed
100.Pp
101Additionally,
102.Dv BIOCGETIF
103and
104.Dv BIOCSETIF
105require
106.Pa \*[Lt]net/if.h\*[Gt] .
107.Pp
108The (third) argument to the
109.Xr ioctl 2
110should be a pointer to the type indicated.
111.Bl -tag -width indent -offset indent
112.It Dv "BIOCGBLEN (u_int)"
113Returns the required buffer length for reads on
114.Nm
115files.
116.It Dv "BIOCSBLEN (u_int)"
117Sets the buffer length for reads on
118.Nm
119files.
120The buffer must be set before the file is attached to an interface with
121.Dv BIOCSETIF .
122If the requested buffer size cannot be accommodated, the closest
123allowable size will be set and returned in the argument.
124A read call will result in
125.Er EINVAL
126if it is passed a buffer that is not this size.
127.It Dv BIOCGDLT (u_int)
128Returns the type of the data link layer underlying the attached interface.
129.Er EINVAL
130is returned if no interface has been specified.
131The device types, prefixed with
132.Dq DLT_ ,
133are defined in
134.In net/bpf.h .
135.It Dv BIOCGDLTLIST (struct bpf_dltlist)
136Returns an array of available type of the data link layer
137underlying the attached interface:
138.Bd -literal -offset indent
139struct bpf_dltlist {
140	u_int bfl_len;
141	u_int *bfl_list;
142};
143.Ed
144.Pp
145The available type is returned to the array pointed to the
146.Va bfl_list
147field while its length in u_int is supplied to the
148.Va bfl_len
149field.
150.Er ENOMEM
151is returned if there is not enough buffer.
152The
153.Va bfl_len
154field is modified on return to indicate the actual length in u_int
155of the array returned.
156If
157.Va bfl_list
158is
159.Dv NULL ,
160the
161.Va bfl_len
162field is returned to indicate the required length of an array in u_int.
163.It Dv BIOCSDLT (u_int)
164Change the type of the data link layer underlying the attached interface.
165.Er EINVAL
166is returned if no interface has been specified or the specified
167type is not available for the interface.
168.It Dv BIOCPROMISC
169Forces the interface into promiscuous mode.
170All packets, not just those destined for the local host, are processed.
171Since more than one file can be listening on a given interface,
172a listener that opened its interface non-promiscuously may receive
173packets promiscuously.
174This problem can be remedied with an appropriate filter.
175.Pp
176The interface remains in promiscuous mode until all files listening
177promiscuously are closed.
178.It Dv BIOCFLUSH
179Flushes the buffer of incoming packets,
180and resets the statistics that are returned by
181.Dv BIOCGSTATS .
182.It Dv BIOCGETIF (struct ifreq)
183Returns the name of the hardware interface that the file is listening on.
184The name is returned in the ifr_name field of
185.Fa ifr .
186All other fields are undefined.
187.It Dv BIOCSETIF (struct ifreq)
188Sets the hardware interface associate with the file.
189This command must be performed before any packets can be read.
190The device is indicated by name using the
191.Dv ifr_name
192field of the
193.Fa ifreq .
194Additionally, performs the actions of
195.Dv BIOCFLUSH .
196.It Dv BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval)
197Set or get the read timeout parameter.
198The
199.Fa timeval
200specifies the length of time to wait before timing
201out on a read request.
202This parameter is initialized to zero by
203.Xr open 2 ,
204indicating no timeout.
205.It Dv BIOCGSTATS (struct bpf_stat)
206Returns the following structure of packet statistics:
207.Bd -literal -offset indent
208struct bpf_stat {
209	uint64_t bs_recv;
210	uint64_t bs_drop;
211	uint64_t bs_capt;
212	uint64_t bs_padding[13];
213};
214.Ed
215.Pp
216The fields are:
217.Bl -tag -width bs_recv -offset indent
218.It Va bs_recv
219the number of packets received by the descriptor since opened or reset
220(including any buffered since the last read call);
221.It Va bs_drop
222the number of packets which were accepted by the filter but dropped by the
223kernel because of buffer overflows
224(i.e., the application's reads aren't keeping up with the packet
225traffic); and
226.It Va bs_capt
227the number of packets accepted by the filter.
228.El
229.It Dv BIOCIMMEDIATE (u_int)
230Enable or disable
231.Dq immediate mode ,
232based on the truth value of the argument.
233When immediate mode is enabled, reads return immediately upon packet
234reception.
235Otherwise, a read will block until either the kernel buffer
236becomes full or a timeout occurs.
237This is useful for programs like
238.Xr rarpd 8 ,
239which must respond to messages in real time.
240The default for a new file is off.
241.It Dv BIOCSETF (struct bpf_program)
242Sets the filter program used by the kernel to discard uninteresting
243packets.
244An array of instructions and its length is passed in using the following structure:
245.Bd -literal -offset indent
246struct bpf_program {
247	u_int bf_len;
248	struct bpf_insn *bf_insns;
249};
250.Ed
251.Pp
252The filter program is pointed to by the
253.Va bf_insns
254field while its length in units of
255.Sq struct bpf_insn
256is given by the
257.Va bf_len
258field.
259Also, the actions of
260.Dv BIOCFLUSH
261are performed.
262.Pp
263See section
264.Sy FILTER MACHINE
265for an explanation of the filter language.
266.It Dv BIOCVERSION (struct bpf_version)
267Returns the major and minor version numbers of the filter language currently
268recognized by the kernel.
269Before installing a filter, applications must check
270that the current version is compatible with the running kernel.
271Version numbers are compatible if the major numbers match and the
272application minor is less than or equal to the kernel minor.
273The kernel version number is returned in the following structure:
274.Bd -literal -offset indent
275struct bpf_version {
276	u_short bv_major;
277	u_short bv_minor;
278};
279.Ed
280.Pp
281The current version numbers are given by
282.Dv BPF_MAJOR_VERSION
283and
284.Dv BPF_MINOR_VERSION
285from
286.In net/bpf.h .
287An incompatible filter
288may result in undefined behavior (most likely, an error returned by
289.Xr ioctl 2
290or haphazard packet matching).
291.It Dv BIOCGHDRCMPLT BIOCSHDRCMPLT (u_int)
292Enable/disable or get the
293.Dq header complete
294flag status.
295If enabled, packets written to the bpf file descriptor will not have
296network layer headers rewritten in the interface output routine.
297By default, the flag is disabled (value is 0).
298.It Dv BIOCGSEESENT BIOCSSEESENT (u_int)
299Enable/disable or get the
300.Dq see sent
301flag status.
302If enabled, packets sent by the host (not from
303.Nm )
304will be passed to the filter.
305By default, the flag is enabled (value is 1).
306.It Dv BIOCFEEDBACK BIOCSFEEDBACK BIOCGFEEDBACK (u_int)
307Set (or get)
308.Dq packet feedback mode .
309This allows injected packets to be fed back as input to the interface when
310output via the interface is successful.
311The first name is meant for
312.Fx
313compatibility, the two others follow the Get/Set convention.
314.\"When
315.\".Dv BPF_D_INOUT
316.\"direction is set, injected
317Injected
318outgoing packets are not returned by BPF to avoid
319duplication. This flag is initialized to zero by default.
320.El
321.Sh STANDARD IOCTLS
322.Nm
323now supports several standard
324.Xr ioctl 2 Ns 's
325which allow the user to do async and/or non-blocking I/O to an open
326.Nm bpf
327file descriptor.
328.Bl -tag -width indent -offset indent
329.It Dv FIONREAD (int)
330Returns the number of bytes that are immediately available for reading.
331.It Dv SIOCGIFADDR (struct ifreq)
332Returns the address associated with the interface.
333.It Dv FIONBIO (int)
334Set or clear non-blocking I/O.
335If arg is non-zero, then doing a
336.Xr read 2
337when no data is available will return -1 and
338.Va errno
339will be set to
340.Er EAGAIN .
341If arg is zero, non-blocking I/O is disabled.
342Note: setting this
343overrides the timeout set by
344.Dv BIOCSRTIMEOUT .
345.It Dv FIOASYNC (int)
346Enable or disable async I/O.
347When enabled (arg is non-zero), the process or process group specified by
348.Dv FIOSETOWN
349will start receiving SIGIO's when packets
350arrive.
351Note that you must do an
352.Dv FIOSETOWN
353in order for this to take effect, as
354the system will not default this for you.
355The signal may be changed via
356.Dv BIOCSRSIG .
357.It Dv FIOSETOWN FIOGETOWN (int)
358Set or get the process or process group (if negative) that should receive SIGIO
359when packets are available.
360The signal may be changed using
361.Dv BIOCSRSIG
362(see above).
363.El
364.Sh BPF HEADER
365The following structure is prepended to each packet returned by
366.Xr read 2 :
367.Bd -literal -offset indent
368struct bpf_hdr {
369	struct bpf_timeval bh_tstamp;
370	uint32_t bh_caplen;
371	uint32_t bh_datalen;
372	uint16_t bh_hdrlen;
373};
374.Ed
375.Pp
376The fields, whose values are stored in host order, and are:
377.Bl -tag -width bh_datalen -offset indent
378.It Va bh_tstamp
379The time at which the packet was processed by the packet filter.
380This structure differs from the standard
381.Vt struct timeval
382in that both members are of type
383.Vt long .
384.It Va bh_caplen
385The length of the captured portion of the packet.
386This is the minimum of
387the truncation amount specified by the filter and the length of the packet.
388.It Va bh_datalen
389The length of the packet off the wire.
390This value is independent of the truncation amount specified by the filter.
391.It Va bh_hdrlen
392The length of the BPF header, which may not be equal to
393.Em sizeof(struct bpf_hdr) .
394.El
395.Pp
396The
397.Va bh_hdrlen
398field exists to account for
399padding between the header and the link level protocol.
400The purpose here is to guarantee proper alignment of the packet
401data structures, which is required on alignment sensitive
402architectures and improves performance on many other architectures.
403The packet filter ensures that the
404.Va bpf_hdr
405and the
406.Em network layer
407header will be word aligned.
408Suitable precautions must be taken when accessing the link layer
409protocol fields on alignment restricted machines.
410(This isn't a problem on an Ethernet, since
411the type field is a short falling on an even offset,
412and the addresses are probably accessed in a bytewise fashion).
413.Pp
414Additionally, individual packets are padded so that each starts
415on a word boundary.
416This requires that an application
417has some knowledge of how to get from packet to packet.
418The macro
419.Dv BPF_WORDALIGN
420is defined in
421.In net/bpf.h
422to facilitate this process.
423It rounds up its argument
424to the nearest word aligned value (where a word is
425.Dv BPF_ALIGNMENT
426bytes wide).
427.Pp
428For example, if
429.Sq Va p
430points to the start of a packet, this expression
431will advance it to the next packet:
432.Pp
433.Dl p = (char *)p + BPF_WORDALIGN(p-\*[Gt]bh_hdrlen + p-\*[Gt]bh_caplen)
434.Pp
435For the alignment mechanisms to work properly, the
436buffer passed to
437.Xr read 2
438must itself be word aligned.
439.Xr malloc 3
440will always return an aligned buffer.
441.Sh FILTER MACHINE
442A filter program is an array of instructions, with all branches forwardly
443directed, terminated by a
444.Sy return
445instruction.
446Each instruction performs some action on the pseudo-machine state,
447which consists of an accumulator, index register, scratch memory store,
448and implicit program counter.
449.Pp
450The following structure defines the instruction format:
451.Bd -literal -offset indent
452struct bpf_insn {
453	uint16_t code;
454	u_char 	jt;
455	u_char 	jf;
456	int32_t k;
457};
458.Ed
459.Pp
460The
461.Va k
462field is used in different ways by different instructions,
463and the
464.Va jt
465and
466.Va jf
467fields are used as offsets
468by the branch instructions.
469The opcodes are encoded in a semi-hierarchical fashion.
470There are eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX,
471BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC.
472Various other mode and
473operator bits are or'd into the class to give the actual instructions.
474The classes and modes are defined in
475.In net/bpf.h .
476.Pp
477Below are the semantics for each defined BPF instruction.
478We use the convention that A is the accumulator, X is the index register,
479P[] packet data, and M[] scratch memory store.
480P[i:n] gives the data at byte offset
481.Dq i
482in the packet,
483interpreted as a word (n=4),
484unsigned halfword (n=2), or unsigned byte (n=1).
485M[i] gives the i'th word in the scratch memory store, which is only
486addressed in word units.
487The memory store is indexed from 0 to BPF_MEMWORDS-1.
488.Va k ,
489.Va jt ,
490and
491.Va jf
492are the corresponding fields in the
493instruction definition.
494.Dq len
495refers to the length of the packet.
496.Bl -tag -width indent -offset indent
497.It Sy BPF_LD
498These instructions copy a value into the accumulator.
499The type of the source operand is specified by an
500.Dq addressing mode
501and can be a constant
502.Sy ( BBPF_IMM ) ,
503packet data at a fixed offset
504.Sy ( BPF_ABS ) ,
505packet data at a variable offset
506.Sy ( BPF_IND ) ,
507the packet length
508.Sy ( BPF_LEN ) ,
509or a word in the scratch memory store
510.Sy ( BPF_MEM ) .
511For
512.Sy BPF_IND
513and
514.Sy BPF_ABS ,
515the data size must be specified as a word
516.Sy ( BPF_W ) ,
517halfword
518.Sy ( BPF_H ) ,
519or byte
520.Sy ( BPF_B ) .
521The semantics of all the recognized BPF_LD instructions follow.
522.Bl -column "BPF_LD_BPF_W_BPF_ABS" "A \*[Lt]- P[k:4]" -offset indent
523.It Sy BPF_LD+BPF_W+BPF_ABS Ta A \*[Lt]- P[k:4]
524.It Sy BPF_LD+BPF_H+BPF_ABS Ta A \*[Lt]- P[k:2]
525.It Sy BPF_LD+BPF_B+BPF_ABS Ta A \*[Lt]- P[k:1]
526.It Sy BPF_LD+BPF_W+BPF_IND Ta A \*[Lt]- P[X+k:4]
527.It Sy BPF_LD+BPF_H+BPF_IND Ta A \*[Lt]- P[X+k:2]
528.It Sy BPF_LD+BPF_B+BPF_IND Ta A \*[Lt]- P[X+k:1]
529.It Sy BPF_LD+BPF_W+BPF_LEN Ta A \*[Lt]- len
530.It Sy BPF_LD+BPF_IMM Ta A \*[Lt]- k
531.It Sy BPF_LD+BPF_MEM Ta A \*[Lt]- M[k]
532.El
533.It Sy BPF_LDX
534These instructions load a value into the index register.
535Note that the addressing modes are more restricted than those of
536the accumulator loads, but they include
537.Sy BPF_MSH ,
538a hack for efficiently loading the IP header length.
539.Bl -column "BPF_LDX_BPF_W_BPF_IMM" "X \*[Lt]- k" -offset indent
540.It Sy BPF_LDX+BPF_W+BPF_IMM Ta X \*[Lt]- k
541.It Sy BPF_LDX+BPF_W+BPF_MEM Ta X \*[Lt]- M[k]
542.It Sy BPF_LDX+BPF_W+BPF_LEN Ta X \*[Lt]- len
543.It Sy BPF_LDX+BPF_B+BPF_MSH Ta X \*[Lt]- 4*(P[k:1]\*[Am]0xf)
544.El
545.It Sy BPF_ST
546This instruction stores the accumulator into the scratch memory.
547We do not need an addressing mode since there is only one possibility
548for the destination.
549.Bl -column "BPF_ST" "M[k] \*[Lt]- A" -offset indent
550.It Sy BPF_ST Ta M[k] \*[Lt]- A
551.El
552.It Sy BPF_STX
553This instruction stores the index register in the scratch memory store.
554.Bl -column "BPF_STX" "M[k] \*[Lt]- X" -offset indent
555.It Sy BPF_STX Ta M[k] \*[Lt]- X
556.El
557.It Sy BPF_ALU
558The alu instructions perform operations between the accumulator and
559index register or constant, and store the result back in the accumulator.
560For binary operations, a source mode is required
561.Sy ( BPF_K
562or
563.Sy BPF_X ) .
564.Bl -column "BPF_ALU_BPF_ADD_BPF_K" "A \*[Lt]- A + k" -offset indent
565.It Sy BPF_ALU+BPF_ADD+BPF_K Ta A \*[Lt]- A + k
566.It Sy BPF_ALU+BPF_SUB+BPF_K Ta A \*[Lt]- A - k
567.It Sy BPF_ALU+BPF_MUL+BPF_K Ta A \*[Lt]- A * k
568.It Sy BPF_ALU+BPF_DIV+BPF_K Ta A \*[Lt]- A / k
569.It Sy BPF_ALU+BPF_AND+BPF_K Ta A \*[Lt]- A \*[Am] k
570.It Sy BPF_ALU+BPF_OR+BPF_K Ta A \*[Lt]- A | k
571.It Sy BPF_ALU+BPF_LSH+BPF_K Ta A \*[Lt]- A \*[Lt]\*[Lt] k
572.It Sy BPF_ALU+BPF_RSH+BPF_K Ta A \*[Lt]- A \*[Gt]\*[Gt] k
573.It Sy BPF_ALU+BPF_ADD+BPF_X Ta A \*[Lt]- A + X
574.It Sy BPF_ALU+BPF_SUB+BPF_X Ta A \*[Lt]- A - X
575.It Sy BPF_ALU+BPF_MUL+BPF_X Ta A \*[Lt]- A * X
576.It Sy BPF_ALU+BPF_DIV+BPF_X Ta A \*[Lt]- A / X
577.It Sy BPF_ALU+BPF_AND+BPF_X Ta A \*[Lt]- A \*[Am] X
578.It Sy BPF_ALU+BPF_OR+BPF_X Ta A \*[Lt]- A | X
579.It Sy BPF_ALU+BPF_LSH+BPF_X Ta A \*[Lt]- A \*[Lt]\*[Lt] X
580.It Sy BPF_ALU+BPF_RSH+BPF_X Ta A \*[Lt]- A \*[Gt]\*[Gt] X
581.It Sy BPF_ALU+BPF_NEG Ta A \*[Lt]- -A
582.El
583.It Sy BPF_JMP
584The jump instructions alter flow of control.
585Conditional jumps compare the accumulator against a constant
586.Sy ( BPF_K )
587or the index register
588.Sy ( BPF_X ) .
589If the result is true (or non-zero),
590the true branch is taken, otherwise the false branch is taken.
591Jump offsets are encoded in 8 bits so the longest jump is 256 instructions.
592However, the jump always
593.Sy ( BPF_JA )
594opcode uses the 32 bit
595.Va k
596field as the offset, allowing arbitrarily distant destinations.
597All conditionals use unsigned comparison conventions.
598.Bl -column "BPF_JMP+BPF_JGE+BPF_K" "pc += (A \*[Ge] k) ? jt : jf" -offset indent
599.It Sy BPF_JMP+BPF_JA Ta pc += k
600.It Sy BPF_JMP+BPF_JGT+BPF_K Ta "pc += (A \*[Gt] k) ? jt : jf"
601.It Sy BPF_JMP+BPF_JGE+BPF_K Ta "pc += (A \*[Ge] k) ? jt : jf"
602.It Sy BPF_JMP+BPF_JEQ+BPF_K Ta "pc += (A == k) ? jt : jf"
603.It Sy BPF_JMP+BPF_JSET+BPF_K Ta "pc += (A \*[Am] k) ? jt : jf"
604.It Sy BPF_JMP+BPF_JGT+BPF_X Ta "pc += (A \*[Gt] X) ? jt : jf"
605.It Sy BPF_JMP+BPF_JGE+BPF_X Ta "pc += (A \*[Ge] X) ? jt : jf"
606.It Sy BPF_JMP+BPF_JEQ+BPF_X Ta "pc += (A == X) ? jt : jf"
607.It Sy BPF_JMP+BPF_JSET+BPF_X Ta "pc += (A \*[Am] X) ? jt : jf"
608.El
609.It Sy BPF_RET
610The return instructions terminate the filter program and specify the amount
611of packet to accept (i.e., they return the truncation amount).
612A return value of zero indicates that the packet should be ignored.
613The return value is either a constant
614.Sy ( BPF_K )
615or the accumulator
616.Sy ( BPF_A ) .
617.Bl -column "BPF_RET+BPF_A" "accept A bytes" -offset indent
618.It Sy BPF_RET+BPF_A Ta accept A bytes
619.It Sy BPF_RET+BPF_K Ta accept k bytes
620.El
621.It Sy BPF_MISC
622The miscellaneous category was created for anything that doesn't
623fit into the above classes, and for any new instructions that might need to
624be added.
625Currently, these are the register transfer instructions
626that copy the index register to the accumulator or vice versa.
627.Bl -column "BPF_MISC+BPF_TAX" "X \*[Lt]- A" -offset indent
628.It Sy BPF_MISC+BPF_TAX Ta X \*[Lt]- A
629.It Sy BPF_MISC+BPF_TXA Ta A \*[Lt]- X
630.El
631.El
632.Pp
633The BPF interface provides the following macros to facilitate
634array initializers:
635.Bd -unfilled -offset indent
636.Sy BPF_STMT No (opcode, operand)
637.Sy BPF_JUMP No (opcode, operand, true_offset, false_offset)
638.Ed
639.Sh SYSCTLS
640The following sysctls are available when
641.Nm
642is enabled:
643.Pp
644.Bl -tag -width "XnetXbpfXmaxbufsizeXX"
645.It Li net.bpf.maxbufsize
646Sets the maximum buffer size available for
647.Nm
648peers.
649.It Li net.bpf.stats
650Shows
651.Nm
652statistics.
653They can be retrieved with the
654.Xr netstat 1
655utility.
656.It Li net.bpf.peers
657Shows the current
658.Nm
659peers.
660This is only available to the super user and can also be retrieved with the
661.Xr netstat 1
662utility.
663.El
664.Sh FILES
665.Pa /dev/bpf
666.Sh EXAMPLES
667The following filter is taken from the Reverse ARP Daemon.
668It accepts only Reverse ARP requests.
669.Bd -literal -offset indent
670struct bpf_insn insns[] = {
671	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
672	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
673	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
674	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
675	BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
676	    sizeof(struct ether_header)),
677	BPF_STMT(BPF_RET+BPF_K, 0),
678};
679.Ed
680.Pp
681This filter accepts only IP packets between host 128.3.112.15 and
682128.3.112.35.
683.Bd -literal -offset indent
684struct bpf_insn insns[] = {
685	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
686	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
687	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
688	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
689	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
690	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
691	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
692	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
693	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
694	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
695	BPF_STMT(BPF_RET+BPF_K, 0),
696};
697.Ed
698.Pp
699Finally, this filter returns only TCP finger packets.
700We must parse the IP header to reach the TCP header.
701The
702.Sy BPF_JSET
703instruction checks that the IP fragment offset is 0 so we are sure
704that we have a TCP header.
705.Bd -literal -offset indent
706struct bpf_insn insns[] = {
707	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
708	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
709	BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
710	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
711	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
712	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
713	BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
714	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
715	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
716	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
717	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
718	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
719	BPF_STMT(BPF_RET+BPF_K, 0),
720};
721.Ed
722.Sh SEE ALSO
723.Xr ioctl 2 ,
724.Xr read 2 ,
725.Xr select 2 ,
726.Xr signal 3 ,
727.Xr tcpdump 8
728.Rs
729.%T "The BSD Packet Filter: A New Architecture for User-level Packet Capture"
730.%A S. McCanne
731.%A V. Jacobson
732.%J Proceedings of the 1993 Winter USENIX
733.%C Technical Conference, San Diego, CA
734.Re
735.Sh HISTORY
736The Enet packet filter was created in 1980 by Mike Accetta and
737Rick Rashid at Carnegie-Mellon University.
738Jeffrey Mogul, at Stanford, ported the code to BSD and continued
739its development from 1983 on.
740Since then, it has evolved into the ULTRIX Packet Filter
741at DEC, a STREAMS NIT module under SunOS 4.1, and BPF.
742.Sh AUTHORS
743Steven McCanne, of Lawrence Berkeley Laboratory, implemented BPF in
744Summer 1990.
745The design was in collaboration with Van Jacobson,
746also of Lawrence Berkeley Laboratory.
747.Sh BUGS
748The read buffer must be of a fixed size (returned by the
749.Dv BIOCGBLEN
750ioctl).
751.Pp
752A file that does not request promiscuous mode may receive promiscuously
753received packets as a side effect of another file requesting this
754mode on the same hardware interface.
755This could be fixed in the kernel with additional processing overhead.
756However, we favor the model where
757all files must assume that the interface is promiscuous, and if
758so desired, must use a filter to reject foreign packets.
759.Pp
760Data link protocols with variable length headers are not currently supported.
761.Pp
762Under SunOS, if a BPF application reads more than 2^31 bytes of
763data, read will fail in
764.Er EINVAL .
765You can either fix the bug in SunOS,
766or lseek to 0 when read fails for this reason.
767.Pp
768.Dq Immediate mode
769and the
770.Dq read timeout
771are misguided features.
772This functionality can be emulated with non-blocking mode and
773.Xr select 2 .
774