xref: /netbsd-src/share/man/man4/bpf.4 (revision 3b01aba77a7a698587faaae455bbfe740923c1f5)
1.\" -*- nroff -*-
2.\"
3.\"	$NetBSD: bpf.4,v 1.14 2001/05/19 17:23:39 jdolecek Exp $
4.\"
5.\" Copyright (c) 1990, 1991, 1992, 1993, 1994
6.\"	The Regents of the University of California.  All rights reserved.
7.\"
8.\" Redistribution and use in source and binary forms, with or without
9.\" modification, are permitted provided that: (1) source code distributions
10.\" retain the above copyright notice and this paragraph in its entirety, (2)
11.\" distributions including binary code include the above copyright notice and
12.\" this paragraph in its entirety in the documentation or other materials
13.\" provided with the distribution, and (3) all advertising materials mentioning
14.\" features or use of this software display the following acknowledgement:
15.\" ``This product includes software developed by the University of California,
16.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
17.\" the University nor the names of its contributors may be used to endorse
18.\" or promote products derived from this software without specific prior
19.\" written permission.
20.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
21.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
22.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
23.\"
24.\" This document is derived in part from the enet man page (enet.4)
25.\" distributed with 4.3BSD Unix.
26.\"
27.Dd June 28, 1994
28.Dt BPF 4
29.Os
30.Sh NAME
31.Nm bpf
32.Nd Berkeley Packet Filter raw network interface
33.Sh SYNOPSIS
34.Cd "pseudo-device bpfilter 16"
35.Sh DESCRIPTION
36The Berkeley Packet Filter
37provides a raw interface to data link layers in a protocol
38independent fashion.
39All packets on the network, even those destined for other hosts,
40are accessible through this mechanism.
41.Pp
42The packet filter appears as a character special device,
43.Pa /dev/bpf0 ,
44.Pa /dev/bpf1 ,
45etc.
46After opening the device, the file descriptor must be bound to a
47specific network interface with the
48.Dv BIOSETIF
49ioctl.
50A given interface can be shared be multiple listeners, and the filter
51underlying each descriptor will see an identical packet stream.
52The total number of open
53files is limited to the value given in the kernel configuration; the
54example given in the SYNOPSIS above sets the limit to 16.
55.Pp
56A separate device file is required for each minor device.
57If a file is in use, the open will fail and
58.Va errno
59will be set to EBUSY.
60.Pp
61Associated with each open instance of a
62.Nm
63file is a user-settable packet filter.
64Whenever a packet is received by an interface,
65all file descriptors listening on that interface apply their filter.
66Each descriptor that accepts the packet receives its own copy.
67.Pp
68Reads from these files return the next group of packets
69that have matched the filter.
70To improve performance, the buffer passed to read must be
71the same size as the buffers used internally by
72.Nm "" .
73This size is returned by the
74.Dv BIOCGBLEN
75ioctl (see below), and under
76BSD, can be set with
77.Dv BIOCSBLEN .
78Note that an individual packet larger than this size is necessarily
79truncated.
80.Pp
81The packet filter will support any link level protocol that has fixed length
82headers.  Currently, only Ethernet, SLIP and PPP drivers have been
83modified to interact with
84.Nm "" .
85.Pp
86Since packet data is in network byte order, applications should use the
87.Xr byteorder 3
88macros to extract multi-byte values.
89.Pp
90A packet can be sent out on the network by writing to a
91.Nm
92file descriptor.  The writes are unbuffered, meaning only one
93packet can be processed per write.
94Currently, only writes to Ethernets and SLIP links are supported.
95.Sh IOCTLS
96The
97.Xr ioctl 2
98command codes below are defined in <net/bpf.h>.  All commands require
99these includes:
100.Bd -literal -offset indent
101.Fd #include <sys/types.h>
102.Fd #include <sys/time.h>
103.Fd #include <sys/ioctl.h>
104.Fd #include <net/bpf.h>
105.Ed
106.Pp
107Additionally, BIOCGETIF and BIOCSETIF require
108.Pa <net/if.h> .
109.Pp
110The (third) argument to the
111.Xr ioctl 2
112should be a pointer to the type indicated.
113.Bl -tag -width -offset indent
114.It Dv "BIOCGBLEN (u_int)"
115Returns the required buffer length for reads on
116.Nm
117files.
118.It Dv "BIOCSBLEN (u_int)"
119Sets the buffer length for reads on
120.Nm
121files.  The buffer must be set before the file is attached to an interface
122with
123.Dv BIOCSETIF .
124If the requested buffer size cannot be accommodated, the closest
125allowable size will be set and returned in the argument.
126A read call will result in EIO if it is passed a buffer that is not this size.
127.It Dv BIOCGDLT (u_int)
128Returns the type of the data link layer underlying the attached interface.
129EINVAL is returned if no interface has been specified.
130The device types, prefixed with
131.Dq DLT_ ,
132are defined in <net/bpf.h>.
133.It Dv BIOCPROMISC
134Forces the interface into promiscuous mode.
135All packets, not just those destined for the local host, are processed.
136Since more than one file can be listening on a given interface,
137a listener that opened its interface non-promiscuously may receive
138packets promiscuously.  This problem can be remedied with an
139appropriate filter.
140.Pp
141The interface remains in promiscuous mode until all files listening
142promiscuously are closed.
143.It Dv BIOCFLUSH
144Flushes the buffer of incoming packets,
145and resets the statistics that are returned by
146.Dv BIOCGSTATS .
147.It Dv BIOCGETIF (struct ifreq)
148Returns the name of the hardware interface that the file is listening on.
149The name is returned in the ifr_name field of
150.Fa ifr .
151All other fields are undefined.
152.It Dv BIOCSETIF (struct ifreq)
153Sets the hardware interface associate with the file.  This
154command must be performed before any packets can be read.
155The device is indicated by name using the
156.Dv ifr_name
157field of the
158.Fa ifreq .
159Additionally, performs the actions of
160.Dv BIOCFLUSH .
161.It Dv BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval)
162Set or get the read timeout parameter.
163The
164.Fa timeval
165specifies the length of time to wait before timing
166out on a read request.
167This parameter is initialized to zero by
168.Xr open 2 ,
169indicating no timeout.
170.It Dv BIOCGSTATS (struct bpf_stat)
171Returns the following structure of packet statistics:
172.Bd -literal -offset indent
173struct bpf_stat {
174	u_int bs_recv;
175	u_int bs_drop;
176};
177.Ed
178.Pp
179The fields are:
180.Bl -tag -width bs_recv -offset indent
181.It Va bs_recv
182the number of packets received by the descriptor since opened or reset
183(including any buffered since the last read call);
184and
185.It Va bs_drop
186the number of packets which were accepted by the filter but dropped by the
187kernel because of buffer overflows
188(i.e., the application's reads aren't keeping up with the packet traffic).
189.El
190.It Dv BIOCIMMEDIATE (u_int)
191Enable or disable
192.Dq immediate mode ,
193based on the truth value of the argument.
194When immediate mode is enabled, reads return immediately upon packet
195reception.  Otherwise, a read will block until either the kernel buffer
196becomes full or a timeout occurs.
197This is useful for programs like
198.Xr rarpd 8 ,
199which must respond to messages in real time.
200The default for a new file is off.
201.It Dv BIOCSETF (struct bpf_program)
202Sets the filter program used by the kernel to discard uninteresting
203packets.  An array of instructions and its length is passed in using
204the following structure:
205.Bd -literal -offset indent
206struct bpf_program {
207	int bf_len;
208	struct bpf_insn *bf_insns;
209};
210.Ed
211.Pp
212The filter program is pointed to by the
213.Va bf_insns
214field while its length in units of
215.Sq struct bpf_insn
216is given by the
217.Va bf_len
218field.
219Also, the actions of
220.Dv BIOCFLUSH
221are performed.
222.Pp
223See section
224.Sy FILTER MACHINE
225for an explanation of the filter language.
226.It Dv BIOCVERSION (struct bpf_version)
227Returns the major and minor version numbers of the filter language currently
228recognized by the kernel.  Before installing a filter, applications must check
229that the current version is compatible with the running kernel.  Version
230numbers are compatible if the major numbers match and the application minor
231is less than or equal to the kernel minor.  The kernel version number is
232returned in the following structure:
233.Bd -literal -offset indent
234struct bpf_version {
235	u_short bv_major;
236	u_short bv_minor;
237};
238.Ed
239.Pp
240The current version numbers are given by
241.Dv BPF_MAJOR_VERSION
242and
243.Dv BPF_MINOR_VERSION
244from <net/bpf.h>.
245An incompatible filter
246may result in undefined behavior (most likely, an error returned by
247.Xr ioctl 2
248or haphazard packet matching).
249.It Dv BIOCSRSIG BIOCGRSIG (u_int signal)
250Set or get the receive signal.  This signal will be sent to the process or process group
251specified by FIOSETOWN.  It defaults to SIGIO.
252.El
253.Sh STANDARD IOCTLS
254.Nm
255now supports several standard
256.Xr ioctl 2 's
257which allow the user to do async and/or non-blocking I/O to an open
258.I bpf
259file descriptor.
260.Bl -tag -width -offset indent
261.It Dv FIONREAD (int)
262Returns the number of bytes that are immediately available for reading.
263.It Dv SIOCGIFADDR (struct ifreq)
264Returns the address associated with the interface.
265.It Dv FIONBIO (int)
266Set or clear non-blocking I/O.  If arg is non-zero, then doing a
267.Xr read 2
268when no data is available will return -1 and
269.Va errno
270will be set to EAGAIN.
271If arg is zero, non-blocking I/O is disabled.  Note:  setting this
272overrides the timeout set by
273.Dv BIOCSRTIMEOUT .
274.It Dv FIOASYNC (int)
275Enable or disable async I/O.  When enabled (arg is non-zero), the process or
276process group specified by FIOSETOWN will start receiving SIGIO's when packets
277arrive.
278Note that you must do an FIOSETOWN in order for this to take affect, as
279the system will not default this for you.
280The signal may be changed via
281.Dv BIOCSRSIG .
282.It Dv FIOSETOWN FIOGETOWN (int)
283Set or get the process or process group (if negative) that should receive SIGIO
284when packets are available.
285The signal may be changed using
286.Dv BIOCSRSIG
287(see above).
288.El
289.Sh BPF HEADER
290The following structure is prepended to each packet returned by
291.Xr read 2 :
292.Bd -literal -offset indent
293struct bpf_hdr {
294	struct timeval bh_tstamp;
295	u_long bh_caplen;
296	u_long bh_datalen;
297	u_short bh_hdrlen;
298};
299.Ed
300.Pp
301The fields, whose values are stored in host order, and are:
302.Bl -tag -width bh_datalen -offset indent
303.It Va bh_tstamp
304The time at which the packet was processed by the packet filter.
305.It Va bh_caplen
306The length of the captured portion of the packet.  This is the minimum of
307the truncation amount specified by the filter and the length of the packet.
308.It Va bh_datalen
309The length of the packet off the wire.
310This value is independent of the truncation amount specified by the filter.
311.It Va bh_hdrlen
312The length of the BPF header, which may not be equal to
313.Em sizeof(struct bpf_hdr) .
314.El
315.Pp
316The
317.Va bh_hdrlen
318field exists to account for
319padding between the header and the link level protocol.
320The purpose here is to guarantee proper alignment of the packet
321data structures, which is required on alignment sensitive
322architectures and and improves performance on many other architectures.
323The packet filter ensures that the
324.Va bpf_hdr
325and the
326.Em network layer
327header will be word aligned.  Suitable precautions
328must be taken when accessing the link layer protocol fields on alignment
329restricted machines.  (This isn't a problem on an Ethernet, since
330the type field is a short falling on an even offset,
331and the addresses are probably accessed in a bytewise fashion).
332.Pp
333Additionally, individual packets are padded so that each starts
334on a word boundary.  This requires that an application
335has some knowledge of how to get from packet to packet.
336The macro
337.Dv BPF_WORDALIGN
338is defined in
339.Pa <net/bpf.h>
340to facilitate this process.
341It rounds up its argument
342to the nearest word aligned value (where a word is BPF_ALIGNMENT bytes wide).
343.Pp
344For example, if
345.Sq Va p
346points to the start of a packet, this expression
347will advance it to the next packet:
348.Pp
349.Dl p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen)
350.Pp
351For the alignment mechanisms to work properly, the
352buffer passed to
353.Xr read 2
354must itself be word aligned.
355.Xr malloc 3
356will always return an aligned buffer.
357.Sh FILTER MACHINE
358A filter program is an array of instructions, with all branches forwardly
359directed, terminated by a
360.Sy return
361instruction.
362Each instruction performs some action on the pseudo-machine state,
363which consists of an accumulator, index register, scratch memory store,
364and implicit program counter.
365
366The following structure defines the instruction format:
367.Bd -literal -offset indent
368struct bpf_insn {
369	u_short	code;
370	u_char 	jt;
371	u_char 	jf;
372	long k;
373};
374.Ed
375.Pp
376The
377.Va k
378field is used in different ways by different instructions,
379and the
380.Va jt
381and
382.Va jf
383fields are used as offsets
384by the branch instructions.
385The opcodes are encoded in a semi-hierarchical fashion.
386There are eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX,
387BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC.  Various other mode and
388operator bits are or'd into the class to give the actual instructions.
389The classes and modes are defined in <net/bpf.h>.
390.Pp
391Below are the semantics for each defined BPF instruction.
392We use the convention that A is the accumulator, X is the index register,
393P[] packet data, and M[] scratch memory store.
394P[i:n] gives the data at byte offset
395.Dq i
396in the packet,
397interpreted as a word (n=4),
398unsigned halfword (n=2), or unsigned byte (n=1).
399M[i] gives the i'th word in the scratch memory store, which is only
400addressed in word units.  The memory store is indexed from 0 to BPF_MEMWORDS-1.
401.Va k ,
402.Va jt ,
403and
404.Va jf
405are the corresponding fields in the
406instruction definition.
407.Dq len
408refers to the length of the packet.
409.Bl -tag -width -offset indent
410.It Sy BPF_LD
411These instructions copy a value into the accumulator.  The type of the
412source operand is specified by an
413.Dq addressing mode
414and can be a constant
415.No ( Ns Sy BBPF_IMM Ns ),
416packet data at a fixed offset
417.No ( Ns Sy BPF_ABS Ns ),
418packet data at a variable offset
419.No ( Ns Sy BPF_IND Ns ),
420the packet length
421.No ( Ns Sy BPF_LEN Ns ),
422or a word in the scratch memory store
423.No ( Ns Sy BPF_MEM Ns ).
424For
425.Sy BPF_IND
426and
427.Sy BPF_ABS ,
428the data size must be specified as a word
429.No ( Ns Sy BPF_W Ns ),
430halfword
431.No ( Ns Sy BPF_H Ns ),
432or byte
433.No ( Ns Sy BPF_B Ns ).
434The semantics of all the recognized BPF_LD instructions follow.
435.Bl -column "BPF_LD+BPF_W+BPF_ABS" "A <- P[k:4]" -width -offset indent
436.It Sy BPF_LD+BPF_W+BPF_ABS Ta A <- P[k:4]
437.It Li Sy BPF_LD+BPF_H+BPF_ABS Ta A <- P[k:2]
438.It Li Sy BPF_LD+BPF_B+BPF_ABS Ta A <- P[k:1]
439.It Li Sy BPF_LD+BPF_W+BPF_IND Ta A <- P[X+k:4]
440.It Li Sy BPF_LD+BPF_H+BPF_IND Ta A <- P[X+k:2]
441.It Li Sy BPF_LD+BPF_B+BPF_IND Ta A <- P[X+k:1]
442.It Li Sy BPF_LD+BPF_W+BPF_LEN Ta A <- len
443.It Li Sy BPF_LD+BPF_IMM Ta A <- k
444.It Li Sy BPF_LD+BPF_MEM Ta A <- M[k]
445.El
446.It Sy BPF_LDX
447These instructions load a value into the index register.  Note that
448the addressing modes are more restricted than those of the accumulator loads,
449but they include
450.Sy BPF_MSH ,
451a hack for efficiently loading the IP header length.
452.Bl -column "BPF_LDX+BPF_W+BPF_IMM" "X <- k" -width -offset indent
453.It Sy BPF_LDX+BPF_W+BPF_IMM Ta X <- k
454.It Li Sy BPF_LDX+BPF_W+BPF_MEM Ta X <- M[k]
455.It Li Sy BPF_LDX+BPF_W+BPF_LEN Ta X <- len
456.It Li Sy BPF_LDX+BPF_B+BPF_MSH Ta X <- 4*(P[k:1]&0xf)
457.El
458.It Sy BPF_ST
459This instruction stores the accumulator into the scratch memory.
460We do not need an addressing mode since there is only one possibility
461for the destination.
462.Bl -column "BPF_ST" "M[k] <- A" -width -offset indent
463.It Sy BPF_ST Ta M[k] <- A
464.El
465.It Sy BPF_STX
466This instruction stores the index register in the scratch memory store.
467.Bl -column "BPF_STX" "M[k] <- X" -width -offset indent
468.It Sy BPF_STX Ta M[k] <- X
469.El
470.It Sy BPF_ALU
471The alu instructions perform operations between the accumulator and
472index register or constant, and store the result back in the accumulator.
473For binary operations, a source mode is required
474.No ( Ns Sy BPF_K
475or
476.Sy BPF_X Ns ).
477.Bl -column "BPF_ALU+BPF_ADD+BPF_K" "A <- A + k" -width -offset indent
478.It Sy BPF_ALU+BPF_ADD+BPF_K Ta A <- A + k
479.It Li Sy BPF_ALU+BPF_SUB+BPF_K Ta A <- A - k
480.It Li Sy BPF_ALU+BPF_MUL+BPF_K Ta A <- A * k
481.It Li Sy BPF_ALU+BPF_DIV+BPF_K Ta A <- A / k
482.It Li Sy BPF_ALU+BPF_AND+BPF_K Ta A <- A & k
483.It Li Sy BPF_ALU+BPF_OR+BPF_K Ta A <- A | k
484.It Li Sy BPF_ALU+BPF_LSH+BPF_K Ta A <- A << k
485.It Li Sy BPF_ALU+BPF_RSH+BPF_K Ta A <- A >> k
486.It Li Sy BPF_ALU+BPF_ADD+BPF_X Ta A <- A + X
487.It Li Sy BPF_ALU+BPF_SUB+BPF_X Ta A <- A - X
488.It Li Sy BPF_ALU+BPF_MUL+BPF_X Ta A <- A * X
489.It Li Sy BPF_ALU+BPF_DIV+BPF_X Ta A <- A / X
490.It Li Sy BPF_ALU+BPF_AND+BPF_X Ta A <- A & X
491.It Li Sy BPF_ALU+BPF_OR+BPF_X Ta A <- A | X
492.It Li Sy BPF_ALU+BPF_LSH+BPF_X Ta A <- A << X
493.It Li Sy BPF_ALU+BPF_RSH+BPF_X Ta A <- A >> X
494.It Li Sy BPF_ALU+BPF_NEG Ta A <- -A
495.El
496.It Sy BPF_JMP
497The jump instructions alter flow of control.  Conditional jumps
498compare the accumulator against a constant
499.No ( Ns Sy BPF_K Ns )
500or the index register
501.No ( Ns Sy BPF_X Ns ).
502If the result is true (or non-zero),
503the true branch is taken, otherwise the false branch is taken.
504Jump offsets are encoded in 8 bits so the longest jump is 256 instructions.
505However, the jump always
506.No ( Ns Sy BPF_JA Ns )
507opcode uses the 32 bit
508.Va k
509field as the offset, allowing arbitrarily distant destinations.
510All conditionals use unsigned comparison conventions.
511.Bl -column "BPF_JMP+BPF_JGE+BPF_K" "pc += (A >= k) ? jt : jf" -width -offset indent
512.It Sy BPF_JMP+BPF_JA Ta pc += k
513.It Li Sy BPF_JMP+BPF_JGT+BPF_K Ta "pc += (A > k) ? jt : jf"
514.It Li Sy BPF_JMP+BPF_JGE+BPF_K Ta "pc += (A >= k) ? jt : jf"
515.It Li Sy BPF_JMP+BPF_JEQ+BPF_K Ta "pc += (A == k) ? jt : jf"
516.It Li Sy BPF_JMP+BPF_JSET+BPF_K Ta "pc += (A & k) ? jt : jf"
517.It Li Sy BPF_JMP+BPF_JGT+BPF_X Ta "pc += (A > X) ? jt : jf"
518.It Li Sy BPF_JMP+BPF_JGE+BPF_X Ta "pc += (A >= X) ? jt : jf"
519.It Li Sy BPF_JMP+BPF_JEQ+BPF_X Ta "pc += (A == X) ? jt : jf"
520.It Li Sy BPF_JMP+BPF_JSET+BPF_X Ta "pc += (A & X) ? jt : jf"
521.El
522.It Sy BPF_RET
523The return instructions terminate the filter program and specify the amount
524of packet to accept (i.e., they return the truncation amount).  A return
525value of zero indicates that the packet should be ignored.
526The return value is either a constant
527.No ( Ns Sy BPF_K Ns )
528or the accumulator
529.No ( Ns Sy BPF_A Ns ).
530.Bl -column "BPF_RET+BPF_A" "accept A bytes" -width -offset indent
531.It Sy BPF_RET+BPF_A Ta accept A bytes
532.It Li Sy BPF_RET+BPF_K Ta accept k bytes
533.El
534.It Sy BPF_MISC
535The miscellaneous category was created for anything that doesn't
536fit into the above classes, and for any new instructions that might need to
537be added.  Currently, these are the register transfer instructions
538that copy the index register to the accumulator or vice versa.
539.Bl -column "BPF_MISC+BPF_TAX" "X <- A" -width -offset indent
540.It Sy BPF_MISC+BPF_TAX Ta X <- A
541.It Li Sy BPF_MISC+BPF_TXA Ta A <- X
542.El
543.El
544.Pp
545The BPF interface provides the following macros to facilitate
546array initializers:
547.Bd -literal -offset indent
548.Sy BPF_STMT Ns (opcode, operand)
549.Sy BPF_JUMP Ns (opcode, operand, true_offset, false_offset)
550.Ed
551.Pp
552.Sh EXAMPLES
553The following filter is taken from the Reverse ARP Daemon.  It accepts
554only Reverse ARP requests.
555.Bd -literal -offset indent
556struct bpf_insn insns[] = {
557	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
558	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
559	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
560	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
561	BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
562	    sizeof(struct ether_header)),
563	BPF_STMT(BPF_RET+BPF_K, 0),
564};
565.Ed
566.Pp
567This filter accepts only IP packets between host 128.3.112.15 and
568128.3.112.35.
569.Bd -literal -offset indent
570struct bpf_insn insns[] = {
571	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
572	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
573	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
574	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
575	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
576	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
577	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
578	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
579	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
580	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
581	BPF_STMT(BPF_RET+BPF_K, 0),
582};
583.Ed
584.Pp
585Finally, this filter returns only TCP finger packets.  We must parse
586the IP header to reach the TCP header.  The
587.Sy BPF_JSET
588instruction checks that the IP fragment offset is 0 so we are sure
589that we have a TCP header.
590.Bd -literal -offset indent
591struct bpf_insn insns[] = {
592	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
593	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
594	BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
595	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
596	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
597	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
598	BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
599	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
600	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
601	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
602	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
603	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
604	BPF_STMT(BPF_RET+BPF_K, 0),
605};
606.Ed
607.Sh SEE ALSO
608.Xr ioctl 2 ,
609.Xr read 2 ,
610.Xr select 2 ,
611.Xr signal 3 ,
612.Xr tcpdump 8
613.Rs
614.%T "The BSD Packet Filter: A New Architecture for User-level Packet Capture".
615.%A McCanne, S. and Jacobson V.
616.%J Proceedings of the 1993 Winter USENIX
617.%C Technical Conference, San Diego, CA.
618.Re
619.Sh FILES
620/dev/bpf0, /dev/bpf1, ...
621.Sh BUGS
622The read buffer must be of a fixed size (returned by the
623.Dv BIOCGBLEN
624ioctl).
625.Pp
626A file that does not request promiscuous mode may receive promiscuously
627received packets as a side effect of another file requesting this
628mode on the same hardware interface.  This could be fixed in the kernel
629with additional processing overhead.  However, we favor the model where
630all files must assume that the interface is promiscuous, and if
631so desired, must utilize a filter to reject foreign packets.
632.Pp
633Data link protocols with variable length headers are not currently supported.
634.Pp
635Under SunOS, if a BPF application reads more than 2^31 bytes of
636data, read will fail in EINVAL.  You can either fix the bug in SunOS,
637or lseek to 0 when read fails for this reason.
638.Pp
639.Dq Immediate mode
640and the
641.Dq read timeout
642are misguided features.
643This functionality can be emulated with non-blocking mode and
644.Xr select 2 .
645.Sh HISTORY
646The Enet packet filter was created in 1980 by Mike Accetta and
647Rick Rashid at Carnegie-Mellon University.  Jeffrey Mogul, at
648Stanford, ported the code to BSD and continued its development from
6491983 on.  Since then, it has evolved into the Ultrix Packet Filter
650at DEC, a STREAMS NIT module under SunOS 4.1, and BPF.
651.Sh AUTHORS
652Steven McCanne, of Lawrence Berkeley Laboratory, implemented BPF in
653Summer 1990.  The design was in collaboration with Van Jacobson,
654also of Lawrence Berkeley Laboratory.
655