1.\" $NetBSD: bpf.4,v 1.46 2010/06/08 04:11:06 jruoho Exp $ 2.\" 3.\" -*- nroff -*- 4.\" 5.\" $NetBSD: bpf.4,v 1.46 2010/06/08 04:11:06 jruoho Exp $ 6.\" 7.\" Copyright (c) 1990, 1991, 1992, 1993, 1994 8.\" The Regents of the University of California. All rights reserved. 9.\" 10.\" Redistribution and use in source and binary forms, with or without 11.\" modification, are permitted provided that: (1) source code distributions 12.\" retain the above copyright notice and this paragraph in its entirety, (2) 13.\" distributions including binary code include the above copyright notice and 14.\" this paragraph in its entirety in the documentation or other materials 15.\" provided with the distribution, and (3) all advertising materials mentioning 16.\" features or use of this software display the following acknowledgement: 17.\" ``This product includes software developed by the University of California, 18.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of 19.\" the University nor the names of its contributors may be used to endorse 20.\" or promote products derived from this software without specific prior 21.\" written permission. 22.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED 23.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF 24.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. 25.\" 26.\" This document is derived in part from the enet man page (enet.4) 27.\" distributed with 4.3BSD Unix. 28.\" 29.Dd June 8, 2010 30.Dt BPF 4 31.Os 32.Sh NAME 33.Nm bpf 34.Nd Berkeley Packet Filter raw network interface 35.Sh SYNOPSIS 36.Cd "pseudo-device bpfilter" 37.Sh DESCRIPTION 38The Berkeley Packet Filter 39provides a raw interface to data link layers in a protocol 40independent fashion. 41All packets on the network, even those destined for other hosts, 42are accessible through this mechanism. 43.Pp 44The packet filter appears as a character special device, 45.Pa /dev/bpf . 46After opening the device, the file descriptor must be bound to a 47specific network interface with the 48.Dv BIOSETIF 49ioctl. 50A given interface can be shared by multiple listeners, and the filter 51underlying each descriptor will see an identical packet stream. 52.Pp 53Associated with each open instance of a 54.Nm 55file is a user-settable packet filter. 56Whenever a packet is received by an interface, 57all file descriptors listening on that interface apply their filter. 58Each descriptor that accepts the packet receives its own copy. 59.Pp 60Reads from these files return the next group of packets 61that have matched the filter. 62To improve performance, the buffer passed to read must be 63the same size as the buffers used internally by 64.Nm . 65This size is returned by the 66.Dv BIOCGBLEN 67ioctl (see below), and under 68BSD, can be set with 69.Dv BIOCSBLEN . 70Note that an individual packet larger than this size is necessarily 71truncated. 72.Pp 73The packet filter will support any link level protocol that has fixed length 74headers. 75Currently, only Ethernet, SLIP and PPP drivers have been 76modified to interact with 77.Nm . 78.Pp 79Since packet data is in network byte order, applications should use the 80.Xr byteorder 3 81macros to extract multi-byte values. 82.Pp 83A packet can be sent out on the network by writing to a 84.Nm 85file descriptor. 86The writes are unbuffered, meaning only one packet can be processed per write. 87Currently, only writes to Ethernets and SLIP links are supported. 88.Sh IOCTLS 89The 90.Xr ioctl 2 91command codes below are defined in 92.In net/bpf.h . 93All commands require these includes: 94.Bd -literal -offset indent 95#include \*[Lt]sys/types.h\*[Gt] 96#include \*[Lt]sys/time.h\*[Gt] 97#include \*[Lt]sys/ioctl.h\*[Gt] 98#include \*[Lt]net/bpf.h\*[Gt] 99.Ed 100.Pp 101Additionally, 102.Dv BIOCGETIF 103and 104.Dv BIOCSETIF 105require 106.Pa \*[Lt]net/if.h\*[Gt] . 107.Pp 108The (third) argument to the 109.Xr ioctl 2 110should be a pointer to the type indicated. 111.Bl -tag -width indent -offset indent 112.It Dv "BIOCGBLEN (u_int)" 113Returns the required buffer length for reads on 114.Nm 115files. 116.It Dv "BIOCSBLEN (u_int)" 117Sets the buffer length for reads on 118.Nm 119files. 120The buffer must be set before the file is attached to an interface with 121.Dv BIOCSETIF . 122If the requested buffer size cannot be accommodated, the closest 123allowable size will be set and returned in the argument. 124A read call will result in 125.Er EINVAL 126if it is passed a buffer that is not this size. 127.It Dv BIOCGDLT (u_int) 128Returns the type of the data link layer underlying the attached interface. 129.Er EINVAL 130is returned if no interface has been specified. 131The device types, prefixed with 132.Dq DLT_ , 133are defined in 134.In net/bpf.h . 135.It Dv BIOCGDLTLIST (struct bpf_dltlist) 136Returns an array of available type of the data link layer 137underlying the attached interface: 138.Bd -literal -offset indent 139struct bpf_dltlist { 140 u_int bfl_len; 141 u_int *bfl_list; 142}; 143.Ed 144.Pp 145The available type is returned to the array pointed to the 146.Va bfl_list 147field while its length in u_int is supplied to the 148.Va bfl_len 149field. 150.Er ENOMEM 151is returned if there is not enough buffer. 152The 153.Va bfl_len 154field is modified on return to indicate the actual length in u_int 155of the array returned. 156If 157.Va bfl_list 158is 159.Dv NULL , 160the 161.Va bfl_len 162field is returned to indicate the required length of an array in u_int. 163.It Dv BIOCSDLT (u_int) 164Change the type of the data link layer underlying the attached interface. 165.Er EINVAL 166is returned if no interface has been specified or the specified 167type is not available for the interface. 168.It Dv BIOCPROMISC 169Forces the interface into promiscuous mode. 170All packets, not just those destined for the local host, are processed. 171Since more than one file can be listening on a given interface, 172a listener that opened its interface non-promiscuously may receive 173packets promiscuously. 174This problem can be remedied with an appropriate filter. 175.Pp 176The interface remains in promiscuous mode until all files listening 177promiscuously are closed. 178.It Dv BIOCFLUSH 179Flushes the buffer of incoming packets, 180and resets the statistics that are returned by 181.Dv BIOCGSTATS . 182.It Dv BIOCGETIF (struct ifreq) 183Returns the name of the hardware interface that the file is listening on. 184The name is returned in the ifr_name field of 185.Fa ifr . 186All other fields are undefined. 187.It Dv BIOCSETIF (struct ifreq) 188Sets the hardware interface associate with the file. 189This command must be performed before any packets can be read. 190The device is indicated by name using the 191.Dv ifr_name 192field of the 193.Fa ifreq . 194Additionally, performs the actions of 195.Dv BIOCFLUSH . 196.It Dv BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval) 197Set or get the read timeout parameter. 198The 199.Fa timeval 200specifies the length of time to wait before timing 201out on a read request. 202This parameter is initialized to zero by 203.Xr open 2 , 204indicating no timeout. 205.It Dv BIOCGSTATS (struct bpf_stat) 206Returns the following structure of packet statistics: 207.Bd -literal -offset indent 208struct bpf_stat { 209 uint64_t bs_recv; 210 uint64_t bs_drop; 211 uint64_t bs_capt; 212 uint64_t bs_padding[13]; 213}; 214.Ed 215.Pp 216The fields are: 217.Bl -tag -width bs_recv -offset indent 218.It Va bs_recv 219the number of packets received by the descriptor since opened or reset 220(including any buffered since the last read call); 221.It Va bs_drop 222the number of packets which were accepted by the filter but dropped by the 223kernel because of buffer overflows 224(i.e., the application's reads aren't keeping up with the packet 225traffic); and 226.It Va bs_capt 227the number of packets accepted by the filter. 228.El 229.It Dv BIOCIMMEDIATE (u_int) 230Enable or disable 231.Dq immediate mode , 232based on the truth value of the argument. 233When immediate mode is enabled, reads return immediately upon packet 234reception. 235Otherwise, a read will block until either the kernel buffer 236becomes full or a timeout occurs. 237This is useful for programs like 238.Xr rarpd 8 , 239which must respond to messages in real time. 240The default for a new file is off. 241.It Dv BIOCSETF (struct bpf_program) 242Sets the filter program used by the kernel to discard uninteresting 243packets. 244An array of instructions and its length is passed in using the following structure: 245.Bd -literal -offset indent 246struct bpf_program { 247 u_int bf_len; 248 struct bpf_insn *bf_insns; 249}; 250.Ed 251.Pp 252The filter program is pointed to by the 253.Va bf_insns 254field while its length in units of 255.Sq struct bpf_insn 256is given by the 257.Va bf_len 258field. 259Also, the actions of 260.Dv BIOCFLUSH 261are performed. 262.Pp 263See section 264.Sy FILTER MACHINE 265for an explanation of the filter language. 266.It Dv BIOCVERSION (struct bpf_version) 267Returns the major and minor version numbers of the filter language currently 268recognized by the kernel. 269Before installing a filter, applications must check 270that the current version is compatible with the running kernel. 271Version numbers are compatible if the major numbers match and the 272application minor is less than or equal to the kernel minor. 273The kernel version number is returned in the following structure: 274.Bd -literal -offset indent 275struct bpf_version { 276 u_short bv_major; 277 u_short bv_minor; 278}; 279.Ed 280.Pp 281The current version numbers are given by 282.Dv BPF_MAJOR_VERSION 283and 284.Dv BPF_MINOR_VERSION 285from 286.In net/bpf.h . 287An incompatible filter 288may result in undefined behavior (most likely, an error returned by 289.Xr ioctl 2 290or haphazard packet matching). 291.It Dv BIOCGHDRCMPLT BIOCSHDRCMPLT (u_int) 292Enable/disable or get the 293.Dq header complete 294flag status. 295If enabled, packets written to the bpf file descriptor will not have 296network layer headers rewritten in the interface output routine. 297By default, the flag is disabled (value is 0). 298.It Dv BIOCGSEESENT BIOCSSEESENT (u_int) 299Enable/disable or get the 300.Dq see sent 301flag status. 302If enabled, packets sent by the host (not from 303.Nm ) 304will be passed to the filter. 305By default, the flag is enabled (value is 1). 306.It Dv BIOCFEEDBACK BIOCSFEEDBACK BIOCGFEEDBACK (u_int) 307Set (or get) 308.Dq packet feedback mode . 309This allows injected packets to be fed back as input to the interface when 310output via the interface is successful. 311The first name is meant for 312.Fx 313compatibility, the two others follow the Get/Set convention. 314.\"When 315.\".Dv BPF_D_INOUT 316.\"direction is set, injected 317Injected 318outgoing packets are not returned by BPF to avoid 319duplication. This flag is initialized to zero by default. 320.El 321.Sh STANDARD IOCTLS 322.Nm 323now supports several standard 324.Xr ioctl 2 Ns 's 325which allow the user to do async and/or non-blocking I/O to an open 326.Nm bpf 327file descriptor. 328.Bl -tag -width indent -offset indent 329.It Dv FIONREAD (int) 330Returns the number of bytes that are immediately available for reading. 331.It Dv FIONBIO (int) 332Set or clear non-blocking I/O. 333If arg is non-zero, then doing a 334.Xr read 2 335when no data is available will return -1 and 336.Va errno 337will be set to 338.Er EAGAIN . 339If arg is zero, non-blocking I/O is disabled. 340Note: setting this 341overrides the timeout set by 342.Dv BIOCSRTIMEOUT . 343.It Dv FIOASYNC (int) 344Enable or disable async I/O. 345When enabled (arg is non-zero), the process or process group specified by 346.Dv FIOSETOWN 347will start receiving SIGIO's when packets 348arrive. 349Note that you must do an 350.Dv FIOSETOWN 351in order for this to take effect, as 352the system will not default this for you. 353The signal may be changed via 354.Dv BIOCSRSIG . 355.It Dv FIOSETOWN FIOGETOWN (int) 356Set or get the process or process group (if negative) that should receive SIGIO 357when packets are available. 358The signal may be changed using 359.Dv BIOCSRSIG 360(see above). 361.El 362.Sh BPF HEADER 363The following structure is prepended to each packet returned by 364.Xr read 2 : 365.Bd -literal -offset indent 366struct bpf_hdr { 367 struct bpf_timeval bh_tstamp; 368 uint32_t bh_caplen; 369 uint32_t bh_datalen; 370 uint16_t bh_hdrlen; 371}; 372.Ed 373.Pp 374The fields, whose values are stored in host order, and are: 375.Bl -tag -width bh_datalen -offset indent 376.It Va bh_tstamp 377The time at which the packet was processed by the packet filter. 378This structure differs from the standard 379.Vt struct timeval 380in that both members are of type 381.Vt long . 382.It Va bh_caplen 383The length of the captured portion of the packet. 384This is the minimum of 385the truncation amount specified by the filter and the length of the packet. 386.It Va bh_datalen 387The length of the packet off the wire. 388This value is independent of the truncation amount specified by the filter. 389.It Va bh_hdrlen 390The length of the BPF header, which may not be equal to 391.Em sizeof(struct bpf_hdr) . 392.El 393.Pp 394The 395.Va bh_hdrlen 396field exists to account for 397padding between the header and the link level protocol. 398The purpose here is to guarantee proper alignment of the packet 399data structures, which is required on alignment sensitive 400architectures and improves performance on many other architectures. 401The packet filter ensures that the 402.Va bpf_hdr 403and the 404.Em network layer 405header will be word aligned. 406Suitable precautions must be taken when accessing the link layer 407protocol fields on alignment restricted machines. 408(This isn't a problem on an Ethernet, since 409the type field is a short falling on an even offset, 410and the addresses are probably accessed in a bytewise fashion). 411.Pp 412Additionally, individual packets are padded so that each starts 413on a word boundary. 414This requires that an application 415has some knowledge of how to get from packet to packet. 416The macro 417.Dv BPF_WORDALIGN 418is defined in 419.In net/bpf.h 420to facilitate this process. 421It rounds up its argument 422to the nearest word aligned value (where a word is 423.Dv BPF_ALIGNMENT 424bytes wide). 425.Pp 426For example, if 427.Sq Va p 428points to the start of a packet, this expression 429will advance it to the next packet: 430.Pp 431.Dl p = (char *)p + BPF_WORDALIGN(p-\*[Gt]bh_hdrlen + p-\*[Gt]bh_caplen) 432.Pp 433For the alignment mechanisms to work properly, the 434buffer passed to 435.Xr read 2 436must itself be word aligned. 437.Xr malloc 3 438will always return an aligned buffer. 439.Sh FILTER MACHINE 440A filter program is an array of instructions, with all branches forwardly 441directed, terminated by a 442.Sy return 443instruction. 444Each instruction performs some action on the pseudo-machine state, 445which consists of an accumulator, index register, scratch memory store, 446and implicit program counter. 447.Pp 448The following structure defines the instruction format: 449.Bd -literal -offset indent 450struct bpf_insn { 451 uint16_t code; 452 u_char jt; 453 u_char jf; 454 int32_t k; 455}; 456.Ed 457.Pp 458The 459.Va k 460field is used in different ways by different instructions, 461and the 462.Va jt 463and 464.Va jf 465fields are used as offsets 466by the branch instructions. 467The opcodes are encoded in a semi-hierarchical fashion. 468There are eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX, 469BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC. 470Various other mode and 471operator bits are or'd into the class to give the actual instructions. 472The classes and modes are defined in 473.In net/bpf.h . 474.Pp 475Below are the semantics for each defined BPF instruction. 476We use the convention that A is the accumulator, X is the index register, 477P[] packet data, and M[] scratch memory store. 478P[i:n] gives the data at byte offset 479.Dq i 480in the packet, 481interpreted as a word (n=4), 482unsigned halfword (n=2), or unsigned byte (n=1). 483M[i] gives the i'th word in the scratch memory store, which is only 484addressed in word units. 485The memory store is indexed from 0 to BPF_MEMWORDS-1. 486.Va k , 487.Va jt , 488and 489.Va jf 490are the corresponding fields in the 491instruction definition. 492.Dq len 493refers to the length of the packet. 494.Bl -tag -width indent -offset indent 495.It Sy BPF_LD 496These instructions copy a value into the accumulator. 497The type of the source operand is specified by an 498.Dq addressing mode 499and can be a constant 500.Sy ( BBPF_IMM ) , 501packet data at a fixed offset 502.Sy ( BPF_ABS ) , 503packet data at a variable offset 504.Sy ( BPF_IND ) , 505the packet length 506.Sy ( BPF_LEN ) , 507or a word in the scratch memory store 508.Sy ( BPF_MEM ) . 509For 510.Sy BPF_IND 511and 512.Sy BPF_ABS , 513the data size must be specified as a word 514.Sy ( BPF_W ) , 515halfword 516.Sy ( BPF_H ) , 517or byte 518.Sy ( BPF_B ) . 519The semantics of all the recognized BPF_LD instructions follow. 520.Bl -column "BPF_LD_BPF_W_BPF_ABS" "A \*[Lt]- P[k:4]" -offset indent 521.It Sy BPF_LD+BPF_W+BPF_ABS Ta A \*[Lt]- P[k:4] 522.It Sy BPF_LD+BPF_H+BPF_ABS Ta A \*[Lt]- P[k:2] 523.It Sy BPF_LD+BPF_B+BPF_ABS Ta A \*[Lt]- P[k:1] 524.It Sy BPF_LD+BPF_W+BPF_IND Ta A \*[Lt]- P[X+k:4] 525.It Sy BPF_LD+BPF_H+BPF_IND Ta A \*[Lt]- P[X+k:2] 526.It Sy BPF_LD+BPF_B+BPF_IND Ta A \*[Lt]- P[X+k:1] 527.It Sy BPF_LD+BPF_W+BPF_LEN Ta A \*[Lt]- len 528.It Sy BPF_LD+BPF_IMM Ta A \*[Lt]- k 529.It Sy BPF_LD+BPF_MEM Ta A \*[Lt]- M[k] 530.El 531.It Sy BPF_LDX 532These instructions load a value into the index register. 533Note that the addressing modes are more restricted than those of 534the accumulator loads, but they include 535.Sy BPF_MSH , 536a hack for efficiently loading the IP header length. 537.Bl -column "BPF_LDX_BPF_W_BPF_IMM" "X \*[Lt]- k" -offset indent 538.It Sy BPF_LDX+BPF_W+BPF_IMM Ta X \*[Lt]- k 539.It Sy BPF_LDX+BPF_W+BPF_MEM Ta X \*[Lt]- M[k] 540.It Sy BPF_LDX+BPF_W+BPF_LEN Ta X \*[Lt]- len 541.It Sy BPF_LDX+BPF_B+BPF_MSH Ta X \*[Lt]- 4*(P[k:1]\*[Am]0xf) 542.El 543.It Sy BPF_ST 544This instruction stores the accumulator into the scratch memory. 545We do not need an addressing mode since there is only one possibility 546for the destination. 547.Bl -column "BPF_ST" "M[k] \*[Lt]- A" -offset indent 548.It Sy BPF_ST Ta M[k] \*[Lt]- A 549.El 550.It Sy BPF_STX 551This instruction stores the index register in the scratch memory store. 552.Bl -column "BPF_STX" "M[k] \*[Lt]- X" -offset indent 553.It Sy BPF_STX Ta M[k] \*[Lt]- X 554.El 555.It Sy BPF_ALU 556The alu instructions perform operations between the accumulator and 557index register or constant, and store the result back in the accumulator. 558For binary operations, a source mode is required 559.Sy ( BPF_K 560or 561.Sy BPF_X ) . 562.Bl -column "BPF_ALU_BPF_ADD_BPF_K" "A \*[Lt]- A + k" -offset indent 563.It Sy BPF_ALU+BPF_ADD+BPF_K Ta A \*[Lt]- A + k 564.It Sy BPF_ALU+BPF_SUB+BPF_K Ta A \*[Lt]- A - k 565.It Sy BPF_ALU+BPF_MUL+BPF_K Ta A \*[Lt]- A * k 566.It Sy BPF_ALU+BPF_DIV+BPF_K Ta A \*[Lt]- A / k 567.It Sy BPF_ALU+BPF_AND+BPF_K Ta A \*[Lt]- A \*[Am] k 568.It Sy BPF_ALU+BPF_OR+BPF_K Ta A \*[Lt]- A | k 569.It Sy BPF_ALU+BPF_LSH+BPF_K Ta A \*[Lt]- A \*[Lt]\*[Lt] k 570.It Sy BPF_ALU+BPF_RSH+BPF_K Ta A \*[Lt]- A \*[Gt]\*[Gt] k 571.It Sy BPF_ALU+BPF_ADD+BPF_X Ta A \*[Lt]- A + X 572.It Sy BPF_ALU+BPF_SUB+BPF_X Ta A \*[Lt]- A - X 573.It Sy BPF_ALU+BPF_MUL+BPF_X Ta A \*[Lt]- A * X 574.It Sy BPF_ALU+BPF_DIV+BPF_X Ta A \*[Lt]- A / X 575.It Sy BPF_ALU+BPF_AND+BPF_X Ta A \*[Lt]- A \*[Am] X 576.It Sy BPF_ALU+BPF_OR+BPF_X Ta A \*[Lt]- A | X 577.It Sy BPF_ALU+BPF_LSH+BPF_X Ta A \*[Lt]- A \*[Lt]\*[Lt] X 578.It Sy BPF_ALU+BPF_RSH+BPF_X Ta A \*[Lt]- A \*[Gt]\*[Gt] X 579.It Sy BPF_ALU+BPF_NEG Ta A \*[Lt]- -A 580.El 581.It Sy BPF_JMP 582The jump instructions alter flow of control. 583Conditional jumps compare the accumulator against a constant 584.Sy ( BPF_K ) 585or the index register 586.Sy ( BPF_X ) . 587If the result is true (or non-zero), 588the true branch is taken, otherwise the false branch is taken. 589Jump offsets are encoded in 8 bits so the longest jump is 256 instructions. 590However, the jump always 591.Sy ( BPF_JA ) 592opcode uses the 32 bit 593.Va k 594field as the offset, allowing arbitrarily distant destinations. 595All conditionals use unsigned comparison conventions. 596.Bl -column "BPF_JMP+BPF_JGE+BPF_K" "pc += (A \*[Ge] k) ? jt : jf" -offset indent 597.It Sy BPF_JMP+BPF_JA Ta pc += k 598.It Sy BPF_JMP+BPF_JGT+BPF_K Ta "pc += (A \*[Gt] k) ? jt : jf" 599.It Sy BPF_JMP+BPF_JGE+BPF_K Ta "pc += (A \*[Ge] k) ? jt : jf" 600.It Sy BPF_JMP+BPF_JEQ+BPF_K Ta "pc += (A == k) ? jt : jf" 601.It Sy BPF_JMP+BPF_JSET+BPF_K Ta "pc += (A \*[Am] k) ? jt : jf" 602.It Sy BPF_JMP+BPF_JGT+BPF_X Ta "pc += (A \*[Gt] X) ? jt : jf" 603.It Sy BPF_JMP+BPF_JGE+BPF_X Ta "pc += (A \*[Ge] X) ? jt : jf" 604.It Sy BPF_JMP+BPF_JEQ+BPF_X Ta "pc += (A == X) ? jt : jf" 605.It Sy BPF_JMP+BPF_JSET+BPF_X Ta "pc += (A \*[Am] X) ? jt : jf" 606.El 607.It Sy BPF_RET 608The return instructions terminate the filter program and specify the amount 609of packet to accept (i.e., they return the truncation amount). 610A return value of zero indicates that the packet should be ignored. 611The return value is either a constant 612.Sy ( BPF_K ) 613or the accumulator 614.Sy ( BPF_A ) . 615.Bl -column "BPF_RET+BPF_A" "accept A bytes" -offset indent 616.It Sy BPF_RET+BPF_A Ta accept A bytes 617.It Sy BPF_RET+BPF_K Ta accept k bytes 618.El 619.It Sy BPF_MISC 620The miscellaneous category was created for anything that doesn't 621fit into the above classes, and for any new instructions that might need to 622be added. 623Currently, these are the register transfer instructions 624that copy the index register to the accumulator or vice versa. 625.Bl -column "BPF_MISC+BPF_TAX" "X \*[Lt]- A" -offset indent 626.It Sy BPF_MISC+BPF_TAX Ta X \*[Lt]- A 627.It Sy BPF_MISC+BPF_TXA Ta A \*[Lt]- X 628.El 629.El 630.Pp 631The BPF interface provides the following macros to facilitate 632array initializers: 633.Bd -unfilled -offset indent 634.Sy BPF_STMT No (opcode, operand) 635.Sy BPF_JUMP No (opcode, operand, true_offset, false_offset) 636.Ed 637.Sh SYSCTLS 638The following sysctls are available when 639.Nm 640is enabled: 641.Pp 642.Bl -tag -width "XnetXbpfXmaxbufsizeXX" 643.It Li net.bpf.maxbufsize 644Sets the maximum buffer size available for 645.Nm 646peers. 647.It Li net.bpf.stats 648Shows 649.Nm 650statistics. 651They can be retrieved with the 652.Xr netstat 1 653utility. 654.It Li net.bpf.peers 655Shows the current 656.Nm 657peers. 658This is only available to the super user and can also be retrieved with the 659.Xr netstat 1 660utility. 661.El 662.Sh FILES 663.Pa /dev/bpf 664.Sh EXAMPLES 665The following filter is taken from the Reverse ARP Daemon. 666It accepts only Reverse ARP requests. 667.Bd -literal -offset indent 668struct bpf_insn insns[] = { 669 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 670 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3), 671 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 672 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1), 673 BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) + 674 sizeof(struct ether_header)), 675 BPF_STMT(BPF_RET+BPF_K, 0), 676}; 677.Ed 678.Pp 679This filter accepts only IP packets between host 128.3.112.15 and 680128.3.112.35. 681.Bd -literal -offset indent 682struct bpf_insn insns[] = { 683 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 684 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8), 685 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26), 686 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2), 687 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 688 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4), 689 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3), 690 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 691 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1), 692 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 693 BPF_STMT(BPF_RET+BPF_K, 0), 694}; 695.Ed 696.Pp 697Finally, this filter returns only TCP finger packets. 698We must parse the IP header to reach the TCP header. 699The 700.Sy BPF_JSET 701instruction checks that the IP fragment offset is 0 so we are sure 702that we have a TCP header. 703.Bd -literal -offset indent 704struct bpf_insn insns[] = { 705 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 706 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10), 707 BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23), 708 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8), 709 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 710 BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0), 711 BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14), 712 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14), 713 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0), 714 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16), 715 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1), 716 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 717 BPF_STMT(BPF_RET+BPF_K, 0), 718}; 719.Ed 720.Sh SEE ALSO 721.Xr ioctl 2 , 722.Xr read 2 , 723.Xr select 2 , 724.Xr signal 3 , 725.Xr tcpdump 8 726.Rs 727.%T "The BSD Packet Filter: A New Architecture for User-level Packet Capture" 728.%A S. McCanne 729.%A V. Jacobson 730.%J Proceedings of the 1993 Winter USENIX 731.%C Technical Conference, San Diego, CA 732.Re 733.Sh HISTORY 734The Enet packet filter was created in 1980 by Mike Accetta and 735Rick Rashid at Carnegie-Mellon University. 736Jeffrey Mogul, at Stanford, ported the code to BSD and continued 737its development from 1983 on. 738Since then, it has evolved into the ULTRIX Packet Filter 739at DEC, a STREAMS NIT module under SunOS 4.1, and BPF. 740.Sh AUTHORS 741Steven McCanne, of Lawrence Berkeley Laboratory, implemented BPF in 742Summer 1990. 743The design was in collaboration with Van Jacobson, 744also of Lawrence Berkeley Laboratory. 745.Sh BUGS 746The read buffer must be of a fixed size (returned by the 747.Dv BIOCGBLEN 748ioctl). 749.Pp 750A file that does not request promiscuous mode may receive promiscuously 751received packets as a side effect of another file requesting this 752mode on the same hardware interface. 753This could be fixed in the kernel with additional processing overhead. 754However, we favor the model where 755all files must assume that the interface is promiscuous, and if 756so desired, must use a filter to reject foreign packets. 757.Pp 758Data link protocols with variable length headers are not currently supported. 759.Pp 760Under SunOS, if a BPF application reads more than 2^31 bytes of 761data, read will fail in 762.Er EINVAL . 763You can either fix the bug in SunOS, 764or lseek to 0 when read fails for this reason. 765.Pp 766.Dq Immediate mode 767and the 768.Dq read timeout 769are misguided features. 770This functionality can be emulated with non-blocking mode and 771.Xr select 2 . 772