1.\" $NetBSD: bpf.4,v 1.45 2010/03/22 18:58:31 joerg Exp $ 2.\" 3.\" -*- nroff -*- 4.\" 5.\" $NetBSD: bpf.4,v 1.45 2010/03/22 18:58:31 joerg Exp $ 6.\" 7.\" Copyright (c) 1990, 1991, 1992, 1993, 1994 8.\" The Regents of the University of California. All rights reserved. 9.\" 10.\" Redistribution and use in source and binary forms, with or without 11.\" modification, are permitted provided that: (1) source code distributions 12.\" retain the above copyright notice and this paragraph in its entirety, (2) 13.\" distributions including binary code include the above copyright notice and 14.\" this paragraph in its entirety in the documentation or other materials 15.\" provided with the distribution, and (3) all advertising materials mentioning 16.\" features or use of this software display the following acknowledgement: 17.\" ``This product includes software developed by the University of California, 18.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of 19.\" the University nor the names of its contributors may be used to endorse 20.\" or promote products derived from this software without specific prior 21.\" written permission. 22.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED 23.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF 24.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. 25.\" 26.\" This document is derived in part from the enet man page (enet.4) 27.\" distributed with 4.3BSD Unix. 28.\" 29.Dd March 13, 2010 30.Dt BPF 4 31.Os 32.Sh NAME 33.Nm bpf 34.Nd Berkeley Packet Filter raw network interface 35.Sh SYNOPSIS 36.Cd "pseudo-device bpfilter" 37.Sh DESCRIPTION 38The Berkeley Packet Filter 39provides a raw interface to data link layers in a protocol 40independent fashion. 41All packets on the network, even those destined for other hosts, 42are accessible through this mechanism. 43.Pp 44The packet filter appears as a character special device, 45.Pa /dev/bpf . 46After opening the device, the file descriptor must be bound to a 47specific network interface with the 48.Dv BIOSETIF 49ioctl. 50A given interface can be shared by multiple listeners, and the filter 51underlying each descriptor will see an identical packet stream. 52.Pp 53Associated with each open instance of a 54.Nm 55file is a user-settable packet filter. 56Whenever a packet is received by an interface, 57all file descriptors listening on that interface apply their filter. 58Each descriptor that accepts the packet receives its own copy. 59.Pp 60Reads from these files return the next group of packets 61that have matched the filter. 62To improve performance, the buffer passed to read must be 63the same size as the buffers used internally by 64.Nm . 65This size is returned by the 66.Dv BIOCGBLEN 67ioctl (see below), and under 68BSD, can be set with 69.Dv BIOCSBLEN . 70Note that an individual packet larger than this size is necessarily 71truncated. 72.Pp 73The packet filter will support any link level protocol that has fixed length 74headers. 75Currently, only Ethernet, SLIP and PPP drivers have been 76modified to interact with 77.Nm . 78.Pp 79Since packet data is in network byte order, applications should use the 80.Xr byteorder 3 81macros to extract multi-byte values. 82.Pp 83A packet can be sent out on the network by writing to a 84.Nm 85file descriptor. 86The writes are unbuffered, meaning only one packet can be processed per write. 87Currently, only writes to Ethernets and SLIP links are supported. 88.Sh IOCTLS 89The 90.Xr ioctl 2 91command codes below are defined in 92.In net/bpf.h . 93All commands require these includes: 94.Bd -literal -offset indent 95#include \*[Lt]sys/types.h\*[Gt] 96#include \*[Lt]sys/time.h\*[Gt] 97#include \*[Lt]sys/ioctl.h\*[Gt] 98#include \*[Lt]net/bpf.h\*[Gt] 99.Ed 100.Pp 101Additionally, 102.Dv BIOCGETIF 103and 104.Dv BIOCSETIF 105require 106.Pa \*[Lt]net/if.h\*[Gt] . 107.Pp 108The (third) argument to the 109.Xr ioctl 2 110should be a pointer to the type indicated. 111.Bl -tag -width indent -offset indent 112.It Dv "BIOCGBLEN (u_int)" 113Returns the required buffer length for reads on 114.Nm 115files. 116.It Dv "BIOCSBLEN (u_int)" 117Sets the buffer length for reads on 118.Nm 119files. 120The buffer must be set before the file is attached to an interface with 121.Dv BIOCSETIF . 122If the requested buffer size cannot be accommodated, the closest 123allowable size will be set and returned in the argument. 124A read call will result in 125.Er EINVAL 126if it is passed a buffer that is not this size. 127.It Dv BIOCGDLT (u_int) 128Returns the type of the data link layer underlying the attached interface. 129.Er EINVAL 130is returned if no interface has been specified. 131The device types, prefixed with 132.Dq DLT_ , 133are defined in 134.In net/bpf.h . 135.It Dv BIOCGDLTLIST (struct bpf_dltlist) 136Returns an array of available type of the data link layer 137underlying the attached interface: 138.Bd -literal -offset indent 139struct bpf_dltlist { 140 u_int bfl_len; 141 u_int *bfl_list; 142}; 143.Ed 144.Pp 145The available type is returned to the array pointed to the 146.Va bfl_list 147field while its length in u_int is supplied to the 148.Va bfl_len 149field. 150.Er ENOMEM 151is returned if there is not enough buffer. 152The 153.Va bfl_len 154field is modified on return to indicate the actual length in u_int 155of the array returned. 156If 157.Va bfl_list 158is 159.Dv NULL , 160the 161.Va bfl_len 162field is returned to indicate the required length of an array in u_int. 163.It Dv BIOCSDLT (u_int) 164Change the type of the data link layer underlying the attached interface. 165.Er EINVAL 166is returned if no interface has been specified or the specified 167type is not available for the interface. 168.It Dv BIOCPROMISC 169Forces the interface into promiscuous mode. 170All packets, not just those destined for the local host, are processed. 171Since more than one file can be listening on a given interface, 172a listener that opened its interface non-promiscuously may receive 173packets promiscuously. 174This problem can be remedied with an appropriate filter. 175.Pp 176The interface remains in promiscuous mode until all files listening 177promiscuously are closed. 178.It Dv BIOCFLUSH 179Flushes the buffer of incoming packets, 180and resets the statistics that are returned by 181.Dv BIOCGSTATS . 182.It Dv BIOCGETIF (struct ifreq) 183Returns the name of the hardware interface that the file is listening on. 184The name is returned in the ifr_name field of 185.Fa ifr . 186All other fields are undefined. 187.It Dv BIOCSETIF (struct ifreq) 188Sets the hardware interface associate with the file. 189This command must be performed before any packets can be read. 190The device is indicated by name using the 191.Dv ifr_name 192field of the 193.Fa ifreq . 194Additionally, performs the actions of 195.Dv BIOCFLUSH . 196.It Dv BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval) 197Set or get the read timeout parameter. 198The 199.Fa timeval 200specifies the length of time to wait before timing 201out on a read request. 202This parameter is initialized to zero by 203.Xr open 2 , 204indicating no timeout. 205.It Dv BIOCGSTATS (struct bpf_stat) 206Returns the following structure of packet statistics: 207.Bd -literal -offset indent 208struct bpf_stat { 209 uint64_t bs_recv; 210 uint64_t bs_drop; 211 uint64_t bs_capt; 212 uint64_t bs_padding[13]; 213}; 214.Ed 215.Pp 216The fields are: 217.Bl -tag -width bs_recv -offset indent 218.It Va bs_recv 219the number of packets received by the descriptor since opened or reset 220(including any buffered since the last read call); 221.It Va bs_drop 222the number of packets which were accepted by the filter but dropped by the 223kernel because of buffer overflows 224(i.e., the application's reads aren't keeping up with the packet 225traffic); and 226.It Va bs_capt 227the number of packets accepted by the filter. 228.El 229.It Dv BIOCIMMEDIATE (u_int) 230Enable or disable 231.Dq immediate mode , 232based on the truth value of the argument. 233When immediate mode is enabled, reads return immediately upon packet 234reception. 235Otherwise, a read will block until either the kernel buffer 236becomes full or a timeout occurs. 237This is useful for programs like 238.Xr rarpd 8 , 239which must respond to messages in real time. 240The default for a new file is off. 241.It Dv BIOCSETF (struct bpf_program) 242Sets the filter program used by the kernel to discard uninteresting 243packets. 244An array of instructions and its length is passed in using the following structure: 245.Bd -literal -offset indent 246struct bpf_program { 247 u_int bf_len; 248 struct bpf_insn *bf_insns; 249}; 250.Ed 251.Pp 252The filter program is pointed to by the 253.Va bf_insns 254field while its length in units of 255.Sq struct bpf_insn 256is given by the 257.Va bf_len 258field. 259Also, the actions of 260.Dv BIOCFLUSH 261are performed. 262.Pp 263See section 264.Sy FILTER MACHINE 265for an explanation of the filter language. 266.It Dv BIOCVERSION (struct bpf_version) 267Returns the major and minor version numbers of the filter language currently 268recognized by the kernel. 269Before installing a filter, applications must check 270that the current version is compatible with the running kernel. 271Version numbers are compatible if the major numbers match and the 272application minor is less than or equal to the kernel minor. 273The kernel version number is returned in the following structure: 274.Bd -literal -offset indent 275struct bpf_version { 276 u_short bv_major; 277 u_short bv_minor; 278}; 279.Ed 280.Pp 281The current version numbers are given by 282.Dv BPF_MAJOR_VERSION 283and 284.Dv BPF_MINOR_VERSION 285from 286.In net/bpf.h . 287An incompatible filter 288may result in undefined behavior (most likely, an error returned by 289.Xr ioctl 2 290or haphazard packet matching). 291.It Dv BIOCGHDRCMPLT BIOCSHDRCMPLT (u_int) 292Enable/disable or get the 293.Dq header complete 294flag status. 295If enabled, packets written to the bpf file descriptor will not have 296network layer headers rewritten in the interface output routine. 297By default, the flag is disabled (value is 0). 298.It Dv BIOCGSEESENT BIOCSSEESENT (u_int) 299Enable/disable or get the 300.Dq see sent 301flag status. 302If enabled, packets sent by the host (not from 303.Nm ) 304will be passed to the filter. 305By default, the flag is enabled (value is 1). 306.It Dv BIOCFEEDBACK BIOCSFEEDBACK BIOCGFEEDBACK (u_int) 307Set (or get) 308.Dq packet feedback mode . 309This allows injected packets to be fed back as input to the interface when 310output via the interface is successful. 311The first name is meant for 312.Fx 313compatibility, the two others follow the Get/Set convention. 314.\"When 315.\".Dv BPF_D_INOUT 316.\"direction is set, injected 317Injected 318outgoing packets are not returned by BPF to avoid 319duplication. This flag is initialized to zero by default. 320.El 321.Sh STANDARD IOCTLS 322.Nm 323now supports several standard 324.Xr ioctl 2 Ns 's 325which allow the user to do async and/or non-blocking I/O to an open 326.Nm bpf 327file descriptor. 328.Bl -tag -width indent -offset indent 329.It Dv FIONREAD (int) 330Returns the number of bytes that are immediately available for reading. 331.It Dv SIOCGIFADDR (struct ifreq) 332Returns the address associated with the interface. 333.It Dv FIONBIO (int) 334Set or clear non-blocking I/O. 335If arg is non-zero, then doing a 336.Xr read 2 337when no data is available will return -1 and 338.Va errno 339will be set to 340.Er EAGAIN . 341If arg is zero, non-blocking I/O is disabled. 342Note: setting this 343overrides the timeout set by 344.Dv BIOCSRTIMEOUT . 345.It Dv FIOASYNC (int) 346Enable or disable async I/O. 347When enabled (arg is non-zero), the process or process group specified by 348.Dv FIOSETOWN 349will start receiving SIGIO's when packets 350arrive. 351Note that you must do an 352.Dv FIOSETOWN 353in order for this to take effect, as 354the system will not default this for you. 355The signal may be changed via 356.Dv BIOCSRSIG . 357.It Dv FIOSETOWN FIOGETOWN (int) 358Set or get the process or process group (if negative) that should receive SIGIO 359when packets are available. 360The signal may be changed using 361.Dv BIOCSRSIG 362(see above). 363.El 364.Sh BPF HEADER 365The following structure is prepended to each packet returned by 366.Xr read 2 : 367.Bd -literal -offset indent 368struct bpf_hdr { 369 struct bpf_timeval bh_tstamp; 370 uint32_t bh_caplen; 371 uint32_t bh_datalen; 372 uint16_t bh_hdrlen; 373}; 374.Ed 375.Pp 376The fields, whose values are stored in host order, and are: 377.Bl -tag -width bh_datalen -offset indent 378.It Va bh_tstamp 379The time at which the packet was processed by the packet filter. 380This structure differs from the standard 381.Vt struct timeval 382in that both members are of type 383.Vt long . 384.It Va bh_caplen 385The length of the captured portion of the packet. 386This is the minimum of 387the truncation amount specified by the filter and the length of the packet. 388.It Va bh_datalen 389The length of the packet off the wire. 390This value is independent of the truncation amount specified by the filter. 391.It Va bh_hdrlen 392The length of the BPF header, which may not be equal to 393.Em sizeof(struct bpf_hdr) . 394.El 395.Pp 396The 397.Va bh_hdrlen 398field exists to account for 399padding between the header and the link level protocol. 400The purpose here is to guarantee proper alignment of the packet 401data structures, which is required on alignment sensitive 402architectures and improves performance on many other architectures. 403The packet filter ensures that the 404.Va bpf_hdr 405and the 406.Em network layer 407header will be word aligned. 408Suitable precautions must be taken when accessing the link layer 409protocol fields on alignment restricted machines. 410(This isn't a problem on an Ethernet, since 411the type field is a short falling on an even offset, 412and the addresses are probably accessed in a bytewise fashion). 413.Pp 414Additionally, individual packets are padded so that each starts 415on a word boundary. 416This requires that an application 417has some knowledge of how to get from packet to packet. 418The macro 419.Dv BPF_WORDALIGN 420is defined in 421.In net/bpf.h 422to facilitate this process. 423It rounds up its argument 424to the nearest word aligned value (where a word is 425.Dv BPF_ALIGNMENT 426bytes wide). 427.Pp 428For example, if 429.Sq Va p 430points to the start of a packet, this expression 431will advance it to the next packet: 432.Pp 433.Dl p = (char *)p + BPF_WORDALIGN(p-\*[Gt]bh_hdrlen + p-\*[Gt]bh_caplen) 434.Pp 435For the alignment mechanisms to work properly, the 436buffer passed to 437.Xr read 2 438must itself be word aligned. 439.Xr malloc 3 440will always return an aligned buffer. 441.Sh FILTER MACHINE 442A filter program is an array of instructions, with all branches forwardly 443directed, terminated by a 444.Sy return 445instruction. 446Each instruction performs some action on the pseudo-machine state, 447which consists of an accumulator, index register, scratch memory store, 448and implicit program counter. 449.Pp 450The following structure defines the instruction format: 451.Bd -literal -offset indent 452struct bpf_insn { 453 uint16_t code; 454 u_char jt; 455 u_char jf; 456 int32_t k; 457}; 458.Ed 459.Pp 460The 461.Va k 462field is used in different ways by different instructions, 463and the 464.Va jt 465and 466.Va jf 467fields are used as offsets 468by the branch instructions. 469The opcodes are encoded in a semi-hierarchical fashion. 470There are eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX, 471BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC. 472Various other mode and 473operator bits are or'd into the class to give the actual instructions. 474The classes and modes are defined in 475.In net/bpf.h . 476.Pp 477Below are the semantics for each defined BPF instruction. 478We use the convention that A is the accumulator, X is the index register, 479P[] packet data, and M[] scratch memory store. 480P[i:n] gives the data at byte offset 481.Dq i 482in the packet, 483interpreted as a word (n=4), 484unsigned halfword (n=2), or unsigned byte (n=1). 485M[i] gives the i'th word in the scratch memory store, which is only 486addressed in word units. 487The memory store is indexed from 0 to BPF_MEMWORDS-1. 488.Va k , 489.Va jt , 490and 491.Va jf 492are the corresponding fields in the 493instruction definition. 494.Dq len 495refers to the length of the packet. 496.Bl -tag -width indent -offset indent 497.It Sy BPF_LD 498These instructions copy a value into the accumulator. 499The type of the source operand is specified by an 500.Dq addressing mode 501and can be a constant 502.Sy ( BBPF_IMM ) , 503packet data at a fixed offset 504.Sy ( BPF_ABS ) , 505packet data at a variable offset 506.Sy ( BPF_IND ) , 507the packet length 508.Sy ( BPF_LEN ) , 509or a word in the scratch memory store 510.Sy ( BPF_MEM ) . 511For 512.Sy BPF_IND 513and 514.Sy BPF_ABS , 515the data size must be specified as a word 516.Sy ( BPF_W ) , 517halfword 518.Sy ( BPF_H ) , 519or byte 520.Sy ( BPF_B ) . 521The semantics of all the recognized BPF_LD instructions follow. 522.Bl -column "BPF_LD_BPF_W_BPF_ABS" "A \*[Lt]- P[k:4]" -offset indent 523.It Sy BPF_LD+BPF_W+BPF_ABS Ta A \*[Lt]- P[k:4] 524.It Sy BPF_LD+BPF_H+BPF_ABS Ta A \*[Lt]- P[k:2] 525.It Sy BPF_LD+BPF_B+BPF_ABS Ta A \*[Lt]- P[k:1] 526.It Sy BPF_LD+BPF_W+BPF_IND Ta A \*[Lt]- P[X+k:4] 527.It Sy BPF_LD+BPF_H+BPF_IND Ta A \*[Lt]- P[X+k:2] 528.It Sy BPF_LD+BPF_B+BPF_IND Ta A \*[Lt]- P[X+k:1] 529.It Sy BPF_LD+BPF_W+BPF_LEN Ta A \*[Lt]- len 530.It Sy BPF_LD+BPF_IMM Ta A \*[Lt]- k 531.It Sy BPF_LD+BPF_MEM Ta A \*[Lt]- M[k] 532.El 533.It Sy BPF_LDX 534These instructions load a value into the index register. 535Note that the addressing modes are more restricted than those of 536the accumulator loads, but they include 537.Sy BPF_MSH , 538a hack for efficiently loading the IP header length. 539.Bl -column "BPF_LDX_BPF_W_BPF_IMM" "X \*[Lt]- k" -offset indent 540.It Sy BPF_LDX+BPF_W+BPF_IMM Ta X \*[Lt]- k 541.It Sy BPF_LDX+BPF_W+BPF_MEM Ta X \*[Lt]- M[k] 542.It Sy BPF_LDX+BPF_W+BPF_LEN Ta X \*[Lt]- len 543.It Sy BPF_LDX+BPF_B+BPF_MSH Ta X \*[Lt]- 4*(P[k:1]\*[Am]0xf) 544.El 545.It Sy BPF_ST 546This instruction stores the accumulator into the scratch memory. 547We do not need an addressing mode since there is only one possibility 548for the destination. 549.Bl -column "BPF_ST" "M[k] \*[Lt]- A" -offset indent 550.It Sy BPF_ST Ta M[k] \*[Lt]- A 551.El 552.It Sy BPF_STX 553This instruction stores the index register in the scratch memory store. 554.Bl -column "BPF_STX" "M[k] \*[Lt]- X" -offset indent 555.It Sy BPF_STX Ta M[k] \*[Lt]- X 556.El 557.It Sy BPF_ALU 558The alu instructions perform operations between the accumulator and 559index register or constant, and store the result back in the accumulator. 560For binary operations, a source mode is required 561.Sy ( BPF_K 562or 563.Sy BPF_X ) . 564.Bl -column "BPF_ALU_BPF_ADD_BPF_K" "A \*[Lt]- A + k" -offset indent 565.It Sy BPF_ALU+BPF_ADD+BPF_K Ta A \*[Lt]- A + k 566.It Sy BPF_ALU+BPF_SUB+BPF_K Ta A \*[Lt]- A - k 567.It Sy BPF_ALU+BPF_MUL+BPF_K Ta A \*[Lt]- A * k 568.It Sy BPF_ALU+BPF_DIV+BPF_K Ta A \*[Lt]- A / k 569.It Sy BPF_ALU+BPF_AND+BPF_K Ta A \*[Lt]- A \*[Am] k 570.It Sy BPF_ALU+BPF_OR+BPF_K Ta A \*[Lt]- A | k 571.It Sy BPF_ALU+BPF_LSH+BPF_K Ta A \*[Lt]- A \*[Lt]\*[Lt] k 572.It Sy BPF_ALU+BPF_RSH+BPF_K Ta A \*[Lt]- A \*[Gt]\*[Gt] k 573.It Sy BPF_ALU+BPF_ADD+BPF_X Ta A \*[Lt]- A + X 574.It Sy BPF_ALU+BPF_SUB+BPF_X Ta A \*[Lt]- A - X 575.It Sy BPF_ALU+BPF_MUL+BPF_X Ta A \*[Lt]- A * X 576.It Sy BPF_ALU+BPF_DIV+BPF_X Ta A \*[Lt]- A / X 577.It Sy BPF_ALU+BPF_AND+BPF_X Ta A \*[Lt]- A \*[Am] X 578.It Sy BPF_ALU+BPF_OR+BPF_X Ta A \*[Lt]- A | X 579.It Sy BPF_ALU+BPF_LSH+BPF_X Ta A \*[Lt]- A \*[Lt]\*[Lt] X 580.It Sy BPF_ALU+BPF_RSH+BPF_X Ta A \*[Lt]- A \*[Gt]\*[Gt] X 581.It Sy BPF_ALU+BPF_NEG Ta A \*[Lt]- -A 582.El 583.It Sy BPF_JMP 584The jump instructions alter flow of control. 585Conditional jumps compare the accumulator against a constant 586.Sy ( BPF_K ) 587or the index register 588.Sy ( BPF_X ) . 589If the result is true (or non-zero), 590the true branch is taken, otherwise the false branch is taken. 591Jump offsets are encoded in 8 bits so the longest jump is 256 instructions. 592However, the jump always 593.Sy ( BPF_JA ) 594opcode uses the 32 bit 595.Va k 596field as the offset, allowing arbitrarily distant destinations. 597All conditionals use unsigned comparison conventions. 598.Bl -column "BPF_JMP+BPF_JGE+BPF_K" "pc += (A \*[Ge] k) ? jt : jf" -offset indent 599.It Sy BPF_JMP+BPF_JA Ta pc += k 600.It Sy BPF_JMP+BPF_JGT+BPF_K Ta "pc += (A \*[Gt] k) ? jt : jf" 601.It Sy BPF_JMP+BPF_JGE+BPF_K Ta "pc += (A \*[Ge] k) ? jt : jf" 602.It Sy BPF_JMP+BPF_JEQ+BPF_K Ta "pc += (A == k) ? jt : jf" 603.It Sy BPF_JMP+BPF_JSET+BPF_K Ta "pc += (A \*[Am] k) ? jt : jf" 604.It Sy BPF_JMP+BPF_JGT+BPF_X Ta "pc += (A \*[Gt] X) ? jt : jf" 605.It Sy BPF_JMP+BPF_JGE+BPF_X Ta "pc += (A \*[Ge] X) ? jt : jf" 606.It Sy BPF_JMP+BPF_JEQ+BPF_X Ta "pc += (A == X) ? jt : jf" 607.It Sy BPF_JMP+BPF_JSET+BPF_X Ta "pc += (A \*[Am] X) ? jt : jf" 608.El 609.It Sy BPF_RET 610The return instructions terminate the filter program and specify the amount 611of packet to accept (i.e., they return the truncation amount). 612A return value of zero indicates that the packet should be ignored. 613The return value is either a constant 614.Sy ( BPF_K ) 615or the accumulator 616.Sy ( BPF_A ) . 617.Bl -column "BPF_RET+BPF_A" "accept A bytes" -offset indent 618.It Sy BPF_RET+BPF_A Ta accept A bytes 619.It Sy BPF_RET+BPF_K Ta accept k bytes 620.El 621.It Sy BPF_MISC 622The miscellaneous category was created for anything that doesn't 623fit into the above classes, and for any new instructions that might need to 624be added. 625Currently, these are the register transfer instructions 626that copy the index register to the accumulator or vice versa. 627.Bl -column "BPF_MISC+BPF_TAX" "X \*[Lt]- A" -offset indent 628.It Sy BPF_MISC+BPF_TAX Ta X \*[Lt]- A 629.It Sy BPF_MISC+BPF_TXA Ta A \*[Lt]- X 630.El 631.El 632.Pp 633The BPF interface provides the following macros to facilitate 634array initializers: 635.Bd -unfilled -offset indent 636.Sy BPF_STMT No (opcode, operand) 637.Sy BPF_JUMP No (opcode, operand, true_offset, false_offset) 638.Ed 639.Sh SYSCTLS 640The following sysctls are available when 641.Nm 642is enabled: 643.Pp 644.Bl -tag -width "XnetXbpfXmaxbufsizeXX" 645.It Li net.bpf.maxbufsize 646Sets the maximum buffer size available for 647.Nm 648peers. 649.It Li net.bpf.stats 650Shows 651.Nm 652statistics. 653They can be retrieved with the 654.Xr netstat 1 655utility. 656.It Li net.bpf.peers 657Shows the current 658.Nm 659peers. 660This is only available to the super user and can also be retrieved with the 661.Xr netstat 1 662utility. 663.El 664.Sh FILES 665.Pa /dev/bpf 666.Sh EXAMPLES 667The following filter is taken from the Reverse ARP Daemon. 668It accepts only Reverse ARP requests. 669.Bd -literal -offset indent 670struct bpf_insn insns[] = { 671 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 672 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3), 673 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 674 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1), 675 BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) + 676 sizeof(struct ether_header)), 677 BPF_STMT(BPF_RET+BPF_K, 0), 678}; 679.Ed 680.Pp 681This filter accepts only IP packets between host 128.3.112.15 and 682128.3.112.35. 683.Bd -literal -offset indent 684struct bpf_insn insns[] = { 685 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 686 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8), 687 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26), 688 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2), 689 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 690 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4), 691 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3), 692 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 693 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1), 694 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 695 BPF_STMT(BPF_RET+BPF_K, 0), 696}; 697.Ed 698.Pp 699Finally, this filter returns only TCP finger packets. 700We must parse the IP header to reach the TCP header. 701The 702.Sy BPF_JSET 703instruction checks that the IP fragment offset is 0 so we are sure 704that we have a TCP header. 705.Bd -literal -offset indent 706struct bpf_insn insns[] = { 707 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 708 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10), 709 BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23), 710 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8), 711 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 712 BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0), 713 BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14), 714 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14), 715 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0), 716 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16), 717 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1), 718 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 719 BPF_STMT(BPF_RET+BPF_K, 0), 720}; 721.Ed 722.Sh SEE ALSO 723.Xr ioctl 2 , 724.Xr read 2 , 725.Xr select 2 , 726.Xr signal 3 , 727.Xr tcpdump 8 728.Rs 729.%T "The BSD Packet Filter: A New Architecture for User-level Packet Capture" 730.%A S. McCanne 731.%A V. Jacobson 732.%J Proceedings of the 1993 Winter USENIX 733.%C Technical Conference, San Diego, CA 734.Re 735.Sh HISTORY 736The Enet packet filter was created in 1980 by Mike Accetta and 737Rick Rashid at Carnegie-Mellon University. 738Jeffrey Mogul, at Stanford, ported the code to BSD and continued 739its development from 1983 on. 740Since then, it has evolved into the ULTRIX Packet Filter 741at DEC, a STREAMS NIT module under SunOS 4.1, and BPF. 742.Sh AUTHORS 743Steven McCanne, of Lawrence Berkeley Laboratory, implemented BPF in 744Summer 1990. 745The design was in collaboration with Van Jacobson, 746also of Lawrence Berkeley Laboratory. 747.Sh BUGS 748The read buffer must be of a fixed size (returned by the 749.Dv BIOCGBLEN 750ioctl). 751.Pp 752A file that does not request promiscuous mode may receive promiscuously 753received packets as a side effect of another file requesting this 754mode on the same hardware interface. 755This could be fixed in the kernel with additional processing overhead. 756However, we favor the model where 757all files must assume that the interface is promiscuous, and if 758so desired, must use a filter to reject foreign packets. 759.Pp 760Data link protocols with variable length headers are not currently supported. 761.Pp 762Under SunOS, if a BPF application reads more than 2^31 bytes of 763data, read will fail in 764.Er EINVAL . 765You can either fix the bug in SunOS, 766or lseek to 0 when read fails for this reason. 767.Pp 768.Dq Immediate mode 769and the 770.Dq read timeout 771are misguided features. 772This functionality can be emulated with non-blocking mode and 773.Xr select 2 . 774