1.\" -*- nroff -*- 2.\" 3.\" $NetBSD: bpf.4,v 1.55 2014/07/24 21:21:55 wiz Exp $ 4.\" 5.\" Copyright (c) 1990, 1991, 1992, 1993, 1994 6.\" The Regents of the University of California. All rights reserved. 7.\" 8.\" Redistribution and use in source and binary forms, with or without 9.\" modification, are permitted provided that: (1) source code distributions 10.\" retain the above copyright notice and this paragraph in its entirety, (2) 11.\" distributions including binary code include the above copyright notice and 12.\" this paragraph in its entirety in the documentation or other materials 13.\" provided with the distribution, and (3) all advertising materials mentioning 14.\" features or use of this software display the following acknowledgement: 15.\" ``This product includes software developed by the University of California, 16.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of 17.\" the University nor the names of its contributors may be used to endorse 18.\" or promote products derived from this software without specific prior 19.\" written permission. 20.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED 21.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF 22.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. 23.\" 24.\" This document is derived in part from the enet man page (enet.4) 25.\" distributed with 4.3BSD Unix. 26.\" 27.Dd July 24, 2014 28.Dt BPF 4 29.Os 30.Sh NAME 31.Nm bpf 32.Nd Berkeley Packet Filter raw network interface 33.Sh SYNOPSIS 34.Cd "pseudo-device bpfilter" 35.Sh DESCRIPTION 36The Berkeley Packet Filter 37provides a raw interface to data link layers in a protocol 38independent fashion. 39All packets on the network, even those destined for other hosts, 40are accessible through this mechanism. 41.Pp 42The packet filter appears as a character special device, 43.Pa /dev/bpf . 44After opening the device, the file descriptor must be bound to a 45specific network interface with the 46.Dv BIOCSETIF 47ioctl. 48A given interface can be shared by multiple listeners, and the filter 49underlying each descriptor will see an identical packet stream. 50.Pp 51Associated with each open instance of a 52.Nm 53file is a user-settable packet filter. 54Whenever a packet is received by an interface, 55all file descriptors listening on that interface apply their filter. 56Each descriptor that accepts the packet receives its own copy. 57.Pp 58Reads from these files return the next group of packets 59that have matched the filter. 60To improve performance, the buffer passed to read must be 61the same size as the buffers used internally by 62.Nm . 63This size is returned by the 64.Dv BIOCGBLEN 65ioctl (see below), and can be set with 66.Dv BIOCSBLEN . 67Note that an individual packet larger than this size is necessarily 68truncated. 69.Pp 70Since packet data is in network byte order, applications should use the 71.Xr byteorder 3 72macros to extract multi-byte values. 73.Pp 74A packet can be sent out on the network by writing to a 75.Nm 76file descriptor. 77The writes are unbuffered, meaning only one packet can be processed per write. 78Currently, only writes to Ethernets and SLIP links are supported. 79.Sh IOCTLS 80The 81.Xr ioctl 2 82command codes below are defined in 83.In net/bpf.h . 84All commands require these includes: 85.Bd -literal -offset indent 86#include \*[Lt]sys/types.h\*[Gt] 87#include \*[Lt]sys/time.h\*[Gt] 88#include \*[Lt]sys/ioctl.h\*[Gt] 89#include \*[Lt]net/bpf.h\*[Gt] 90.Ed 91.Pp 92Additionally, 93.Dv BIOCGETIF 94and 95.Dv BIOCSETIF 96require 97.Pa \*[Lt]net/if.h\*[Gt] . 98.Pp 99The (third) argument to the 100.Xr ioctl 2 101should be a pointer to the type indicated. 102.Bl -tag -width indent -offset indent 103.It Dv "BIOCGBLEN (u_int)" 104Returns the required buffer length for reads on 105.Nm 106files. 107.It Dv "BIOCSBLEN (u_int)" 108Sets the buffer length for reads on 109.Nm 110files. 111The buffer must be set before the file is attached to an interface with 112.Dv BIOCSETIF . 113If the requested buffer size cannot be accommodated, the closest 114allowable size will be set and returned in the argument. 115A read call will result in 116.Er EINVAL 117if it is passed a buffer that is not this size. 118.It Dv BIOCGDLT (u_int) 119Returns the type of the data link layer underlying the attached interface. 120.Er EINVAL 121is returned if no interface has been specified. 122The device types, prefixed with 123.Dq DLT_ , 124are defined in 125.In net/bpf.h . 126.It Dv BIOCGDLTLIST (struct bpf_dltlist) 127Returns an array of the available types of the data link layer 128underlying the attached interface: 129.Bd -literal -offset indent 130struct bpf_dltlist { 131 u_int bfl_len; 132 u_int *bfl_list; 133}; 134.Ed 135.Pp 136The available types are returned in the array pointed to by the 137.Va bfl_list 138field while their length in u_int is supplied to the 139.Va bfl_len 140field. 141.Er ENOMEM 142is returned if there is not enough buffer space and 143.Er EFAULT 144is returned if a bad address is encountered. 145The 146.Va bfl_len 147field is modified on return to indicate the actual length in u_int 148of the array returned. 149If 150.Va bfl_list 151is 152.Dv NULL , 153the 154.Va bfl_len 155field is set to indicate the required length of an array in u_int. 156.It Dv BIOCSDLT (u_int) 157Changes the type of the data link layer underlying the attached interface. 158.Er EINVAL 159is returned if no interface has been specified or the specified 160type is not available for the interface. 161.It Dv BIOCPROMISC 162Forces the interface into promiscuous mode. 163All packets, not just those destined for the local host, are processed. 164Since more than one file can be listening on a given interface, 165a listener that opened its interface non-promiscuously may receive 166packets promiscuously. 167This problem can be remedied with an appropriate filter. 168.Pp 169The interface remains in promiscuous mode until all files listening 170promiscuously are closed. 171.It Dv BIOCFLUSH 172Flushes the buffer of incoming packets, 173and resets the statistics that are returned by 174.Dv BIOCGSTATS . 175.It Dv BIOCGETIF (struct ifreq) 176Returns the name of the hardware interface that the file is listening on. 177The name is returned in the ifr_name field of 178.Fa ifr . 179All other fields are undefined. 180.It Dv BIOCSETIF (struct ifreq) 181Sets the hardware interface associated with the file. 182This command must be performed before any packets can be read. 183The device is indicated by name using the 184.Dv ifr_name 185field of the 186.Fa ifreq . 187Additionally, performs the actions of 188.Dv BIOCFLUSH . 189.It Dv BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval) 190Sets or gets the read timeout parameter. 191The 192.Fa timeval 193specifies the length of time to wait before timing 194out on a read request. 195This parameter is initialized to zero by 196.Xr open 2 , 197indicating no timeout. 198.It Dv BIOCGSTATS (struct bpf_stat) 199Returns the following structure of packet statistics: 200.Bd -literal -offset indent 201struct bpf_stat { 202 uint64_t bs_recv; 203 uint64_t bs_drop; 204 uint64_t bs_capt; 205 uint64_t bs_padding[13]; 206}; 207.Ed 208.Pp 209The fields are: 210.Bl -tag -width bs_recv -offset indent 211.It Va bs_recv 212the number of packets received by the descriptor since opened or reset 213(including any buffered since the last read call); 214.It Va bs_drop 215the number of packets which were accepted by the filter but dropped by the 216kernel because of buffer overflows 217(i.e., the application's reads aren't keeping up with the packet 218traffic); and 219.It Va bs_capt 220the number of packets accepted by the filter. 221.El 222.It Dv BIOCIMMEDIATE (u_int) 223Enables or disables 224.Dq immediate mode , 225based on the truth value of the argument. 226When immediate mode is enabled, reads return immediately upon packet 227reception. 228Otherwise, a read will block until either the kernel buffer 229becomes full or a timeout occurs. 230This is useful for programs like 231.Xr rarpd 8 , 232which must respond to messages in real time. 233The default for a new file is off. 234.It Dv BIOCSETF (struct bpf_program) 235Sets the filter program used by the kernel to discard uninteresting 236packets. 237An array of instructions and its length are passed in using the following structure: 238.Bd -literal -offset indent 239struct bpf_program { 240 u_int bf_len; 241 struct bpf_insn *bf_insns; 242}; 243.Ed 244.Pp 245The filter program is pointed to by the 246.Va bf_insns 247field while its length in units of 248.Sq struct bpf_insn 249is given by the 250.Va bf_len 251field. 252Also, the actions of 253.Dv BIOCFLUSH 254are performed. 255.Pp 256See section 257.Sy FILTER MACHINE 258for an explanation of the filter language. 259.It Dv BIOCVERSION (struct bpf_version) 260Returns the major and minor version numbers of the filter language currently 261recognized by the kernel. 262Before installing a filter, applications must check 263that the current version is compatible with the running kernel. 264Version numbers are compatible if the major numbers match and the 265application minor is less than or equal to the kernel minor. 266The kernel version number is returned in the following structure: 267.Bd -literal -offset indent 268struct bpf_version { 269 u_short bv_major; 270 u_short bv_minor; 271}; 272.Ed 273.Pp 274The current version numbers are given by 275.Dv BPF_MAJOR_VERSION 276and 277.Dv BPF_MINOR_VERSION 278from 279.In net/bpf.h . 280An incompatible filter 281may result in undefined behavior (most likely, an error returned by 282.Xr ioctl 2 283or haphazard packet matching). 284.It Dv BIOCSRSIG BIOCGRSIG (u_int) 285Sets or gets the receive signal. 286This signal will be sent to the process or process group specified by 287.Dv FIOSETOWN . 288It defaults to 289.Dv SIGIO . 290.It Dv BIOCGHDRCMPLT BIOCSHDRCMPLT (u_int) 291Sets or gets the status of the 292.Dq header complete 293flag. 294Set to zero if the link level source address should be filled in 295automatically by the interface output routine. 296Set to one if the link level source address will be written, 297as provided, to the wire. 298This flag is initialized to zero by default. 299.It Dv BIOCGSEESENT BIOCSSEESENT (u_int) 300Enable/disable or get the 301.Dq see sent 302flag status. 303If enabled, packets sent by the host (not from 304.Nm ) 305will be passed to the filter. 306By default, the flag is enabled (value is 1). 307.It Dv BIOCFEEDBACK BIOCSFEEDBACK BIOCGFEEDBACK (u_int) 308Set (or get) 309.Dq packet feedback mode . 310This allows injected packets to be fed back as input to the interface when 311output via the interface is successful. 312The first name is meant for 313.Fx 314compatibility, the two others follow the Get/Set convention. 315.\"When 316.\".Dv BPF_D_INOUT 317.\"direction is set, injected 318Injected 319outgoing packets are not returned by BPF to avoid 320duplication. 321This flag is initialized to zero by default. 322.El 323.Sh STANDARD IOCTLS 324.Nm 325now supports several standard 326.Xr ioctl 2 Ns 's 327which allow the user to do async and/or non-blocking I/O to an open 328.Nm bpf 329file descriptor. 330.Bl -tag -width indent -offset indent 331.It Dv FIONREAD (int) 332Returns the number of bytes that are immediately available for reading. 333.It Dv FIONBIO (int) 334Set or clear non-blocking I/O. 335If arg is non-zero, then doing a 336.Xr read 2 337when no data is available will return -1 and 338.Va errno 339will be set to 340.Er EAGAIN . 341If arg is zero, non-blocking I/O is disabled. 342Note: setting this 343overrides the timeout set by 344.Dv BIOCSRTIMEOUT . 345.It Dv FIOASYNC (int) 346Enable or disable async I/O. 347When enabled (arg is non-zero), the process or process group specified by 348.Dv FIOSETOWN 349will start receiving SIGIO's when packets 350arrive. 351Note that you must do an 352.Dv FIOSETOWN 353in order for this to take effect, as 354the system will not default this for you. 355The signal may be changed via 356.Dv BIOCSRSIG . 357.It Dv FIOSETOWN FIOGETOWN (int) 358Set or get the process or process group (if negative) that should receive SIGIO 359when packets are available. 360The signal may be changed using 361.Dv BIOCSRSIG 362(see above). 363.El 364.Sh BPF HEADER 365The following structure is prepended to each packet returned by 366.Xr read 2 : 367.Bd -literal -offset indent 368struct bpf_hdr { 369 struct bpf_timeval bh_tstamp; 370 uint32_t bh_caplen; 371 uint32_t bh_datalen; 372 uint16_t bh_hdrlen; 373}; 374.Ed 375.Pp 376The fields, whose values are stored in host order, are: 377.Bl -tag -width bh_datalen -offset indent 378.It Va bh_tstamp 379The time at which the packet was processed by the packet filter. 380This structure differs from the standard 381.Vt struct timeval 382in that both members are of type 383.Vt long . 384.It Va bh_caplen 385The length of the captured portion of the packet. 386This is the minimum of 387the truncation amount specified by the filter and the length of the packet. 388.It Va bh_datalen 389The length of the packet off the wire. 390This value is independent of the truncation amount specified by the filter. 391.It Va bh_hdrlen 392The length of the BPF header, which may not be equal to 393.Em sizeof(struct bpf_hdr) . 394.El 395.Pp 396The 397.Va bh_hdrlen 398field exists to account for 399padding between the header and the link level protocol. 400The purpose here is to guarantee proper alignment of the packet 401data structures, which is required on alignment sensitive 402architectures and improves performance on many other architectures. 403The packet filter ensures that the 404.Va bpf_hdr 405and the 406.Em network layer 407header will be word aligned. 408Suitable precautions must be taken when accessing the link layer 409protocol fields on alignment restricted machines. 410(This isn't a problem on an Ethernet, since 411the type field is a short falling on an even offset, 412and the addresses are probably accessed in a bytewise fashion). 413.Pp 414Additionally, individual packets are padded so that each starts 415on a word boundary. 416This requires that an application 417has some knowledge of how to get from packet to packet. 418The macro 419.Dv BPF_WORDALIGN 420is defined in 421.In net/bpf.h 422to facilitate this process. 423It rounds up its argument 424to the nearest word aligned value (where a word is 425.Dv BPF_ALIGNMENT 426bytes wide). 427.Pp 428For example, if 429.Sq Va p 430points to the start of a packet, this expression 431will advance it to the next packet: 432.Pp 433.Dl p = (char *)p + BPF_WORDALIGN(p-\*[Gt]bh_hdrlen + p-\*[Gt]bh_caplen) 434.Pp 435For the alignment mechanisms to work properly, the 436buffer passed to 437.Xr read 2 438must itself be word aligned. 439.Xr malloc 3 440will always return an aligned buffer. 441.Sh FILTER MACHINE 442A filter program is an array of instructions, with all branches forwardly 443directed, terminated by a 444.Sy return 445instruction. 446Each instruction performs some action on the pseudo-machine state, 447which consists of an accumulator, index register, scratch memory store, 448and implicit program counter. 449.Pp 450The following structure defines the instruction format: 451.Bd -literal -offset indent 452struct bpf_insn { 453 uint16_t code; 454 u_char jt; 455 u_char jf; 456 uint32_t k; 457}; 458.Ed 459.Pp 460The 461.Va k 462field is used in different ways by different instructions, 463and the 464.Va jt 465and 466.Va jf 467fields are used as offsets 468by the branch instructions. 469The opcodes are encoded in a semi-hierarchical fashion. 470There are eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX, 471BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC. 472Various other mode and 473operator bits are or'd into the class to give the actual instructions. 474The classes and modes are defined in 475.In net/bpf.h . 476.Pp 477Below are the semantics for each defined BPF instruction. 478We use the convention that A is the accumulator, X is the index register, 479P[] packet data, and M[] scratch memory store. 480P[i:n] gives the data at byte offset 481.Dq i 482in the packet, 483interpreted as a word (n=4), 484unsigned halfword (n=2), or unsigned byte (n=1). 485M[i] gives the i'th word in the scratch memory store, which is only 486addressed in word units. 487The memory store is indexed from 0 to BPF_MEMWORDS-1. 488.Va k , 489.Va jt , 490and 491.Va jf 492are the corresponding fields in the 493instruction definition. 494.Dq len 495refers to the length of the packet. 496.Bl -tag -width indent -offset indent 497.It Sy BPF_LD 498These instructions copy a value into the accumulator. 499The type of the source operand is specified by an 500.Dq addressing mode 501and can be a constant 502.Sy ( BPF_IMM ) , 503packet data at a fixed offset 504.Sy ( BPF_ABS ) , 505packet data at a variable offset 506.Sy ( BPF_IND ) , 507the packet length 508.Sy ( BPF_LEN ) , 509or a word in the scratch memory store 510.Sy ( BPF_MEM ) . 511For 512.Sy BPF_IND 513and 514.Sy BPF_ABS , 515the data size must be specified as a word 516.Sy ( BPF_W ) , 517halfword 518.Sy ( BPF_H ) , 519or byte 520.Sy ( BPF_B ) . 521Arithmetic overflow when calculating a variable offset terminates 522the filter program and the packet is ignored. 523The semantics of all the recognized BPF_LD instructions follow. 524.Bl -column "BPF_LD_BPF_W_BPF_ABS" "A \*[Lt]- P[k:4]" -offset indent 525.It Sy BPF_LD+BPF_W+BPF_ABS Ta A \*[Lt]- P[k:4] 526.It Sy BPF_LD+BPF_H+BPF_ABS Ta A \*[Lt]- P[k:2] 527.It Sy BPF_LD+BPF_B+BPF_ABS Ta A \*[Lt]- P[k:1] 528.It Sy BPF_LD+BPF_W+BPF_IND Ta A \*[Lt]- P[X+k:4] 529.It Sy BPF_LD+BPF_H+BPF_IND Ta A \*[Lt]- P[X+k:2] 530.It Sy BPF_LD+BPF_B+BPF_IND Ta A \*[Lt]- P[X+k:1] 531.It Sy BPF_LD+BPF_W+BPF_LEN Ta A \*[Lt]- len 532.It Sy BPF_LD+BPF_IMM Ta A \*[Lt]- k 533.It Sy BPF_LD+BPF_MEM Ta A \*[Lt]- M[k] 534.El 535.It Sy BPF_LDX 536These instructions load a value into the index register. 537Note that the addressing modes are more restricted than those of 538the accumulator loads, but they include 539.Sy BPF_MSH , 540a hack for efficiently loading the IP header length. 541.Bl -column "BPF_LDX_BPF_W_BPF_IMM" "X \*[Lt]- k" -offset indent 542.It Sy BPF_LDX+BPF_W+BPF_IMM Ta X \*[Lt]- k 543.It Sy BPF_LDX+BPF_W+BPF_MEM Ta X \*[Lt]- M[k] 544.It Sy BPF_LDX+BPF_W+BPF_LEN Ta X \*[Lt]- len 545.It Sy BPF_LDX+BPF_B+BPF_MSH Ta X \*[Lt]- 4*(P[k:1]\*[Am]0xf) 546.El 547.It Sy BPF_ST 548This instruction stores the accumulator into the scratch memory. 549We do not need an addressing mode since there is only one possibility 550for the destination. 551.Bl -column "BPF_ST" "M[k] \*[Lt]- A" -offset indent 552.It Sy BPF_ST Ta M[k] \*[Lt]- A 553.El 554.It Sy BPF_STX 555This instruction stores the index register in the scratch memory store. 556.Bl -column "BPF_STX" "M[k] \*[Lt]- X" -offset indent 557.It Sy BPF_STX Ta M[k] \*[Lt]- X 558.El 559.It Sy BPF_ALU 560The alu instructions perform operations between the accumulator and 561index register or constant, and store the result back in the accumulator. 562For binary operations, a source mode is required 563.Sy ( BPF_K 564or 565.Sy BPF_X ) . 566.Bl -column "BPF_ALU_BPF_ADD_BPF_K" "A \*[Lt]- A + k" -offset indent 567.It Sy BPF_ALU+BPF_ADD+BPF_K Ta A \*[Lt]- A + k 568.It Sy BPF_ALU+BPF_SUB+BPF_K Ta A \*[Lt]- A - k 569.It Sy BPF_ALU+BPF_MUL+BPF_K Ta A \*[Lt]- A * k 570.It Sy BPF_ALU+BPF_DIV+BPF_K Ta A \*[Lt]- A / k 571.It Sy BPF_ALU+BPF_AND+BPF_K Ta A \*[Lt]- A \*[Am] k 572.It Sy BPF_ALU+BPF_OR+BPF_K Ta A \*[Lt]- A | k 573.It Sy BPF_ALU+BPF_LSH+BPF_K Ta A \*[Lt]- A \*[Lt]\*[Lt] k 574.It Sy BPF_ALU+BPF_RSH+BPF_K Ta A \*[Lt]- A \*[Gt]\*[Gt] k 575.It Sy BPF_ALU+BPF_ADD+BPF_X Ta A \*[Lt]- A + X 576.It Sy BPF_ALU+BPF_SUB+BPF_X Ta A \*[Lt]- A - X 577.It Sy BPF_ALU+BPF_MUL+BPF_X Ta A \*[Lt]- A * X 578.It Sy BPF_ALU+BPF_DIV+BPF_X Ta A \*[Lt]- A / X 579.It Sy BPF_ALU+BPF_AND+BPF_X Ta A \*[Lt]- A \*[Am] X 580.It Sy BPF_ALU+BPF_OR+BPF_X Ta A \*[Lt]- A | X 581.It Sy BPF_ALU+BPF_LSH+BPF_X Ta A \*[Lt]- A \*[Lt]\*[Lt] X 582.It Sy BPF_ALU+BPF_RSH+BPF_X Ta A \*[Lt]- A \*[Gt]\*[Gt] X 583.It Sy BPF_ALU+BPF_NEG Ta A \*[Lt]- -A 584.El 585.It Sy BPF_JMP 586The jump instructions alter flow of control. 587Conditional jumps compare the accumulator against a constant 588.Sy ( BPF_K ) 589or the index register 590.Sy ( BPF_X ) . 591If the result is true (or non-zero), 592the true branch is taken, otherwise the false branch is taken. 593Jump offsets are encoded in 8 bits so the longest jump is 256 instructions. 594However, the jump always 595.Sy ( BPF_JA ) 596opcode uses the 32 bit 597.Va k 598field as the offset, allowing arbitrarily distant destinations. 599All conditionals use unsigned comparison conventions. 600.Bl -column "BPF_JMP+BPF_JGE+BPF_K" "pc += (A \*[Ge] k) ? jt : jf" -offset indent 601.It Sy BPF_JMP+BPF_JA Ta pc += k 602.It Sy BPF_JMP+BPF_JGT+BPF_K Ta "pc += (A \*[Gt] k) ? jt : jf" 603.It Sy BPF_JMP+BPF_JGE+BPF_K Ta "pc += (A \*[Ge] k) ? jt : jf" 604.It Sy BPF_JMP+BPF_JEQ+BPF_K Ta "pc += (A == k) ? jt : jf" 605.It Sy BPF_JMP+BPF_JSET+BPF_K Ta "pc += (A \*[Am] k) ? jt : jf" 606.It Sy BPF_JMP+BPF_JGT+BPF_X Ta "pc += (A \*[Gt] X) ? jt : jf" 607.It Sy BPF_JMP+BPF_JGE+BPF_X Ta "pc += (A \*[Ge] X) ? jt : jf" 608.It Sy BPF_JMP+BPF_JEQ+BPF_X Ta "pc += (A == X) ? jt : jf" 609.It Sy BPF_JMP+BPF_JSET+BPF_X Ta "pc += (A \*[Am] X) ? jt : jf" 610.El 611.It Sy BPF_RET 612The return instructions terminate the filter program and specify the amount 613of packet to accept (i.e., they return the truncation amount). 614A return value of zero indicates that the packet should be ignored. 615The return value is either a constant 616.Sy ( BPF_K ) 617or the accumulator 618.Sy ( BPF_A ) . 619.Bl -column "BPF_RET+BPF_A" "accept A bytes" -offset indent 620.It Sy BPF_RET+BPF_A Ta accept A bytes 621.It Sy BPF_RET+BPF_K Ta accept k bytes 622.El 623.It Sy BPF_MISC 624The miscellaneous category was created for anything that doesn't 625fit into the above classes, and for any new instructions that might need to 626be added. 627Currently, these are the register transfer instructions 628that copy the index register to the accumulator or vice versa. 629.Bl -column "BPF_MISC+BPF_TAX" "X \*[Lt]- A" -offset indent 630.It Sy BPF_MISC+BPF_TAX Ta X \*[Lt]- A 631.It Sy BPF_MISC+BPF_TXA Ta A \*[Lt]- X 632.El 633.Pp 634Also, two instructions to call a "coprocessor" if initialized by the kernel 635component. 636There is no coprocessor by default. 637.Bl -column "BPF_MISC+BPF_COP" "A \*[Lt]- funcs[X](...)" -offset indent 638.It Sy BPF_MISC+BPF_COP Ta A \*[Lt]- funcs[k](..) 639.It Sy BPF_MISC+BPF_COPX Ta A \*[Lt]- funcs[X](..) 640.El 641.Pp 642If the coprocessor is not set or the function index is out of range, these 643instructions will abort the program and return zero. 644.El 645.Pp 646The BPF interface provides the following macros to facilitate 647array initializers: 648.Bd -unfilled -offset indent 649.Sy BPF_STMT No (opcode, operand) 650.Sy BPF_JUMP No (opcode, operand, true_offset, false_offset) 651.Ed 652.Sh SYSCTLS 653The following sysctls are available when 654.Nm 655is enabled: 656.Pp 657.Bl -tag -width "XnetXbpfXmaxbufsizeXX" 658.It Li net.bpf.maxbufsize 659Sets the maximum buffer size available for 660.Nm 661peers. 662.It Li net.bpf.stats 663Shows 664.Nm 665statistics. 666They can be retrieved with the 667.Xr netstat 1 668utility. 669.It Li net.bpf.peers 670Shows the current 671.Nm 672peers. 673This is only available to the super user and can also be retrieved with the 674.Xr netstat 1 675utility. 676.El 677.Pp 678On architectures with 679.Xr bpfjit 4 680support, the additional sysctl is available: 681.Pp 682.Bl -tag -width "XnetXbpfXjitXX" 683.It Li net.bpf.jit 684Toggle 685.Sy Just-In-Time 686compilation of new filter programs. 687In order to enable Just-In-Time compilation, 688the bpfjit kernel module must be loaded. 689Changing a value of this sysctl doesn't affect 690existing filter programs. 691.El 692.Sh FILES 693.Pa /dev/bpf 694.Sh EXAMPLES 695The following filter is taken from the Reverse ARP Daemon. 696It accepts only Reverse ARP requests. 697.Bd -literal -offset indent 698struct bpf_insn insns[] = { 699 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 700 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3), 701 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 702 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1), 703 BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) + 704 sizeof(struct ether_header)), 705 BPF_STMT(BPF_RET+BPF_K, 0), 706}; 707.Ed 708.Pp 709This filter accepts only IP packets between host 128.3.112.15 and 710128.3.112.35. 711.Bd -literal -offset indent 712struct bpf_insn insns[] = { 713 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 714 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8), 715 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26), 716 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2), 717 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 718 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4), 719 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3), 720 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 721 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1), 722 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 723 BPF_STMT(BPF_RET+BPF_K, 0), 724}; 725.Ed 726.Pp 727Finally, this filter returns only TCP finger packets. 728We must parse the IP header to reach the TCP header. 729The 730.Sy BPF_JSET 731instruction checks that the IP fragment offset is 0 so we are sure 732that we have a TCP header. 733.Bd -literal -offset indent 734struct bpf_insn insns[] = { 735 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 736 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10), 737 BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23), 738 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8), 739 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 740 BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0), 741 BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14), 742 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14), 743 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0), 744 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16), 745 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1), 746 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 747 BPF_STMT(BPF_RET+BPF_K, 0), 748}; 749.Ed 750.Sh SEE ALSO 751.Xr ioctl 2 , 752.Xr read 2 , 753.Xr select 2 , 754.Xr signal 3 , 755.Xr bpfjit 4 , 756.Xr tcpdump 8 757.Rs 758.%T "The BSD Packet Filter: A New Architecture for User-level Packet Capture" 759.%A S. McCanne 760.%A V. Jacobson 761.%J Proceedings of the 1993 Winter USENIX 762.%C Technical Conference, San Diego, CA 763.Re 764.Sh HISTORY 765The Enet packet filter was created in 1980 by Mike Accetta and 766Rick Rashid at Carnegie-Mellon University. 767Jeffrey Mogul, at Stanford, ported the code to BSD and continued 768its development from 1983 on. 769Since then, it has evolved into the ULTRIX Packet Filter 770at DEC, a STREAMS NIT module under SunOS 4.1, and BPF. 771.Sh AUTHORS 772.An -nosplit 773.An Steven McCanne , 774of Lawrence Berkeley Laboratory, implemented BPF in Summer 1990. 775The design was in collaboration with 776.An Van Jacobson , 777also of Lawrence Berkeley Laboratory. 778.Sh BUGS 779The read buffer must be of a fixed size (returned by the 780.Dv BIOCGBLEN 781ioctl). 782.Pp 783A file that does not request promiscuous mode may receive promiscuously 784received packets as a side effect of another file requesting this 785mode on the same hardware interface. 786This could be fixed in the kernel with additional processing overhead. 787However, we favor the model where 788all files must assume that the interface is promiscuous, and if 789so desired, must use a filter to reject foreign packets. 790.Pp 791Under SunOS, if a BPF application reads more than 2^31 bytes of 792data, read will fail in 793.Er EINVAL . 794You can either fix the bug in SunOS, 795or lseek to 0 when read fails for this reason. 796.Pp 797.Dq Immediate mode 798and the 799.Dq read timeout 800are misguided features. 801This functionality can be emulated with non-blocking mode and 802.Xr select 2 . 803