1.\" -*- nroff -*- 2.\" 3.\" $NetBSD: bpf.4,v 1.64 2021/10/24 17:46:06 gutteridge Exp $ 4.\" 5.\" Copyright (c) 1990, 1991, 1992, 1993, 1994 6.\" The Regents of the University of California. All rights reserved. 7.\" 8.\" Redistribution and use in source and binary forms, with or without 9.\" modification, are permitted provided that: (1) source code distributions 10.\" retain the above copyright notice and this paragraph in its entirety, (2) 11.\" distributions including binary code include the above copyright notice and 12.\" this paragraph in its entirety in the documentation or other materials 13.\" provided with the distribution, and (3) all advertising materials mentioning 14.\" features or use of this software display the following acknowledgement: 15.\" ``This product includes software developed by the University of California, 16.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of 17.\" the University nor the names of its contributors may be used to endorse 18.\" or promote products derived from this software without specific prior 19.\" written permission. 20.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED 21.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF 22.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. 23.\" 24.\" This document is derived in part from the enet man page (enet.4) 25.\" distributed with 4.3BSD Unix. 26.\" 27.Dd October 24, 2021 28.Dt BPF 4 29.Os 30.Sh NAME 31.Nm bpf 32.Nd Berkeley Packet Filter raw network interface 33.Sh SYNOPSIS 34.Cd "pseudo-device bpfilter" 35.Sh DESCRIPTION 36The Berkeley Packet Filter 37provides a raw interface to data link layers in a protocol 38independent fashion. 39All packets on the network, even those destined for other hosts, 40are accessible through this mechanism. 41.Pp 42The packet filter appears as a character special device, 43.Pa /dev/bpf . 44After opening the device, the file descriptor must be bound to a 45specific network interface with the 46.Dv BIOCSETIF 47ioctl. 48A given interface can be shared by multiple listeners, and the filter 49underlying each descriptor will see an identical packet stream. 50.Pp 51Associated with each open instance of a 52.Nm 53file is a user-settable packet filter. 54Whenever a packet is received by an interface, 55all file descriptors listening on that interface apply their filter. 56Each descriptor that accepts the packet receives its own copy. 57.Pp 58Reads from these files return the next group of packets 59that have matched the filter. 60To improve performance, the buffer passed to read must be 61the same size as the buffers used internally by 62.Nm . 63This size is returned by the 64.Dv BIOCGBLEN 65ioctl (see below), and can be set with 66.Dv BIOCSBLEN . 67Note that an individual packet larger than this size is necessarily 68truncated. 69.Pp 70Since packet data is in network byte order, applications should use the 71.Xr byteorder 3 72macros to extract multi-byte values. 73.Pp 74A packet can be sent out on the network by writing to a 75.Nm 76file descriptor. 77The writes are unbuffered, meaning only one packet can be processed per write. 78Currently, only writes to Ethernet-based (including Wi-Fi) and SLIP 79links are supported. 80.Sh IOCTLS 81The 82.Xr ioctl 2 83command codes below are defined in 84.In net/bpf.h . 85All commands require these includes: 86.Bd -literal -offset indent 87#include <sys/types.h> 88#include <sys/time.h> 89#include <sys/ioctl.h> 90#include <net/bpf.h> 91.Ed 92.Pp 93Additionally, 94.Dv BIOCGETIF 95and 96.Dv BIOCSETIF 97require 98.Pa <net/if.h> . 99.Pp 100The (third) argument to the 101.Xr ioctl 2 102should be a pointer to the type indicated. 103.Bl -tag -width indent -offset indent 104.It Dv BIOCGBLEN ( u_int ) 105Returns the required buffer length for reads on 106.Nm 107files. 108.It Dv BIOCSBLEN ( u_int ) 109Sets the buffer length for reads on 110.Nm 111files. 112The buffer must be set before the file is attached to an interface with 113.Dv BIOCSETIF . 114If the requested buffer size cannot be accommodated, the closest 115allowable size will be set and returned in the argument. 116A read call will result in 117.Er EINVAL 118if it is passed a buffer that is not this size. 119.It Dv BIOCGDLT ( u_int ) 120Returns the type of the data link layer underlying the attached interface. 121.Er EINVAL 122is returned if no interface has been specified. 123The device types, prefixed with 124.Dq DLT_ , 125are defined in 126.In net/bpf.h . 127.It Dv BIOCGDLTLIST ( struct bpf_dltlist ) 128Returns an array of the available types of the data link layer 129underlying the attached interface: 130.Bd -literal -offset indent 131struct bpf_dltlist { 132 u_int bfl_len; 133 u_int *bfl_list; 134}; 135.Ed 136.Pp 137The available types are returned in the array pointed to by the 138.Va bfl_list 139field while their length in u_int is supplied to the 140.Va bfl_len 141field. 142.Er ENOMEM 143is returned if there is not enough buffer space and 144.Er EFAULT 145is returned if a bad address is encountered. 146The 147.Va bfl_len 148field is modified on return to indicate the actual length in u_int 149of the array returned. 150If 151.Va bfl_list 152is 153.Dv NULL , 154the 155.Va bfl_len 156field is set to indicate the required length of an array in u_int. 157.It Dv BIOCSDLT ( u_int ) 158Changes the type of the data link layer underlying the attached interface. 159.Er EINVAL 160is returned if no interface has been specified or the specified 161type is not available for the interface. 162.It Dv BIOCPROMISC 163Forces the interface into promiscuous mode. 164All packets, not just those destined for the local host, are processed. 165Since more than one file can be listening on a given interface, 166a listener that opened its interface non-promiscuously may receive 167packets promiscuously. 168This problem can be remedied with an appropriate filter. 169.Pp 170The interface remains in promiscuous mode until all files listening 171promiscuously are closed. 172.It Dv BIOCFLUSH 173Flushes the buffer of incoming packets, 174and resets the statistics that are returned by 175.Dv BIOCGSTATS . 176.It Dv BIOCGETIF ( struct ifreq ) 177Returns the name of the hardware interface that the file is listening on. 178The name is returned in the ifr_name field of 179.Fa ifr . 180All other fields are undefined. 181.It Dv BIOCSETIF ( struct ifreq ) 182Sets the hardware interface associated with the file. 183This command must be performed before any packets can be read. 184The device is indicated by name using the 185.Dv ifr_name 186field of the 187.Fa ifreq . 188Additionally, performs the actions of 189.Dv BIOCFLUSH . 190.It Dv BIOCSRTIMEOUT , BIOCGRTIMEOUT ( struct timeval ) 191Sets or gets the read timeout parameter. 192The 193.Fa timeval 194specifies the length of time to wait before timing 195out on a read request. 196This parameter is initialized to zero by 197.Xr open 2 , 198indicating no timeout. 199.It Dv BIOCGSTATS ( struct bpf_stat ) 200Returns the following structure of packet statistics: 201.Bd -literal -offset indent 202struct bpf_stat { 203 uint64_t bs_recv; 204 uint64_t bs_drop; 205 uint64_t bs_capt; 206 uint64_t bs_padding[13]; 207}; 208.Ed 209.Pp 210The fields are: 211.Bl -tag -width bs_recv -offset indent 212.It Va bs_recv 213the number of packets received by the descriptor since opened or reset 214(including any buffered since the last read call); 215.It Va bs_drop 216the number of packets which were accepted by the filter but dropped by the 217kernel because of buffer overflows 218(i.e., the application's reads aren't keeping up with the packet 219traffic); and 220.It Va bs_capt 221the number of packets accepted by the filter. 222.El 223.It Dv BIOCIMMEDIATE ( u_int ) 224Enables or disables 225.Dq immediate mode , 226based on the truth value of the argument. 227When immediate mode is enabled, reads return immediately upon packet 228reception. 229Otherwise, a read will block until either the kernel buffer 230becomes full or a timeout occurs. 231This is useful for programs like 232.Xr rarpd 8 , 233which must respond to messages in real time. 234The default for a new file is off. 235.Dv BIOCLOCK 236Set the locked flag on the bpf descriptor. 237This prevents the execution of ioctl commands which could change the 238underlying operating parameters of the device. 239.It Dv BIOCSETF ( struct bpf_program ) 240Sets the filter program used by the kernel to discard uninteresting 241packets. 242An array of instructions and its length are passed in using the following structure: 243.Bd -literal -offset indent 244struct bpf_program { 245 u_int bf_len; 246 struct bpf_insn *bf_insns; 247}; 248.Ed 249.Pp 250The filter program is pointed to by the 251.Va bf_insns 252field while its length in units of 253.Sq struct bpf_insn 254is given by the 255.Va bf_len 256field. 257Also, the actions of 258.Dv BIOCFLUSH 259are performed. 260.Pp 261See section 262.Sy FILTER MACHINE 263for an explanation of the filter language. 264.It Dv BIOCSETWF ( struct bpf_program ) 265Sets the write filter program used by the kernel to control what type 266of packets can be written to the interface. 267See the 268.Dv BIOCSETF 269command for more information on the bpf filter program. 270.It Dv BIOCVERSION ( struct bpf_version ) 271Returns the major and minor version numbers of the filter language currently 272recognized by the kernel. 273Before installing a filter, applications must check 274that the current version is compatible with the running kernel. 275Version numbers are compatible if the major numbers match and the 276application minor is less than or equal to the kernel minor. 277The kernel version number is returned in the following structure: 278.Bd -literal -offset indent 279struct bpf_version { 280 u_short bv_major; 281 u_short bv_minor; 282}; 283.Ed 284.Pp 285The current version numbers are given by 286.Dv BPF_MAJOR_VERSION 287and 288.Dv BPF_MINOR_VERSION 289from 290.In net/bpf.h . 291An incompatible filter 292may result in undefined behavior (most likely, an error returned by 293.Xr ioctl 2 294or haphazard packet matching). 295.It Dv BIOCSRSIG , BIOCGRSIG ( u_int ) 296Sets or gets the receive signal. 297This signal will be sent to the process or process group specified by 298.Dv FIOSETOWN . 299It defaults to 300.Dv SIGIO . 301.It Dv BIOCGHDRCMPLT , BIOCSHDRCMPLT ( u_int ) 302Sets or gets the status of the 303.Dq header complete 304flag. 305Set to zero if the link level source address should be filled in 306automatically by the interface output routine. 307Set to one if the link level source address will be written, 308as provided, to the wire. 309This flag is initialized to zero by default. 310.It Dv BIOCGSEESENT , BIOCSSEESENT ( u_int ) 311These commands are obsolete but left for compatibility. 312Use 313.Dv BIOCSDIRECTION 314and 315.Dv BIOCGDIRECTION 316instead. 317Set or get the flag determining whether locally generated packets on the 318interface should be returned by BPF. 319Set to zero to see only incoming packets on the interface. 320Set to one to see packets originating locally and remotely on the interface. 321This flag is initialized to one by default. 322.It Dv BIOCSDIRECTION 323.It Dv BIOCGDIRECTION 324.Pq Li u_int 325Set or get the setting determining whether incoming, outgoing, or all packets 326on the interface should be returned by BPF. 327Set to 328.Dv BPF_D_IN 329to see only incoming packets on the interface. 330Set to 331.Dv BPF_D_INOUT 332to see packets originating locally and remotely on the interface. 333Set to 334.Dv BPF_D_OUT 335to see only outgoing packets on the interface. 336This setting is initialized to 337.Dv BPF_D_INOUT 338by default. 339.It Dv BIOCFEEDBACK , BIOCSFEEDBACK , BIOCGFEEDBACK ( u_int ) 340Set (or get) 341.Dq packet feedback mode . 342This allows injected packets to be fed back as input to the interface when 343output via the interface is successful. 344The first name is meant for 345.Fx 346compatibility, the two others follow the Get/Set convention. 347.\"When 348.\".Dv BPF_D_INOUT 349.\"direction is set, injected 350Injected 351outgoing packets are not returned by BPF to avoid 352duplication. 353This flag is initialized to zero by default. 354.El 355.Sh STANDARD IOCTLS 356.Nm 357now supports several standard 358.Xr ioctl 2 Ns 's 359which allow the user to do async and/or non-blocking I/O to an open 360.Nm bpf 361file descriptor. 362.Bl -tag -width indent -offset indent 363.It Dv FIONREAD ( int ) 364Returns the number of bytes that are immediately available for reading. 365.It Dv FIONBIO ( int ) 366Set or clear non-blocking I/O. 367If arg is non-zero, then doing a 368.Xr read 2 369when no data is available will return -1 and 370.Va errno 371will be set to 372.Er EAGAIN . 373If arg is zero, non-blocking I/O is disabled. 374Note: setting this 375overrides the timeout set by 376.Dv BIOCSRTIMEOUT . 377.It Dv FIOASYNC ( int ) 378Enable or disable async I/O. 379When enabled (arg is non-zero), the process or process group specified by 380.Dv FIOSETOWN 381will start receiving SIGIO's when packets 382arrive. 383Note that you must do an 384.Dv FIOSETOWN 385in order for this to take effect, as 386the system will not default this for you. 387The signal may be changed via 388.Dv BIOCSRSIG . 389.It Dv FIOSETOWN , FIOGETOWN ( int ) 390Set or get the process or process group (if negative) that should receive SIGIO 391when packets are available. 392The signal may be changed using 393.Dv BIOCSRSIG 394(see above). 395.El 396.Sh BPF HEADER 397The following structure is prepended to each packet returned by 398.Xr read 2 : 399.Bd -literal -offset indent 400struct bpf_hdr { 401 struct bpf_timeval bh_tstamp; 402 uint32_t bh_caplen; 403 uint32_t bh_datalen; 404 uint16_t bh_hdrlen; 405}; 406.Ed 407.Pp 408The fields, whose values are stored in host order, are: 409.Bl -tag -width bh_datalen -offset indent 410.It Va bh_tstamp 411The time at which the packet was processed by the packet filter. 412This structure differs from the standard 413.Vt struct timeval 414in that both members are of type 415.Vt long . 416.It Va bh_caplen 417The length of the captured portion of the packet. 418This is the minimum of 419the truncation amount specified by the filter and the length of the packet. 420.It Va bh_datalen 421The length of the packet off the wire. 422This value is independent of the truncation amount specified by the filter. 423.It Va bh_hdrlen 424The length of the BPF header, which may not be equal to 425.Em sizeof(struct bpf_hdr) . 426.El 427.Pp 428The 429.Va bh_hdrlen 430field exists to account for 431padding between the header and the link level protocol. 432The purpose here is to guarantee proper alignment of the packet 433data structures, which is required on alignment sensitive 434architectures and improves performance on many other architectures. 435The packet filter ensures that the 436.Va bpf_hdr 437and the 438.Em network layer 439header will be word aligned. 440Suitable precautions must be taken when accessing the link layer 441protocol fields on alignment restricted machines. 442(This isn't a problem on an Ethernet, since 443the type field is a short falling on an even offset, 444and the addresses are probably accessed in a bytewise fashion). 445.Pp 446Additionally, individual packets are padded so that each starts 447on a word boundary. 448This requires that an application 449has some knowledge of how to get from packet to packet. 450The macro 451.Dv BPF_WORDALIGN 452is defined in 453.In net/bpf.h 454to facilitate this process. 455It rounds up its argument 456to the nearest word aligned value (where a word is 457.Dv BPF_ALIGNMENT 458bytes wide). 459.Pp 460For example, if 461.Sq Va p 462points to the start of a packet, this expression 463will advance it to the next packet: 464.Pp 465.Dl p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen) 466.Pp 467For the alignment mechanisms to work properly, the 468buffer passed to 469.Xr read 2 470must itself be word aligned. 471.Xr malloc 3 472will always return an aligned buffer. 473.Sh FILTER MACHINE 474A filter program is an array of instructions, with all branches forwardly 475directed, terminated by a 476.Sy return 477instruction. 478Each instruction performs some action on the pseudo-machine state, 479which consists of an accumulator, index register, scratch memory store, 480and implicit program counter. 481.Pp 482The following structure defines the instruction format: 483.Bd -literal -offset indent 484struct bpf_insn { 485 uint16_t code; 486 u_char jt; 487 u_char jf; 488 uint32_t k; 489}; 490.Ed 491.Pp 492The 493.Va k 494field is used in different ways by different instructions, 495and the 496.Va jt 497and 498.Va jf 499fields are used as offsets 500by the branch instructions. 501The opcodes are encoded in a semi-hierarchical fashion. 502There are eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX, 503BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC. 504Various other mode and 505operator bits are or'd into the class to give the actual instructions. 506The classes and modes are defined in 507.In net/bpf.h . 508.Pp 509Below are the semantics for each defined BPF instruction. 510We use the convention that A is the accumulator, X is the index register, 511P[] packet data, and M[] scratch memory store. 512P[i:n] gives the data at byte offset 513.Dq i 514in the packet, 515interpreted as a word (n=4), 516unsigned halfword (n=2), or unsigned byte (n=1). 517M[i] gives the i'th word in the scratch memory store, which is only 518addressed in word units. 519The memory store is indexed from 0 to BPF_MEMWORDS-1. 520.Va k , 521.Va jt , 522and 523.Va jf 524are the corresponding fields in the 525instruction definition. 526.Dq len 527refers to the length of the packet. 528.Bl -tag -width indent -offset indent 529.It Sy BPF_LD 530These instructions copy a value into the accumulator. 531The type of the source operand is specified by an 532.Dq addressing mode 533and can be a constant 534.Sy ( BPF_IMM ) , 535packet data at a fixed offset 536.Sy ( BPF_ABS ) , 537packet data at a variable offset 538.Sy ( BPF_IND ) , 539the packet length 540.Sy ( BPF_LEN ) , 541or a word in the scratch memory store 542.Sy ( BPF_MEM ) . 543For 544.Sy BPF_IND 545and 546.Sy BPF_ABS , 547the data size must be specified as a word 548.Sy ( BPF_W ) , 549halfword 550.Sy ( BPF_H ) , 551or byte 552.Sy ( BPF_B ) . 553Arithmetic overflow when calculating a variable offset terminates 554the filter program and the packet is ignored. 555The semantics of all the recognized BPF_LD instructions follow. 556.Bl -column "BPF_LD_BPF_W_BPF_ABS" "A <- P[k:4]" -offset indent 557.It Sy BPF_LD+BPF_W+BPF_ABS Ta A <- P[k:4] 558.It Sy BPF_LD+BPF_H+BPF_ABS Ta A <- P[k:2] 559.It Sy BPF_LD+BPF_B+BPF_ABS Ta A <- P[k:1] 560.It Sy BPF_LD+BPF_W+BPF_IND Ta A <- P[X+k:4] 561.It Sy BPF_LD+BPF_H+BPF_IND Ta A <- P[X+k:2] 562.It Sy BPF_LD+BPF_B+BPF_IND Ta A <- P[X+k:1] 563.It Sy BPF_LD+BPF_W+BPF_LEN Ta A <- len 564.It Sy BPF_LD+BPF_IMM Ta A <- k 565.It Sy BPF_LD+BPF_MEM Ta A <- M[k] 566.El 567.It Sy BPF_LDX 568These instructions load a value into the index register. 569Note that the addressing modes are more restricted than those of 570the accumulator loads, but they include 571.Sy BPF_MSH , 572a hack for efficiently loading the IP header length. 573.Bl -column "BPF_LDX_BPF_W_BPF_MEM" "X <- k" -offset indent 574.It Sy BPF_LDX+BPF_W+BPF_IMM Ta X <- k 575.It Sy BPF_LDX+BPF_W+BPF_MEM Ta X <- M[k] 576.It Sy BPF_LDX+BPF_W+BPF_LEN Ta X <- len 577.It Sy BPF_LDX+BPF_B+BPF_MSH Ta X <- 4*(P[k:1]&0xf) 578.El 579.It Sy BPF_ST 580This instruction stores the accumulator into the scratch memory. 581We do not need an addressing mode since there is only one possibility 582for the destination. 583.Bl -column "BPF_ST" "M[k] <- A" -offset indent 584.It Sy BPF_ST Ta M[k] <- A 585.El 586.It Sy BPF_STX 587This instruction stores the index register in the scratch memory store. 588.Bl -column "BPF_STX" "M[k] <- X" -offset indent 589.It Sy BPF_STX Ta M[k] <- X 590.El 591.It Sy BPF_ALU 592The alu instructions perform operations between the accumulator and 593index register or constant, and store the result back in the accumulator. 594For binary operations, a source mode is required 595.Sy ( BPF_K 596or 597.Sy BPF_X ) . 598.Bl -column "BPF_ALU_BPF_ADD_BPF_K" "A <- A + k" -offset indent 599.It Sy BPF_ALU+BPF_ADD+BPF_K Ta A <- A + k 600.It Sy BPF_ALU+BPF_SUB+BPF_K Ta A <- A - k 601.It Sy BPF_ALU+BPF_MUL+BPF_K Ta A <- A * k 602.It Sy BPF_ALU+BPF_DIV+BPF_K Ta A <- A / k 603.It Sy BPF_ALU+BPF_AND+BPF_K Ta A <- A & k 604.It Sy BPF_ALU+BPF_OR+BPF_K Ta A <- A | k 605.It Sy BPF_ALU+BPF_LSH+BPF_K Ta A <- A << k 606.It Sy BPF_ALU+BPF_RSH+BPF_K Ta A <- A >> k 607.It Sy BPF_ALU+BPF_ADD+BPF_X Ta A <- A + X 608.It Sy BPF_ALU+BPF_SUB+BPF_X Ta A <- A - X 609.It Sy BPF_ALU+BPF_MUL+BPF_X Ta A <- A * X 610.It Sy BPF_ALU+BPF_DIV+BPF_X Ta A <- A / X 611.It Sy BPF_ALU+BPF_AND+BPF_X Ta A <- A & X 612.It Sy BPF_ALU+BPF_OR+BPF_X Ta A <- A | X 613.It Sy BPF_ALU+BPF_LSH+BPF_X Ta A <- A << X 614.It Sy BPF_ALU+BPF_RSH+BPF_X Ta A <- A >> X 615.It Sy BPF_ALU+BPF_NEG Ta A <- -A 616.El 617.It Sy BPF_JMP 618The jump instructions alter flow of control. 619Conditional jumps compare the accumulator against a constant 620.Sy ( BPF_K ) 621or the index register 622.Sy ( BPF_X ) . 623If the result is true (or non-zero), 624the true branch is taken, otherwise the false branch is taken. 625Jump offsets are encoded in 8 bits so the longest jump is 256 instructions. 626However, the jump always 627.Sy ( BPF_JA ) 628opcode uses the 32 bit 629.Va k 630field as the offset, allowing arbitrarily distant destinations. 631All conditionals use unsigned comparison conventions. 632.Bl -column "BPF_JMP+BPF_JSET+BPF_K" "pc += (A \*[Ge] k) ? jt : jf" -offset indent 633.It Sy BPF_JMP+BPF_JA Ta pc += k 634.It Sy BPF_JMP+BPF_JGT+BPF_K Ta "pc += (A > k) ? jt : jf" 635.It Sy BPF_JMP+BPF_JGE+BPF_K Ta "pc += (A \*[Ge] k) ? jt : jf" 636.It Sy BPF_JMP+BPF_JEQ+BPF_K Ta "pc += (A == k) ? jt : jf" 637.It Sy BPF_JMP+BPF_JSET+BPF_K Ta "pc += (A & k) ? jt : jf" 638.It Sy BPF_JMP+BPF_JGT+BPF_X Ta "pc += (A > X) ? jt : jf" 639.It Sy BPF_JMP+BPF_JGE+BPF_X Ta "pc += (A \*[Ge] X) ? jt : jf" 640.It Sy BPF_JMP+BPF_JEQ+BPF_X Ta "pc += (A == X) ? jt : jf" 641.It Sy BPF_JMP+BPF_JSET+BPF_X Ta "pc += (A & X) ? jt : jf" 642.El 643.It Sy BPF_RET 644The return instructions terminate the filter program and specify the amount 645of packet to accept (i.e., they return the truncation amount). 646A return value of zero indicates that the packet should be ignored. 647The return value is either a constant 648.Sy ( BPF_K ) 649or the accumulator 650.Sy ( BPF_A ) . 651.Bl -column "BPF_RET+BPF_A" "accept A bytes" -offset indent 652.It Sy BPF_RET+BPF_A Ta accept A bytes 653.It Sy BPF_RET+BPF_K Ta accept k bytes 654.El 655.It Sy BPF_MISC 656The miscellaneous category was created for anything that doesn't 657fit into the above classes, and for any new instructions that might need to 658be added. 659Currently, these are the register transfer instructions 660that copy the index register to the accumulator or vice versa. 661.Bl -column "BPF_MISC+BPF_TAX" "X <- A" -offset indent 662.It Sy BPF_MISC+BPF_TAX Ta X <- A 663.It Sy BPF_MISC+BPF_TXA Ta A <- X 664.El 665.Pp 666Also, two instructions to call a "coprocessor" if initialized by the kernel 667component. 668There is no coprocessor by default. 669.Bl -column "BPF_MISC+BPF_COPX" "A <- funcs[X](...)" -offset indent 670.It Sy BPF_MISC+BPF_COP Ta A <- funcs[k](..) 671.It Sy BPF_MISC+BPF_COPX Ta A <- funcs[X](..) 672.El 673.Pp 674If the coprocessor is not set or the function index is out of range, these 675instructions will abort the program and return zero. 676.El 677.Pp 678The BPF interface provides the following macros to facilitate 679array initializers: 680.Bd -unfilled -offset indent 681.Fn BPF_STMT opcode operand 682.Fn BPF_JUMP opcode operand true_offset false_offset 683.Ed 684.Sh SYSCTLS 685The following sysctls are available when 686.Nm 687is enabled: 688.Bl -tag -width "XnetXbpfXmaxbufsizeXX" 689.It Li net.bpf.maxbufsize 690Sets the maximum buffer size available for 691.Nm 692peers. 693.It Li net.bpf.stats 694Shows 695.Nm 696statistics. 697They can be retrieved with the 698.Xr netstat 1 699utility. 700.It Li net.bpf.peers 701Shows the current 702.Nm 703peers. 704This is only available to the super user and can also be retrieved with the 705.Xr netstat 1 706utility. 707.El 708.Pp 709On architectures with 710.Xr bpfjit 4 711support, the additional sysctl is available: 712.Bl -tag -width "XnetXbpfXjitXX" 713.It Li net.bpf.jit 714Toggle 715.Sy Just-In-Time 716compilation of new filter programs. 717In order to enable Just-In-Time compilation, 718the bpfjit kernel module must be loaded. 719Changing a value of this sysctl doesn't affect 720existing filter programs. 721.El 722.Sh FILES 723.Pa /dev/bpf 724.Sh EXAMPLES 725The following filter is taken from the Reverse ARP Daemon. 726It accepts only Reverse ARP requests. 727.Bd -literal -offset indent 728struct bpf_insn insns[] = { 729 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 730 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3), 731 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 732 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1), 733 BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) + 734 sizeof(struct ether_header)), 735 BPF_STMT(BPF_RET+BPF_K, 0), 736}; 737.Ed 738.Pp 739This filter accepts only IP packets between host 128.3.112.15 and 740128.3.112.35. 741.Bd -literal -offset indent 742struct bpf_insn insns[] = { 743 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 744 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8), 745 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26), 746 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2), 747 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 748 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4), 749 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3), 750 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 751 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1), 752 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 753 BPF_STMT(BPF_RET+BPF_K, 0), 754}; 755.Ed 756.Pp 757Finally, this filter returns only TCP finger packets. 758We must parse the IP header to reach the TCP header. 759The 760.Sy BPF_JSET 761instruction checks that the IP fragment offset is 0 so we are sure 762that we have a TCP header. 763.Bd -literal -offset indent 764struct bpf_insn insns[] = { 765 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 766 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10), 767 BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23), 768 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8), 769 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 770 BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0), 771 BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14), 772 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14), 773 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0), 774 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16), 775 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1), 776 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 777 BPF_STMT(BPF_RET+BPF_K, 0), 778}; 779.Ed 780.Sh SEE ALSO 781.Xr ioctl 2 , 782.Xr read 2 , 783.Xr select 2 , 784.Xr signal 3 , 785.Xr bpfjit 4 , 786.Xr tcpdump 8 787.Rs 788.%T "The BSD Packet Filter: A New Architecture for User-level Packet Capture" 789.%A S. McCanne 790.%A V. Jacobson 791.%J Proceedings of the 1993 Winter USENIX 792.%C Technical Conference, San Diego, CA 793.Re 794.Sh HISTORY 795The Enet packet filter was created in 1980 by Mike Accetta and 796Rick Rashid at Carnegie-Mellon University. 797Jeffrey Mogul, at Stanford, ported the code to BSD and continued 798its development from 1983 on. 799Since then, it has evolved into the ULTRIX Packet Filter 800at DEC, a STREAMS NIT module under SunOS 4.1, and BPF. 801.Sh AUTHORS 802.An -nosplit 803.An Steven McCanne , 804of Lawrence Berkeley Laboratory, implemented BPF in Summer 1990. 805The design was in collaboration with 806.An Van Jacobson , 807also of Lawrence Berkeley Laboratory. 808.Sh BUGS 809The read buffer must be of a fixed size (returned by the 810.Dv BIOCGBLEN 811ioctl). 812.Pp 813A file that does not request promiscuous mode may receive promiscuously 814received packets as a side effect of another file requesting this 815mode on the same hardware interface. 816This could be fixed in the kernel with additional processing overhead. 817However, we favor the model where 818all files must assume that the interface is promiscuous, and if 819so desired, must use a filter to reject foreign packets. 820.Pp 821Under SunOS, if a BPF application reads more than 2^31 bytes of 822data, read will fail in 823.Er EINVAL . 824You can either fix the bug in SunOS, 825or lseek to 0 when read fails for this reason. 826.Pp 827.Dq Immediate mode 828and the 829.Dq read timeout 830are misguided features. 831This functionality can be emulated with non-blocking mode and 832.Xr select 2 . 833