1.\" $OpenBSD: bpf.4,v 1.45 2023/03/09 06:01:40 dlg Exp $ 2.\" $NetBSD: bpf.4,v 1.7 1995/09/27 18:31:50 thorpej Exp $ 3.\" 4.\" Copyright (c) 1990 The Regents of the University of California. 5.\" All rights reserved. 6.\" 7.\" Redistribution and use in source and binary forms, with or without 8.\" modification, are permitted provided that: (1) source code distributions 9.\" retain the above copyright notice and this paragraph in its entirety, (2) 10.\" distributions including binary code include the above copyright notice and 11.\" this paragraph in its entirety in the documentation or other materials 12.\" provided with the distribution, and (3) all advertising materials mentioning 13.\" features or use of this software display the following acknowledgement: 14.\" ``This product includes software developed by the University of California, 15.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of 16.\" the University nor the names of its contributors may be used to endorse 17.\" or promote products derived from this software without specific prior 18.\" written permission. 19.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED 20.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF 21.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. 22.\" 23.\" This document is derived in part from the enet man page (enet.4) 24.\" distributed with 4.3BSD Unix. 25.\" 26.Dd $Mdocdate: March 9 2023 $ 27.Dt BPF 4 28.Os 29.Sh NAME 30.Nm bpf 31.Nd Berkeley Packet Filter 32.Sh SYNOPSIS 33.Cd "pseudo-device bpfilter" 34.Sh DESCRIPTION 35The Berkeley Packet Filter provides a raw interface to data link layers in 36a protocol-independent fashion. 37All packets on the network, even those destined for other hosts, are 38accessible through this mechanism. 39.Pp 40The packet filter appears as a character special device, 41.Pa /dev/bpf . 42After opening the device, the file descriptor must be bound to a specific 43network interface with the 44.Dv BIOCSETIF 45.Xr ioctl 2 . 46A given interface can be shared between multiple listeners, and the filter 47underlying each descriptor will see an identical packet stream. 48.Pp 49Associated with each open instance of a 50.Nm 51file is a user-settable 52packet filter. 53Whenever a packet is received by an interface, all file descriptors 54listening on that interface apply their filter. 55Each descriptor that accepts the packet receives its own copy. 56.Pp 57Reads from these files return the next group of packets that have matched 58the filter. 59To improve performance, the buffer passed to read must be the same size as 60the buffers used internally by 61.Nm bpf . 62This size is returned by the 63.Dv BIOCGBLEN 64.Xr ioctl 2 65and can be set with 66.Dv BIOCSBLEN . 67Note that an individual packet larger than this size is necessarily truncated. 68.Pp 69A packet can be sent out on the network by writing to a 70.Nm 71file descriptor. 72Each descriptor can also have a user-settable filter 73for controlling the writes. 74Only packets matching the filter are sent out of the interface. 75The writes are unbuffered, meaning only one packet can be processed per write. 76.Pp 77Once a descriptor is configured, further changes to the configuration 78can be prevented using the 79.Dv BIOCLOCK 80.Xr ioctl 2 . 81.Sh IOCTL INTERFACE 82The 83.Xr ioctl 2 84command codes below are defined in 85.In net/bpf.h . 86All commands require these includes: 87.Pp 88.nr nS 1 89.In sys/types.h 90.In sys/time.h 91.In sys/ioctl.h 92.In net/bpf.h 93.nr nS 0 94.Pp 95Additionally, 96.Dv BIOCGETIF 97and 98.Dv BIOCSETIF 99require 100.In sys/socket.h 101and 102.In net/if.h . 103.Pp 104The (third) argument to the 105.Xr ioctl 2 106call should be a pointer to the type indicated. 107.Pp 108.Bl -tag -width Ds -compact 109.It Dv BIOCGBLEN Fa "u_int *" 110Returns the required buffer length for reads on 111.Nm 112files. 113.Pp 114.It Dv BIOCSBLEN Fa "u_int *" 115Sets the buffer length for reads on 116.Nm 117files. 118The buffer must be set before the file is attached to an interface with 119.Dv BIOCSETIF . 120If the requested buffer size cannot be accommodated, the closest allowable 121size will be set and returned in the argument. 122A read call will result in 123.Er EINVAL 124if it is passed a buffer that is not this size. 125.Pp 126.It Dv BIOCGDLT Fa "u_int *" 127Returns the type of the data link layer underlying the attached interface. 128.Er EINVAL 129is returned if no interface has been specified. 130The device types, prefixed with 131.Dq DLT_ , 132are defined in 133.In net/bpf.h . 134.Pp 135.It Dv BIOCGDLTLIST Fa "struct bpf_dltlist *" 136Returns an array of the available types of the data link layer 137underlying the attached interface: 138.Bd -literal -offset indent 139struct bpf_dltlist { 140 u_int bfl_len; 141 u_int *bfl_list; 142}; 143.Ed 144.Pp 145The available types are returned in the array pointed to by the 146.Va bfl_list 147field while their length in 148.Vt u_int 149is supplied to the 150.Va bfl_len 151field. 152.Er ENOMEM 153is returned if there is not enough buffer space and 154.Er EFAULT 155is returned if a bad address is encountered. 156The 157.Va bfl_len 158field is modified on return to indicate the actual length in 159.Vt u_int 160of the array returned. 161If 162.Va bfl_list 163is 164.Dv NULL , 165the 166.Va bfl_len 167field is set to indicate the required length of the array in 168.Vt u_int . 169.Pp 170.It Dv BIOCSDLT Fa "u_int *" 171Changes the type of the data link layer underlying the attached interface. 172.Er EINVAL 173is returned if no interface has been specified or the specified 174type is not available for the interface. 175.Pp 176.It Dv BIOCPROMISC 177Forces the interface into promiscuous mode. 178All packets, not just those destined for the local host, are processed. 179Since more than one file can be listening on a given interface, a listener 180that opened its interface non-promiscuously may receive packets promiscuously. 181This problem can be remedied with an appropriate filter. 182.Pp 183The interface remains in promiscuous mode until all files listening 184promiscuously are closed. 185.Pp 186.It Dv BIOCFLUSH 187Flushes the buffer of incoming packets and resets the statistics that are 188returned by 189.Dv BIOCGSTATS . 190.Pp 191.It Dv BIOCLOCK 192This ioctl is designed to prevent the security issues associated 193with an open 194.Nm 195descriptor in unprivileged programs. 196Even with dropped privileges, an open 197.Nm 198descriptor can be abused by a rogue program to listen on any interface 199on the system, send packets on these interfaces if the descriptor was 200opened read-write and send signals to arbitrary processes using the 201signaling mechanism of 202.Nm bpf . 203By allowing only 204.Dq known safe 205ioctls, the 206.Dv BIOCLOCK 207ioctl prevents this abuse. 208The allowable ioctls are 209.Dv BIOCFLUSH , 210.Dv BIOCGBLEN , 211.Dv BIOCGDIRFILT , 212.Dv BIOCGDLT , 213.Dv BIOCGDIRFILT , 214.Dv BIOCGDLTLIST , 215.Dv BIOCGETIF , 216.Dv BIOCGHDRCMPLT , 217.Dv BIOCGRSIG , 218.Dv BIOCGRTIMEOUT , 219.Dv BIOCGSTATS , 220.Dv BIOCIMMEDIATE , 221.Dv BIOCLOCK , 222.Dv BIOCSRTIMEOUT , 223.Dv BIOCSWTIMEOUT , 224.Dv BIOCDWTIMEOUT , 225.Dv BIOCVERSION , 226.Dv TIOCGPGRP , 227and 228.Dv FIONREAD . 229Use of any other ioctl is denied with error 230.Er EPERM . 231Once a descriptor is locked, it is not possible to unlock it. 232A process with root privileges is not affected by the lock. 233.Pp 234A privileged program can open a 235.Nm 236device, drop privileges, set the interface, filters and modes on the 237descriptor, and lock it. 238Once the descriptor is locked, the system is safe 239from further abuse through the descriptor. 240Locking a descriptor does not prevent writes. 241If the application does not need to send packets through 242.Nm bpf , 243it can open the device read-only to prevent writing. 244If sending packets is necessary, a write-filter can be set before locking the 245descriptor to prevent arbitrary packets from being sent out. 246.Pp 247.It Dv BIOCGETIF Fa "struct ifreq *" 248Returns the name of the hardware interface that the file is listening on. 249The name is returned in the 250.Fa ifr_name 251field of the 252.Vt struct ifreq . 253All other fields are undefined. 254.Pp 255.It Dv BIOCSETIF Fa "struct ifreq *" 256Sets the hardware interface associated with the file. 257This command must be performed before any packets can be read. 258The device is indicated by name using the 259.Fa ifr_name 260field of the 261.Vt struct ifreq . 262Additionally, performs the actions of 263.Dv BIOCFLUSH . 264.Pp 265.It Dv BIOCSRTIMEOUT Fa "struct timeval *" 266.It Dv BIOCGRTIMEOUT Fa "struct timeval *" 267Sets or gets the read timeout parameter. 268The 269.Ar timeval 270specifies the length of time to wait before timing out on a read request. 271This parameter is initialized to zero by 272.Xr open 2 , 273indicating no timeout. 274.Pp 275.It Dv BIOCGSTATS Fa "struct bpf_stat *" 276Returns the following structure of packet statistics: 277.Bd -literal -offset indent 278struct bpf_stat { 279 u_int bs_recv; 280 u_int bs_drop; 281}; 282.Ed 283.Pp 284The fields are: 285.Bl -tag -width bs_recv 286.It Fa bs_recv 287Number of packets received by the descriptor since opened or reset (including 288any buffered since the last read call). 289.It Fa bs_drop 290Number of packets which were accepted by the filter but dropped by the kernel 291because of buffer overflows (i.e., the application's reads aren't keeping up 292with the packet traffic). 293.El 294.Pp 295.It Dv BIOCIMMEDIATE Fa "u_int *" 296Enables or disables 297.Dq immediate mode , 298based on the truth value of the argument. 299When immediate mode is enabled, reads return immediately upon packet reception. 300Otherwise, a read will block until either the kernel buffer becomes full or a 301timeout occurs. 302This is useful for programs like 303.Xr rarpd 8 , 304which must respond to messages in real time. 305The default for a new file is off. 306.Pp 307.It Dv BIOCSWTIMEOUT Fa "struct timeval *" 308.It Dv BIOCGWTIMEOUT Fa "struct timeval *" 309.It Dv BIOCDWTIMEOUT 310Sets, gets, or deletes (resets) the wait timeout parameter. 311The 312.Ar timeval 313specifies the length of time to wait between receiving a packet and 314the kernel buffer becoming readable. 315By default, or when reset, the wait timeout is infinite, meaning 316the age of packets in the kernel buffer does not make the buffer 317readable. 318.Pp 319.It Dv BIOCSETF Fa "struct bpf_program *" 320Sets the filter program used by the kernel to discard uninteresting packets. 321An array of instructions and its length are passed in using the following 322structure: 323.Bd -literal -offset indent 324struct bpf_program { 325 u_int bf_len; 326 struct bpf_insn *bf_insns; 327}; 328.Ed 329.Pp 330The filter program is pointed to by the 331.Fa bf_insns 332field, while its length in units of 333.Vt struct bpf_insn 334is given by the 335.Fa bf_len 336field. 337Also, the actions of 338.Dv BIOCFLUSH 339are performed. 340.Pp 341See section 342.Sx FILTER MACHINE 343for an explanation of the filter language. 344.Pp 345.It Dv BIOCSETWF Fa "struct bpf_program *" 346Sets the filter program used by the kernel to filter the packets 347written to the descriptor before the packets are sent out on the 348network. 349See 350.Dv BIOCSETF 351for a description of the filter program. 352This ioctl also acts as 353.Dv BIOCFLUSH . 354.Pp 355Note that the filter operates on the packet data written to the descriptor. 356If the 357.Dq header complete 358flag is not set, the kernel sets the link-layer source address 359of the packet after filtering. 360.Pp 361.It Dv BIOCVERSION Fa "struct bpf_version *" 362Returns the major and minor version numbers of the filter language currently 363recognized by the kernel. 364Before installing a filter, applications must check that the current version 365is compatible with the running kernel. 366Version numbers are compatible if the major numbers match and the application 367minor is less than or equal to the kernel minor. 368The kernel version number is returned in the following structure: 369.Bd -literal -offset indent 370struct bpf_version { 371 u_short bv_major; 372 u_short bv_minor; 373}; 374.Ed 375.Pp 376The current version numbers are given by 377.Dv BPF_MAJOR_VERSION 378and 379.Dv BPF_MINOR_VERSION 380from 381.In net/bpf.h . 382An incompatible filter may result in undefined behavior (most likely, an 383error returned by 384.Xr ioctl 2 385or haphazard packet matching). 386.Pp 387.It Dv BIOCSRSIG Fa "u_int *" 388.It Dv BIOCGRSIG Fa "u_int *" 389Sets or gets the receive signal. 390This signal will be sent to the process or process group specified by 391.Dv FIOSETOWN . 392It defaults to 393.Dv SIGIO . 394.Pp 395.It Dv BIOCSHDRCMPLT Fa "u_int *" 396.It Dv BIOCGHDRCMPLT Fa "u_int *" 397Sets or gets the status of the 398.Dq header complete 399flag. 400Set to zero if the link level source address should be filled in 401automatically by the interface output routine. 402Set to one if the link level source address will be written, 403as provided, to the wire. 404This flag is initialized to zero by default. 405.Pp 406.It Dv BIOCSFILDROP Fa "u_int *" 407.It Dv BIOCGFILDROP Fa "u_int *" 408Sets or gets the 409.Dq filter drop 410action. 411The supported actions for packets matching the filter are: 412.Pp 413.Bl -tag -width "BPF_FILDROP_CAPTURE" -compact 414.It Dv BPF_FILDROP_PASS 415Accept and capture 416.It Dv BPF_FILDROP_CAPTURE 417Drop and capture 418.It Dv BPF_FILDROP_DROP 419Drop and do not capture 420.El 421.Pp 422Packets matching any filter configured to drop packets will be 423reported to the associated interface so that they can be dropped. 424The default action is 425.Dv BPF_FILDROP_PASS . 426.Pp 427.It Dv BIOCSDIRFILT Fa "u_int *" 428.It Dv BIOCGDIRFILT Fa "u_int *" 429Sets or gets the status of the 430.Dq direction filter 431flag. 432If non-zero, packets matching the specified direction (either 433.Dv BPF_DIRECTION_IN 434or 435.Dv BPF_DIRECTION_OUT ) 436will be ignored. 437.El 438.Ss Standard ioctls 439.Nm 440now supports several standard ioctls which allow the user to do asynchronous 441and/or non-blocking I/O to an open 442.Nm 443file descriptor. 444.Pp 445.Bl -tag -width Ds -compact 446.It Dv FIONREAD Fa "int *" 447Returns the number of bytes that are immediately available for reading. 448.Pp 449.It Dv FIONBIO Fa "int *" 450Sets or clears non-blocking I/O. 451If the argument is non-zero, enable non-blocking I/O. 452If the argument is zero, disable non-blocking I/O. 453If non-blocking I/O is enabled, the return value of a read while no data 454is available will be 0. 455The non-blocking read behavior is different from performing non-blocking 456reads on other file descriptors, which will return \-1 and set 457.Va errno 458to 459.Er EAGAIN 460if no data is available. 461Note: setting this overrides the timeout set by 462.Dv BIOCSRTIMEOUT . 463.Pp 464.It Dv FIOASYNC Fa "int *" 465Enables or disables asynchronous I/O. 466When enabled (argument is non-zero), the process or process group specified 467by 468.Dv FIOSETOWN 469will start receiving 470.Dv SIGIO 471signals when packets arrive. 472Note that you must perform an 473.Dv FIOSETOWN 474command in order for this to take effect, as the system will not do it by 475default. 476The signal may be changed via 477.Dv BIOCSRSIG . 478.Pp 479.It Dv FIOSETOWN Fa "int *" 480.It Dv FIOGETOWN Fa "int *" 481Sets or gets the process or process group (if negative) that should receive 482.Dv SIGIO 483when packets are available. 484The signal may be changed using 485.Dv BIOCSRSIG 486(see above). 487.El 488.Ss BPF header 489The following structure is prepended to each packet returned by 490.Xr read 2 : 491.Bd -literal -offset indent 492struct bpf_hdr { 493 struct bpf_timeval bh_tstamp; 494 u_int32_t bh_caplen; 495 u_int32_t bh_datalen; 496 u_int16_t bh_hdrlen; 497}; 498.Ed 499.Pp 500The fields, stored in host order, are as follows: 501.Bl -tag -width Ds 502.It Fa bh_tstamp 503Time at which the packet was processed by the packet filter. 504.It Fa bh_caplen 505Length of the captured portion of the packet. 506This is the minimum of the truncation amount specified by the filter and the 507length of the packet. 508.It Fa bh_datalen 509Length of the packet off the wire. 510This value is independent of the truncation amount specified by the filter. 511.It Fa bh_hdrlen 512Length of the BPF header, which may not be equal to 513.Li sizeof(struct bpf_hdr) . 514.El 515.Pp 516The 517.Fa bh_hdrlen 518field exists to account for padding between the header and the link level 519protocol. 520The purpose here is to guarantee proper alignment of the packet data 521structures, which is required on alignment-sensitive architectures and 522improves performance on many other architectures. 523The packet filter ensures that the 524.Fa bpf_hdr 525and the network layer header will be word aligned. 526Suitable precautions must be taken when accessing the link layer protocol 527fields on alignment restricted machines. 528(This isn't a problem on an Ethernet, since the type field is a 529.Vt short 530falling on an even offset, and the addresses are probably accessed in a 531bytewise fashion). 532.Pp 533Additionally, individual packets are padded so that each starts on a 534word boundary. 535This requires that an application has some knowledge of how to get from packet 536to packet. 537The macro 538.Dv BPF_WORDALIGN 539is defined in 540.In net/bpf.h 541to facilitate this process. 542It rounds up its argument to the nearest word aligned value (where a word is 543.Dv BPF_ALIGNMENT 544bytes wide). 545For example, if 546.Va p 547points to the start of a packet, this expression will advance it to the 548next packet: 549.Pp 550.Dl p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen); 551.Pp 552For the alignment mechanisms to work properly, the buffer passed to 553.Xr read 2 554must itself be word aligned. 555.Xr malloc 3 556will always return an aligned buffer. 557.Ss Filter machine 558A filter program is an array of instructions with all branches forwardly 559directed, terminated by a 560.Dq return 561instruction. 562Each instruction performs some action on the pseudo-machine state, which 563consists of an accumulator, index register, scratch memory store, and 564implicit program counter. 565.Pp 566The following structure defines the instruction format: 567.Bd -literal -offset indent 568struct bpf_insn { 569 u_int16_t code; 570 u_char jt; 571 u_char jf; 572 u_int32_t k; 573}; 574.Ed 575.Pp 576The 577.Fa k 578field is used in different ways by different instructions, and the 579.Fa jt 580and 581.Fa jf 582fields are used as offsets by the branch instructions. 583The opcodes are encoded in a semi-hierarchical fashion. 584There are eight classes of instructions: 585.Dv BPF_LD , 586.Dv BPF_LDX , 587.Dv BPF_ST , 588.Dv BPF_STX , 589.Dv BPF_ALU , 590.Dv BPF_JMP , 591.Dv BPF_RET , 592and 593.Dv BPF_MISC . 594Various other mode and operator bits are logically OR'd into the class to 595give the actual instructions. 596The classes and modes are defined in 597.In net/bpf.h . 598Below are the semantics for each defined 599.Nm 600instruction. 601We use the convention that A is the accumulator, X is the index register, 602P[] packet data, and M[] scratch memory store. 603P[i:n] gives the data at byte offset 604.Dq i 605in the packet, interpreted as a word (n=4), unsigned halfword (n=2), or 606unsigned byte (n=1). 607M[i] gives the i'th word in the scratch memory store, which is only addressed 608in word units. 609The memory store is indexed from 0 to 610.Dv BPF_MEMWORDS Ns \-1 . 611.Fa k , 612.Fa jt , 613and 614.Fa jf 615are the corresponding fields in the instruction definition. 616.Dq len 617refers to the length of the packet. 618.Bl -tag -width Ds 619.It Dv BPF_LD 620These instructions copy a value into the accumulator. 621The type of the source operand is specified by an 622.Dq addressing mode 623and can be a constant 624.Pf ( Dv BPF_IMM ) , 625packet data at a fixed offset 626.Pf ( Dv BPF_ABS ) , 627packet data at a variable offset 628.Pf ( Dv BPF_IND ) , 629the packet length 630.Pf ( Dv BPF_LEN ) , 631a random number 632.Pf ( Dv BPF_RND ) , 633or a word in the scratch memory store 634.Pf ( Dv BPF_MEM ) . 635For 636.Dv BPF_IND 637and 638.Dv BPF_ABS , 639the data size must be specified as a word 640.Pf ( Dv BPF_W ) , 641halfword 642.Pf ( Dv BPF_H ) , 643or byte 644.Pf ( Dv BPF_B ) . 645The semantics of all recognized 646.Dv BPF_LD 647instructions follow. 648.Pp 649.Bl -tag -width 32n -compact 650.Sm off 651.It Xo Dv BPF_LD No + Dv BPF_W No + 652.Dv BPF_ABS 653.Xc 654.Sm on 655A <- P[k:4] 656.Sm off 657.It Xo Dv BPF_LD No + Dv BPF_H No + 658.Dv BPF_ABS 659.Xc 660.Sm on 661A <- P[k:2] 662.Sm off 663.It Xo Dv BPF_LD No + Dv BPF_B No + 664.Dv BPF_ABS 665.Xc 666.Sm on 667A <- P[k:1] 668.Sm off 669.It Xo Dv BPF_LD No + Dv BPF_W No + 670.Dv BPF_IND 671.Xc 672.Sm on 673A <- P[X+k:4] 674.Sm off 675.It Xo Dv BPF_LD No + Dv BPF_H No + 676.Dv BPF_IND 677.Xc 678.Sm on 679A <- P[X+k:2] 680.Sm off 681.It Xo Dv BPF_LD No + Dv BPF_B No + 682.Dv BPF_IND 683.Xc 684.Sm on 685A <- P[X+k:1] 686.Sm off 687.It Xo Dv BPF_LD No + Dv BPF_W No + 688.Dv BPF_LEN 689.Xc 690.Sm on 691A <- len 692.Sm off 693.It Xo Dv BPF_LD No + Dv BPF_W No + 694.Dv BPF_RND 695.Xc 696.Sm on 697A <- arc4random() 698.Sm off 699.It Dv BPF_LD No + Dv BPF_IMM 700.Sm on 701A <- k 702.Sm off 703.It Dv BPF_LD No + Dv BPF_MEM 704.Sm on 705A <- M[k] 706.El 707.It Dv BPF_LDX 708These instructions load a value into the index register. 709Note that the addressing modes are more restricted than those of the 710accumulator loads, but they include 711.Dv BPF_MSH , 712a hack for efficiently loading the IP header length. 713.Pp 714.Bl -tag -width 32n -compact 715.Sm off 716.It Xo Dv BPF_LDX No + Dv BPF_W No + 717.Dv BPF_IMM 718.Xc 719.Sm on 720X <- k 721.Sm off 722.It Xo Dv BPF_LDX No + Dv BPF_W No + 723.Dv BPF_MEM 724.Xc 725.Sm on 726X <- M[k] 727.Sm off 728.It Xo Dv BPF_LDX No + Dv BPF_W No + 729.Dv BPF_LEN 730.Xc 731.Sm on 732X <- len 733.Sm off 734.It Xo Dv BPF_LDX No + Dv BPF_B No + 735.Dv BPF_MSH 736.Xc 737.Sm on 738X <- 4*(P[k:1]&0xf) 739.El 740.It Dv BPF_ST 741This instruction stores the accumulator into the scratch memory. 742We do not need an addressing mode since there is only one possibility for 743the destination. 744.Pp 745.Bl -tag -width 32n -compact 746.It Dv BPF_ST 747M[k] <- A 748.El 749.It Dv BPF_STX 750This instruction stores the index register in the scratch memory store. 751.Pp 752.Bl -tag -width 32n -compact 753.It Dv BPF_STX 754M[k] <- X 755.El 756.It Dv BPF_ALU 757The ALU instructions perform operations between the accumulator and index 758register or constant, and store the result back in the accumulator. 759For binary operations, a source mode is required 760.Pf ( Dv BPF_K 761or 762.Dv BPF_X ) . 763.Pp 764.Bl -tag -width 32n -compact 765.Sm off 766.It Xo Dv BPF_ALU No + BPF_ADD No + 767.Dv BPF_K 768.Xc 769.Sm on 770A <- A + k 771.Sm off 772.It Xo Dv BPF_ALU No + BPF_SUB No + 773.Dv BPF_K 774.Xc 775.Sm on 776A <- A - k 777.Sm off 778.It Xo Dv BPF_ALU No + BPF_MUL No + 779.Dv BPF_K 780.Xc 781.Sm on 782A <- A * k 783.Sm off 784.It Xo Dv BPF_ALU No + BPF_DIV No + 785.Dv BPF_K 786.Xc 787.Sm on 788A <- A / k 789.Sm off 790.It Xo Dv BPF_ALU No + BPF_AND No + 791.Dv BPF_K 792.Xc 793.Sm on 794A <- A & k 795.Sm off 796.It Xo Dv BPF_ALU No + BPF_OR No + 797.Dv BPF_K 798.Xc 799.Sm on 800A <- A | k 801.Sm off 802.It Xo Dv BPF_ALU No + BPF_LSH No + 803.Dv BPF_K 804.Xc 805.Sm on 806A <- A << k 807.Sm off 808.It Xo Dv BPF_ALU No + BPF_RSH No + 809.Dv BPF_K 810.Xc 811.Sm on 812A <- A >> k 813.Sm off 814.It Xo Dv BPF_ALU No + BPF_ADD No + 815.Dv BPF_X 816.Xc 817.Sm on 818A <- A + X 819.Sm off 820.It Xo Dv BPF_ALU No + BPF_SUB No + 821.Dv BPF_X 822.Xc 823.Sm on 824A <- A - X 825.Sm off 826.It Xo Dv BPF_ALU No + BPF_MUL No + 827.Dv BPF_X 828.Xc 829.Sm on 830A <- A * X 831.Sm off 832.It Xo Dv BPF_ALU No + BPF_DIV No + 833.Dv BPF_X 834.Xc 835.Sm on 836A <- A / X 837.Sm off 838.It Xo Dv BPF_ALU No + BPF_AND No + 839.Dv BPF_X 840.Xc 841.Sm on 842A <- A & X 843.Sm off 844.It Xo Dv BPF_ALU No + BPF_OR No + 845.Dv BPF_X 846.Xc 847.Sm on 848A <- A | X 849.Sm off 850.It Xo Dv BPF_ALU No + BPF_LSH No + 851.Dv BPF_X 852.Xc 853.Sm on 854A <- A << X 855.Sm off 856.It Xo Dv BPF_ALU No + BPF_RSH No + 857.Dv BPF_X 858.Xc 859.Sm on 860A <- A >> X 861.Sm off 862.It Dv BPF_ALU No + BPF_NEG 863.Sm on 864A <- -A 865.El 866.It Dv BPF_JMP 867The jump instructions alter flow of control. 868Conditional jumps compare the accumulator against a constant 869.Pf ( Dv BPF_K ) 870or the index register 871.Pf ( Dv BPF_X ) . 872If the result is true (or non-zero), the true branch is taken, otherwise the 873false branch is taken. 874Jump offsets are encoded in 8 bits so the longest jump is 256 instructions. 875However, the jump always 876.Pf ( Dv BPF_JA ) 877opcode uses the 32-bit 878.Fa k 879field as the offset, allowing arbitrarily distant destinations. 880All conditionals use unsigned comparison conventions. 881.Pp 882.Bl -tag -width 32n -compact 883.Sm off 884.It Dv BPF_JMP No + BPF_JA 885pc += k 886.Sm on 887.Sm off 888.It Xo Dv BPF_JMP No + BPF_JGT No + 889.Dv BPF_K 890.Xc 891.Sm on 892pc += (A > k) ? jt : jf 893.Sm off 894.It Xo Dv BPF_JMP No + BPF_JGE No + 895.Dv BPF_K 896.Xc 897.Sm on 898pc += (A >= k) ? jt : jf 899.Sm off 900.It Xo Dv BPF_JMP No + BPF_JEQ No + 901.Dv BPF_K 902.Xc 903.Sm on 904pc += (A == k) ? jt : jf 905.Sm off 906.It Xo Dv BPF_JMP No + BPF_JSET No + 907.Dv BPF_K 908.Xc 909.Sm on 910pc += (A & k) ? jt : jf 911.Sm off 912.It Xo Dv BPF_JMP No + BPF_JGT No + 913.Dv BPF_X 914.Xc 915.Sm on 916pc += (A > X) ? jt : jf 917.Sm off 918.It Xo Dv BPF_JMP No + BPF_JGE No + 919.Dv BPF_X 920.Xc 921.Sm on 922pc += (A >= X) ? jt : jf 923.Sm off 924.It Xo Dv BPF_JMP No + BPF_JEQ No + 925.Dv BPF_X 926.Xc 927.Sm on 928pc += (A == X) ? jt : jf 929.Sm off 930.It Xo Dv BPF_JMP No + BPF_JSET No + 931.Dv BPF_X 932.Xc 933.Sm on 934pc += (A & X) ? jt : jf 935.El 936.It Dv BPF_RET 937The return instructions terminate the filter program and specify the 938amount of packet to accept (i.e., they return the truncation amount) 939or, for the write filter, the maximum acceptable size for the packet 940(i.e., the packet is dropped if it is larger than the returned 941amount). 942A return value of zero indicates that the packet should be ignored/dropped. 943The return value is either a constant 944.Pf ( Dv BPF_K ) 945or the accumulator 946.Pf ( Dv BPF_A ) . 947.Pp 948.Bl -tag -width 32n -compact 949.It Dv BPF_RET No + Dv BPF_A 950Accept A bytes. 951.It Dv BPF_RET No + Dv BPF_K 952Accept k bytes. 953.El 954.It Dv BPF_MISC 955The miscellaneous category was created for anything that doesn't fit into 956the above classes, and for any new instructions that might need to be added. 957Currently, these are the register transfer instructions that copy the index 958register to the accumulator or vice versa. 959.Pp 960.Bl -tag -width 32n -compact 961.Sm off 962.It Dv BPF_MISC No + Dv BPF_TAX 963.Sm on 964X <- A 965.Sm off 966.It Dv BPF_MISC No + Dv BPF_TXA 967.Sm on 968A <- X 969.El 970.El 971.Pp 972The 973.Nm 974interface provides the following macros to facilitate array initializers: 975.Bd -filled -offset indent 976.Dv BPF_STMT ( Ns Ar opcode , 977.Ar operand ) 978.Pp 979.Dv BPF_JUMP ( Ns Ar opcode , 980.Ar operand , 981.Ar true_offset , 982.Ar false_offset ) 983.Ed 984.Sh FILES 985.Bl -tag -width /dev/bpf -compact 986.It Pa /dev/bpf 987.Nm 988device 989.El 990.Sh EXAMPLES 991The following filter is taken from the Reverse ARP daemon. 992It accepts only Reverse ARP requests. 993.Bd -literal -offset indent 994struct bpf_insn insns[] = { 995 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 996 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3), 997 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 998 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1), 999 BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) + 1000 sizeof(struct ether_header)), 1001 BPF_STMT(BPF_RET+BPF_K, 0), 1002}; 1003.Ed 1004.Pp 1005This filter accepts only IP packets between host 128.3.112.15 and 1006128.3.112.35. 1007.Bd -literal -offset indent 1008struct bpf_insn insns[] = { 1009 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 1010 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8), 1011 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26), 1012 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2), 1013 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 1014 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4), 1015 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3), 1016 BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30), 1017 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1), 1018 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 1019 BPF_STMT(BPF_RET+BPF_K, 0), 1020}; 1021.Ed 1022.Pp 1023Finally, this filter returns only TCP finger packets. 1024We must parse the IP header to reach the TCP header. 1025The 1026.Dv BPF_JSET 1027instruction checks that the IP fragment offset is 0 so we are sure that we 1028have a TCP header. 1029.Bd -literal -offset indent 1030struct bpf_insn insns[] = { 1031 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12), 1032 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10), 1033 BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23), 1034 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8), 1035 BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20), 1036 BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0), 1037 BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14), 1038 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14), 1039 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0), 1040 BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16), 1041 BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1), 1042 BPF_STMT(BPF_RET+BPF_K, (u_int)-1), 1043 BPF_STMT(BPF_RET+BPF_K, 0), 1044}; 1045.Ed 1046.Sh ERRORS 1047If the 1048.Xr ioctl 2 1049call fails, 1050.Xr errno 2 1051is set to one of the following values: 1052.Bl -tag -width Er 1053.It Bq Er EINVAL 1054The timeout used in a 1055.Dv BIOCSRTIMEOUT 1056request is negative. 1057.It Bq Er EINVAL 1058The timeout used in a 1059.Dv BIOCSRTIMEOUT 1060request specified a microsecond value less than zero or 1061greater than or equal to 1 million. 1062.It Bq Er EOVERFLOW 1063The timeout used in a 1064.Dv BIOCSRTIMEOUT 1065request is too large to be represented by an 1066.Vt int . 1067.El 1068.Sh SEE ALSO 1069.Xr ioctl 2 , 1070.Xr read 2 , 1071.Xr select 2 , 1072.Xr signal 3 , 1073.Xr MAKEDEV 8 , 1074.Xr tcpdump 8 , 1075.Xr arc4random 9 1076.Rs 1077.%A McCanne, S. 1078.%A Jacobson, V. 1079.%D January 1993 1080.%J 1993 Winter USENIX Conference 1081.%T The BSD Packet Filter: A New Architecture for User-level Packet Capture 1082.Re 1083.Sh HISTORY 1084The Enet packet filter was created in 1980 by Mike Accetta and Rick Rashid 1085at Carnegie-Mellon University. 1086Jeffrey Mogul, at Stanford, ported the code to 1087.Bx 1088and continued its 1089development from 1983 on. 1090Since then, it has evolved into the Ultrix Packet Filter at DEC, a STREAMS 1091NIT module under SunOS 4.1, and BPF. 1092.Sh AUTHORS 1093.An -nosplit 1094.An Steve McCanne 1095of Lawrence Berkeley Laboratory implemented BPF in Summer 1990. 1096Much of the design is due to 1097.An Van Jacobson . 1098.Sh BUGS 1099The read buffer must be of a fixed size (returned by the 1100.Dv BIOCGBLEN 1101ioctl). 1102.Pp 1103A file that does not request promiscuous mode may receive promiscuously 1104received packets as a side effect of another file requesting this mode on 1105the same hardware interface. 1106This could be fixed in the kernel with additional processing overhead. 1107However, we favor the model where all files must assume that the interface 1108is promiscuous, and if so desired, must utilize a filter to reject foreign 1109packets. 1110