bpf.c - OpenGrok history log for /openbsd-src/sys/net/bpf.c

Revision	Date	Author	Comments
# 129cf7bd	19-Jan-2025	dlg <dlg@openbsd.org>	make BIOCSWTIMEOUT work with kq events. makes sense jmatthew@
# b9ae17a0	30-Dec-2024	guenther <guenther@openbsd.org>	All the device and file type ioctl routines just ignore FIONBIO, so stop calling down into those layer from fcntl(F_SETFL) or ioctl(FIONBIO) and delete the "do nothing for this" stubs in all the ioc All the device and file type ioctl routines just ignore FIONBIO, so stop calling down into those layer from fcntl(F_SETFL) or ioctl(FIONBIO) and delete the "do nothing for this" stubs in all the ioctl routines. ok dlg@ show more ...
# c6b373c6	26-Nov-2024	dlg <dlg@openbsd.org>	let bpf pick the first attached dlt when attaching to an interface. this is instead of picking the lowest numbered dlt, which was done to make bpf more predictable with interfaces that attached mult let bpf pick the first attached dlt when attaching to an interface. this is instead of picking the lowest numbered dlt, which was done to make bpf more predictable with interfaces that attached multiple DLTs. i think the real problem was that bpf would keep the list in the reverse order of attachment and would prefer the last dlt. interfaces that attach multiple DLTs attach ethernet first, which is what you want the majority of the time anyway. but letting bpf pick the first one means drivers can control which dlt they want to default to, regardless of the numeric id behind a dlt. ok claudio@ show more ...
# 1511e544	19-Nov-2024	dlg <dlg@openbsd.org>	use a tailq for the global list of bpf_if structs. this replaces a hand rolled list that's been here since 1.1. ok claudio@ kn@ tb@
# a921796a	17-Nov-2024	dlg <dlg@openbsd.org>	make sure bpfsdetach is holding a bpf_d ref when invalidating stuff. when bpfsdetach is called by an interface being destroyed, it iterates over the bpf descriptors using the interface and calls vde make sure bpfsdetach is holding a bpf_d ref when invalidating stuff. when bpfsdetach is called by an interface being destroyed, it iterates over the bpf descriptors using the interface and calls vdevgone and klist_invalidate against them. however, i'm not sure the reference the interface holds against the bpf_d is accounted for properly, so vdevgone might drop it to 0 and free it, which makes the klist_invalidate a use after free. avoid this by taking a bpf_d ref before calling vdevgone and klist_invalidate so the memory can't be freed out from under the feet of bpfsdetach. Reported-by: syzbot+b3927f8ad162452a2f39@syzkaller.appspotmail.com i wasn't able to reproduce whatever syzkaller did. it's possible this is a double free, but we'll wait and see if it pops up again. ok mpi@ show more ...
# e88074f0	15-Aug-2024	dlg <dlg@openbsd.org>	add BIOCSETFNR, which is like BIOCSETF but doesnt reset the buffer or stats. from Matthew Luckie <mjl@luckie.org.nz> via tech@ deraadt@ likes it.
# 34cc435a	12-Aug-2024	mvs <mvs@openbsd.org>	Prepare bpf_sysctl() for upcoming net_sysctl() unlocking. Both NET_BPF_MAXBUFSIZE and NET_BPF_BUFSIZE (`bpf_maxbufsize' and `bpf_bufsize' respectively) are atomically accessed integers. No locks req Prepare bpf_sysctl() for upcoming net_sysctl() unlocking. Both NET_BPF_MAXBUFSIZE and NET_BPF_BUFSIZE (`bpf_maxbufsize' and `bpf_bufsize' respectively) are atomically accessed integers. No locks required to modify them. ok bluhm show more ...
# 2293e682	05-Aug-2024	dlg <dlg@openbsd.org>	restrict the maximum wait time you can set via BIOCSWTIMEOUT to 5 minutes. this is avoids passing excessively large values to timeout_add_nsec. Reported-by: syzbot+f650785d4f2b3fe28284@syzkaller.ap restrict the maximum wait time you can set via BIOCSWTIMEOUT to 5 minutes. this is avoids passing excessively large values to timeout_add_nsec. Reported-by: syzbot+f650785d4f2b3fe28284@syzkaller.appspotmail.com show more ...
# 2b86dc95	26-Jan-2024	jan <jan@openbsd.org>	Put checksum flags in bpf_hdr to use them in userland dhcpleased. Thus, dhcpleased accept non-calculated checksums which were verified by hardware/hypervisor. With tweaks from dlg@ ok bluhm@ mkay Put checksum flags in bpf_hdr to use them in userland dhcpleased. Thus, dhcpleased accept non-calculated checksums which were verified by hardware/hypervisor. With tweaks from dlg@ ok bluhm@ mkay tobhe@ show more ...
# b9a6c834	09-Mar-2023	dlg <dlg@openbsd.org>	add a timeout between capturing a packet and making the buffer readable. before this, there were three reasons that a bpf read will finish. the first is the obvious one: the bpf packet buffer in th add a timeout between capturing a packet and making the buffer readable. before this, there were three reasons that a bpf read will finish. the first is the obvious one: the bpf packet buffer in the kernel fills up. by default this is about 32k, so if you're only capturing a small packet packet every few seconds, it can take a long time for the buffer to fill up before you can read them. the second is if bpf has been configured to enable immediate mode with ioctl(BIOCIMMEDIATE). this means that when any packet is written into the bpf buffer, the buffer is immediately readable. this is fine if the packet rate is low, but if the packet rate is high you don't get the benefit of buffering many packets that bpf is supposed to provide. the third mechanism is if bpf has been configured with the BIOCSRTIMEOUT ioctl, which sets a maximum wait time on a bpf read. BIOCSRTIMEOUT means than a clock starts ticking down when a program (eg pflogd) reads from bpf. when the clock reaches zero then the read returns with whatever is in the bpf packet buffer. however, there could be nothing in the buffer, and the read will still complete. deraadt@ noticed this behaviour with pflogd. it wants packets logged by pf to end up on disk in a timely fashion, but it's fine with tolerating a bit of delay so it can take advantatage of buffering to amortise the cost of the reads per packet. it currently does this with BIOCSRTIMEOUT set to half a second, which means it's always waking up every half second even if there's nothing to log. this diff adds BIOCSWTIMEOUT, which specifies a timeout from when bpf first puts a packet in the capture buffer, and when the buffer becomes readable. by default this wait timeout is infinite, meaning the buffer has to be filled before it becomes readable. BIOCSWTIMEOUT can be set to enable the new functionality. BIOCIMMEDIATE is turned into a variation of BIOCSWTIMEOUT with the wait time set to 0, ie, wait 0 seconds between when a packet is written to the buffer and when the buffer becomes readable. combining BIOCSWTIMEOUT and BIOCIMMEDIATE simplifies the code a lot. for pflogd, this means if there are no packets to capture, pflogd won't wake up every half second to do nothing. however, when a packet is logged by pf, bpf will wait another half second to see if any more packets arrive (or the buffer fills up) before the read fires. discussed a lot with deraadt@ and sashan@ ok sashan@ show more ...
# c78098b6	10-Feb-2023	visa <visa@openbsd.org>	Adjust knote(9) API Make knote(9) lock the knote list internally, and add knote_locked(9) for the typical situation where the list is already locked. Remove the KNOTE(9) macro to simplify the API. Adjust knote(9) API Make knote(9) lock the knote list internally, and add knote_locked(9) for the typical situation where the list is already locked. Remove the KNOTE(9) macro to simplify the API. Manual page OK jmc@ OK mpi@ mvs@ show more ...
# a820167a	09-Jul-2022	visa <visa@openbsd.org>	Unwrap klist from struct selinfo as this code no longer uses selwakeup(). OK jsg@
# d164f4a1	05-Jul-2022	visa <visa@openbsd.org>	Remove old poll/select wakeup mechanism. Also remove unneeded seltrue() and selfalse(). OK mpi@ jsg@
# 1525749f	02-Jul-2022	visa <visa@openbsd.org>	Remove unused device poll functions. Also remove unneeded includes of <sys/poll.h> and <sys/select.h>. Some addenda from jsg@. OK miod@ mpi@
# 8f99bf68	17-Mar-2022	visa <visa@openbsd.org>	Use the refcnt API in bpf. OK sashan@ bluhm@
# 1a86186d	15-Feb-2022	visa <visa@openbsd.org>	Use knote_modify_fn() and knote_process_fn() in bpf. OK dlg@
# 4be097b8	13-Feb-2022	bluhm <bluhm@openbsd.org>	The length value in bpf_movein() is casted to from size_t to u_int and then rounded before checking. Put the same check before the calculations to avoid overflow. Reported-by: syzbot+6f29d23eca959c5 The length value in bpf_movein() is casted to from size_t to u_int and then rounded before checking. Put the same check before the calculations to avoid overflow. Reported-by: syzbot+6f29d23eca959c5a9705@syzkaller.appspotmail.com OK claudio@ show more ...
# a3a2b40e	13-Feb-2022	visa <visa@openbsd.org>	Rename knote_modify() to knote_assign() This avoids verb overlap with f_modify.
# df61dccf	11-Feb-2022	visa <visa@openbsd.org>	Replace manual !klist_empty()+knote() with KNOTE(). OK mpi@
# b807ad8b	05-Feb-2022	dlg <dlg@openbsd.org>	make bpf_movein align the packet payload. bluhm@ hit a problem while running a regress test where a packet generated and injected via bpf ends up being consumed by the network stack. the stack assum make bpf_movein align the packet payload. bluhm@ hit a problem while running a regress test where a packet generated and injected via bpf ends up being consumed by the network stack. the stack assumes that packets are aligned properly, but bpf was lazy and put whatever was written to it at the start of an mbuf. ethernet has a 14 byte header, so if you put that at the start the payload will be misaligned by 2 bytes. bpf already has handling for different link header types, so this handling is extended a bit to align the payload after the link header. while here we're fixing up a few error codes. short packets produce EINVAL instead of EPERM, and packets larger than the biggest mbuf the kernel supports generates EMSGSIZE. with tweaks and ok bluhm@ show more ...
# c8b9beef	16-Jan-2022	dlg <dlg@openbsd.org>	activate/notify waiting kq kevents from bpf_wakeup directly. this builds on the mpsafe kq/kevent work visa has been doing. normally kevents are notified by calling selwakeup, but selwakeup needs th activate/notify waiting kq kevents from bpf_wakeup directly. this builds on the mpsafe kq/kevent work visa has been doing. normally kevents are notified by calling selwakeup, but selwakeup needs the KERNEL_LOCK. because bpf runs from all sorts of contexts that may or may not have the kernel lock, the call to selwakeup is deferred to the systq which already has the kernel lock. while this avoids spinning in bpf for the kernel lock, it still adds latency between when the buffer is ready for a program and when that program gets notified about it. now that bpf kevents are mpsafe and bpf_wakeup is already holding the necessary locks, we can avoid that latency. bpf_wakeup now checks if there are waiting kevents and notifies them immediately. if there are no other things to wake up, bpf_wakeup avoids the task_add (and associated reference counting) to defer the selwakeup call. selwakeup can still try to notify waiting kevents, so this uses the hint passed to knote() to differentiate between the notification from bpf_wakeup and selwakeup and returns early from the latter. ok visa@ show more ...
# 422efc16	13-Jan-2022	visa <visa@openbsd.org>	Make bpf event filter MP-safe Use bd_mtx to serialize bpf knote handling. This allows calling the event filter without the kernel lock. OK mpi@
# 1c4e6f78	13-Jan-2022	visa <visa@openbsd.org>	Return an error if bpfilter_lookup() fails in bpfkqfilter() The lookup should not fail because the kernel lock should prevent simultaneous detaching on the vnode layer. However, most other device kq Return an error if bpfilter_lookup() fails in bpfkqfilter() The lookup should not fail because the kernel lock should prevent simultaneous detaching on the vnode layer. However, most other device kqfilter routines check the lookup's outcome anyway, which is maybe a bit more forgiving. OK mpi@ show more ...
# 3c33e230	10-Nov-2021	dlg <dlg@openbsd.org>	whitespace tweaks, no functional change.
# 828d9ca1	23-Oct-2021	visa <visa@openbsd.org>	Fix double free after allocation failure in bpf(4). Reported by Peter J. Philipp. OK mpi@
12 3 4 5 6 7 8 9 10