1.\" $OpenBSD: kqueue.2,v 1.51 2023/08/20 19:52:40 jmc Exp $ 2.\" 3.\" Copyright (c) 2000 Jonathan Lemon 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $ 28.\" 29.Dd $Mdocdate: August 20 2023 $ 30.Dt KQUEUE 2 31.Os 32.Sh NAME 33.Nm kqueue , 34.Nm kqueue1 , 35.Nm kevent , 36.Nm EV_SET 37.Nd kernel event notification mechanism 38.Sh SYNOPSIS 39.In sys/types.h 40.In sys/event.h 41.In sys/time.h 42.Ft int 43.Fn kqueue "void" 44.Ft int 45.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 46.Fn EV_SET "&kev" ident filter flags fflags data udata 47.In sys/types.h 48.In sys/event.h 49.In sys/time.h 50.In fcntl.h 51.Ft int 52.Fn kqueue1 "int flags" 53.Sh DESCRIPTION 54.Fn kqueue 55provides a generic method of notifying the user when an event 56happens or a condition holds, based on the results of small 57pieces of kernel code termed 58.Dq filters . 59A kevent is identified by the (ident, filter) pair; there may only 60be one unique kevent per kqueue. 61.Pp 62The filter is executed upon the initial registration of a kevent 63in order to detect whether a preexisting condition is present, and is also 64executed whenever an event is passed to the filter for evaluation. 65If the filter determines that the condition should be reported, 66then the kevent is placed on the kqueue for the user to retrieve. 67.Pp 68The filter is also run when the user attempts to retrieve the kevent 69from the kqueue. 70If the filter indicates that the condition that triggered 71the event no longer holds, the kevent is removed from the kqueue and 72is not returned. 73.Pp 74Multiple events which trigger the filter do not result in multiple 75kevents being placed on the kqueue; instead, the filter will aggregate 76the events into a single 77.Vt struct kevent . 78Calling 79.Xr close 2 80on a file descriptor will remove any kevents that reference the descriptor. 81.Pp 82.Fn kqueue 83creates a new kernel event queue and returns a descriptor. 84The queue is not inherited by a child created with 85.Xr fork 2 . 86Similarly, kqueues cannot be passed across UNIX-domain sockets. 87.Pp 88The 89.Fn kqueue1 90function is identical to 91.Fn kqueue 92except that the close-on-exec flag on the new file descriptor 93is determined by the 94.Dv O_CLOEXEC 95flag 96in the 97.Fa flags 98argument. 99.Pp 100.Fn kevent 101is used to register events with the queue, and return any pending 102events to the user. 103.Fa changelist 104is a pointer to an array of 105.Vt kevent 106structures, as defined in 107.In sys/event.h . 108All changes contained in the 109.Fa changelist 110are applied before any pending events are read from the queue. 111.Fa nchanges 112gives the size of 113.Fa changelist . 114.Fa eventlist 115is a pointer to an array of 116.Vt kevent 117structures. 118.Fa nevents 119determines the size of 120.Fa eventlist . 121When 122.Fa nevents 123is zero, 124.Fn kevent 125will return immediately even if there is a 126.Fa timeout 127specified, unlike 128.Xr select 2 . 129If 130.Fa timeout 131is not 132.Dv NULL , 133it specifies a maximum interval to wait 134for an event, which will be interpreted as a 135.Vt struct timespec . 136If 137.Fa timeout 138is 139.Dv NULL , 140.Fn kevent 141waits indefinitely. 142To effect a poll, the 143.Fa timeout 144argument should not be 145.Dv NULL , 146pointing to a zero-valued 147.Vt struct timespec . 148The same array may be used for the 149.Fa changelist 150and 151.Fa eventlist . 152.Pp 153.Fn EV_SET 154is a macro which is provided for ease of initializing a 155.Vt kevent 156structure. 157.Pp 158The 159.Vt kevent 160structure is defined as: 161.Bd -literal 162struct kevent { 163 uintptr_t ident; /* identifier for this event */ 164 short filter; /* filter for event */ 165 u_short flags; /* action flags for kqueue */ 166 u_int fflags; /* filter flag value */ 167 int64_t data; /* filter data value */ 168 void *udata; /* opaque user data identifier */ 169}; 170.Ed 171.Pp 172The fields of 173.Vt struct kevent 174are: 175.Bl -tag -width XXXfilter 176.It Fa ident 177Value used to identify this event. 178The exact interpretation is determined by the attached filter, 179but often is a file descriptor. 180.It Fa filter 181Identifies the kernel filter used to process this event. 182The pre-defined system filters are described below. 183.It Fa flags 184Actions to perform on the event. 185.It Fa fflags 186Filter-specific flags. 187.It Fa data 188Filter-specific data value. 189.It Fa udata 190Opaque user-defined value passed through the kernel unchanged. 191.El 192.Pp 193The 194.Fa flags 195field can contain the following values: 196.Bl -tag -width XXXEV_ONESHOT 197.It Dv EV_ADD 198Adds the event to the kqueue. 199Re-adding an existing event will modify the parameters of the original event, 200and not result in a duplicate entry. 201Adding an event automatically enables it, unless overridden by the 202.Dv EV_DISABLE 203flag. 204.It Dv EV_ENABLE 205Permit 206.Fn kevent 207to return the event if it is triggered. 208.It Dv EV_DISABLE 209Disable the event so 210.Fn kevent 211will not return it. 212The filter itself is not disabled. 213.It Dv EV_DISPATCH 214Disable the event source immediately after delivery of an event. 215See 216.Dv EV_DISABLE 217above. 218.It Dv EV_DELETE 219Removes the event from the kqueue. 220Events which are attached to file descriptors are automatically deleted 221on the last close of the descriptor. 222.It Dv EV_RECEIPT 223Causes 224.Fn kevent 225to return with 226.Dv EV_ERROR 227set without draining any pending events after updating events in the kqueue. 228When a filter is successfully added, the 229.Fa data 230field will be zero. 231This flag is useful for making bulk changes to a kqueue. 232.It Dv EV_ONESHOT 233Causes the event to return only the first occurrence of the filter 234being triggered. 235After the user retrieves the event from the kqueue, it is deleted. 236.It Dv EV_CLEAR 237After the event is retrieved by the user, its state is reset. 238This is useful for filters which report state transitions 239instead of the current state. 240Note that some filters may automatically set this flag internally. 241.It Dv EV_EOF 242Filters may set this flag to indicate filter-specific EOF condition. 243.It Dv EV_ERROR 244See 245.Sx RETURN VALUES 246below. 247.El 248.Pp 249The predefined system filters are listed below. 250Arguments may be passed to and from the filter via the 251.Fa fflags 252and 253.Fa data 254fields in the 255.Vt kevent 256structure. 257.Bl -tag -width EVFILT_SIGNAL 258.It Dv EVFILT_READ 259Takes a descriptor as the identifier, and returns whenever 260there is data available to read. 261The behavior of the filter is slightly different depending 262on the descriptor type. 263.Bl -tag -width 2n 264.It Sockets 265Sockets which have previously been passed to 266.Xr listen 2 267return when there is an incoming connection pending. 268.Fa data 269contains the size of the listen backlog. 270.Pp 271Other socket descriptors return when there is data to be read, 272subject to the 273.Dv SO_RCVLOWAT 274value of the socket buffer. 275This may be overridden with a per-filter low water mark at the 276time the filter is added by setting the 277.Dv NOTE_LOWAT 278flag in 279.Fa fflags , 280and specifying the new low water mark in 281.Fa data . 282On return, 283.Fa data 284contains the number of bytes in the socket buffer. 285.Pp 286If the read direction of the socket has shutdown, then the filter 287also sets 288.Dv EV_EOF 289in 290.Fa flags , 291and returns the socket error (if any) in 292.Fa fflags . 293It is possible for EOF to be returned (indicating the connection is gone) 294while there is still data pending in the socket buffer. 295.It Vnodes 296Returns when the file pointer is not at the end of file. 297.Fa data 298contains the offset from current position to end of file, 299and may be negative. 300If 301.Dv NOTE_EOF 302is set in 303.Fa fflags , 304.Fn kevent 305will also return when the file pointer is at the end of file. 306The end of file condition is indicated by the presence of 307.Dv NOTE_EOF 308in 309.Fa fflags 310on return. 311.It "FIFOs, Pipes" 312Returns when there is data to read; 313.Fa data 314contains the number of bytes available. 315.Pp 316When the last writer disconnects, the filter will set 317.Dv EV_EOF 318in 319.Fa flags . 320This may be cleared by passing in 321.Dv EV_CLEAR , 322at which point the filter will resume waiting for data to become 323available before returning. 324.It "BPF devices" 325Returns when the BPF buffer is full, the BPF timeout has expired, or 326when the BPF has 327.Dq immediate mode 328enabled and there is any data to read; 329.Fa data 330contains the number of bytes available. 331.El 332.It Dv EVFILT_EXCEPT 333Takes a descriptor as the identifier, and returns whenever one of the 334specified exceptional conditions has occurred on the descriptor. 335Conditions are specified in 336.Fa fflags . 337Currently, a filter can monitor the reception of out-of-band data 338on a socket or pseudo terminal with 339.Dv NOTE_OOB . 340.It Dv EVFILT_WRITE 341Takes a descriptor as the identifier, and returns whenever 342it is possible to write to the descriptor. 343For sockets, pipes, and FIFOs, 344.Fa data 345will contain the amount of space remaining in the write buffer. 346The filter will set 347.Dv EV_EOF 348when the reader disconnects, and for the FIFO case, 349this may be cleared by use of 350.Dv EV_CLEAR . 351Note that this filter is not supported for vnodes or BPF devices. 352.Pp 353For sockets, the low water mark and socket error handling is 354identical to the 355.Dv EVFILT_READ 356case. 357.\".It Dv EVFILT_AIO 358.\"The sigevent portion of the AIO request is filled in, with 359.\".Va sigev_notify_kqueue 360.\"containing the descriptor of the kqueue that the event should 361.\"be attached to, 362.\".Va sigev_value 363.\"containing the udata value, and 364.\".Va sigev_notify 365.\"set to 366.\".Dv SIGEV_KEVENT . 367.\"When the aio_* function is called, the event will be registered 368.\"with the specified kqueue, and the 369.\".Va ident 370.\"argument set to the 371.\".Li struct aiocb 372.\"returned by the aio_* function. 373.\"The filter returns under the same conditions as aio_error. 374.\".Pp 375.\"Alternatively, a kevent structure may be initialized, with 376.\".Va ident 377.\"containing the descriptor of the kqueue, and the 378.\"address of the kevent structure placed in the 379.\".Va aio_lio_opcode 380.\"field of the AIO request. 381.\"However, this approach will not work on architectures with 64-bit pointers, 382.\"and should be considered deprecated. 383.It Dv EVFILT_VNODE 384Takes a file descriptor as the identifier and the events to watch for in 385.Fa fflags , 386and returns when one or more of the requested events occurs on the descriptor. 387The events to monitor are: 388.Bl -tag -width XXNOTE_RENAME 389.It Dv NOTE_DELETE 390.Xr unlink 2 391was called on the file referenced by the descriptor. 392.It Dv NOTE_WRITE 393A write occurred on the file referenced by the descriptor. 394.It Dv NOTE_EXTEND 395The file referenced by the descriptor was extended. 396.It Dv NOTE_TRUNCATE 397The file referenced by the descriptor was truncated. 398.It Dv NOTE_ATTRIB 399The file referenced by the descriptor had its attributes changed. 400.It Dv NOTE_LINK 401The link count on the file changed. 402.It Dv NOTE_RENAME 403The file referenced by the descriptor was renamed. 404.It Dv NOTE_REVOKE 405Access to the file was revoked via 406.Xr revoke 2 407or the underlying file system was unmounted. 408.El 409.Pp 410On return, 411.Fa fflags 412contains the events which triggered the filter. 413.It Dv EVFILT_PROC 414Takes the process ID to monitor as the identifier and the events to watch for 415in 416.Fa fflags , 417and returns when the process performs one or more of the requested events. 418If a process can normally see another process, it can attach an event to it. 419The events to monitor are: 420.Bl -tag -width XXNOTE_TRACKERR 421.It Dv NOTE_EXIT 422The process has exited. 423The exit status will be stored in 424.Fa data 425in the same format as the status set by 426.Xr wait 2 . 427.It Dv NOTE_FORK 428The process has called 429.Xr fork 2 . 430.It Dv NOTE_EXEC 431The process has executed a new process via 432.Xr execve 2 433or similar call. 434.It Dv NOTE_TRACK 435Follow a process across 436.Xr fork 2 437calls. 438The parent process will return with 439.Dv NOTE_FORK 440set in the 441.Fa fflags 442field, while the child process will return with 443.Dv NOTE_CHILD 444set in 445.Fa fflags 446and the parent PID in 447.Fa data . 448.It Dv NOTE_TRACKERR 449This flag is returned if the system was unable to attach an event to 450the child process, usually due to resource limitations. 451.El 452.Pp 453On return, 454.Fa fflags 455contains the events which triggered the filter. 456.It Dv EVFILT_SIGNAL 457Takes the signal number to monitor as the identifier and returns 458when the given signal is delivered to the process. 459This coexists with the 460.Xr signal 3 461and 462.Xr sigaction 2 463facilities, and has a lower precedence. 464The filter will record all attempts to deliver a signal to a process, 465even if the signal has been marked as 466.Dv SIG_IGN . 467Event notification happens after normal signal delivery processing. 468.Fa data 469returns the number of times the signal has occurred since the last call to 470.Fn kevent . 471This filter automatically sets the 472.Dv EV_CLEAR 473flag internally. 474.It Dv EVFILT_TIMER 475Establishes an arbitrary timer identified by 476.Fa ident . 477When adding a timer, 478.Fa data 479specifies the timeout period in units described below or, if 480.Dv NOTE_ABSTIME 481is set in 482.Va fflags , 483the absolute time at which the timer should fire. 484The timer will repeat unless 485.Dv EV_ONESHOT 486is set in 487.Va flags 488or 489.Dv NOTE_ABSTIME 490is set in 491.Va fflags . 492On return, 493.Fa data 494contains the number of times the timeout has expired since the last call to 495.Fn kevent . 496This filter automatically sets 497.Dv EV_CLEAR 498in 499.Va flags 500for periodic timers. 501Timers created with 502.Dv NOTE_ABSTIME 503remain activated on the kqueue once the absolute time has passed unless 504.Dv EV_CLEAR 505or 506.Dv EV_ONESHOT 507are also specified. 508.Pp 509The filter accepts the following flags in the 510.Va fflags 511argument: 512.Bl -tag -width NOTE_MSECONDS 513.It Dv NOTE_SECONDS 514The timer value in 515.Va data 516is expressed in seconds. 517.It Dv NOTE_MSECONDS 518The timer value in 519.Va data 520is expressed in milliseconds. 521.It Dv NOTE_USECONDS 522The timer value in 523.Va data 524is expressed in microseconds. 525.It Dv NOTE_NSECONDS 526The timer value in 527.Va data 528is expressed in nanoseconds. 529.It Dv NOTE_ABSTIME 530The timer value is an absolute time with 531.Dv CLOCK_REALTIME 532as the reference clock. 533.El 534.Pp 535Note that 536.Dv NOTE_SECONDS , 537.Dv NOTE_MSECONDS , 538.Dv NOTE_USECONDS , 539and 540.Dv NOTE_NSECONDS 541are mutually exclusive; behavior is undefined if more than one are specified. 542If a timer value unit is not specified, the default is 543.Dv NOTE_MSECONDS . 544.Pp 545If an existing timer is re-added, the existing timer and related pending events 546will be cancelled. 547The timer will be re-started using the timeout period 548.Fa data . 549.It Dv EVFILT_DEVICE 550Takes a descriptor as the identifier and the events to watch for in 551.Fa fflags , 552and returns when one or more of the requested events occur on the 553descriptor. 554The events to monitor are: 555.Bl -tag -width XXNOTE_CHANGE 556.It Dv NOTE_CHANGE 557A device change event has occurred, 558e.g. an HDMI cable has been plugged in to a port. 559.El 560.Pp 561On return, 562.Fa fflags 563contains the events which triggered the filter. 564.El 565.Sh RETURN VALUES 566.Fn kqueue 567and 568.Fn kqueue1 569create a new kernel event queue and returns a file descriptor. 570If there was an error creating the kernel event queue, a value of -1 is 571returned and 572.Va errno 573set. 574.Pp 575.Fn kevent 576returns the number of events placed in the 577.Fa eventlist , 578up to the value given by 579.Fa nevents . 580If an error occurs while processing an element of the 581.Fa changelist 582and there is enough room in the 583.Fa eventlist , 584then the event will be placed in the 585.Fa eventlist 586with 587.Dv EV_ERROR 588set in 589.Fa flags 590and the system error in 591.Fa data . 592Otherwise, -1 will be returned, and 593.Va errno 594will be set to indicate the error condition. 595If the time limit expires, then 596.Fn kevent 597returns 0. 598.Sh ERRORS 599The 600.Fn kqueue 601and 602.Fn kqueue1 603functions fail if: 604.Bl -tag -width Er 605.It Bq Er ENOMEM 606The kernel failed to allocate enough memory for the kernel queue. 607.It Bq Er EMFILE 608The per-process descriptor table is full. 609.It Bq Er ENFILE 610The system file table is full. 611.El 612.Pp 613In addition, 614.Fn kqueue1 615fails if: 616.Bl -tag -width Er 617.It Bq Er EINVAL 618.Fa flags 619is invalid. 620.El 621.Pp 622The 623.Fn kevent 624function fails if: 625.Bl -tag -width Er 626.It Bq Er EACCES 627The process does not have permission to register a filter. 628.It Bq Er EFAULT 629There was an error reading or writing the 630.Vt kevent 631structure. 632.It Bq Er EBADF 633The specified descriptor is invalid. 634.It Bq Er EINTR 635A signal was delivered before the timeout expired and before any 636events were placed on the kqueue for return. 637.It Bq Er EINVAL 638The specified time limit or filter is invalid. 639.It Bq Er ENOENT 640The event could not be found to be modified or deleted. 641.It Bq Er ENOMEM 642No memory was available to register the event. 643.It Bq Er ESRCH 644The specified process to attach to does not exist. 645.El 646.Sh SEE ALSO 647.Xr clock_gettime 2 , 648.Xr poll 2 , 649.Xr read 2 , 650.Xr select 2 , 651.Xr sigaction 2 , 652.Xr wait 2 , 653.Xr write 2 , 654.Xr signal 3 655.Sh HISTORY 656The 657.Fn kqueue 658and 659.Fn kevent 660functions first appeared in 661.Fx 4.1 662and have been available since 663.Ox 2.9 . 664.Sh AUTHORS 665The 666.Fn kqueue 667system and this manual page were written by 668.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org . 669