1.\" $OpenBSD: kqueue.2,v 1.39 2019/07/01 16:52:02 cheloha Exp $ 2.\" 3.\" Copyright (c) 2000 Jonathan Lemon 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $ 28.\" 29.Dd $Mdocdate: July 1 2019 $ 30.Dt KQUEUE 2 31.Os 32.Sh NAME 33.Nm kqueue , 34.Nm kevent , 35.Nm EV_SET 36.Nd kernel event notification mechanism 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/event.h 40.In sys/time.h 41.Ft int 42.Fn kqueue "void" 43.Ft int 44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 45.Fn EV_SET "&kev" ident filter flags fflags data udata 46.Sh DESCRIPTION 47.Fn kqueue 48provides a generic method of notifying the user when an event 49happens or a condition holds, based on the results of small 50pieces of kernel code termed 51.Dq filters . 52A kevent is identified by the (ident, filter) pair; there may only 53be one unique kevent per kqueue. 54.Pp 55The filter is executed upon the initial registration of a kevent 56in order to detect whether a preexisting condition is present, and is also 57executed whenever an event is passed to the filter for evaluation. 58If the filter determines that the condition should be reported, 59then the kevent is placed on the kqueue for the user to retrieve. 60.Pp 61The filter is also run when the user attempts to retrieve the kevent 62from the kqueue. 63If the filter indicates that the condition that triggered 64the event no longer holds, the kevent is removed from the kqueue and 65is not returned. 66.Pp 67Multiple events which trigger the filter do not result in multiple 68kevents being placed on the kqueue; instead, the filter will aggregate 69the events into a single 70.Vt struct kevent . 71Calling 72.Xr close 2 73on a file descriptor will remove any kevents that reference the descriptor. 74.Pp 75.Fn kqueue 76creates a new kernel event queue and returns a descriptor. 77The queue is not inherited by a child created with 78.Xr fork 2 . 79Similarly, kqueues cannot be passed across UNIX-domain sockets. 80.Pp 81.Fn kevent 82is used to register events with the queue, and return any pending 83events to the user. 84.Fa changelist 85is a pointer to an array of 86.Vt kevent 87structures, as defined in 88.In sys/event.h . 89All changes contained in the 90.Fa changelist 91are applied before any pending events are read from the queue. 92.Fa nchanges 93gives the size of 94.Fa changelist . 95.Fa eventlist 96is a pointer to an array of 97.Vt kevent 98structures. 99.Fa nevents 100determines the size of 101.Fa eventlist . 102When 103.Fa nevents 104is zero, 105.Fn kevent 106will return immediately even if there is a 107.Fa timeout 108specified unlike 109.Xr select 2 . 110If 111.Fa timeout 112is not 113.Dv NULL , 114it specifies a maximum interval to wait 115for an event, which will be interpreted as a 116.Vt struct timespec . 117If 118.Fa timeout 119is 120.Dv NULL , 121.Fn kevent 122waits indefinitely. 123To effect a poll, the 124.Fa timeout 125argument should not be 126.Dv NULL , 127pointing to a zero-valued 128.Vt struct timespec . 129The same array may be used for the 130.Fa changelist 131and 132.Fa eventlist . 133.Pp 134.Fn EV_SET 135is a macro which is provided for ease of initializing a 136.Vt kevent 137structure. 138.Pp 139The 140.Vt kevent 141structure is defined as: 142.Bd -literal 143struct kevent { 144 uintptr_t ident; /* identifier for this event */ 145 short filter; /* filter for event */ 146 u_short flags; /* action flags for kqueue */ 147 u_int fflags; /* filter flag value */ 148 int64_t data; /* filter data value */ 149 void *udata; /* opaque user data identifier */ 150}; 151.Ed 152.Pp 153The fields of 154.Vt struct kevent 155are: 156.Bl -tag -width XXXfilter 157.It Fa ident 158Value used to identify this event. 159The exact interpretation is determined by the attached filter, 160but often is a file descriptor. 161.It Fa filter 162Identifies the kernel filter used to process this event. 163The pre-defined system filters are described below. 164.It Fa flags 165Actions to perform on the event. 166.It Fa fflags 167Filter-specific flags. 168.It Fa data 169Filter-specific data value. 170.It Fa udata 171Opaque user-defined value passed through the kernel unchanged. 172.El 173.Pp 174The 175.Fa flags 176field can contain the following values: 177.Bl -tag -width XXXEV_ONESHOT 178.It Dv EV_ADD 179Adds the event to the kqueue. 180Re-adding an existing event will modify the parameters of the original event, 181and not result in a duplicate entry. 182Adding an event automatically enables it, unless overridden by the 183.Dv EV_DISABLE 184flag. 185.It Dv EV_ENABLE 186Permit 187.Fn kevent 188to return the event if it is triggered. 189.It Dv EV_DISABLE 190Disable the event so 191.Fn kevent 192will not return it. 193The filter itself is not disabled. 194.It Dv EV_DISPATCH 195Disable the event source immediately after delivery of an event. 196See 197.Dv EV_DISABLE 198above. 199.It Dv EV_DELETE 200Removes the event from the kqueue. 201Events which are attached to file descriptors are automatically deleted 202on the last close of the descriptor. 203.It Dv EV_RECEIPT 204Causes 205.Fn kevent 206to return with 207.Dv EV_ERROR 208set without draining any pending events after updating events in the kqueue. 209When a filter is successfully added the 210.Fa data 211field will be zero. 212This flag is useful for making bulk changes to a kqueue. 213.It Dv EV_ONESHOT 214Causes the event to return only the first occurrence of the filter 215being triggered. 216After the user retrieves the event from the kqueue, it is deleted. 217.It Dv EV_CLEAR 218After the event is retrieved by the user, its state is reset. 219This is useful for filters which report state transitions 220instead of the current state. 221Note that some filters may automatically set this flag internally. 222.It Dv EV_EOF 223Filters may set this flag to indicate filter-specific EOF condition. 224.It Dv EV_ERROR 225See 226.Sx RETURN VALUES 227below. 228.El 229.Pp 230The predefined system filters are listed below. 231Arguments may be passed to and from the filter via the 232.Fa fflags 233and 234.Fa data 235fields in the 236.Vt kevent 237structure. 238.Bl -tag -width EVFILT_SIGNAL 239.It Dv EVFILT_READ 240Takes a descriptor as the identifier, and returns whenever 241there is data available to read. 242The behavior of the filter is slightly different depending 243on the descriptor type. 244.Bl -tag -width 2n 245.It Sockets 246Sockets which have previously been passed to 247.Xr listen 2 248return when there is an incoming connection pending. 249.Fa data 250contains the size of the listen backlog. 251.Pp 252Other socket descriptors return when there is data to be read, 253subject to the 254.Dv SO_RCVLOWAT 255value of the socket buffer. 256This may be overridden with a per-filter low water mark at the 257time the filter is added by setting the 258.Dv NOTE_LOWAT 259flag in 260.Fa fflags , 261and specifying the new low water mark in 262.Fa data . 263On return, 264.Fa data 265contains the number of bytes in the socket buffer. 266.Pp 267If the read direction of the socket has shutdown, then the filter 268also sets 269.Dv EV_EOF 270in 271.Fa flags , 272and returns the socket error (if any) in 273.Fa fflags . 274It is possible for EOF to be returned (indicating the connection is gone) 275while there is still data pending in the socket buffer. 276.It Vnodes 277Returns when the file pointer is not at the end of file. 278.Fa data 279contains the offset from current position to end of file, 280and may be negative. 281If 282.Dv NOTE_EOF 283is set in 284.Fa fflags , 285.Fn kevent 286will also return when the file pointer is at the end of file. 287The end of file condition is indicated by the presence of 288.Dv NOTE_EOF 289in 290.Fa fflags 291on return. 292.It "FIFOs, Pipes" 293Returns when there is data to read; 294.Fa data 295contains the number of bytes available. 296.Pp 297When the last writer disconnects, the filter will set 298.Dv EV_EOF 299in 300.Fa flags . 301This may be cleared by passing in 302.Dv EV_CLEAR , 303at which point the filter will resume waiting for data to become 304available before returning. 305.It "BPF devices" 306Returns when the BPF buffer is full, the BPF timeout has expired, or 307when the BPF has 308.Dq immediate mode 309enabled and there is any data to read; 310.Fa data 311contains the number of bytes available. 312.El 313.It Dv EVFILT_WRITE 314Takes a descriptor as the identifier, and returns whenever 315it is possible to write to the descriptor. 316For sockets, pipes, and FIFOs, 317.Fa data 318will contain the amount of space remaining in the write buffer. 319The filter will set 320.Dv EV_EOF 321when the reader disconnects, and for the FIFO case, 322this may be cleared by use of 323.Dv EV_CLEAR . 324Note that this filter is not supported for vnodes or BPF devices. 325.Pp 326For sockets, the low water mark and socket error handling is 327identical to the 328.Dv EVFILT_READ 329case. 330.\".It Dv EVFILT_AIO 331.\"The sigevent portion of the AIO request is filled in, with 332.\".Va sigev_notify_kqueue 333.\"containing the descriptor of the kqueue that the event should 334.\"be attached to, 335.\".Va sigev_value 336.\"containing the udata value, and 337.\".Va sigev_notify 338.\"set to 339.\".Dv SIGEV_KEVENT . 340.\"When the aio_* function is called, the event will be registered 341.\"with the specified kqueue, and the 342.\".Va ident 343.\"argument set to the 344.\".Li struct aiocb 345.\"returned by the aio_* function. 346.\"The filter returns under the same conditions as aio_error. 347.\".Pp 348.\"Alternatively, a kevent structure may be initialized, with 349.\".Va ident 350.\"containing the descriptor of the kqueue, and the 351.\"address of the kevent structure placed in the 352.\".Va aio_lio_opcode 353.\"field of the AIO request. 354.\"However, this approach will not work on architectures with 64-bit pointers, 355.\"and should be considered deprecated. 356.It Dv EVFILT_VNODE 357Takes a file descriptor as the identifier and the events to watch for in 358.Fa fflags , 359and returns when one or more of the requested events occurs on the descriptor. 360The events to monitor are: 361.Bl -tag -width XXNOTE_RENAME 362.It Dv NOTE_DELETE 363.Xr unlink 2 364was called on the file referenced by the descriptor. 365.It Dv NOTE_WRITE 366A write occurred on the file referenced by the descriptor. 367.It Dv NOTE_EXTEND 368The file referenced by the descriptor was extended. 369.It Dv NOTE_TRUNCATE 370The file referenced by the descriptor was truncated. 371.It Dv NOTE_ATTRIB 372The file referenced by the descriptor had its attributes changed. 373.It Dv NOTE_LINK 374The link count on the file changed. 375.It Dv NOTE_RENAME 376The file referenced by the descriptor was renamed. 377.It Dv NOTE_REVOKE 378Access to the file was revoked via 379.Xr revoke 2 380or the underlying file system was unmounted. 381.El 382.Pp 383On return, 384.Fa fflags 385contains the events which triggered the filter. 386.It Dv EVFILT_PROC 387Takes the process ID to monitor as the identifier and the events to watch for 388in 389.Fa fflags , 390and returns when the process performs one or more of the requested events. 391If a process can normally see another process, it can attach an event to it. 392The events to monitor are: 393.Bl -tag -width XXNOTE_TRACKERR 394.It Dv NOTE_EXIT 395The process has exited. 396The exit status will be stored in 397.Fa data 398in the same format as the status set by 399.Xr wait 2 . 400.It Dv NOTE_FORK 401The process has called 402.Xr fork 2 . 403.It Dv NOTE_EXEC 404The process has executed a new process via 405.Xr execve 2 406or similar call. 407.It Dv NOTE_TRACK 408Follow a process across 409.Xr fork 2 410calls. 411The parent process will return with 412.Dv NOTE_FORK 413set in the 414.Fa fflags 415field, while the child process will return with 416.Dv NOTE_CHILD 417set in 418.Fa fflags 419and the parent PID in 420.Fa data . 421.It Dv NOTE_TRACKERR 422This flag is returned if the system was unable to attach an event to 423the child process, usually due to resource limitations. 424.El 425.Pp 426On return, 427.Fa fflags 428contains the events which triggered the filter. 429.It Dv EVFILT_SIGNAL 430Takes the signal number to monitor as the identifier and returns 431when the given signal is delivered to the process. 432This coexists with the 433.Xr signal 3 434and 435.Xr sigaction 2 436facilities, and has a lower precedence. 437The filter will record all attempts to deliver a signal to a process, 438even if the signal has been marked as 439.Dv SIG_IGN . 440Event notification happens after normal signal delivery processing. 441.Fa data 442returns the number of times the signal has occurred since the last call to 443.Fn kevent . 444This filter automatically sets the 445.Dv EV_CLEAR 446flag internally. 447.It Dv EVFILT_TIMER 448Establishes an arbitrary timer identified by 449.Fa ident . 450When adding a timer, 451.Fa data 452specifies the timeout period in milliseconds. 453The timer will be periodic unless 454.Dv EV_ONESHOT 455is specified. 456On return, 457.Fa data 458contains the number of times the timeout has expired since the last call to 459.Fn kevent . 460This filter automatically sets the 461.Dv EV_CLEAR 462flag internally. 463.It Dv EVFILT_DEVICE 464Takes a descriptor as the identifier and the events to watch for in 465.Fa fflags , 466and returns when one or more of the requested events occur on the 467descriptor. 468The events to monitor are: 469.Bl -tag -width XXNOTE_CHANGE 470.It Dv NOTE_CHANGE 471A device change event has occurred, e.g. an HDMI cable has been plugged in to a port. 472.El 473.Pp 474On return, 475.Fa fflags 476contains the events which triggered the filter. 477.El 478.Sh RETURN VALUES 479.Fn kqueue 480creates a new kernel event queue and returns a file descriptor. 481If there was an error creating the kernel event queue, a value of -1 is 482returned and 483.Va errno 484set. 485.Pp 486.Fn kevent 487returns the number of events placed in the 488.Fa eventlist , 489up to the value given by 490.Fa nevents . 491If an error occurs while processing an element of the 492.Fa changelist 493and there is enough room in the 494.Fa eventlist , 495then the event will be placed in the 496.Fa eventlist 497with 498.Dv EV_ERROR 499set in 500.Fa flags 501and the system error in 502.Fa data . 503Otherwise, -1 will be returned, and 504.Va errno 505will be set to indicate the error condition. 506If the time limit expires, then 507.Fn kevent 508returns 0. 509.Sh ERRORS 510The 511.Fn kqueue 512function fails if: 513.Bl -tag -width Er 514.It Bq Er ENOMEM 515The kernel failed to allocate enough memory for the kernel queue. 516.It Bq Er EMFILE 517The per-process descriptor table is full. 518.It Bq Er ENFILE 519The system file table is full. 520.El 521.Pp 522The 523.Fn kevent 524function fails if: 525.Bl -tag -width Er 526.It Bq Er EACCES 527The process does not have permission to register a filter. 528.It Bq Er EFAULT 529There was an error reading or writing the 530.Vt kevent 531structure. 532.It Bq Er EBADF 533The specified descriptor is invalid. 534.It Bq Er EINTR 535A signal was delivered before the timeout expired and before any 536events were placed on the kqueue for return. 537.It Bq Er EINVAL 538The specified time limit or filter is invalid. 539.It Bq Er ENOENT 540The event could not be found to be modified or deleted. 541.It Bq Er ENOMEM 542No memory was available to register the event. 543.It Bq Er ESRCH 544The specified process to attach to does not exist. 545.El 546.Sh SEE ALSO 547.Xr poll 2 , 548.Xr read 2 , 549.Xr select 2 , 550.Xr sigaction 2 , 551.Xr wait 2 , 552.Xr write 2 , 553.Xr signal 3 554.Sh HISTORY 555The 556.Fn kqueue 557and 558.Fn kevent 559functions first appeared in 560.Fx 4.1 561and have been available since 562.Ox 2.9 . 563.Sh AUTHORS 564The 565.Fn kqueue 566system and this manual page were written by 567.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org . 568.Sh BUGS 569It is currently not possible to watch FIFOs or AIO that reside 570on anything but a UFS file system. 571Watching a vnode is possible on UFS, NFS and MS-DOS file systems. 572