1.\" $OpenBSD: kqueue.2,v 1.32 2015/11/07 22:57:52 jmc Exp $ 2.\" 3.\" Copyright (c) 2000 Jonathan Lemon 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $ 28.\" 29.Dd $Mdocdate: November 7 2015 $ 30.Dt KQUEUE 2 31.Os 32.Sh NAME 33.Nm kqueue , 34.Nm kevent , 35.Nm EV_SET 36.Nd kernel event notification mechanism 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/event.h 40.In sys/time.h 41.Ft int 42.Fn kqueue "void" 43.Ft int 44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 45.Fn EV_SET "&kev" ident filter flags fflags data udata 46.Sh DESCRIPTION 47.Fn kqueue 48provides a generic method of notifying the user when an event 49happens or a condition holds, based on the results of small 50pieces of kernel code termed 51.Dq filters . 52A kevent is identified by the (ident, filter) pair; there may only 53be one unique kevent per kqueue. 54.Pp 55The filter is executed upon the initial registration of a kevent 56in order to detect whether a preexisting condition is present, and is also 57executed whenever an event is passed to the filter for evaluation. 58If the filter determines that the condition should be reported, 59then the kevent is placed on the kqueue for the user to retrieve. 60.Pp 61The filter is also run when the user attempts to retrieve the kevent 62from the kqueue. 63If the filter indicates that the condition that triggered 64the event no longer holds, the kevent is removed from the kqueue and 65is not returned. 66.Pp 67Multiple events which trigger the filter do not result in multiple 68kevents being placed on the kqueue; instead, the filter will aggregate 69the events into a single 70.Li struct kevent . 71Calling 72.Fn close 73on a file descriptor will remove any kevents that reference the descriptor. 74.Pp 75.Fn kqueue 76creates a new kernel event queue and returns a descriptor. 77The queue is not inherited by a child created with 78.Xr fork 2 . 79Similarly, kqueues cannot be passed across UNIX-domain sockets. 80.Pp 81.Fn kevent 82is used to register events with the queue, and return any pending 83events to the user. 84.Fa changelist 85is a pointer to an array of 86.Va kevent 87structures, as defined in 88.In sys/event.h . 89All changes contained in the 90.Fa changelist 91are applied before any pending events are read from the queue. 92.Fa nchanges 93gives the size of 94.Fa changelist . 95.Fa eventlist 96is a pointer to an array of kevent structures. 97.Fa nevents 98determines the size of 99.Fa eventlist . 100When 101.Fa nevents 102is zero, 103.Fn kevent 104will return immediately even if there is a 105.Fa timeout 106specified unlike 107.Xr select 2 . 108If 109.Fa timeout 110is a non-null pointer, it specifies a maximum interval to wait 111for an event, which will be interpreted as a 112.Li struct timespec . 113If 114.Fa timeout 115is a null pointer, 116.Fn kevent 117waits indefinitely. 118To effect a poll, the 119.Fa timeout 120argument should be non-null, pointing to a zero-valued 121.Va timespec 122structure. 123The same array may be used for the 124.Fa changelist 125and 126.Fa eventlist . 127.Pp 128.Fn EV_SET 129is a macro which is provided for ease of initializing a 130kevent structure. 131.Pp 132The 133.Va kevent 134structure is defined as: 135.Bd -literal 136struct kevent { 137 uintptr_t ident; /* identifier for this event */ 138 short filter; /* filter for event */ 139 u_short flags; /* action flags for kqueue */ 140 u_int fflags; /* filter flag value */ 141 quad_t data; /* filter data value */ 142 void *udata; /* opaque user data identifier */ 143}; 144.Ed 145.Pp 146The fields of 147.Li struct kevent 148are: 149.Bl -tag -width XXXfilter 150.It ident 151Value used to identify this event. 152The exact interpretation is determined by the attached filter, 153but often is a file descriptor. 154.It filter 155Identifies the kernel filter used to process this event. 156The pre-defined system filters are described below. 157.It flags 158Actions to perform on the event. 159.It fflags 160Filter-specific flags. 161.It data 162Filter-specific data value. 163.It udata 164Opaque user-defined value passed through the kernel unchanged. 165.El 166.Pp 167The 168.Va flags 169field can contain the following values: 170.Bl -tag -width XXXEV_ONESHOT 171.It Dv EV_ADD 172Adds the event to the kqueue. 173Re-adding an existing event will modify the parameters of the original event, 174and not result in a duplicate entry. 175Adding an event automatically enables it, unless overridden by the 176.Dv EV_DISABLE 177flag. 178.It Dv EV_ENABLE 179Permit 180.Fn kevent 181to return the event if it is triggered. 182.It Dv EV_DISABLE 183Disable the event so 184.Fn kevent 185will not return it. 186The filter itself is not disabled. 187.It Dv EV_DELETE 188Removes the event from the kqueue. 189Events which are attached to file descriptors are automatically deleted 190on the last close of the descriptor. 191.It Dv EV_ONESHOT 192Causes the event to return only the first occurrence of the filter 193being triggered. 194After the user retrieves the event from the kqueue, it is deleted. 195.It Dv EV_CLEAR 196After the event is retrieved by the user, its state is reset. 197This is useful for filters which report state transitions 198instead of the current state. 199Note that some filters may automatically set this flag internally. 200.It Dv EV_EOF 201Filters may set this flag to indicate filter-specific EOF condition. 202.It Dv EV_ERROR 203See 204.Sx RETURN VALUES 205below. 206.El 207.Pp 208The predefined system filters are listed below. 209Arguments may be passed to and from the filter via the 210.Va fflags 211and 212.Va data 213fields in the kevent structure. 214.Bl -tag -width EVFILT_SIGNAL 215.It Dv EVFILT_READ 216Takes a descriptor as the identifier, and returns whenever 217there is data available to read. 218The behavior of the filter is slightly different depending 219on the descriptor type. 220.Bl -tag -width 2n 221.It Sockets 222Sockets which have previously been passed to 223.Fn listen 224return when there is an incoming connection pending. 225.Va data 226contains the size of the listen backlog. 227.Pp 228Other socket descriptors return when there is data to be read, 229subject to the 230.Dv SO_RCVLOWAT 231value of the socket buffer. 232This may be overridden with a per-filter low water mark at the 233time the filter is added by setting the 234.Dv NOTE_LOWAT 235flag in 236.Va fflags , 237and specifying the new low water mark in 238.Va data . 239On return, 240.Va data 241contains the number of bytes in the socket buffer. 242.Pp 243If the read direction of the socket has shutdown, then the filter 244also sets 245.Dv EV_EOF 246in 247.Va flags , 248and returns the socket error (if any) in 249.Va fflags . 250It is possible for EOF to be returned (indicating the connection is gone) 251while there is still data pending in the socket buffer. 252.It Vnodes 253Returns when the file pointer is not at the end of file. 254.Va data 255contains the offset from current position to end of file, 256and may be negative. 257If 258.Dv NOTE_EOF 259is set in 260.Va fflags , 261.Fn kevent 262will also return when the file pointer is at the end of file. 263The end of file condition is indicated by the presence of 264.Dv NOTE_EOF 265in 266.Va fflags 267on return. 268.It "Fifos, Pipes" 269Returns when there is data to read; 270.Va data 271contains the number of bytes available. 272.Pp 273When the last writer disconnects, the filter will set 274.Dv EV_EOF 275in 276.Va flags . 277This may be cleared by passing in 278.Dv EV_CLEAR , 279at which point the filter will resume waiting for data to become 280available before returning. 281.It "BPF devices" 282Returns when the BPF buffer is full, the BPF timeout has expired, or 283when the BPF has 284.Dq immediate mode 285enabled and there is any data to read; 286.Va data 287contains the number of bytes available. 288.El 289.It Dv EVFILT_WRITE 290Takes a descriptor as the identifier, and returns whenever 291it is possible to write to the descriptor. 292For sockets, pipes, and FIFOs, 293.Va data 294will contain the amount of space remaining in the write buffer. 295The filter will set 296.Dv EV_EOF 297when the reader disconnects, and for the FIFO case, 298this may be cleared by use of 299.Dv EV_CLEAR . 300Note that this filter is not supported for vnodes or BPF devices. 301.Pp 302For sockets, the low water mark and socket error handling is 303identical to the 304.Dv EVFILT_READ 305case. 306.\".It Dv EVFILT_AIO 307.\"The sigevent portion of the AIO request is filled in, with 308.\".Va sigev_notify_kqueue 309.\"containing the descriptor of the kqueue that the event should 310.\"be attached to, 311.\".Va sigev_value 312.\"containing the udata value, and 313.\".Va sigev_notify 314.\"set to 315.\".Dv SIGEV_KEVENT . 316.\"When the aio_* function is called, the event will be registered 317.\"with the specified kqueue, and the 318.\".Va ident 319.\"argument set to the 320.\".Li struct aiocb 321.\"returned by the aio_* function. 322.\"The filter returns under the same conditions as aio_error. 323.\".Pp 324.\"Alternatively, a kevent structure may be initialized, with 325.\".Va ident 326.\"containing the descriptor of the kqueue, and the 327.\"address of the kevent structure placed in the 328.\".Va aio_lio_opcode 329.\"field of the AIO request. 330.\"However, this approach will not work on architectures with 64-bit pointers, 331.\"and should be considered deprecated. 332.It Dv EVFILT_VNODE 333Takes a file descriptor as the identifier and the events to watch for in 334.Va fflags , 335and returns when one or more of the requested events occurs on the descriptor. 336The events to monitor are: 337.Bl -tag -width XXNOTE_RENAME 338.It Dv NOTE_DELETE 339.Fn unlink 340was called on the file referenced by the descriptor. 341.It Dv NOTE_WRITE 342A write occurred on the file referenced by the descriptor. 343.It Dv NOTE_EXTEND 344The file referenced by the descriptor was extended. 345.It Dv NOTE_TRUNCATE 346The file referenced by the descriptor was truncated. 347.It Dv NOTE_ATTRIB 348The file referenced by the descriptor had its attributes changed. 349.It Dv NOTE_LINK 350The link count on the file changed. 351.It Dv NOTE_RENAME 352The file referenced by the descriptor was renamed. 353.It Dv NOTE_REVOKE 354Access to the file was revoked via 355.Xr revoke 2 356or the underlying file system was unmounted. 357.El 358.Pp 359On return, 360.Va fflags 361contains the events which triggered the filter. 362.It Dv EVFILT_PROC 363Takes the process ID to monitor as the identifier and the events to watch for 364in 365.Va fflags , 366and returns when the process performs one or more of the requested events. 367If a process can normally see another process, it can attach an event to it. 368The events to monitor are: 369.Bl -tag -width XXNOTE_TRACKERR 370.It Dv NOTE_EXIT 371The process has exited. 372The exit status will be stored in 373.Va data 374in the same format as the status set by 375.Xr wait 2 . 376.It Dv NOTE_FORK 377The process has called 378.Fn fork . 379.It Dv NOTE_EXEC 380The process has executed a new process via 381.Xr execve 2 382or similar call. 383.It Dv NOTE_TRACK 384Follow a process across 385.Fn fork 386calls. 387The parent process will return with 388.Dv NOTE_FORK 389set in the 390.Va fflags 391field, while the child process will return with 392.Dv NOTE_CHILD 393set in 394.Va fflags 395and the parent PID in 396.Va data . 397.It Dv NOTE_TRACKERR 398This flag is returned if the system was unable to attach an event to 399the child process, usually due to resource limitations. 400.El 401.Pp 402On return, 403.Va fflags 404contains the events which triggered the filter. 405.It Dv EVFILT_SIGNAL 406Takes the signal number to monitor as the identifier and returns 407when the given signal is delivered to the process. 408This coexists with the 409.Fn signal 410and 411.Fn sigaction 412facilities, and has a lower precedence. 413The filter will record all attempts to deliver a signal to a process, 414even if the signal has been marked as 415.Dv SIG_IGN . 416Event notification happens after normal signal delivery processing. 417.Va data 418returns the number of times the signal has occurred since the last call to 419.Fn kevent . 420This filter automatically sets the 421.Dv EV_CLEAR 422flag internally. 423.It Dv EVFILT_TIMER 424Establishes an arbitrary timer identified by 425.Va ident . 426When adding a timer, 427.Va data 428specifies the timeout period in milliseconds. 429The timer will be periodic unless 430.Dv EV_ONESHOT 431is specified. 432On return, 433.Va data 434contains the number of times the timeout has expired since the last call to 435.Fn kevent . 436This filter automatically sets the 437.Dv EV_CLEAR 438flag internally. 439.El 440.Sh RETURN VALUES 441.Fn kqueue 442creates a new kernel event queue and returns a file descriptor. 443If there was an error creating the kernel event queue, a value of -1 is 444returned and errno set. 445.Pp 446.Fn kevent 447returns the number of events placed in the 448.Fa eventlist , 449up to the value given by 450.Fa nevents . 451If an error occurs while processing an element of the 452.Fa changelist 453and there is enough room in the 454.Fa eventlist , 455then the event will be placed in the 456.Fa eventlist 457with 458.Dv EV_ERROR 459set in 460.Va flags 461and the system error in 462.Va data . 463Otherwise, 464.Dv -1 465will be returned, and 466.Dv errno 467will be set to indicate the error condition. 468If the time limit expires, then 469.Fn kevent 470returns 0. 471.Sh ERRORS 472The 473.Fn kqueue 474function fails if: 475.Bl -tag -width Er 476.It Bq Er ENOMEM 477The kernel failed to allocate enough memory for the kernel queue. 478.It Bq Er EMFILE 479The per-process descriptor table is full. 480.It Bq Er ENFILE 481The system file table is full. 482.El 483.Pp 484The 485.Fn kevent 486function fails if: 487.Bl -tag -width Er 488.It Bq Er EACCES 489The process does not have permission to register a filter. 490.It Bq Er EFAULT 491There was an error reading or writing the 492.Va kevent 493structure. 494.It Bq Er EBADF 495The specified descriptor is invalid. 496.It Bq Er EINTR 497A signal was delivered before the timeout expired and before any 498events were placed on the kqueue for return. 499.It Bq Er EINVAL 500The specified time limit or filter is invalid. 501.It Bq Er ENOENT 502The event could not be found to be modified or deleted. 503.It Bq Er ENOMEM 504No memory was available to register the event. 505.It Bq Er ESRCH 506The specified process to attach to does not exist. 507.El 508.Sh SEE ALSO 509.Xr poll 2 , 510.Xr read 2 , 511.Xr select 2 , 512.Xr sigaction 2 , 513.Xr wait 2 , 514.Xr write 2 , 515.Xr signal 3 516.Sh HISTORY 517The 518.Fn kqueue 519and 520.Fn kevent 521functions first appeared in 522.Fx 4.1 . 523.Sh AUTHORS 524The 525.Fn kqueue 526system and this manual page were written by 527.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org . 528.Sh BUGS 529It is currently not possible to watch FIFOs or AIO that reside 530on anything but a UFS file system. 531Watching a vnode is possible on UFS, NFS and MS-DOS file systems. 532.Pp 533The 534.Fa timeout 535value is limited to 24 hours; longer timeouts will be silently 536reinterpreted as 24 hours. 537