1.\" $OpenBSD: kqueue.2,v 1.37 2018/01/13 17:13:12 jmc Exp $ 2.\" 3.\" Copyright (c) 2000 Jonathan Lemon 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $ 28.\" 29.Dd $Mdocdate: January 13 2018 $ 30.Dt KQUEUE 2 31.Os 32.Sh NAME 33.Nm kqueue , 34.Nm kevent , 35.Nm EV_SET 36.Nd kernel event notification mechanism 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/event.h 40.In sys/time.h 41.Ft int 42.Fn kqueue "void" 43.Ft int 44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 45.Fn EV_SET "&kev" ident filter flags fflags data udata 46.Sh DESCRIPTION 47.Fn kqueue 48provides a generic method of notifying the user when an event 49happens or a condition holds, based on the results of small 50pieces of kernel code termed 51.Dq filters . 52A kevent is identified by the (ident, filter) pair; there may only 53be one unique kevent per kqueue. 54.Pp 55The filter is executed upon the initial registration of a kevent 56in order to detect whether a preexisting condition is present, and is also 57executed whenever an event is passed to the filter for evaluation. 58If the filter determines that the condition should be reported, 59then the kevent is placed on the kqueue for the user to retrieve. 60.Pp 61The filter is also run when the user attempts to retrieve the kevent 62from the kqueue. 63If the filter indicates that the condition that triggered 64the event no longer holds, the kevent is removed from the kqueue and 65is not returned. 66.Pp 67Multiple events which trigger the filter do not result in multiple 68kevents being placed on the kqueue; instead, the filter will aggregate 69the events into a single 70.Li struct kevent . 71Calling 72.Fn close 73on a file descriptor will remove any kevents that reference the descriptor. 74.Pp 75.Fn kqueue 76creates a new kernel event queue and returns a descriptor. 77The queue is not inherited by a child created with 78.Xr fork 2 . 79Similarly, kqueues cannot be passed across UNIX-domain sockets. 80.Pp 81.Fn kevent 82is used to register events with the queue, and return any pending 83events to the user. 84.Fa changelist 85is a pointer to an array of 86.Va kevent 87structures, as defined in 88.In sys/event.h . 89All changes contained in the 90.Fa changelist 91are applied before any pending events are read from the queue. 92.Fa nchanges 93gives the size of 94.Fa changelist . 95.Fa eventlist 96is a pointer to an array of kevent structures. 97.Fa nevents 98determines the size of 99.Fa eventlist . 100When 101.Fa nevents 102is zero, 103.Fn kevent 104will return immediately even if there is a 105.Fa timeout 106specified unlike 107.Xr select 2 . 108If 109.Fa timeout 110is a non-null pointer, it specifies a maximum interval to wait 111for an event, which will be interpreted as a 112.Li struct timespec . 113If 114.Fa timeout 115is a null pointer, 116.Fn kevent 117waits indefinitely. 118To effect a poll, the 119.Fa timeout 120argument should be non-null, pointing to a zero-valued 121.Va timespec 122structure. 123The same array may be used for the 124.Fa changelist 125and 126.Fa eventlist . 127.Pp 128.Fn EV_SET 129is a macro which is provided for ease of initializing a 130kevent structure. 131.Pp 132The 133.Va kevent 134structure is defined as: 135.Bd -literal 136struct kevent { 137 uintptr_t ident; /* identifier for this event */ 138 short filter; /* filter for event */ 139 u_short flags; /* action flags for kqueue */ 140 u_int fflags; /* filter flag value */ 141 int64_t data; /* filter data value */ 142 void *udata; /* opaque user data identifier */ 143}; 144.Ed 145.Pp 146The fields of 147.Li struct kevent 148are: 149.Bl -tag -width XXXfilter 150.It ident 151Value used to identify this event. 152The exact interpretation is determined by the attached filter, 153but often is a file descriptor. 154.It filter 155Identifies the kernel filter used to process this event. 156The pre-defined system filters are described below. 157.It flags 158Actions to perform on the event. 159.It fflags 160Filter-specific flags. 161.It data 162Filter-specific data value. 163.It udata 164Opaque user-defined value passed through the kernel unchanged. 165.El 166.Pp 167The 168.Va flags 169field can contain the following values: 170.Bl -tag -width XXXEV_ONESHOT 171.It Dv EV_ADD 172Adds the event to the kqueue. 173Re-adding an existing event will modify the parameters of the original event, 174and not result in a duplicate entry. 175Adding an event automatically enables it, unless overridden by the 176.Dv EV_DISABLE 177flag. 178.It Dv EV_ENABLE 179Permit 180.Fn kevent 181to return the event if it is triggered. 182.It Dv EV_DISABLE 183Disable the event so 184.Fn kevent 185will not return it. 186The filter itself is not disabled. 187.It Dv EV_DISPATCH 188Disable the event source immediately after delivery of an event. 189See 190.Dv EV_DISABLE 191above. 192.It Dv EV_DELETE 193Removes the event from the kqueue. 194Events which are attached to file descriptors are automatically deleted 195on the last close of the descriptor. 196.It Dv EV_RECEIPT 197Causes 198.Fn kevent 199to return with 200.Dv EV_ERROR 201set without draining any pending events after updating events in the kqueue. 202When a filter is successfully added the 203.Va data 204field will be zero. 205This flag is useful for making bulk changes to a kqueue. 206.It Dv EV_ONESHOT 207Causes the event to return only the first occurrence of the filter 208being triggered. 209After the user retrieves the event from the kqueue, it is deleted. 210.It Dv EV_CLEAR 211After the event is retrieved by the user, its state is reset. 212This is useful for filters which report state transitions 213instead of the current state. 214Note that some filters may automatically set this flag internally. 215.It Dv EV_EOF 216Filters may set this flag to indicate filter-specific EOF condition. 217.It Dv EV_ERROR 218See 219.Sx RETURN VALUES 220below. 221.El 222.Pp 223The predefined system filters are listed below. 224Arguments may be passed to and from the filter via the 225.Va fflags 226and 227.Va data 228fields in the kevent structure. 229.Bl -tag -width EVFILT_SIGNAL 230.It Dv EVFILT_READ 231Takes a descriptor as the identifier, and returns whenever 232there is data available to read. 233The behavior of the filter is slightly different depending 234on the descriptor type. 235.Bl -tag -width 2n 236.It Sockets 237Sockets which have previously been passed to 238.Fn listen 239return when there is an incoming connection pending. 240.Va data 241contains the size of the listen backlog. 242.Pp 243Other socket descriptors return when there is data to be read, 244subject to the 245.Dv SO_RCVLOWAT 246value of the socket buffer. 247This may be overridden with a per-filter low water mark at the 248time the filter is added by setting the 249.Dv NOTE_LOWAT 250flag in 251.Va fflags , 252and specifying the new low water mark in 253.Va data . 254On return, 255.Va data 256contains the number of bytes in the socket buffer. 257.Pp 258If the read direction of the socket has shutdown, then the filter 259also sets 260.Dv EV_EOF 261in 262.Va flags , 263and returns the socket error (if any) in 264.Va fflags . 265It is possible for EOF to be returned (indicating the connection is gone) 266while there is still data pending in the socket buffer. 267.It Vnodes 268Returns when the file pointer is not at the end of file. 269.Va data 270contains the offset from current position to end of file, 271and may be negative. 272If 273.Dv NOTE_EOF 274is set in 275.Va fflags , 276.Fn kevent 277will also return when the file pointer is at the end of file. 278The end of file condition is indicated by the presence of 279.Dv NOTE_EOF 280in 281.Va fflags 282on return. 283.It "FIFOs, Pipes" 284Returns when there is data to read; 285.Va data 286contains the number of bytes available. 287.Pp 288When the last writer disconnects, the filter will set 289.Dv EV_EOF 290in 291.Va flags . 292This may be cleared by passing in 293.Dv EV_CLEAR , 294at which point the filter will resume waiting for data to become 295available before returning. 296.It "BPF devices" 297Returns when the BPF buffer is full, the BPF timeout has expired, or 298when the BPF has 299.Dq immediate mode 300enabled and there is any data to read; 301.Va data 302contains the number of bytes available. 303.El 304.It Dv EVFILT_WRITE 305Takes a descriptor as the identifier, and returns whenever 306it is possible to write to the descriptor. 307For sockets, pipes, and FIFOs, 308.Va data 309will contain the amount of space remaining in the write buffer. 310The filter will set 311.Dv EV_EOF 312when the reader disconnects, and for the FIFO case, 313this may be cleared by use of 314.Dv EV_CLEAR . 315Note that this filter is not supported for vnodes or BPF devices. 316.Pp 317For sockets, the low water mark and socket error handling is 318identical to the 319.Dv EVFILT_READ 320case. 321.\".It Dv EVFILT_AIO 322.\"The sigevent portion of the AIO request is filled in, with 323.\".Va sigev_notify_kqueue 324.\"containing the descriptor of the kqueue that the event should 325.\"be attached to, 326.\".Va sigev_value 327.\"containing the udata value, and 328.\".Va sigev_notify 329.\"set to 330.\".Dv SIGEV_KEVENT . 331.\"When the aio_* function is called, the event will be registered 332.\"with the specified kqueue, and the 333.\".Va ident 334.\"argument set to the 335.\".Li struct aiocb 336.\"returned by the aio_* function. 337.\"The filter returns under the same conditions as aio_error. 338.\".Pp 339.\"Alternatively, a kevent structure may be initialized, with 340.\".Va ident 341.\"containing the descriptor of the kqueue, and the 342.\"address of the kevent structure placed in the 343.\".Va aio_lio_opcode 344.\"field of the AIO request. 345.\"However, this approach will not work on architectures with 64-bit pointers, 346.\"and should be considered deprecated. 347.It Dv EVFILT_VNODE 348Takes a file descriptor as the identifier and the events to watch for in 349.Va fflags , 350and returns when one or more of the requested events occurs on the descriptor. 351The events to monitor are: 352.Bl -tag -width XXNOTE_RENAME 353.It Dv NOTE_DELETE 354.Fn unlink 355was called on the file referenced by the descriptor. 356.It Dv NOTE_WRITE 357A write occurred on the file referenced by the descriptor. 358.It Dv NOTE_EXTEND 359The file referenced by the descriptor was extended. 360.It Dv NOTE_TRUNCATE 361The file referenced by the descriptor was truncated. 362.It Dv NOTE_ATTRIB 363The file referenced by the descriptor had its attributes changed. 364.It Dv NOTE_LINK 365The link count on the file changed. 366.It Dv NOTE_RENAME 367The file referenced by the descriptor was renamed. 368.It Dv NOTE_REVOKE 369Access to the file was revoked via 370.Xr revoke 2 371or the underlying file system was unmounted. 372.El 373.Pp 374On return, 375.Va fflags 376contains the events which triggered the filter. 377.It Dv EVFILT_PROC 378Takes the process ID to monitor as the identifier and the events to watch for 379in 380.Va fflags , 381and returns when the process performs one or more of the requested events. 382If a process can normally see another process, it can attach an event to it. 383The events to monitor are: 384.Bl -tag -width XXNOTE_TRACKERR 385.It Dv NOTE_EXIT 386The process has exited. 387The exit status will be stored in 388.Va data 389in the same format as the status set by 390.Xr wait 2 . 391.It Dv NOTE_FORK 392The process has called 393.Fn fork . 394.It Dv NOTE_EXEC 395The process has executed a new process via 396.Xr execve 2 397or similar call. 398.It Dv NOTE_TRACK 399Follow a process across 400.Fn fork 401calls. 402The parent process will return with 403.Dv NOTE_FORK 404set in the 405.Va fflags 406field, while the child process will return with 407.Dv NOTE_CHILD 408set in 409.Va fflags 410and the parent PID in 411.Va data . 412.It Dv NOTE_TRACKERR 413This flag is returned if the system was unable to attach an event to 414the child process, usually due to resource limitations. 415.El 416.Pp 417On return, 418.Va fflags 419contains the events which triggered the filter. 420.It Dv EVFILT_SIGNAL 421Takes the signal number to monitor as the identifier and returns 422when the given signal is delivered to the process. 423This coexists with the 424.Fn signal 425and 426.Fn sigaction 427facilities, and has a lower precedence. 428The filter will record all attempts to deliver a signal to a process, 429even if the signal has been marked as 430.Dv SIG_IGN . 431Event notification happens after normal signal delivery processing. 432.Va data 433returns the number of times the signal has occurred since the last call to 434.Fn kevent . 435This filter automatically sets the 436.Dv EV_CLEAR 437flag internally. 438.It Dv EVFILT_TIMER 439Establishes an arbitrary timer identified by 440.Va ident . 441When adding a timer, 442.Va data 443specifies the timeout period in milliseconds. 444The timer will be periodic unless 445.Dv EV_ONESHOT 446is specified. 447On return, 448.Va data 449contains the number of times the timeout has expired since the last call to 450.Fn kevent . 451This filter automatically sets the 452.Dv EV_CLEAR 453flag internally. 454.It Dv EVFILT_DEVICE 455Takes a descriptor as the identifier and the events to watch for in 456.Va fflags , 457and returns when one or more of the requested events occur on the 458descriptor. 459The events to monitor are: 460.Bl -tag -width XXNOTE_CHANGE 461.It Dv NOTE_CHANGE 462A device change event has occurred, e.g. an HDMI cable has been plugged in to a port. 463.El 464.Pp 465On return, 466.Va fflags 467contains the events which triggered the filter. 468.El 469.Sh RETURN VALUES 470.Fn kqueue 471creates a new kernel event queue and returns a file descriptor. 472If there was an error creating the kernel event queue, a value of -1 is 473returned and errno set. 474.Pp 475.Fn kevent 476returns the number of events placed in the 477.Fa eventlist , 478up to the value given by 479.Fa nevents . 480If an error occurs while processing an element of the 481.Fa changelist 482and there is enough room in the 483.Fa eventlist , 484then the event will be placed in the 485.Fa eventlist 486with 487.Dv EV_ERROR 488set in 489.Va flags 490and the system error in 491.Va data . 492Otherwise, 493.Dv -1 494will be returned, and 495.Dv errno 496will be set to indicate the error condition. 497If the time limit expires, then 498.Fn kevent 499returns 0. 500.Sh ERRORS 501The 502.Fn kqueue 503function fails if: 504.Bl -tag -width Er 505.It Bq Er ENOMEM 506The kernel failed to allocate enough memory for the kernel queue. 507.It Bq Er EMFILE 508The per-process descriptor table is full. 509.It Bq Er ENFILE 510The system file table is full. 511.El 512.Pp 513The 514.Fn kevent 515function fails if: 516.Bl -tag -width Er 517.It Bq Er EACCES 518The process does not have permission to register a filter. 519.It Bq Er EFAULT 520There was an error reading or writing the 521.Va kevent 522structure. 523.It Bq Er EBADF 524The specified descriptor is invalid. 525.It Bq Er EINTR 526A signal was delivered before the timeout expired and before any 527events were placed on the kqueue for return. 528.It Bq Er EINVAL 529The specified time limit or filter is invalid. 530.It Bq Er ENOENT 531The event could not be found to be modified or deleted. 532.It Bq Er ENOMEM 533No memory was available to register the event. 534.It Bq Er ESRCH 535The specified process to attach to does not exist. 536.El 537.Sh SEE ALSO 538.Xr poll 2 , 539.Xr read 2 , 540.Xr select 2 , 541.Xr sigaction 2 , 542.Xr wait 2 , 543.Xr write 2 , 544.Xr signal 3 545.Sh HISTORY 546The 547.Fn kqueue 548and 549.Fn kevent 550functions first appeared in 551.Fx 4.1 . 552.Sh AUTHORS 553The 554.Fn kqueue 555system and this manual page were written by 556.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org . 557.Sh BUGS 558It is currently not possible to watch FIFOs or AIO that reside 559on anything but a UFS file system. 560Watching a vnode is possible on UFS, NFS and MS-DOS file systems. 561.Pp 562The 563.Fa timeout 564value is limited to 24 hours; longer timeouts will be silently 565reinterpreted as 24 hours. 566