1.\" $OpenBSD: kqueue.2,v 1.7 2001/07/22 00:46:29 deraadt Exp $ 2.\" 3.\" Copyright (c) 2000 Jonathan Lemon 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $ 28.\" 29.Dd April 14, 2000 30.Dt KQUEUE 2 31.Os 32.Sh NAME 33.Nm kqueue , 34.Nm kevent 35.Nd kernel event notification mechanism 36.Sh SYNOPSIS 37.Fd #include <sys/types.h> 38.Fd #include <sys/event.h> 39.Fd #include <sys/time.h> 40.Ft int 41.Fn kqueue "void" 42.Ft int 43.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 44.Fn EV_SET "&kev" ident filter flags fflags data udata 45.Sh DESCRIPTION 46.Fn kqueue 47provides a generic method of notifying the user when an event 48happens or a condition holds, based on the results of small 49pieces of kernel code termed 50.Dq filters . 51A kevent is identified by the (ident, filter) pair; there may only 52be one unique kevent per kqueue. 53.Pp 54The filter is executed upon the initial registration of a kevent 55in order to detect whether a preexisting condition is present, and is also 56executed whenever an event is passed to the filter for evaluation. 57If the filter determines that the condition should be reported, 58then the kevent is placed on the kqueue for the user to retrieve. 59.Pp 60The filter is also run when the user attempts to retrieve the kevent 61from the kqueue. 62If the filter indicates that the condition that triggered 63the event no longer holds, the kevent is removed from the kqueue and 64is not returned. 65.Pp 66Multiple events which trigger the filter do not result in multiple 67kevents being placed on the kqueue; instead, the filter will aggregate 68the events into a single 69.Li struct kevent . 70Calling 71.Fn close 72on a file descriptor will remove any kevents that reference the descriptor. 73.Pp 74.Fn kqueue 75creates a new kernel event queue and returns a descriptor. 76The queue is not inherited by a child created with 77.Xr fork 2 . 78However, if 79.Xr rfork 2 80is called without the 81.Dv RFFDG 82flag, then the descriptor table is shared, 83which will allow sharing of the kqueue between two processes. 84.Pp 85.Fn kevent 86is used to register events with the queue, and return any pending 87events to the user. 88.Fa changelist 89is a pointer to an array of 90.Va kevent 91structures, as defined in 92.Aq Pa sys/event.h . 93All changes contained in the 94.Fa changelist 95are applied before any pending events are read from the queue. 96.Fa nchanges 97gives the size of 98.Fa changelist . 99.Fa eventlist 100is a pointer to an array of kevent structures. 101.Fa nevents 102determines the size of 103.Fa eventlist . 104If 105.Fa timeout 106is a non-null pointer, it specifies a maximum interval to wait 107for an event, which will be interpreted as a 108.Li struct timespec . 109If 110.Fa timeout 111is a null pointer, 112.Fn kevent 113waits indefinitely. 114To effect a poll, the 115.Fa timeout 116argument should be non-null, pointing to a zero-valued 117.Va timespec 118structure. 119The same array may be used for the 120.Fa changelist 121and 122.Fa eventlist . 123.Pp 124.Fn EV_SET 125is a macro which is provided for ease of initializing a 126kevent structure. 127.Pp 128The 129.Va kevent 130structure is defined as: 131.Bd -literal 132struct kevent { 133 uintptr_t ident; /* identifier for this event */ 134 short filter; /* filter for event */ 135 u_short flags; /* action flags for kqueue */ 136 u_int fflags; /* filter flag value */ 137 intptr_t data; /* filter data value */ 138 void *udata; /* opaque user data identifier */ 139}; 140.Ed 141.Pp 142The fields of 143.Li struct kevent 144are: 145.Bl -tag -width XXXfilter 146.It ident 147Value used to identify this event. 148The exact interpretation is determined by the attached filter, 149but often is a file descriptor. 150.It filter 151Identifies the kernel filter used to process this event. 152The pre-defined system filters are described below. 153.It flags 154Actions to perform on the event. 155.It fflags 156Filter-specific flags. 157.It data 158Filter-specific data value. 159.It udata 160Opaque user-defined value passed through the kernel unchanged. 161.El 162.Pp 163The 164.Va flags 165field can contain the following values: 166.Bl -tag -width XXXEV_ONESHOT 167.It Dv EV_ADD 168Adds the event to the kqueue. 169Re-adding an existing event will modify the parameters of the original event, 170and not result in a duplicate entry. 171Adding an event automatically enables it, unless overridden by the 172.Dv EV_DISABLE 173flag. 174.It Dv EV_ENABLE 175Permit 176.Fn kevent 177to return the event if it is triggered. 178.It Dv EV_DISABLE 179Disable the event so 180.Fn kevent 181will not return it. 182The filter itself is not disabled. 183.It Dv EV_DELETE 184Removes the event from the kqueue. 185Events which are attached to file descriptors are automatically deleted 186on the last close of the descriptor. 187.It Dv EV_ONESHOT 188Causes the event to return only the first occurrence of the filter 189being triggered. 190After the user retrieves the event from the kqueue, it is deleted. 191.It Dv EV_CLEAR 192After the event is retrieved by the user, its state is reset. 193This is useful for filters which report state transitions 194instead of the current state. 195Note that some filters may automatically set this flag internally. 196.It Dv EV_EOF 197Filters may set this flag to indicate filter-specific EOF condition. 198.It Dv EV_ERROR 199See 200.Sx RETURN VALUES 201below. 202.El 203.Pp 204The predefined system filters are listed below. 205Arguments may be passed to and from the filter via the 206.Va fflags 207and 208.Va data 209fields in the kevent structure. 210.Bl -tag -width EVFILT_SIGNAL 211.It Dv EVFILT_READ 212Takes a descriptor as the identifier, and returns whenever 213there is data available to read. 214The behavior of the filter is slightly different depending 215on the descriptor type. 216.Pp 217.Bl -tag -width 2n 218.It Sockets 219Sockets which have previously been passed to 220.Fn listen 221return when there is an incoming connection pending. 222.Va data 223contains the size of the listen backlog. 224.Pp 225Other socket descriptors return when there is data to be read, 226subject to the 227.Dv SO_RCVLOWAT 228value of the socket buffer. 229This may be overridden with a per-filter low water mark at the 230time the filter is added by setting the 231.Dv NOTE_LOWAT 232flag in 233.Va fflags , 234and specifying the new low water mark in 235.Va data . 236On return, 237.Va data 238contains the number of bytes in the socket buffer. 239.Pp 240If the read direction of the socket has shutdown, then the filter 241also sets 242.Dv EV_EOF 243in 244.Va flags , 245and returns the socket error (if any) in 246.Va fflags . 247It is possible for EOF to be returned (indicating the connection is gone) 248while there is still data pending in the socket buffer. 249.It Vnodes 250Returns when the file pointer is not at the end of file. 251.Va data 252contains the offset from current position to end of file, 253and may be negative. 254.It "Fifos, Pipes" 255Returns when the there is data to read; 256.Va data 257contains the number of bytes available. 258.Pp 259When the last writer disconnects, the filter will set 260.Dv EV_EOF 261in 262.Va flags . 263This may be cleared by passing in 264.Dv EV_CLEAR , 265at which point the filter will resume waiting for data to become 266available before returning. 267.El 268.It Dv EVFILT_WRITE 269Takes a descriptor as the identifier, and returns whenever 270it is possible to write to the descriptor. 271For sockets, pipes, and fifos, 272.Va data 273will contain the amount of space remaining in the write buffer. 274The filter will set 275.Dv EV_EOF 276when the reader disconnects, and for the fifo case, 277this may be cleared by use of 278.Dv EV_CLEAR . 279Note that this filter is not supported for vnodes. 280.Pp 281For sockets, the low water mark and socket error handling is 282identical to the 283.Dv EVFILT_READ 284case. 285.It Dv EVFILT_AIO 286The sigevent portion of the AIO request is filled in, with 287.Va sigev_notify_kqueue 288containing the descriptor of the kqueue that the event should 289be attached to, 290.Va sigev_value 291containing the udata value, and 292.Va sigev_notify 293set to 294.Dv SIGEV_EVENT . 295When the aio_* function is called, the event will be registered 296with the specified kqueue, and the 297.Va ident 298argument set to the 299.Li struct aiocb 300returned by the aio_* function. 301The filter returns under the same conditions as aio_error. 302.Pp 303Alternatively, a kevent structure may be initialized, with 304.Va ident 305containing the descriptor of the kqueue, and the 306address of the kevent structure placed in the 307.Va aio_lio_opcode 308field of the AIO request. 309However, this approach will not work on architectures with 64-bit pointers, 310and should be considered depreciated. 311.It Dv EVFILT_VNODE 312Takes a file descriptor as the identifier and the events to watch for in 313.Va fflags , 314and returns when one or more of the requested events occurs on the descriptor. 315The events to monitor are: 316.Bl -tag -width XXNOTE_RENAME 317.It Dv NOTE_DELETE 318.Fn unlink 319was called on the file referenced by the descriptor. 320.It Dv NOTE_WRITE 321A write occurred on the file referenced by the descriptor. 322.It Dv NOTE_EXTEND 323The file referenced by the descriptor was extended. 324.It Dv NOTE_ATTRIB 325The file referenced by the descriptor had its attributes changed. 326.It Dv NOTE_LINK 327The link count on the file changed. 328.It Dv NOTE_RENAME 329The file referenced by the descriptor was renamed. 330.It Dv NOTE_REVOKE 331Access to the file was revoked via 332.Xr revoke 2 333or the underlying fileystem was unmounted. 334.El 335.Pp 336On return, 337.Va fflags 338contains the events which triggered the filter. 339.It Dv EVFILT_PROC 340Takes the process ID to monitor as the identifier and the events to watch for 341in 342.Va fflags , 343and returns when the process performs one or more of the requested events. 344If a process can normally see another process, it can attach an event to it. 345The events to monitor are: 346.Bl -tag -width XXNOTE_TRACKERR 347.It Dv NOTE_EXIT 348The process has exited. 349.It Dv NOTE_FORK 350The process has called 351.Fn fork . 352.It Dv NOTE_EXEC 353The process has executed a new process via 354.Xr execve 2 355or similar call. 356.It Dv NOTE_TRACK 357Follow a process across 358.Fn fork 359calls. 360The parent process will return with 361.Dv NOTE_TRACK 362set in the 363.Va fflags 364field, while the child process will return with 365.Dv NOTE_CHILD 366set in 367.Va fflags 368and the parent PID in 369.Va data . 370.It Dv NOTE_TRACKERR 371This flag is returned if the system was unable to attach an event to 372the child process, usually due to resource limitations. 373.El 374.Pp 375On return, 376.Va fflags 377contains the events which triggered the filter. 378.It Dv EVFILT_SIGNAL 379Takes the signal number to monitor as the identifier and returns 380when the given signal is delivered to the process. 381This coexists with the 382.Fn signal 383and 384.Fn sigaction 385facilities, and has a lower precedence. 386The filter will record all attempts to deliver a signal to a process, 387even if the signal has been marked as 388.Dv SIG_IGN . 389Event notification happens after normal signal delivery processing. 390.Va data 391returns the number of times the signal has occurred since the last call to 392.Fn kqueue . 393This filter automatically sets the 394.Dv EV_CLEAR 395flag internally. 396.El 397.Sh RETURN VALUES 398.Fn kqueue 399creates a new kernel event queue and returns a file descriptor. 400If there was an error creating the kernel event queue, a value of -1 is 401returned and errno set. 402.Pp 403.Fn kevent 404returns the number of events placed in the 405.Fa eventlist , 406up to the value given by 407.Fa nevents . 408If an error occurs while processing an element of the 409.Fa changelist 410and there is enough room in the 411.Fa eventlist , 412then the event will be placed in the 413.Fa eventlist 414with 415.Dv EV_ERROR 416set in 417.Va flags 418and the system error in 419.Va data . 420Otherwise, 421.Dv -1 422will be returned, and 423.Dv errno 424will be set to indicate the error condition. 425If the time limit expires, then 426.Fn kevent 427returns 0. 428.Sh ERRORS 429The 430.Fn kqueue 431function fails if: 432.Bl -tag -width Er 433.It Bq Er ENOMEM 434The kernel failed to allocate enough memory for the kernel queue. 435.It Bq Er EMFILE 436The per-process descriptor table is full. 437.It Bq Er ENFILE 438The system file table is full. 439.El 440.Pp 441The 442.Fn kevent 443function fails if: 444.Bl -tag -width Er 445.It Bq Er EACCES 446The process does not have permission to register a filter. 447.It Bq Er EFAULT 448There was an error reading or writing the 449.Va kevent 450structure. 451.It Bq Er EBADF 452The specified descriptor is invalid. 453.It Bq Er EINTR 454A signal was delivered before the timeout expired and before any 455events were placed on the kqueue for return. 456.It Bq Er EINVAL 457The specified time limit or filter is invalid. 458.It Bq Er ENOENT 459The event could not be found to be modified or deleted. 460.It Bq Er ENOMEM 461No memory was available to register the event. 462.It Bq Er ESRCH 463The specified process to attach to does not exist. 464.El 465.Sh SEE ALSO 466.Xr poll 2 , 467.Xr read 2 , 468.Xr select 2 , 469.Xr sigaction 2 , 470.Xr write 2 , 471.Xr signal 3 472.Sh HISTORY 473The 474.Fn kqueue 475and 476.Fn kevent 477functions first appeared in 478.Fx 4.1 . 479.Sh AUTHORS 480The 481.Fn kqueue 482system and this manual page were written by 483.An Jonathan Lemon Aq jlemon@FreeBSD.org . 484.Sh BUGS 485It is currently not possible to watch fifos, AIO, or a vnode that 486resides on anything but a UFS file system. 487