1.\" $OpenBSD: kqueue.2,v 1.29 2014/01/21 03:15:45 schwarze Exp $ 2.\" 3.\" Copyright (c) 2000 Jonathan Lemon 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $ 28.\" 29.Dd $Mdocdate: January 21 2014 $ 30.Dt KQUEUE 2 31.Os 32.Sh NAME 33.Nm kqueue , 34.Nm kevent 35.Nd kernel event notification mechanism 36.Sh SYNOPSIS 37.Fd #include <sys/types.h> 38.Fd #include <sys/event.h> 39.Fd #include <sys/time.h> 40.Ft int 41.Fn kqueue "void" 42.Ft int 43.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 44.Fn EV_SET "&kev" ident filter flags fflags data udata 45.Sh DESCRIPTION 46.Fn kqueue 47provides a generic method of notifying the user when an event 48happens or a condition holds, based on the results of small 49pieces of kernel code termed 50.Dq filters . 51A kevent is identified by the (ident, filter) pair; there may only 52be one unique kevent per kqueue. 53.Pp 54The filter is executed upon the initial registration of a kevent 55in order to detect whether a preexisting condition is present, and is also 56executed whenever an event is passed to the filter for evaluation. 57If the filter determines that the condition should be reported, 58then the kevent is placed on the kqueue for the user to retrieve. 59.Pp 60The filter is also run when the user attempts to retrieve the kevent 61from the kqueue. 62If the filter indicates that the condition that triggered 63the event no longer holds, the kevent is removed from the kqueue and 64is not returned. 65.Pp 66Multiple events which trigger the filter do not result in multiple 67kevents being placed on the kqueue; instead, the filter will aggregate 68the events into a single 69.Li struct kevent . 70Calling 71.Fn close 72on a file descriptor will remove any kevents that reference the descriptor. 73.Pp 74.Fn kqueue 75creates a new kernel event queue and returns a descriptor. 76The queue is not inherited by a child created with 77.Xr fork 2 . 78Similarly, kqueues cannot be passed across UNIX-domain sockets. 79.Pp 80.Fn kevent 81is used to register events with the queue, and return any pending 82events to the user. 83.Fa changelist 84is a pointer to an array of 85.Va kevent 86structures, as defined in 87.In sys/event.h . 88All changes contained in the 89.Fa changelist 90are applied before any pending events are read from the queue. 91.Fa nchanges 92gives the size of 93.Fa changelist . 94.Fa eventlist 95is a pointer to an array of kevent structures. 96.Fa nevents 97determines the size of 98.Fa eventlist . 99When 100.Fa nevents 101is zero, 102.Fn kevent 103will return immediately even if there is a 104.Fa timeout 105specified unlike 106.Xr select 2 . 107If 108.Fa timeout 109is a non-null pointer, it specifies a maximum interval to wait 110for an event, which will be interpreted as a 111.Li struct timespec . 112If 113.Fa timeout 114is a null pointer, 115.Fn kevent 116waits indefinitely. 117To effect a poll, the 118.Fa timeout 119argument should be non-null, pointing to a zero-valued 120.Va timespec 121structure. 122The same array may be used for the 123.Fa changelist 124and 125.Fa eventlist . 126.Pp 127.Fn EV_SET 128is a macro which is provided for ease of initializing a 129kevent structure. 130.Pp 131The 132.Va kevent 133structure is defined as: 134.Bd -literal 135struct kevent { 136 uintptr_t ident; /* identifier for this event */ 137 short filter; /* filter for event */ 138 u_short flags; /* action flags for kqueue */ 139 u_int fflags; /* filter flag value */ 140 quad_t data; /* filter data value */ 141 void *udata; /* opaque user data identifier */ 142}; 143.Ed 144.Pp 145The fields of 146.Li struct kevent 147are: 148.Bl -tag -width XXXfilter 149.It ident 150Value used to identify this event. 151The exact interpretation is determined by the attached filter, 152but often is a file descriptor. 153.It filter 154Identifies the kernel filter used to process this event. 155The pre-defined system filters are described below. 156.It flags 157Actions to perform on the event. 158.It fflags 159Filter-specific flags. 160.It data 161Filter-specific data value. 162.It udata 163Opaque user-defined value passed through the kernel unchanged. 164.El 165.Pp 166The 167.Va flags 168field can contain the following values: 169.Bl -tag -width XXXEV_ONESHOT 170.It Dv EV_ADD 171Adds the event to the kqueue. 172Re-adding an existing event will modify the parameters of the original event, 173and not result in a duplicate entry. 174Adding an event automatically enables it, unless overridden by the 175.Dv EV_DISABLE 176flag. 177.It Dv EV_ENABLE 178Permit 179.Fn kevent 180to return the event if it is triggered. 181.It Dv EV_DISABLE 182Disable the event so 183.Fn kevent 184will not return it. 185The filter itself is not disabled. 186.It Dv EV_DELETE 187Removes the event from the kqueue. 188Events which are attached to file descriptors are automatically deleted 189on the last close of the descriptor. 190.It Dv EV_ONESHOT 191Causes the event to return only the first occurrence of the filter 192being triggered. 193After the user retrieves the event from the kqueue, it is deleted. 194.It Dv EV_CLEAR 195After the event is retrieved by the user, its state is reset. 196This is useful for filters which report state transitions 197instead of the current state. 198Note that some filters may automatically set this flag internally. 199.It Dv EV_EOF 200Filters may set this flag to indicate filter-specific EOF condition. 201.It Dv EV_ERROR 202See 203.Sx RETURN VALUES 204below. 205.El 206.Pp 207The predefined system filters are listed below. 208Arguments may be passed to and from the filter via the 209.Va fflags 210and 211.Va data 212fields in the kevent structure. 213.Bl -tag -width EVFILT_SIGNAL 214.It Dv EVFILT_READ 215Takes a descriptor as the identifier, and returns whenever 216there is data available to read. 217The behavior of the filter is slightly different depending 218on the descriptor type. 219.Bl -tag -width 2n 220.It Sockets 221Sockets which have previously been passed to 222.Fn listen 223return when there is an incoming connection pending. 224.Va data 225contains the size of the listen backlog. 226.Pp 227Other socket descriptors return when there is data to be read, 228subject to the 229.Dv SO_RCVLOWAT 230value of the socket buffer. 231This may be overridden with a per-filter low water mark at the 232time the filter is added by setting the 233.Dv NOTE_LOWAT 234flag in 235.Va fflags , 236and specifying the new low water mark in 237.Va data . 238On return, 239.Va data 240contains the number of bytes in the socket buffer. 241.Pp 242If the read direction of the socket has shutdown, then the filter 243also sets 244.Dv EV_EOF 245in 246.Va flags , 247and returns the socket error (if any) in 248.Va fflags . 249It is possible for EOF to be returned (indicating the connection is gone) 250while there is still data pending in the socket buffer. 251.It Vnodes 252Returns when the file pointer is not at the end of file. 253.Va data 254contains the offset from current position to end of file, 255and may be negative. 256If 257.Dv NOTE_EOF 258is set in 259.Va fflags , 260.Fn kevent 261will also return when the file pointer is at the end of file. 262The end of file condition is indicated by the presence of 263.Dv NOTE_EOF 264in 265.Va fflags 266on return. 267.It "Fifos, Pipes" 268Returns when there is data to read; 269.Va data 270contains the number of bytes available. 271.Pp 272When the last writer disconnects, the filter will set 273.Dv EV_EOF 274in 275.Va flags . 276This may be cleared by passing in 277.Dv EV_CLEAR , 278at which point the filter will resume waiting for data to become 279available before returning. 280.It "BPF devices" 281Returns when the BPF buffer is full, the BPF timeout has expired, or 282when the BPF has 283.Dq immediate mode 284enabled and there is any data to read; 285.Va data 286contains the number of bytes available. 287.El 288.It Dv EVFILT_WRITE 289Takes a descriptor as the identifier, and returns whenever 290it is possible to write to the descriptor. 291For sockets, pipes, and FIFOs, 292.Va data 293will contain the amount of space remaining in the write buffer. 294The filter will set 295.Dv EV_EOF 296when the reader disconnects, and for the FIFO case, 297this may be cleared by use of 298.Dv EV_CLEAR . 299Note that this filter is not supported for vnodes or BPF devices. 300.Pp 301For sockets, the low water mark and socket error handling is 302identical to the 303.Dv EVFILT_READ 304case. 305.It Dv EVFILT_AIO 306The sigevent portion of the AIO request is filled in, with 307.Va sigev_notify_kqueue 308containing the descriptor of the kqueue that the event should 309be attached to, 310.Va sigev_value 311containing the udata value, and 312.Va sigev_notify 313set to 314.Dv SIGEV_KEVENT . 315When the aio_* function is called, the event will be registered 316with the specified kqueue, and the 317.Va ident 318argument set to the 319.Li struct aiocb 320returned by the aio_* function. 321The filter returns under the same conditions as aio_error. 322.Pp 323Alternatively, a kevent structure may be initialized, with 324.Va ident 325containing the descriptor of the kqueue, and the 326address of the kevent structure placed in the 327.Va aio_lio_opcode 328field of the AIO request. 329However, this approach will not work on architectures with 64-bit pointers, 330and should be considered deprecated. 331.It Dv EVFILT_VNODE 332Takes a file descriptor as the identifier and the events to watch for in 333.Va fflags , 334and returns when one or more of the requested events occurs on the descriptor. 335The events to monitor are: 336.Bl -tag -width XXNOTE_RENAME 337.It Dv NOTE_DELETE 338.Fn unlink 339was called on the file referenced by the descriptor. 340.It Dv NOTE_WRITE 341A write occurred on the file referenced by the descriptor. 342.It Dv NOTE_EXTEND 343The file referenced by the descriptor was extended. 344.It Dv NOTE_TRUNCATE 345The file referenced by the descriptor was truncated. 346.It Dv NOTE_ATTRIB 347The file referenced by the descriptor had its attributes changed. 348.It Dv NOTE_LINK 349The link count on the file changed. 350.It Dv NOTE_RENAME 351The file referenced by the descriptor was renamed. 352.It Dv NOTE_REVOKE 353Access to the file was revoked via 354.Xr revoke 2 355or the underlying file system was unmounted. 356.El 357.Pp 358On return, 359.Va fflags 360contains the events which triggered the filter. 361.It Dv EVFILT_PROC 362Takes the process ID to monitor as the identifier and the events to watch for 363in 364.Va fflags , 365and returns when the process performs one or more of the requested events. 366If a process can normally see another process, it can attach an event to it. 367The events to monitor are: 368.Bl -tag -width XXNOTE_TRACKERR 369.It Dv NOTE_EXIT 370The process has exited. 371The exit status will be stored in 372.Va data 373in the same format as the status set by 374.Xr wait 2 . 375.It Dv NOTE_FORK 376The process has called 377.Fn fork . 378.It Dv NOTE_EXEC 379The process has executed a new process via 380.Xr execve 2 381or similar call. 382.It Dv NOTE_TRACK 383Follow a process across 384.Fn fork 385calls. 386The parent process will return with 387.Dv NOTE_FORK 388set in the 389.Va fflags 390field, while the child process will return with 391.Dv NOTE_CHILD 392set in 393.Va fflags 394and the parent PID in 395.Va data . 396.It Dv NOTE_TRACKERR 397This flag is returned if the system was unable to attach an event to 398the child process, usually due to resource limitations. 399.El 400.Pp 401On return, 402.Va fflags 403contains the events which triggered the filter. 404.It Dv EVFILT_SIGNAL 405Takes the signal number to monitor as the identifier and returns 406when the given signal is delivered to the process. 407This coexists with the 408.Fn signal 409and 410.Fn sigaction 411facilities, and has a lower precedence. 412The filter will record all attempts to deliver a signal to a process, 413even if the signal has been marked as 414.Dv SIG_IGN . 415Event notification happens after normal signal delivery processing. 416.Va data 417returns the number of times the signal has occurred since the last call to 418.Fn kevent . 419This filter automatically sets the 420.Dv EV_CLEAR 421flag internally. 422.It Dv EVFILT_TIMER 423Establishes an arbitrary timer identified by 424.Va ident . 425When adding a timer, 426.Va data 427specifies the timeout period in milliseconds. 428The timer will be periodic unless 429.Dv EV_ONESHOT 430is specified. 431On return, 432.Va data 433contains the number of times the timeout has expired since the last call to 434.Fn kevent . 435This filter automatically sets the 436.Dv EV_CLEAR 437flag internally. 438.El 439.Sh RETURN VALUES 440.Fn kqueue 441creates a new kernel event queue and returns a file descriptor. 442If there was an error creating the kernel event queue, a value of -1 is 443returned and errno set. 444.Pp 445.Fn kevent 446returns the number of events placed in the 447.Fa eventlist , 448up to the value given by 449.Fa nevents . 450If an error occurs while processing an element of the 451.Fa changelist 452and there is enough room in the 453.Fa eventlist , 454then the event will be placed in the 455.Fa eventlist 456with 457.Dv EV_ERROR 458set in 459.Va flags 460and the system error in 461.Va data . 462Otherwise, 463.Dv -1 464will be returned, and 465.Dv errno 466will be set to indicate the error condition. 467If the time limit expires, then 468.Fn kevent 469returns 0. 470.Sh ERRORS 471The 472.Fn kqueue 473function fails if: 474.Bl -tag -width Er 475.It Bq Er ENOMEM 476The kernel failed to allocate enough memory for the kernel queue. 477.It Bq Er EMFILE 478The per-process descriptor table is full. 479.It Bq Er ENFILE 480The system file table is full. 481.El 482.Pp 483The 484.Fn kevent 485function fails if: 486.Bl -tag -width Er 487.It Bq Er EACCES 488The process does not have permission to register a filter. 489.It Bq Er EFAULT 490There was an error reading or writing the 491.Va kevent 492structure. 493.It Bq Er EBADF 494The specified descriptor is invalid. 495.It Bq Er EINTR 496A signal was delivered before the timeout expired and before any 497events were placed on the kqueue for return. 498.It Bq Er EINVAL 499The specified time limit or filter is invalid. 500.It Bq Er ENOENT 501The event could not be found to be modified or deleted. 502.It Bq Er ENOMEM 503No memory was available to register the event. 504.It Bq Er ESRCH 505The specified process to attach to does not exist. 506.El 507.Sh SEE ALSO 508.Xr poll 2 , 509.Xr read 2 , 510.Xr select 2 , 511.Xr sigaction 2 , 512.Xr wait 2 , 513.Xr write 2 , 514.Xr signal 3 515.Sh HISTORY 516The 517.Fn kqueue 518and 519.Fn kevent 520functions first appeared in 521.Fx 4.1 . 522.Sh AUTHORS 523The 524.Fn kqueue 525system and this manual page were written by 526.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org . 527.Sh BUGS 528It is currently not possible to watch FIFOs or AIO that reside 529on anything but a UFS file system. 530Watching a vnode is possible on UFS, NFS and MS-DOS file systems. 531.Pp 532The 533.Fa timeout 534value is limited to 24 hours; longer timeouts will be silently 535reinterpreted as 24 hours. 536