xref: /openbsd-src/lib/libc/sys/kqueue.2 (revision 50b7afb2c2c0993b0894d4e34bf857cb13ed9c80)
1.\"	$OpenBSD: kqueue.2,v 1.29 2014/01/21 03:15:45 schwarze Exp $
2.\"
3.\" Copyright (c) 2000 Jonathan Lemon
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND
16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25.\" SUCH DAMAGE.
26.\"
27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $
28.\"
29.Dd $Mdocdate: January 21 2014 $
30.Dt KQUEUE 2
31.Os
32.Sh NAME
33.Nm kqueue ,
34.Nm kevent
35.Nd kernel event notification mechanism
36.Sh SYNOPSIS
37.Fd #include <sys/types.h>
38.Fd #include <sys/event.h>
39.Fd #include <sys/time.h>
40.Ft int
41.Fn kqueue "void"
42.Ft int
43.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout"
44.Fn EV_SET "&kev" ident filter flags fflags data udata
45.Sh DESCRIPTION
46.Fn kqueue
47provides a generic method of notifying the user when an event
48happens or a condition holds, based on the results of small
49pieces of kernel code termed
50.Dq filters .
51A kevent is identified by the (ident, filter) pair; there may only
52be one unique kevent per kqueue.
53.Pp
54The filter is executed upon the initial registration of a kevent
55in order to detect whether a preexisting condition is present, and is also
56executed whenever an event is passed to the filter for evaluation.
57If the filter determines that the condition should be reported,
58then the kevent is placed on the kqueue for the user to retrieve.
59.Pp
60The filter is also run when the user attempts to retrieve the kevent
61from the kqueue.
62If the filter indicates that the condition that triggered
63the event no longer holds, the kevent is removed from the kqueue and
64is not returned.
65.Pp
66Multiple events which trigger the filter do not result in multiple
67kevents being placed on the kqueue; instead, the filter will aggregate
68the events into a single
69.Li struct kevent .
70Calling
71.Fn close
72on a file descriptor will remove any kevents that reference the descriptor.
73.Pp
74.Fn kqueue
75creates a new kernel event queue and returns a descriptor.
76The queue is not inherited by a child created with
77.Xr fork 2 .
78Similarly, kqueues cannot be passed across UNIX-domain sockets.
79.Pp
80.Fn kevent
81is used to register events with the queue, and return any pending
82events to the user.
83.Fa changelist
84is a pointer to an array of
85.Va kevent
86structures, as defined in
87.In sys/event.h .
88All changes contained in the
89.Fa changelist
90are applied before any pending events are read from the queue.
91.Fa nchanges
92gives the size of
93.Fa changelist .
94.Fa eventlist
95is a pointer to an array of kevent structures.
96.Fa nevents
97determines the size of
98.Fa eventlist .
99When
100.Fa nevents
101is zero,
102.Fn kevent
103will return immediately even if there is a
104.Fa timeout
105specified unlike
106.Xr select 2 .
107If
108.Fa timeout
109is a non-null pointer, it specifies a maximum interval to wait
110for an event, which will be interpreted as a
111.Li struct timespec .
112If
113.Fa timeout
114is a null pointer,
115.Fn kevent
116waits indefinitely.
117To effect a poll, the
118.Fa timeout
119argument should be non-null, pointing to a zero-valued
120.Va timespec
121structure.
122The same array may be used for the
123.Fa changelist
124and
125.Fa eventlist .
126.Pp
127.Fn EV_SET
128is a macro which is provided for ease of initializing a
129kevent structure.
130.Pp
131The
132.Va kevent
133structure is defined as:
134.Bd -literal
135struct kevent {
136	uintptr_t  ident;	/* identifier for this event */
137	short	   filter;	/* filter for event */
138	u_short	   flags;	/* action flags for kqueue */
139	u_int	   fflags;	/* filter flag value */
140	quad_t	   data;	/* filter data value */
141	void	   *udata;	/* opaque user data identifier */
142};
143.Ed
144.Pp
145The fields of
146.Li struct kevent
147are:
148.Bl -tag -width XXXfilter
149.It ident
150Value used to identify this event.
151The exact interpretation is determined by the attached filter,
152but often is a file descriptor.
153.It filter
154Identifies the kernel filter used to process this event.
155The pre-defined system filters are described below.
156.It flags
157Actions to perform on the event.
158.It fflags
159Filter-specific flags.
160.It data
161Filter-specific data value.
162.It udata
163Opaque user-defined value passed through the kernel unchanged.
164.El
165.Pp
166The
167.Va flags
168field can contain the following values:
169.Bl -tag -width XXXEV_ONESHOT
170.It Dv EV_ADD
171Adds the event to the kqueue.
172Re-adding an existing event will modify the parameters of the original event,
173and not result in a duplicate entry.
174Adding an event automatically enables it, unless overridden by the
175.Dv EV_DISABLE
176flag.
177.It Dv EV_ENABLE
178Permit
179.Fn kevent
180to return the event if it is triggered.
181.It Dv EV_DISABLE
182Disable the event so
183.Fn kevent
184will not return it.
185The filter itself is not disabled.
186.It Dv EV_DELETE
187Removes the event from the kqueue.
188Events which are attached to file descriptors are automatically deleted
189on the last close of the descriptor.
190.It Dv EV_ONESHOT
191Causes the event to return only the first occurrence of the filter
192being triggered.
193After the user retrieves the event from the kqueue, it is deleted.
194.It Dv EV_CLEAR
195After the event is retrieved by the user, its state is reset.
196This is useful for filters which report state transitions
197instead of the current state.
198Note that some filters may automatically set this flag internally.
199.It Dv EV_EOF
200Filters may set this flag to indicate filter-specific EOF condition.
201.It Dv EV_ERROR
202See
203.Sx RETURN VALUES
204below.
205.El
206.Pp
207The predefined system filters are listed below.
208Arguments may be passed to and from the filter via the
209.Va fflags
210and
211.Va data
212fields in the kevent structure.
213.Bl -tag -width EVFILT_SIGNAL
214.It Dv EVFILT_READ
215Takes a descriptor as the identifier, and returns whenever
216there is data available to read.
217The behavior of the filter is slightly different depending
218on the descriptor type.
219.Bl -tag -width 2n
220.It Sockets
221Sockets which have previously been passed to
222.Fn listen
223return when there is an incoming connection pending.
224.Va data
225contains the size of the listen backlog.
226.Pp
227Other socket descriptors return when there is data to be read,
228subject to the
229.Dv SO_RCVLOWAT
230value of the socket buffer.
231This may be overridden with a per-filter low water mark at the
232time the filter is added by setting the
233.Dv NOTE_LOWAT
234flag in
235.Va fflags ,
236and specifying the new low water mark in
237.Va data .
238On return,
239.Va data
240contains the number of bytes in the socket buffer.
241.Pp
242If the read direction of the socket has shutdown, then the filter
243also sets
244.Dv EV_EOF
245in
246.Va flags ,
247and returns the socket error (if any) in
248.Va fflags .
249It is possible for EOF to be returned (indicating the connection is gone)
250while there is still data pending in the socket buffer.
251.It Vnodes
252Returns when the file pointer is not at the end of file.
253.Va data
254contains the offset from current position to end of file,
255and may be negative.
256If
257.Dv NOTE_EOF
258is set in
259.Va fflags ,
260.Fn kevent
261will also return when the file pointer is at the end of file.
262The end of file condition is indicated by the presence of
263.Dv NOTE_EOF
264in
265.Va fflags
266on return.
267.It "Fifos, Pipes"
268Returns when there is data to read;
269.Va data
270contains the number of bytes available.
271.Pp
272When the last writer disconnects, the filter will set
273.Dv EV_EOF
274in
275.Va flags .
276This may be cleared by passing in
277.Dv EV_CLEAR ,
278at which point the filter will resume waiting for data to become
279available before returning.
280.It "BPF devices"
281Returns when the BPF buffer is full, the BPF timeout has expired, or
282when the BPF has
283.Dq immediate mode
284enabled and there is any data to read;
285.Va data
286contains the number of bytes available.
287.El
288.It Dv EVFILT_WRITE
289Takes a descriptor as the identifier, and returns whenever
290it is possible to write to the descriptor.
291For sockets, pipes, and FIFOs,
292.Va data
293will contain the amount of space remaining in the write buffer.
294The filter will set
295.Dv EV_EOF
296when the reader disconnects, and for the FIFO case,
297this may be cleared by use of
298.Dv EV_CLEAR .
299Note that this filter is not supported for vnodes or BPF devices.
300.Pp
301For sockets, the low water mark and socket error handling is
302identical to the
303.Dv EVFILT_READ
304case.
305.It Dv EVFILT_AIO
306The sigevent portion of the AIO request is filled in, with
307.Va sigev_notify_kqueue
308containing the descriptor of the kqueue that the event should
309be attached to,
310.Va sigev_value
311containing the udata value, and
312.Va sigev_notify
313set to
314.Dv SIGEV_KEVENT .
315When the aio_* function is called, the event will be registered
316with the specified kqueue, and the
317.Va ident
318argument set to the
319.Li struct aiocb
320returned by the aio_* function.
321The filter returns under the same conditions as aio_error.
322.Pp
323Alternatively, a kevent structure may be initialized, with
324.Va ident
325containing the descriptor of the kqueue, and the
326address of the kevent structure placed in the
327.Va aio_lio_opcode
328field of the AIO request.
329However, this approach will not work on architectures with 64-bit pointers,
330and should be considered deprecated.
331.It Dv EVFILT_VNODE
332Takes a file descriptor as the identifier and the events to watch for in
333.Va fflags ,
334and returns when one or more of the requested events occurs on the descriptor.
335The events to monitor are:
336.Bl -tag -width XXNOTE_RENAME
337.It Dv NOTE_DELETE
338.Fn unlink
339was called on the file referenced by the descriptor.
340.It Dv NOTE_WRITE
341A write occurred on the file referenced by the descriptor.
342.It Dv NOTE_EXTEND
343The file referenced by the descriptor was extended.
344.It Dv NOTE_TRUNCATE
345The file referenced by the descriptor was truncated.
346.It Dv NOTE_ATTRIB
347The file referenced by the descriptor had its attributes changed.
348.It Dv NOTE_LINK
349The link count on the file changed.
350.It Dv NOTE_RENAME
351The file referenced by the descriptor was renamed.
352.It Dv NOTE_REVOKE
353Access to the file was revoked via
354.Xr revoke 2
355or the underlying file system was unmounted.
356.El
357.Pp
358On return,
359.Va fflags
360contains the events which triggered the filter.
361.It Dv EVFILT_PROC
362Takes the process ID to monitor as the identifier and the events to watch for
363in
364.Va fflags ,
365and returns when the process performs one or more of the requested events.
366If a process can normally see another process, it can attach an event to it.
367The events to monitor are:
368.Bl -tag -width XXNOTE_TRACKERR
369.It Dv NOTE_EXIT
370The process has exited.
371The exit status will be stored in
372.Va data
373in the same format as the status set by
374.Xr wait 2 .
375.It Dv NOTE_FORK
376The process has called
377.Fn fork .
378.It Dv NOTE_EXEC
379The process has executed a new process via
380.Xr execve 2
381or similar call.
382.It Dv NOTE_TRACK
383Follow a process across
384.Fn fork
385calls.
386The parent process will return with
387.Dv NOTE_FORK
388set in the
389.Va fflags
390field, while the child process will return with
391.Dv NOTE_CHILD
392set in
393.Va fflags
394and the parent PID in
395.Va data .
396.It Dv NOTE_TRACKERR
397This flag is returned if the system was unable to attach an event to
398the child process, usually due to resource limitations.
399.El
400.Pp
401On return,
402.Va fflags
403contains the events which triggered the filter.
404.It Dv EVFILT_SIGNAL
405Takes the signal number to monitor as the identifier and returns
406when the given signal is delivered to the process.
407This coexists with the
408.Fn signal
409and
410.Fn sigaction
411facilities, and has a lower precedence.
412The filter will record all attempts to deliver a signal to a process,
413even if the signal has been marked as
414.Dv SIG_IGN .
415Event notification happens after normal signal delivery processing.
416.Va data
417returns the number of times the signal has occurred since the last call to
418.Fn kevent .
419This filter automatically sets the
420.Dv EV_CLEAR
421flag internally.
422.It Dv EVFILT_TIMER
423Establishes an arbitrary timer identified by
424.Va ident .
425When adding a timer,
426.Va data
427specifies the timeout period in milliseconds.
428The timer will be periodic unless
429.Dv EV_ONESHOT
430is specified.
431On return,
432.Va data
433contains the number of times the timeout has expired since the last call to
434.Fn kevent .
435This filter automatically sets the
436.Dv EV_CLEAR
437flag internally.
438.El
439.Sh RETURN VALUES
440.Fn kqueue
441creates a new kernel event queue and returns a file descriptor.
442If there was an error creating the kernel event queue, a value of -1 is
443returned and errno set.
444.Pp
445.Fn kevent
446returns the number of events placed in the
447.Fa eventlist ,
448up to the value given by
449.Fa nevents .
450If an error occurs while processing an element of the
451.Fa changelist
452and there is enough room in the
453.Fa eventlist ,
454then the event will be placed in the
455.Fa eventlist
456with
457.Dv EV_ERROR
458set in
459.Va flags
460and the system error in
461.Va data .
462Otherwise,
463.Dv -1
464will be returned, and
465.Dv errno
466will be set to indicate the error condition.
467If the time limit expires, then
468.Fn kevent
469returns 0.
470.Sh ERRORS
471The
472.Fn kqueue
473function fails if:
474.Bl -tag -width Er
475.It Bq Er ENOMEM
476The kernel failed to allocate enough memory for the kernel queue.
477.It Bq Er EMFILE
478The per-process descriptor table is full.
479.It Bq Er ENFILE
480The system file table is full.
481.El
482.Pp
483The
484.Fn kevent
485function fails if:
486.Bl -tag -width Er
487.It Bq Er EACCES
488The process does not have permission to register a filter.
489.It Bq Er EFAULT
490There was an error reading or writing the
491.Va kevent
492structure.
493.It Bq Er EBADF
494The specified descriptor is invalid.
495.It Bq Er EINTR
496A signal was delivered before the timeout expired and before any
497events were placed on the kqueue for return.
498.It Bq Er EINVAL
499The specified time limit or filter is invalid.
500.It Bq Er ENOENT
501The event could not be found to be modified or deleted.
502.It Bq Er ENOMEM
503No memory was available to register the event.
504.It Bq Er ESRCH
505The specified process to attach to does not exist.
506.El
507.Sh SEE ALSO
508.Xr poll 2 ,
509.Xr read 2 ,
510.Xr select 2 ,
511.Xr sigaction 2 ,
512.Xr wait 2 ,
513.Xr write 2 ,
514.Xr signal 3
515.Sh HISTORY
516The
517.Fn kqueue
518and
519.Fn kevent
520functions first appeared in
521.Fx 4.1 .
522.Sh AUTHORS
523The
524.Fn kqueue
525system and this manual page were written by
526.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org .
527.Sh BUGS
528It is currently not possible to watch FIFOs or AIO that reside
529on anything but a UFS file system.
530Watching a vnode is possible on UFS, NFS and MS-DOS file systems.
531.Pp
532The
533.Fa timeout
534value is limited to 24 hours; longer timeouts will be silently
535reinterpreted as 24 hours.
536