xref: /openbsd-src/lib/libc/sys/kqueue.2 (revision b2ea75c1b17e1a9a339660e7ed45cd24946b230e)
1.\"	$OpenBSD: kqueue.2,v 1.7 2001/07/22 00:46:29 deraadt Exp $
2.\"
3.\" Copyright (c) 2000 Jonathan Lemon
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND
16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25.\" SUCH DAMAGE.
26.\"
27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $
28.\"
29.Dd April 14, 2000
30.Dt KQUEUE 2
31.Os
32.Sh NAME
33.Nm kqueue ,
34.Nm kevent
35.Nd kernel event notification mechanism
36.Sh SYNOPSIS
37.Fd #include <sys/types.h>
38.Fd #include <sys/event.h>
39.Fd #include <sys/time.h>
40.Ft int
41.Fn kqueue "void"
42.Ft int
43.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout"
44.Fn EV_SET "&kev" ident filter flags fflags data udata
45.Sh DESCRIPTION
46.Fn kqueue
47provides a generic method of notifying the user when an event
48happens or a condition holds, based on the results of small
49pieces of kernel code termed
50.Dq filters .
51A kevent is identified by the (ident, filter) pair; there may only
52be one unique kevent per kqueue.
53.Pp
54The filter is executed upon the initial registration of a kevent
55in order to detect whether a preexisting condition is present, and is also
56executed whenever an event is passed to the filter for evaluation.
57If the filter determines that the condition should be reported,
58then the kevent is placed on the kqueue for the user to retrieve.
59.Pp
60The filter is also run when the user attempts to retrieve the kevent
61from the kqueue.
62If the filter indicates that the condition that triggered
63the event no longer holds, the kevent is removed from the kqueue and
64is not returned.
65.Pp
66Multiple events which trigger the filter do not result in multiple
67kevents being placed on the kqueue; instead, the filter will aggregate
68the events into a single
69.Li struct kevent .
70Calling
71.Fn close
72on a file descriptor will remove any kevents that reference the descriptor.
73.Pp
74.Fn kqueue
75creates a new kernel event queue and returns a descriptor.
76The queue is not inherited by a child created with
77.Xr fork 2 .
78However, if
79.Xr rfork 2
80is called without the
81.Dv RFFDG
82flag, then the descriptor table is shared,
83which will allow sharing of the kqueue between two processes.
84.Pp
85.Fn kevent
86is used to register events with the queue, and return any pending
87events to the user.
88.Fa changelist
89is a pointer to an array of
90.Va kevent
91structures, as defined in
92.Aq Pa sys/event.h .
93All changes contained in the
94.Fa changelist
95are applied before any pending events are read from the queue.
96.Fa nchanges
97gives the size of
98.Fa changelist .
99.Fa eventlist
100is a pointer to an array of kevent structures.
101.Fa nevents
102determines the size of
103.Fa eventlist .
104If
105.Fa timeout
106is a non-null pointer, it specifies a maximum interval to wait
107for an event, which will be interpreted as a
108.Li struct timespec .
109If
110.Fa timeout
111is a null pointer,
112.Fn kevent
113waits indefinitely.
114To effect a poll, the
115.Fa timeout
116argument should be non-null, pointing to a zero-valued
117.Va timespec
118structure.
119The same array may be used for the
120.Fa changelist
121and
122.Fa eventlist .
123.Pp
124.Fn EV_SET
125is a macro which is provided for ease of initializing a
126kevent structure.
127.Pp
128The
129.Va kevent
130structure is defined as:
131.Bd -literal
132struct kevent {
133	uintptr_t ident;	/* identifier for this event */
134	short	  filter;	/* filter for event */
135	u_short	  flags;	/* action flags for kqueue */
136	u_int	  fflags;	/* filter flag value */
137	intptr_t  data;		/* filter data value */
138	void	  *udata;	/* opaque user data identifier */
139};
140.Ed
141.Pp
142The fields of
143.Li struct kevent
144are:
145.Bl -tag -width XXXfilter
146.It ident
147Value used to identify this event.
148The exact interpretation is determined by the attached filter,
149but often is a file descriptor.
150.It filter
151Identifies the kernel filter used to process this event.
152The pre-defined system filters are described below.
153.It flags
154Actions to perform on the event.
155.It fflags
156Filter-specific flags.
157.It data
158Filter-specific data value.
159.It udata
160Opaque user-defined value passed through the kernel unchanged.
161.El
162.Pp
163The
164.Va flags
165field can contain the following values:
166.Bl -tag -width XXXEV_ONESHOT
167.It Dv EV_ADD
168Adds the event to the kqueue.
169Re-adding an existing event will modify the parameters of the original event,
170and not result in a duplicate entry.
171Adding an event automatically enables it, unless overridden by the
172.Dv EV_DISABLE
173flag.
174.It Dv EV_ENABLE
175Permit
176.Fn kevent
177to return the event if it is triggered.
178.It Dv EV_DISABLE
179Disable the event so
180.Fn kevent
181will not return it.
182The filter itself is not disabled.
183.It Dv EV_DELETE
184Removes the event from the kqueue.
185Events which are attached to file descriptors are automatically deleted
186on the last close of the descriptor.
187.It Dv EV_ONESHOT
188Causes the event to return only the first occurrence of the filter
189being triggered.
190After the user retrieves the event from the kqueue, it is deleted.
191.It Dv EV_CLEAR
192After the event is retrieved by the user, its state is reset.
193This is useful for filters which report state transitions
194instead of the current state.
195Note that some filters may automatically set this flag internally.
196.It Dv EV_EOF
197Filters may set this flag to indicate filter-specific EOF condition.
198.It Dv EV_ERROR
199See
200.Sx RETURN VALUES
201below.
202.El
203.Pp
204The predefined system filters are listed below.
205Arguments may be passed to and from the filter via the
206.Va fflags
207and
208.Va data
209fields in the kevent structure.
210.Bl -tag -width EVFILT_SIGNAL
211.It Dv EVFILT_READ
212Takes a descriptor as the identifier, and returns whenever
213there is data available to read.
214The behavior of the filter is slightly different depending
215on the descriptor type.
216.Pp
217.Bl -tag -width 2n
218.It Sockets
219Sockets which have previously been passed to
220.Fn listen
221return when there is an incoming connection pending.
222.Va data
223contains the size of the listen backlog.
224.Pp
225Other socket descriptors return when there is data to be read,
226subject to the
227.Dv SO_RCVLOWAT
228value of the socket buffer.
229This may be overridden with a per-filter low water mark at the
230time the filter is added by setting the
231.Dv NOTE_LOWAT
232flag in
233.Va fflags ,
234and specifying the new low water mark in
235.Va data .
236On return,
237.Va data
238contains the number of bytes in the socket buffer.
239.Pp
240If the read direction of the socket has shutdown, then the filter
241also sets
242.Dv EV_EOF
243in
244.Va flags ,
245and returns the socket error (if any) in
246.Va fflags .
247It is possible for EOF to be returned (indicating the connection is gone)
248while there is still data pending in the socket buffer.
249.It Vnodes
250Returns when the file pointer is not at the end of file.
251.Va data
252contains the offset from current position to end of file,
253and may be negative.
254.It "Fifos, Pipes"
255Returns when the there is data to read;
256.Va data
257contains the number of bytes available.
258.Pp
259When the last writer disconnects, the filter will set
260.Dv EV_EOF
261in
262.Va flags .
263This may be cleared by passing in
264.Dv EV_CLEAR ,
265at which point the filter will resume waiting for data to become
266available before returning.
267.El
268.It Dv EVFILT_WRITE
269Takes a descriptor as the identifier, and returns whenever
270it is possible to write to the descriptor.
271For sockets, pipes, and fifos,
272.Va data
273will contain the amount of space remaining in the write buffer.
274The filter will set
275.Dv EV_EOF
276when the reader disconnects, and for the fifo case,
277this may be cleared by use of
278.Dv EV_CLEAR .
279Note that this filter is not supported for vnodes.
280.Pp
281For sockets, the low water mark and socket error handling is
282identical to the
283.Dv EVFILT_READ
284case.
285.It Dv EVFILT_AIO
286The sigevent portion of the AIO request is filled in, with
287.Va sigev_notify_kqueue
288containing the descriptor of the kqueue that the event should
289be attached to,
290.Va sigev_value
291containing the udata value, and
292.Va sigev_notify
293set to
294.Dv SIGEV_EVENT .
295When the aio_* function is called, the event will be registered
296with the specified kqueue, and the
297.Va ident
298argument set to the
299.Li struct aiocb
300returned by the aio_* function.
301The filter returns under the same conditions as aio_error.
302.Pp
303Alternatively, a kevent structure may be initialized, with
304.Va ident
305containing the descriptor of the kqueue, and the
306address of the kevent structure placed in the
307.Va aio_lio_opcode
308field of the AIO request.
309However, this approach will not work on architectures with 64-bit pointers,
310and should be considered depreciated.
311.It Dv EVFILT_VNODE
312Takes a file descriptor as the identifier and the events to watch for in
313.Va fflags ,
314and returns when one or more of the requested events occurs on the descriptor.
315The events to monitor are:
316.Bl -tag -width XXNOTE_RENAME
317.It Dv NOTE_DELETE
318.Fn unlink
319was called on the file referenced by the descriptor.
320.It Dv NOTE_WRITE
321A write occurred on the file referenced by the descriptor.
322.It Dv NOTE_EXTEND
323The file referenced by the descriptor was extended.
324.It Dv NOTE_ATTRIB
325The file referenced by the descriptor had its attributes changed.
326.It Dv NOTE_LINK
327The link count on the file changed.
328.It Dv NOTE_RENAME
329The file referenced by the descriptor was renamed.
330.It Dv NOTE_REVOKE
331Access to the file was revoked via
332.Xr revoke 2
333or the underlying fileystem was unmounted.
334.El
335.Pp
336On return,
337.Va fflags
338contains the events which triggered the filter.
339.It Dv EVFILT_PROC
340Takes the process ID to monitor as the identifier and the events to watch for
341in
342.Va fflags ,
343and returns when the process performs one or more of the requested events.
344If a process can normally see another process, it can attach an event to it.
345The events to monitor are:
346.Bl -tag -width XXNOTE_TRACKERR
347.It Dv NOTE_EXIT
348The process has exited.
349.It Dv NOTE_FORK
350The process has called
351.Fn fork .
352.It Dv NOTE_EXEC
353The process has executed a new process via
354.Xr execve 2
355or similar call.
356.It Dv NOTE_TRACK
357Follow a process across
358.Fn fork
359calls.
360The parent process will return with
361.Dv NOTE_TRACK
362set in the
363.Va fflags
364field, while the child process will return with
365.Dv NOTE_CHILD
366set in
367.Va fflags
368and the parent PID in
369.Va data .
370.It Dv NOTE_TRACKERR
371This flag is returned if the system was unable to attach an event to
372the child process, usually due to resource limitations.
373.El
374.Pp
375On return,
376.Va fflags
377contains the events which triggered the filter.
378.It Dv EVFILT_SIGNAL
379Takes the signal number to monitor as the identifier and returns
380when the given signal is delivered to the process.
381This coexists with the
382.Fn signal
383and
384.Fn sigaction
385facilities, and has a lower precedence.
386The filter will record all attempts to deliver a signal to a process,
387even if the signal has been marked as
388.Dv SIG_IGN .
389Event notification happens after normal signal delivery processing.
390.Va data
391returns the number of times the signal has occurred since the last call to
392.Fn kqueue .
393This filter automatically sets the
394.Dv EV_CLEAR
395flag internally.
396.El
397.Sh RETURN VALUES
398.Fn kqueue
399creates a new kernel event queue and returns a file descriptor.
400If there was an error creating the kernel event queue, a value of -1 is
401returned and errno set.
402.Pp
403.Fn kevent
404returns the number of events placed in the
405.Fa eventlist ,
406up to the value given by
407.Fa nevents .
408If an error occurs while processing an element of the
409.Fa changelist
410and there is enough room in the
411.Fa eventlist ,
412then the event will be placed in the
413.Fa eventlist
414with
415.Dv EV_ERROR
416set in
417.Va flags
418and the system error in
419.Va data .
420Otherwise,
421.Dv -1
422will be returned, and
423.Dv errno
424will be set to indicate the error condition.
425If the time limit expires, then
426.Fn kevent
427returns 0.
428.Sh ERRORS
429The
430.Fn kqueue
431function fails if:
432.Bl -tag -width Er
433.It Bq Er ENOMEM
434The kernel failed to allocate enough memory for the kernel queue.
435.It Bq Er EMFILE
436The per-process descriptor table is full.
437.It Bq Er ENFILE
438The system file table is full.
439.El
440.Pp
441The
442.Fn kevent
443function fails if:
444.Bl -tag -width Er
445.It Bq Er EACCES
446The process does not have permission to register a filter.
447.It Bq Er EFAULT
448There was an error reading or writing the
449.Va kevent
450structure.
451.It Bq Er EBADF
452The specified descriptor is invalid.
453.It Bq Er EINTR
454A signal was delivered before the timeout expired and before any
455events were placed on the kqueue for return.
456.It Bq Er EINVAL
457The specified time limit or filter is invalid.
458.It Bq Er ENOENT
459The event could not be found to be modified or deleted.
460.It Bq Er ENOMEM
461No memory was available to register the event.
462.It Bq Er ESRCH
463The specified process to attach to does not exist.
464.El
465.Sh SEE ALSO
466.Xr poll 2 ,
467.Xr read 2 ,
468.Xr select 2 ,
469.Xr sigaction 2 ,
470.Xr write 2 ,
471.Xr signal 3
472.Sh HISTORY
473The
474.Fn kqueue
475and
476.Fn kevent
477functions first appeared in
478.Fx 4.1 .
479.Sh AUTHORS
480The
481.Fn kqueue
482system and this manual page were written by
483.An Jonathan Lemon Aq jlemon@FreeBSD.org .
484.Sh BUGS
485It is currently not possible to watch fifos, AIO, or a vnode that
486resides on anything but a UFS file system.
487