xref: /openbsd-src/lib/libc/sys/kqueue.2 (revision 0b7734b3d77bb9b21afec6f4621cae6c805dbd45)
1.\"	$OpenBSD: kqueue.2,v 1.32 2015/11/07 22:57:52 jmc Exp $
2.\"
3.\" Copyright (c) 2000 Jonathan Lemon
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND
16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25.\" SUCH DAMAGE.
26.\"
27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $
28.\"
29.Dd $Mdocdate: November 7 2015 $
30.Dt KQUEUE 2
31.Os
32.Sh NAME
33.Nm kqueue ,
34.Nm kevent ,
35.Nm EV_SET
36.Nd kernel event notification mechanism
37.Sh SYNOPSIS
38.In sys/types.h
39.In sys/event.h
40.In sys/time.h
41.Ft int
42.Fn kqueue "void"
43.Ft int
44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout"
45.Fn EV_SET "&kev" ident filter flags fflags data udata
46.Sh DESCRIPTION
47.Fn kqueue
48provides a generic method of notifying the user when an event
49happens or a condition holds, based on the results of small
50pieces of kernel code termed
51.Dq filters .
52A kevent is identified by the (ident, filter) pair; there may only
53be one unique kevent per kqueue.
54.Pp
55The filter is executed upon the initial registration of a kevent
56in order to detect whether a preexisting condition is present, and is also
57executed whenever an event is passed to the filter for evaluation.
58If the filter determines that the condition should be reported,
59then the kevent is placed on the kqueue for the user to retrieve.
60.Pp
61The filter is also run when the user attempts to retrieve the kevent
62from the kqueue.
63If the filter indicates that the condition that triggered
64the event no longer holds, the kevent is removed from the kqueue and
65is not returned.
66.Pp
67Multiple events which trigger the filter do not result in multiple
68kevents being placed on the kqueue; instead, the filter will aggregate
69the events into a single
70.Li struct kevent .
71Calling
72.Fn close
73on a file descriptor will remove any kevents that reference the descriptor.
74.Pp
75.Fn kqueue
76creates a new kernel event queue and returns a descriptor.
77The queue is not inherited by a child created with
78.Xr fork 2 .
79Similarly, kqueues cannot be passed across UNIX-domain sockets.
80.Pp
81.Fn kevent
82is used to register events with the queue, and return any pending
83events to the user.
84.Fa changelist
85is a pointer to an array of
86.Va kevent
87structures, as defined in
88.In sys/event.h .
89All changes contained in the
90.Fa changelist
91are applied before any pending events are read from the queue.
92.Fa nchanges
93gives the size of
94.Fa changelist .
95.Fa eventlist
96is a pointer to an array of kevent structures.
97.Fa nevents
98determines the size of
99.Fa eventlist .
100When
101.Fa nevents
102is zero,
103.Fn kevent
104will return immediately even if there is a
105.Fa timeout
106specified unlike
107.Xr select 2 .
108If
109.Fa timeout
110is a non-null pointer, it specifies a maximum interval to wait
111for an event, which will be interpreted as a
112.Li struct timespec .
113If
114.Fa timeout
115is a null pointer,
116.Fn kevent
117waits indefinitely.
118To effect a poll, the
119.Fa timeout
120argument should be non-null, pointing to a zero-valued
121.Va timespec
122structure.
123The same array may be used for the
124.Fa changelist
125and
126.Fa eventlist .
127.Pp
128.Fn EV_SET
129is a macro which is provided for ease of initializing a
130kevent structure.
131.Pp
132The
133.Va kevent
134structure is defined as:
135.Bd -literal
136struct kevent {
137	uintptr_t  ident;	/* identifier for this event */
138	short	   filter;	/* filter for event */
139	u_short	   flags;	/* action flags for kqueue */
140	u_int	   fflags;	/* filter flag value */
141	quad_t	   data;	/* filter data value */
142	void	   *udata;	/* opaque user data identifier */
143};
144.Ed
145.Pp
146The fields of
147.Li struct kevent
148are:
149.Bl -tag -width XXXfilter
150.It ident
151Value used to identify this event.
152The exact interpretation is determined by the attached filter,
153but often is a file descriptor.
154.It filter
155Identifies the kernel filter used to process this event.
156The pre-defined system filters are described below.
157.It flags
158Actions to perform on the event.
159.It fflags
160Filter-specific flags.
161.It data
162Filter-specific data value.
163.It udata
164Opaque user-defined value passed through the kernel unchanged.
165.El
166.Pp
167The
168.Va flags
169field can contain the following values:
170.Bl -tag -width XXXEV_ONESHOT
171.It Dv EV_ADD
172Adds the event to the kqueue.
173Re-adding an existing event will modify the parameters of the original event,
174and not result in a duplicate entry.
175Adding an event automatically enables it, unless overridden by the
176.Dv EV_DISABLE
177flag.
178.It Dv EV_ENABLE
179Permit
180.Fn kevent
181to return the event if it is triggered.
182.It Dv EV_DISABLE
183Disable the event so
184.Fn kevent
185will not return it.
186The filter itself is not disabled.
187.It Dv EV_DELETE
188Removes the event from the kqueue.
189Events which are attached to file descriptors are automatically deleted
190on the last close of the descriptor.
191.It Dv EV_ONESHOT
192Causes the event to return only the first occurrence of the filter
193being triggered.
194After the user retrieves the event from the kqueue, it is deleted.
195.It Dv EV_CLEAR
196After the event is retrieved by the user, its state is reset.
197This is useful for filters which report state transitions
198instead of the current state.
199Note that some filters may automatically set this flag internally.
200.It Dv EV_EOF
201Filters may set this flag to indicate filter-specific EOF condition.
202.It Dv EV_ERROR
203See
204.Sx RETURN VALUES
205below.
206.El
207.Pp
208The predefined system filters are listed below.
209Arguments may be passed to and from the filter via the
210.Va fflags
211and
212.Va data
213fields in the kevent structure.
214.Bl -tag -width EVFILT_SIGNAL
215.It Dv EVFILT_READ
216Takes a descriptor as the identifier, and returns whenever
217there is data available to read.
218The behavior of the filter is slightly different depending
219on the descriptor type.
220.Bl -tag -width 2n
221.It Sockets
222Sockets which have previously been passed to
223.Fn listen
224return when there is an incoming connection pending.
225.Va data
226contains the size of the listen backlog.
227.Pp
228Other socket descriptors return when there is data to be read,
229subject to the
230.Dv SO_RCVLOWAT
231value of the socket buffer.
232This may be overridden with a per-filter low water mark at the
233time the filter is added by setting the
234.Dv NOTE_LOWAT
235flag in
236.Va fflags ,
237and specifying the new low water mark in
238.Va data .
239On return,
240.Va data
241contains the number of bytes in the socket buffer.
242.Pp
243If the read direction of the socket has shutdown, then the filter
244also sets
245.Dv EV_EOF
246in
247.Va flags ,
248and returns the socket error (if any) in
249.Va fflags .
250It is possible for EOF to be returned (indicating the connection is gone)
251while there is still data pending in the socket buffer.
252.It Vnodes
253Returns when the file pointer is not at the end of file.
254.Va data
255contains the offset from current position to end of file,
256and may be negative.
257If
258.Dv NOTE_EOF
259is set in
260.Va fflags ,
261.Fn kevent
262will also return when the file pointer is at the end of file.
263The end of file condition is indicated by the presence of
264.Dv NOTE_EOF
265in
266.Va fflags
267on return.
268.It "Fifos, Pipes"
269Returns when there is data to read;
270.Va data
271contains the number of bytes available.
272.Pp
273When the last writer disconnects, the filter will set
274.Dv EV_EOF
275in
276.Va flags .
277This may be cleared by passing in
278.Dv EV_CLEAR ,
279at which point the filter will resume waiting for data to become
280available before returning.
281.It "BPF devices"
282Returns when the BPF buffer is full, the BPF timeout has expired, or
283when the BPF has
284.Dq immediate mode
285enabled and there is any data to read;
286.Va data
287contains the number of bytes available.
288.El
289.It Dv EVFILT_WRITE
290Takes a descriptor as the identifier, and returns whenever
291it is possible to write to the descriptor.
292For sockets, pipes, and FIFOs,
293.Va data
294will contain the amount of space remaining in the write buffer.
295The filter will set
296.Dv EV_EOF
297when the reader disconnects, and for the FIFO case,
298this may be cleared by use of
299.Dv EV_CLEAR .
300Note that this filter is not supported for vnodes or BPF devices.
301.Pp
302For sockets, the low water mark and socket error handling is
303identical to the
304.Dv EVFILT_READ
305case.
306.\".It Dv EVFILT_AIO
307.\"The sigevent portion of the AIO request is filled in, with
308.\".Va sigev_notify_kqueue
309.\"containing the descriptor of the kqueue that the event should
310.\"be attached to,
311.\".Va sigev_value
312.\"containing the udata value, and
313.\".Va sigev_notify
314.\"set to
315.\".Dv SIGEV_KEVENT .
316.\"When the aio_* function is called, the event will be registered
317.\"with the specified kqueue, and the
318.\".Va ident
319.\"argument set to the
320.\".Li struct aiocb
321.\"returned by the aio_* function.
322.\"The filter returns under the same conditions as aio_error.
323.\".Pp
324.\"Alternatively, a kevent structure may be initialized, with
325.\".Va ident
326.\"containing the descriptor of the kqueue, and the
327.\"address of the kevent structure placed in the
328.\".Va aio_lio_opcode
329.\"field of the AIO request.
330.\"However, this approach will not work on architectures with 64-bit pointers,
331.\"and should be considered deprecated.
332.It Dv EVFILT_VNODE
333Takes a file descriptor as the identifier and the events to watch for in
334.Va fflags ,
335and returns when one or more of the requested events occurs on the descriptor.
336The events to monitor are:
337.Bl -tag -width XXNOTE_RENAME
338.It Dv NOTE_DELETE
339.Fn unlink
340was called on the file referenced by the descriptor.
341.It Dv NOTE_WRITE
342A write occurred on the file referenced by the descriptor.
343.It Dv NOTE_EXTEND
344The file referenced by the descriptor was extended.
345.It Dv NOTE_TRUNCATE
346The file referenced by the descriptor was truncated.
347.It Dv NOTE_ATTRIB
348The file referenced by the descriptor had its attributes changed.
349.It Dv NOTE_LINK
350The link count on the file changed.
351.It Dv NOTE_RENAME
352The file referenced by the descriptor was renamed.
353.It Dv NOTE_REVOKE
354Access to the file was revoked via
355.Xr revoke 2
356or the underlying file system was unmounted.
357.El
358.Pp
359On return,
360.Va fflags
361contains the events which triggered the filter.
362.It Dv EVFILT_PROC
363Takes the process ID to monitor as the identifier and the events to watch for
364in
365.Va fflags ,
366and returns when the process performs one or more of the requested events.
367If a process can normally see another process, it can attach an event to it.
368The events to monitor are:
369.Bl -tag -width XXNOTE_TRACKERR
370.It Dv NOTE_EXIT
371The process has exited.
372The exit status will be stored in
373.Va data
374in the same format as the status set by
375.Xr wait 2 .
376.It Dv NOTE_FORK
377The process has called
378.Fn fork .
379.It Dv NOTE_EXEC
380The process has executed a new process via
381.Xr execve 2
382or similar call.
383.It Dv NOTE_TRACK
384Follow a process across
385.Fn fork
386calls.
387The parent process will return with
388.Dv NOTE_FORK
389set in the
390.Va fflags
391field, while the child process will return with
392.Dv NOTE_CHILD
393set in
394.Va fflags
395and the parent PID in
396.Va data .
397.It Dv NOTE_TRACKERR
398This flag is returned if the system was unable to attach an event to
399the child process, usually due to resource limitations.
400.El
401.Pp
402On return,
403.Va fflags
404contains the events which triggered the filter.
405.It Dv EVFILT_SIGNAL
406Takes the signal number to monitor as the identifier and returns
407when the given signal is delivered to the process.
408This coexists with the
409.Fn signal
410and
411.Fn sigaction
412facilities, and has a lower precedence.
413The filter will record all attempts to deliver a signal to a process,
414even if the signal has been marked as
415.Dv SIG_IGN .
416Event notification happens after normal signal delivery processing.
417.Va data
418returns the number of times the signal has occurred since the last call to
419.Fn kevent .
420This filter automatically sets the
421.Dv EV_CLEAR
422flag internally.
423.It Dv EVFILT_TIMER
424Establishes an arbitrary timer identified by
425.Va ident .
426When adding a timer,
427.Va data
428specifies the timeout period in milliseconds.
429The timer will be periodic unless
430.Dv EV_ONESHOT
431is specified.
432On return,
433.Va data
434contains the number of times the timeout has expired since the last call to
435.Fn kevent .
436This filter automatically sets the
437.Dv EV_CLEAR
438flag internally.
439.El
440.Sh RETURN VALUES
441.Fn kqueue
442creates a new kernel event queue and returns a file descriptor.
443If there was an error creating the kernel event queue, a value of -1 is
444returned and errno set.
445.Pp
446.Fn kevent
447returns the number of events placed in the
448.Fa eventlist ,
449up to the value given by
450.Fa nevents .
451If an error occurs while processing an element of the
452.Fa changelist
453and there is enough room in the
454.Fa eventlist ,
455then the event will be placed in the
456.Fa eventlist
457with
458.Dv EV_ERROR
459set in
460.Va flags
461and the system error in
462.Va data .
463Otherwise,
464.Dv -1
465will be returned, and
466.Dv errno
467will be set to indicate the error condition.
468If the time limit expires, then
469.Fn kevent
470returns 0.
471.Sh ERRORS
472The
473.Fn kqueue
474function fails if:
475.Bl -tag -width Er
476.It Bq Er ENOMEM
477The kernel failed to allocate enough memory for the kernel queue.
478.It Bq Er EMFILE
479The per-process descriptor table is full.
480.It Bq Er ENFILE
481The system file table is full.
482.El
483.Pp
484The
485.Fn kevent
486function fails if:
487.Bl -tag -width Er
488.It Bq Er EACCES
489The process does not have permission to register a filter.
490.It Bq Er EFAULT
491There was an error reading or writing the
492.Va kevent
493structure.
494.It Bq Er EBADF
495The specified descriptor is invalid.
496.It Bq Er EINTR
497A signal was delivered before the timeout expired and before any
498events were placed on the kqueue for return.
499.It Bq Er EINVAL
500The specified time limit or filter is invalid.
501.It Bq Er ENOENT
502The event could not be found to be modified or deleted.
503.It Bq Er ENOMEM
504No memory was available to register the event.
505.It Bq Er ESRCH
506The specified process to attach to does not exist.
507.El
508.Sh SEE ALSO
509.Xr poll 2 ,
510.Xr read 2 ,
511.Xr select 2 ,
512.Xr sigaction 2 ,
513.Xr wait 2 ,
514.Xr write 2 ,
515.Xr signal 3
516.Sh HISTORY
517The
518.Fn kqueue
519and
520.Fn kevent
521functions first appeared in
522.Fx 4.1 .
523.Sh AUTHORS
524The
525.Fn kqueue
526system and this manual page were written by
527.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org .
528.Sh BUGS
529It is currently not possible to watch FIFOs or AIO that reside
530on anything but a UFS file system.
531Watching a vnode is possible on UFS, NFS and MS-DOS file systems.
532.Pp
533The
534.Fa timeout
535value is limited to 24 hours; longer timeouts will be silently
536reinterpreted as 24 hours.
537