xref: /openbsd-src/lib/libc/sys/kqueue.2 (revision 9f11ffb7133c203312a01e4b986886bc88c7d74b)
1.\"	$OpenBSD: kqueue.2,v 1.37 2018/01/13 17:13:12 jmc Exp $
2.\"
3.\" Copyright (c) 2000 Jonathan Lemon
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND
16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25.\" SUCH DAMAGE.
26.\"
27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $
28.\"
29.Dd $Mdocdate: January 13 2018 $
30.Dt KQUEUE 2
31.Os
32.Sh NAME
33.Nm kqueue ,
34.Nm kevent ,
35.Nm EV_SET
36.Nd kernel event notification mechanism
37.Sh SYNOPSIS
38.In sys/types.h
39.In sys/event.h
40.In sys/time.h
41.Ft int
42.Fn kqueue "void"
43.Ft int
44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout"
45.Fn EV_SET "&kev" ident filter flags fflags data udata
46.Sh DESCRIPTION
47.Fn kqueue
48provides a generic method of notifying the user when an event
49happens or a condition holds, based on the results of small
50pieces of kernel code termed
51.Dq filters .
52A kevent is identified by the (ident, filter) pair; there may only
53be one unique kevent per kqueue.
54.Pp
55The filter is executed upon the initial registration of a kevent
56in order to detect whether a preexisting condition is present, and is also
57executed whenever an event is passed to the filter for evaluation.
58If the filter determines that the condition should be reported,
59then the kevent is placed on the kqueue for the user to retrieve.
60.Pp
61The filter is also run when the user attempts to retrieve the kevent
62from the kqueue.
63If the filter indicates that the condition that triggered
64the event no longer holds, the kevent is removed from the kqueue and
65is not returned.
66.Pp
67Multiple events which trigger the filter do not result in multiple
68kevents being placed on the kqueue; instead, the filter will aggregate
69the events into a single
70.Li struct kevent .
71Calling
72.Fn close
73on a file descriptor will remove any kevents that reference the descriptor.
74.Pp
75.Fn kqueue
76creates a new kernel event queue and returns a descriptor.
77The queue is not inherited by a child created with
78.Xr fork 2 .
79Similarly, kqueues cannot be passed across UNIX-domain sockets.
80.Pp
81.Fn kevent
82is used to register events with the queue, and return any pending
83events to the user.
84.Fa changelist
85is a pointer to an array of
86.Va kevent
87structures, as defined in
88.In sys/event.h .
89All changes contained in the
90.Fa changelist
91are applied before any pending events are read from the queue.
92.Fa nchanges
93gives the size of
94.Fa changelist .
95.Fa eventlist
96is a pointer to an array of kevent structures.
97.Fa nevents
98determines the size of
99.Fa eventlist .
100When
101.Fa nevents
102is zero,
103.Fn kevent
104will return immediately even if there is a
105.Fa timeout
106specified unlike
107.Xr select 2 .
108If
109.Fa timeout
110is a non-null pointer, it specifies a maximum interval to wait
111for an event, which will be interpreted as a
112.Li struct timespec .
113If
114.Fa timeout
115is a null pointer,
116.Fn kevent
117waits indefinitely.
118To effect a poll, the
119.Fa timeout
120argument should be non-null, pointing to a zero-valued
121.Va timespec
122structure.
123The same array may be used for the
124.Fa changelist
125and
126.Fa eventlist .
127.Pp
128.Fn EV_SET
129is a macro which is provided for ease of initializing a
130kevent structure.
131.Pp
132The
133.Va kevent
134structure is defined as:
135.Bd -literal
136struct kevent {
137	uintptr_t  ident;	/* identifier for this event */
138	short	   filter;	/* filter for event */
139	u_short	   flags;	/* action flags for kqueue */
140	u_int	   fflags;	/* filter flag value */
141	int64_t	   data;	/* filter data value */
142	void	   *udata;	/* opaque user data identifier */
143};
144.Ed
145.Pp
146The fields of
147.Li struct kevent
148are:
149.Bl -tag -width XXXfilter
150.It ident
151Value used to identify this event.
152The exact interpretation is determined by the attached filter,
153but often is a file descriptor.
154.It filter
155Identifies the kernel filter used to process this event.
156The pre-defined system filters are described below.
157.It flags
158Actions to perform on the event.
159.It fflags
160Filter-specific flags.
161.It data
162Filter-specific data value.
163.It udata
164Opaque user-defined value passed through the kernel unchanged.
165.El
166.Pp
167The
168.Va flags
169field can contain the following values:
170.Bl -tag -width XXXEV_ONESHOT
171.It Dv EV_ADD
172Adds the event to the kqueue.
173Re-adding an existing event will modify the parameters of the original event,
174and not result in a duplicate entry.
175Adding an event automatically enables it, unless overridden by the
176.Dv EV_DISABLE
177flag.
178.It Dv EV_ENABLE
179Permit
180.Fn kevent
181to return the event if it is triggered.
182.It Dv EV_DISABLE
183Disable the event so
184.Fn kevent
185will not return it.
186The filter itself is not disabled.
187.It Dv EV_DISPATCH
188Disable the event source immediately after delivery of an event.
189See
190.Dv EV_DISABLE
191above.
192.It Dv EV_DELETE
193Removes the event from the kqueue.
194Events which are attached to file descriptors are automatically deleted
195on the last close of the descriptor.
196.It Dv EV_RECEIPT
197Causes
198.Fn kevent
199to return with
200.Dv EV_ERROR
201set without draining any pending events after updating events in the kqueue.
202When a filter is successfully added the
203.Va data
204field will be zero.
205This flag is useful for making bulk changes to a kqueue.
206.It Dv EV_ONESHOT
207Causes the event to return only the first occurrence of the filter
208being triggered.
209After the user retrieves the event from the kqueue, it is deleted.
210.It Dv EV_CLEAR
211After the event is retrieved by the user, its state is reset.
212This is useful for filters which report state transitions
213instead of the current state.
214Note that some filters may automatically set this flag internally.
215.It Dv EV_EOF
216Filters may set this flag to indicate filter-specific EOF condition.
217.It Dv EV_ERROR
218See
219.Sx RETURN VALUES
220below.
221.El
222.Pp
223The predefined system filters are listed below.
224Arguments may be passed to and from the filter via the
225.Va fflags
226and
227.Va data
228fields in the kevent structure.
229.Bl -tag -width EVFILT_SIGNAL
230.It Dv EVFILT_READ
231Takes a descriptor as the identifier, and returns whenever
232there is data available to read.
233The behavior of the filter is slightly different depending
234on the descriptor type.
235.Bl -tag -width 2n
236.It Sockets
237Sockets which have previously been passed to
238.Fn listen
239return when there is an incoming connection pending.
240.Va data
241contains the size of the listen backlog.
242.Pp
243Other socket descriptors return when there is data to be read,
244subject to the
245.Dv SO_RCVLOWAT
246value of the socket buffer.
247This may be overridden with a per-filter low water mark at the
248time the filter is added by setting the
249.Dv NOTE_LOWAT
250flag in
251.Va fflags ,
252and specifying the new low water mark in
253.Va data .
254On return,
255.Va data
256contains the number of bytes in the socket buffer.
257.Pp
258If the read direction of the socket has shutdown, then the filter
259also sets
260.Dv EV_EOF
261in
262.Va flags ,
263and returns the socket error (if any) in
264.Va fflags .
265It is possible for EOF to be returned (indicating the connection is gone)
266while there is still data pending in the socket buffer.
267.It Vnodes
268Returns when the file pointer is not at the end of file.
269.Va data
270contains the offset from current position to end of file,
271and may be negative.
272If
273.Dv NOTE_EOF
274is set in
275.Va fflags ,
276.Fn kevent
277will also return when the file pointer is at the end of file.
278The end of file condition is indicated by the presence of
279.Dv NOTE_EOF
280in
281.Va fflags
282on return.
283.It "FIFOs, Pipes"
284Returns when there is data to read;
285.Va data
286contains the number of bytes available.
287.Pp
288When the last writer disconnects, the filter will set
289.Dv EV_EOF
290in
291.Va flags .
292This may be cleared by passing in
293.Dv EV_CLEAR ,
294at which point the filter will resume waiting for data to become
295available before returning.
296.It "BPF devices"
297Returns when the BPF buffer is full, the BPF timeout has expired, or
298when the BPF has
299.Dq immediate mode
300enabled and there is any data to read;
301.Va data
302contains the number of bytes available.
303.El
304.It Dv EVFILT_WRITE
305Takes a descriptor as the identifier, and returns whenever
306it is possible to write to the descriptor.
307For sockets, pipes, and FIFOs,
308.Va data
309will contain the amount of space remaining in the write buffer.
310The filter will set
311.Dv EV_EOF
312when the reader disconnects, and for the FIFO case,
313this may be cleared by use of
314.Dv EV_CLEAR .
315Note that this filter is not supported for vnodes or BPF devices.
316.Pp
317For sockets, the low water mark and socket error handling is
318identical to the
319.Dv EVFILT_READ
320case.
321.\".It Dv EVFILT_AIO
322.\"The sigevent portion of the AIO request is filled in, with
323.\".Va sigev_notify_kqueue
324.\"containing the descriptor of the kqueue that the event should
325.\"be attached to,
326.\".Va sigev_value
327.\"containing the udata value, and
328.\".Va sigev_notify
329.\"set to
330.\".Dv SIGEV_KEVENT .
331.\"When the aio_* function is called, the event will be registered
332.\"with the specified kqueue, and the
333.\".Va ident
334.\"argument set to the
335.\".Li struct aiocb
336.\"returned by the aio_* function.
337.\"The filter returns under the same conditions as aio_error.
338.\".Pp
339.\"Alternatively, a kevent structure may be initialized, with
340.\".Va ident
341.\"containing the descriptor of the kqueue, and the
342.\"address of the kevent structure placed in the
343.\".Va aio_lio_opcode
344.\"field of the AIO request.
345.\"However, this approach will not work on architectures with 64-bit pointers,
346.\"and should be considered deprecated.
347.It Dv EVFILT_VNODE
348Takes a file descriptor as the identifier and the events to watch for in
349.Va fflags ,
350and returns when one or more of the requested events occurs on the descriptor.
351The events to monitor are:
352.Bl -tag -width XXNOTE_RENAME
353.It Dv NOTE_DELETE
354.Fn unlink
355was called on the file referenced by the descriptor.
356.It Dv NOTE_WRITE
357A write occurred on the file referenced by the descriptor.
358.It Dv NOTE_EXTEND
359The file referenced by the descriptor was extended.
360.It Dv NOTE_TRUNCATE
361The file referenced by the descriptor was truncated.
362.It Dv NOTE_ATTRIB
363The file referenced by the descriptor had its attributes changed.
364.It Dv NOTE_LINK
365The link count on the file changed.
366.It Dv NOTE_RENAME
367The file referenced by the descriptor was renamed.
368.It Dv NOTE_REVOKE
369Access to the file was revoked via
370.Xr revoke 2
371or the underlying file system was unmounted.
372.El
373.Pp
374On return,
375.Va fflags
376contains the events which triggered the filter.
377.It Dv EVFILT_PROC
378Takes the process ID to monitor as the identifier and the events to watch for
379in
380.Va fflags ,
381and returns when the process performs one or more of the requested events.
382If a process can normally see another process, it can attach an event to it.
383The events to monitor are:
384.Bl -tag -width XXNOTE_TRACKERR
385.It Dv NOTE_EXIT
386The process has exited.
387The exit status will be stored in
388.Va data
389in the same format as the status set by
390.Xr wait 2 .
391.It Dv NOTE_FORK
392The process has called
393.Fn fork .
394.It Dv NOTE_EXEC
395The process has executed a new process via
396.Xr execve 2
397or similar call.
398.It Dv NOTE_TRACK
399Follow a process across
400.Fn fork
401calls.
402The parent process will return with
403.Dv NOTE_FORK
404set in the
405.Va fflags
406field, while the child process will return with
407.Dv NOTE_CHILD
408set in
409.Va fflags
410and the parent PID in
411.Va data .
412.It Dv NOTE_TRACKERR
413This flag is returned if the system was unable to attach an event to
414the child process, usually due to resource limitations.
415.El
416.Pp
417On return,
418.Va fflags
419contains the events which triggered the filter.
420.It Dv EVFILT_SIGNAL
421Takes the signal number to monitor as the identifier and returns
422when the given signal is delivered to the process.
423This coexists with the
424.Fn signal
425and
426.Fn sigaction
427facilities, and has a lower precedence.
428The filter will record all attempts to deliver a signal to a process,
429even if the signal has been marked as
430.Dv SIG_IGN .
431Event notification happens after normal signal delivery processing.
432.Va data
433returns the number of times the signal has occurred since the last call to
434.Fn kevent .
435This filter automatically sets the
436.Dv EV_CLEAR
437flag internally.
438.It Dv EVFILT_TIMER
439Establishes an arbitrary timer identified by
440.Va ident .
441When adding a timer,
442.Va data
443specifies the timeout period in milliseconds.
444The timer will be periodic unless
445.Dv EV_ONESHOT
446is specified.
447On return,
448.Va data
449contains the number of times the timeout has expired since the last call to
450.Fn kevent .
451This filter automatically sets the
452.Dv EV_CLEAR
453flag internally.
454.It Dv EVFILT_DEVICE
455Takes a descriptor as the identifier and the events to watch for in
456.Va fflags ,
457and returns when one or more of the requested events occur on the
458descriptor.
459The events to monitor are:
460.Bl -tag -width XXNOTE_CHANGE
461.It Dv NOTE_CHANGE
462A device change event has occurred, e.g. an HDMI cable has been plugged in to a port.
463.El
464.Pp
465On return,
466.Va fflags
467contains the events which triggered the filter.
468.El
469.Sh RETURN VALUES
470.Fn kqueue
471creates a new kernel event queue and returns a file descriptor.
472If there was an error creating the kernel event queue, a value of -1 is
473returned and errno set.
474.Pp
475.Fn kevent
476returns the number of events placed in the
477.Fa eventlist ,
478up to the value given by
479.Fa nevents .
480If an error occurs while processing an element of the
481.Fa changelist
482and there is enough room in the
483.Fa eventlist ,
484then the event will be placed in the
485.Fa eventlist
486with
487.Dv EV_ERROR
488set in
489.Va flags
490and the system error in
491.Va data .
492Otherwise,
493.Dv -1
494will be returned, and
495.Dv errno
496will be set to indicate the error condition.
497If the time limit expires, then
498.Fn kevent
499returns 0.
500.Sh ERRORS
501The
502.Fn kqueue
503function fails if:
504.Bl -tag -width Er
505.It Bq Er ENOMEM
506The kernel failed to allocate enough memory for the kernel queue.
507.It Bq Er EMFILE
508The per-process descriptor table is full.
509.It Bq Er ENFILE
510The system file table is full.
511.El
512.Pp
513The
514.Fn kevent
515function fails if:
516.Bl -tag -width Er
517.It Bq Er EACCES
518The process does not have permission to register a filter.
519.It Bq Er EFAULT
520There was an error reading or writing the
521.Va kevent
522structure.
523.It Bq Er EBADF
524The specified descriptor is invalid.
525.It Bq Er EINTR
526A signal was delivered before the timeout expired and before any
527events were placed on the kqueue for return.
528.It Bq Er EINVAL
529The specified time limit or filter is invalid.
530.It Bq Er ENOENT
531The event could not be found to be modified or deleted.
532.It Bq Er ENOMEM
533No memory was available to register the event.
534.It Bq Er ESRCH
535The specified process to attach to does not exist.
536.El
537.Sh SEE ALSO
538.Xr poll 2 ,
539.Xr read 2 ,
540.Xr select 2 ,
541.Xr sigaction 2 ,
542.Xr wait 2 ,
543.Xr write 2 ,
544.Xr signal 3
545.Sh HISTORY
546The
547.Fn kqueue
548and
549.Fn kevent
550functions first appeared in
551.Fx 4.1 .
552.Sh AUTHORS
553The
554.Fn kqueue
555system and this manual page were written by
556.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org .
557.Sh BUGS
558It is currently not possible to watch FIFOs or AIO that reside
559on anything but a UFS file system.
560Watching a vnode is possible on UFS, NFS and MS-DOS file systems.
561.Pp
562The
563.Fa timeout
564value is limited to 24 hours; longer timeouts will be silently
565reinterpreted as 24 hours.
566