xref: /dflybsd-src/share/man/man4/netmap.4 (revision f933b737dabc806a2f1680f0afea2fb42a345b92)
1fb578518SFranco Fichtner.\" Copyright (c) 2011-2013 Matteo Landi, Luigi Rizzo, Universita` di Pisa
2fb578518SFranco Fichtner.\" All rights reserved.
3fb578518SFranco Fichtner.\"
4fb578518SFranco Fichtner.\" Redistribution and use in source and binary forms, with or without
5fb578518SFranco Fichtner.\" modification, are permitted provided that the following conditions
6fb578518SFranco Fichtner.\" are met:
7fb578518SFranco Fichtner.\" 1. Redistributions of source code must retain the above copyright
8fb578518SFranco Fichtner.\"    notice, this list of conditions and the following disclaimer.
9fb578518SFranco Fichtner.\" 2. Redistributions in binary form must reproduce the above copyright
10fb578518SFranco Fichtner.\"    notice, this list of conditions and the following disclaimer in the
11fb578518SFranco Fichtner.\"    documentation and/or other materials provided with the distribution.
12fb578518SFranco Fichtner.\"
13fb578518SFranco Fichtner.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14fb578518SFranco Fichtner.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15fb578518SFranco Fichtner.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16fb578518SFranco Fichtner.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17fb578518SFranco Fichtner.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18fb578518SFranco Fichtner.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19fb578518SFranco Fichtner.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20fb578518SFranco Fichtner.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21fb578518SFranco Fichtner.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22fb578518SFranco Fichtner.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23fb578518SFranco Fichtner.\" SUCH DAMAGE.
24fb578518SFranco Fichtner.\"
25fb578518SFranco Fichtner.\" This document is derived in part from the enet man page (enet.4)
26fb578518SFranco Fichtner.\" distributed with 4.3BSD Unix.
27fb578518SFranco Fichtner.\"
28fb578518SFranco Fichtner.\" $FreeBSD: head/share/man/man4/netmap.4 228017 2011-11-27 06:55:57Z gjb $
29fb578518SFranco Fichtner.\"
30*f933b737SSascha Wildner.Dd May 25, 2019
31fb578518SFranco Fichtner.Dt NETMAP 4
32fb578518SFranco Fichtner.Os
33fb578518SFranco Fichtner.Sh NAME
34fb578518SFranco Fichtner.Nm netmap
35fb578518SFranco Fichtner.Nd a framework for fast packet I/O
36fb578518SFranco Fichtner.Sh SYNOPSIS
37fb578518SFranco Fichtner.Cd device netmap
38fb578518SFranco Fichtner.Sh DESCRIPTION
39fb578518SFranco Fichtner.Nm
40fb578518SFranco Fichtneris a framework for extremely fast and efficient packet I/O
41fb578518SFranco Fichtner(reaching 14.88 Mpps with a single core at less than 1 GHz)
42fb578518SFranco Fichtnerfor both userspace and kernel clients.
437c417b37SFranco FichtnerUserspace clients can use the
447c417b37SFranco Fichtner.Nm
457c417b37SFranco FichtnerAPI
46fb578518SFranco Fichtnerto send and receive raw packets through physical interfaces
47fb578518SFranco Fichtneror ports of the
487c417b37SFranco Fichtner.Xr vale 4
49fb578518SFranco Fichtnerswitch.
50fb578518SFranco Fichtner.Pp
517c417b37SFranco Fichtner.Xr vale 4
52fb578518SFranco Fichtneris a very fast (reaching 20 Mpps per port)
53fb578518SFranco Fichtnerand modular software switch,
54fb578518SFranco Fichtnerimplemented within the kernel, which can interconnect
55fb578518SFranco Fichtnervirtual ports, physical devices, and the native host stack.
56fb578518SFranco Fichtner.Pp
57fb578518SFranco Fichtner.Nm
58fb578518SFranco Fichtneruses a memory mapped region to share packet buffers,
59fb578518SFranco Fichtnerdescriptors and queues with the kernel.
607c417b37SFranco Fichtner.Xr ioctl 2
617c417b37SFranco Fichtneris used to bind interfaces/ports to file descriptors and
62fb578518SFranco Fichtnerimplement non-blocking I/O, whereas blocking I/O uses
637c417b37SFranco Fichtner.Xr select 2
647c417b37SFranco Fichtnerand
657c417b37SFranco Fichtner.Xr poll 2 .
66fb578518SFranco Fichtner.Nm
67fb578518SFranco Fichtnercan exploit the parallelism in multiqueue devices and
68fb578518SFranco Fichtnermulticore systems.
69fb578518SFranco Fichtner.Pp
70fb578518SFranco FichtnerFor the best performance,
71fb578518SFranco Fichtner.Nm
72fb578518SFranco Fichtnerrequires explicit support in device drivers;
73fb578518SFranco Fichtnera generic emulation layer is available to implement the
74fb578518SFranco Fichtner.Nm
75fb578518SFranco FichtnerAPI on top of unmodified device drivers,
76fb578518SFranco Fichtnerat the price of reduced performance
77fb578518SFranco Fichtner(but still better than what can be achieved with
787c417b37SFranco Fichtner.Xr socket 2 ,
797c417b37SFranco Fichtner.Xr bpf 4 ,
807c417b37SFranco Fichtneror
817c417b37SFranco Fichtner.Xr pcap 3 ) .
82fb578518SFranco Fichtner.Pp
83fb578518SFranco FichtnerFor a list of devices with native
84fb578518SFranco Fichtner.Nm
857c417b37SFranco Fichtnersupport, see section
867c417b37SFranco Fichtner.Sx SUPPORTED INTERFACES
877c417b37SFranco Fichtnerat the end of this manual page.
887c417b37SFranco Fichtner.Sh OPERATING THE API
89fb578518SFranco Fichtner.Nm
907c417b37SFranco Fichtnerclients must first issue the following code to open the device
917c417b37SFranco Fichtnernode and to bind the file descriptor to a specific interface or port:
927c417b37SFranco Fichtner.Bd -literal -offset indent
937c417b37SFranco Fichtnerfd = open("/dev/netmap");
947c417b37SFranco Fichtnerioctl(fd, NIOCREGIF, (struct nmreq *)arg);
957c417b37SFranco Fichtner.Ed
967c417b37SFranco Fichtner.Pp
97fb578518SFranco Fichtner.Nm
98fb578518SFranco Fichtnerhas multiple modes of operation controlled by the
99fb578518SFranco Fichtnercontent of the
1007c417b37SFranco Fichtner.Vt struct nmreq
1017c417b37SFranco Fichtnerpassed to
1027c417b37SFranco Fichtner.Xr ioctl 2 .
103fb578518SFranco FichtnerIn particular, the
1047c417b37SFranco Fichtner.Va nr_name
105fb578518SFranco Fichtnerfield specifies whether the client operates on a physical network
106fb578518SFranco Fichtnerinterface or on a port of a
1077c417b37SFranco Fichtner.Xr vale 4
1087c417b37SFranco Fichtnerswitch, as indicated below.
1097c417b37SFranco FichtnerAdditional fields in the
1107c417b37SFranco Fichtner.Vt struct nmreq
111fb578518SFranco Fichtnercontrol the details of operation.
112fb578518SFranco Fichtner.Bl -tag -width XXXX
1137c417b37SFranco Fichtner.It Sy Interface name (e.g. 'em0', 'eth1', ...)
114fb578518SFranco FichtnerThe data path of the interface is disconnected from the host stack.
115fb578518SFranco FichtnerDepending on additional arguments,
116fb578518SFranco Fichtnerthe file descriptor is bound to the NIC (one or all queues),
117fb578518SFranco Fichtneror to the host stack.
1187c417b37SFranco Fichtner.It Sy valeXXX:YYY (arbitrary XXX and YYY)
1197c417b37SFranco FichtnerThe file descriptor is bound to port YYY of a
1207c417b37SFranco Fichtner.Xr vale 4
1217c417b37SFranco Fichtnerswitch called XXX,
122fb578518SFranco Fichtnerwhere XXX and YYY are arbitrary alphanumeric strings.
123fb578518SFranco FichtnerThe string cannot exceed IFNAMSIZ characters, and YYY cannot
124fb578518SFranco Fichtnermatching the name of any existing interface.
125fb578518SFranco Fichtner.Pp
126fb578518SFranco FichtnerThe switch and the port are created if not existing.
1277c417b37SFranco Fichtner.It Sy valeXXX:ifname (ifname is an existing interface)
128fb578518SFranco FichtnerFlags in the argument control whether the physical interface
1297c417b37SFranco Fichtner(and optionally the corresponding host stack endpoint)
1307c417b37SFranco Fichtnerare connected or disconnected from the
1317c417b37SFranco Fichtner.Xr vale 4
1327c417b37SFranco Fichtnerswitch named XXX.
133fb578518SFranco Fichtner.Pp
1347c417b37SFranco FichtnerIn this case
1357c417b37SFranco Fichtner.Xr ioctl 2
1367c417b37SFranco Fichtneris used only for configuring the
1377c417b37SFranco Fichtner.Xr vale 4
1387c417b37SFranco Fichtnerswitch, typically through the
1397c417b37SFranco Fichtner.Cm vale-ctl
140fb578518SFranco Fichtnercommand.
1417c417b37SFranco FichtnerThe file descriptor cannot be used for I/O, and should be passed to
1427c417b37SFranco Fichtner.Xr close 2
1437c417b37SFranco Fichtnerafter issuing
1447c417b37SFranco Fichtner.Xr ioctl 2 .
145fb578518SFranco Fichtner.El
146fb578518SFranco Fichtner.Pp
147fb578518SFranco FichtnerThe binding can be removed (and the interface returns to
148fb578518SFranco Fichtnerregular operation, or the virtual port destroyed) with a
1497c417b37SFranco Fichtner.Xr close 2
150fb578518SFranco Fichtneron the file descriptor.
151fb578518SFranco Fichtner.Pp
152fb578518SFranco FichtnerThe processes owning the file descriptor can then
1537c417b37SFranco Fichtner.Xr mmap 2
154fb578518SFranco Fichtnerthe memory region that contains pre-allocated
155fb578518SFranco Fichtnerbuffers, descriptors and queues, and use them to
156fb578518SFranco Fichtnerread/write raw packets.
157fb578518SFranco FichtnerNon blocking I/O is done with special
1587c417b37SFranco Fichtner.Xr ioctl 2
1597c417b37SFranco Fichtnercommands, whereas the file descriptor can be passed to
1607c417b37SFranco Fichtner.Xr select 2
1617c417b37SFranco Fichtnerand
1627c417b37SFranco Fichtner.Xr poll 2
163fb578518SFranco Fichtnerto be notified about incoming packet or available transmit buffers.
164fb578518SFranco Fichtner.Ss DATA STRUCTURES
165fb578518SFranco FichtnerThe data structures in the mmapped memory are described below
166fb578518SFranco Fichtner(see
167*f933b737SSascha Wildner.In net/netmap/netmap.h
168fb578518SFranco Fichtnerfor reference).
169fb578518SFranco FichtnerAll physical devices operating in
170fb578518SFranco Fichtner.Nm
171fb578518SFranco Fichtnermode use the same memory region,
172fb578518SFranco Fichtnershared by the kernel and all processes who own
173fb578518SFranco Fichtner.Pa /dev/netmap
174fb578518SFranco Fichtnerdescriptors bound to those devices
175fb578518SFranco Fichtner(NOTE: visibility may be restricted in future implementations).
176fb578518SFranco FichtnerVirtual ports instead use separate memory regions,
177fb578518SFranco Fichtnershared only with the kernel.
178fb578518SFranco Fichtner.Pp
179fb578518SFranco FichtnerAll references between the shared data structure
1807c417b37SFranco Fichtnerare relative (offsets or indexes).
1817c417b37SFranco FichtnerSome macros help converting
182fb578518SFranco Fichtnerthem into actual pointers.
1837c417b37SFranco Fichtner.Bl -tag -width XXXX
1847c417b37SFranco Fichtner.It Sy struct netmap_if (one per interface)
185fb578518SFranco Fichtnerindicates the number of rings supported by an interface, their
186fb578518SFranco Fichtnersizes, and the offsets of the
1877c417b37SFranco Fichtner.Nm
1887c417b37SFranco Fichtnerrings associated to the interface.
189fb578518SFranco Fichtner.Pp
1907c417b37SFranco Fichtner.Vt struct netmap_if
191fb578518SFranco Fichtneris at offset
1927c417b37SFranco Fichtner.Va nr_offset
1937c417b37SFranco Fichtnerin the shared memory region indicated by the
1947c417b37SFranco Fichtnerfield in the structure returned by
1957c417b37SFranco Fichtner.Dv NIOCREGIF .
196fb578518SFranco Fichtner.Bd -literal
197fb578518SFranco Fichtnerstruct netmap_if {
198fb578518SFranco Fichtner    char          ni_name[IFNAMSIZ]; /* name of the interface.    */
199fb578518SFranco Fichtner    const u_int   ni_version;        /* API version               */
200fb578518SFranco Fichtner    const u_int   ni_rx_rings;       /* number of rx ring pairs   */
201fb578518SFranco Fichtner    const u_int   ni_tx_rings;       /* if 0, same as ni_rx_rings */
202fb578518SFranco Fichtner    const ssize_t ring_ofs[];        /* offset of tx and rx rings */
203fb578518SFranco Fichtner};
204fb578518SFranco Fichtner.Ed
2057c417b37SFranco Fichtner.It Sy struct netmap_ring (one per ring)
206fb578518SFranco FichtnerContains the positions in the transmit and receive rings to
207fb578518SFranco Fichtnersynchronize the kernel and the application,
208fb578518SFranco Fichtnerand an array of
2097c417b37SFranco Fichtner.Nm
2107c417b37SFranco Fichtnerslots describing the buffers.
2117c417b37SFranco Fichtner.Va reserved
2127c417b37SFranco Fichtneris used in receive rings to tell the kernel the number of slots after
2137c417b37SFranco Fichtner.Va cur
2147c417b37SFranco Fichtnerthat are still in use indicates how many slots starting from
2157c417b37SFranco Fichtner.Va cur
216fb578518SFranco Fichtnerthe
2177c417b37SFranco Fichtner.\" XXX Fix and finish this sentence?
218fb578518SFranco Fichtner.Pp
219fb578518SFranco FichtnerEach physical interface has one
2207c417b37SFranco Fichtner.Vt struct netmap_ring
221fb578518SFranco Fichtnerfor each hardware transmit and receive ring,
222fb578518SFranco Fichtnerplus one extra transmit and one receive structure
223fb578518SFranco Fichtnerthat connect to the host stack.
224fb578518SFranco Fichtner.Bd -literal
225fb578518SFranco Fichtnerstruct netmap_ring {
226fb578518SFranco Fichtner    const ssize_t  buf_ofs;   /* see details                 */
227fb578518SFranco Fichtner    const uint32_t num_slots; /* number of slots in the ring */
228fb578518SFranco Fichtner    uint32_t       avail;     /* number of usable slots      */
229fb578518SFranco Fichtner    uint32_t       cur;       /* 'current' read/write index  */
230fb578518SFranco Fichtner    uint32_t       reserved;  /* not refilled before current */
231fb578518SFranco Fichtner
232fb578518SFranco Fichtner    const uint16_t nr_buf_size;
233fb578518SFranco Fichtner    uint16_t       flags;
234fb578518SFranco Fichtner#define NR_TIMESTAMP 0x0002   /* set timestamp on *sync()    */
235fb578518SFranco Fichtner#define NR_FORWARD   0x0004   /* enable NS_FORWARD for ring  */
236fb578518SFranco Fichtner#define NR_RX_TSTMP  0x0008   /* set rx timestamp in slots   */
237fb578518SFranco Fichtner    struct timeval ts;
238fb578518SFranco Fichtner    struct netmap_slot slot[0]; /* array of slots            */
239fb578518SFranco Fichtner}
240fb578518SFranco Fichtner.Ed
241fb578518SFranco Fichtner.Pp
2427c417b37SFranco FichtnerIn transmit rings, after a system call
2437c417b37SFranco Fichtner.Va cur
2447c417b37SFranco Fichtnerindicates the first slot that can be used for transmissions, and
2457c417b37SFranco Fichtner.Va avail
2467c417b37SFranco Fichtnerreports how many of them are available.
2477c417b37SFranco FichtnerBefore the next
2487c417b37SFranco Fichtner.Nm Ns -related
2497c417b37SFranco Fichtnersystem call on the file
250fb578518SFranco Fichtnerdescriptor, the application should fill buffers and
2517c417b37SFranco Fichtnerslots with data, and update
2527c417b37SFranco Fichtner.Va cur
2537c417b37SFranco Fichtnerand
2547c417b37SFranco Fichtner.Va avail
255fb578518SFranco Fichtneraccordingly, as shown in the figure below:
256fb578518SFranco Fichtner.Bd -literal
257fb578518SFranco Fichtner              cur
258fb578518SFranco Fichtner               |----- avail ---|   (after syscall)
259fb578518SFranco Fichtner               v
260fb578518SFranco Fichtner     TX  [*****aaaaaaaaaaaaaaaaa**]
261fb578518SFranco Fichtner     TX  [*****TTTTTaaaaaaaaaaaa**]
262fb578518SFranco Fichtner                    ^
263fb578518SFranco Fichtner                    |-- avail --|   (before syscall)
264fb578518SFranco Fichtner                   cur
265fb578518SFranco Fichtner.Ed
2667c417b37SFranco Fichtner.Pp
2677c417b37SFranco FichtnerIn receive rings, after a system call
2687c417b37SFranco Fichtner.Va cur
2697c417b37SFranco Fichtnerindicates the first slot that contains a valid packet, and
2707c417b37SFranco Fichtner.Va avail
2717c417b37SFranco Fichtnerreports how many of them are available.
2727c417b37SFranco FichtnerBefore the next
2737c417b37SFranco Fichtner.Nm Ns -related
2747c417b37SFranco Fichtnersystem call on the file
275fb578518SFranco Fichtnerdescriptor, the application can process buffers and
276fb578518SFranco Fichtnerrelease them to the kernel updating
2777c417b37SFranco Fichtner.Va cur
2787c417b37SFranco Fichtnerand
2797c417b37SFranco Fichtner.Va avail
2807c417b37SFranco Fichtneraccordingly, as shown in the figure below.
2817c417b37SFranco FichtnerReceive rings have an additional field called
2827c417b37SFranco Fichtner.Va reserved
2837c417b37SFranco Fichtnerto indicate how many buffers before
2847c417b37SFranco Fichtner.Va cur
2857c417b37SFranco Fichtnercannot be released because they are still being processed.
286fb578518SFranco Fichtner.Bd -literal
287fb578518SFranco Fichtner                 cur
288fb578518SFranco Fichtner            |-res-|-- avail --|   (after syscall)
289fb578518SFranco Fichtner                  v
290fb578518SFranco Fichtner     RX  [**rrrrrrRRRRRRRRRRRR******]
291fb578518SFranco Fichtner     RX  [**...........rrrrRRR******]
292fb578518SFranco Fichtner                       |res|--|<avail (before syscall)
293fb578518SFranco Fichtner                           ^
294fb578518SFranco Fichtner                          cur
295fb578518SFranco Fichtner.Ed
2967c417b37SFranco Fichtner.It Sy struct netmap_slot (one per packet)
297fb578518SFranco Fichtnercontains the metadata for a packet:
298fb578518SFranco Fichtner.Bd -literal
299fb578518SFranco Fichtnerstruct netmap_slot {
300fb578518SFranco Fichtner    uint32_t buf_idx; /* buffer index */
301fb578518SFranco Fichtner    uint16_t len;   /* packet length */
302fb578518SFranco Fichtner    uint16_t flags; /* buf changed, etc. */
303fb578518SFranco Fichtner#define NS_BUF_CHANGED  0x0001  /* must resync, buffer changed */
3047c417b37SFranco Fichtner#define NS_REPORT       0x0002  /* tell hw to report results,
305fb578518SFranco Fichtner                                 * e.g. by generating an interrupt
306fb578518SFranco Fichtner                                 */
307fb578518SFranco Fichtner#define NS_FORWARD      0x0004  /* pass packet to the other endpoint
308fb578518SFranco Fichtner                                 * (host stack or device)
309fb578518SFranco Fichtner                                 */
310fb578518SFranco Fichtner#define NS_NO_LEARN     0x0008
311fb578518SFranco Fichtner#define NS_INDIRECT     0x0010
312fb578518SFranco Fichtner#define NS_MOREFRAG     0x0020
313fb578518SFranco Fichtner#define NS_PORT_SHIFT   8
314fb578518SFranco Fichtner#define NS_PORT_MASK    (0xff << NS_PORT_SHIFT)
315fb578518SFranco Fichtner#define NS_RFRAGS(_slot)        (((_slot)->flags >> 8) & 0xff)
316fb578518SFranco Fichtner    uint64_t ptr;   /* buffer address (indirect buffers) */
317fb578518SFranco Fichtner};
318fb578518SFranco Fichtner.Ed
3197c417b37SFranco Fichtner.Pp
320fb578518SFranco FichtnerThe flags control how the the buffer associated to the slot
321fb578518SFranco Fichtnershould be managed.
3227c417b37SFranco Fichtner.It Sy packet buffers
323fb578518SFranco Fichtnerare normally fixed size (2 Kbyte) buffers allocated by the kernel
3247c417b37SFranco Fichtnerthat contain packet data.
325fb578518SFranco Fichtner.El
326fb578518SFranco Fichtner.Pp
3277c417b37SFranco FichtnerAddresses are computed through macros in order to
3287c417b37SFranco Fichtnersupport access to objects in the shared memory region, e.g.:
3297c417b37SFranco Fichtner.Bl -tag -width ".Fn NETMAP_BUF ring buf_idx"
3307c417b37SFranco Fichtner.It Fn NETMAP_TXRING nifp i
3317c417b37SFranco FichtnerReturns the address of the
3327c417b37SFranco Fichtner.Va i Ns -th
3337c417b37SFranco Fichtnertransmit ring.
3347c417b37SFranco Fichtner.It Fn NETMAP_RXRING nifp i
3357c417b37SFranco FichtnerReturns the address of the
3367c417b37SFranco Fichtner.Va i Ns -th
3377c417b37SFranco Fichtnerreceive ring.
3387c417b37SFranco Fichtner.It Fn NETMAP_BUF ring buf_idx
3397c417b37SFranco FichtnerReturns the address of the buffer with index
3407c417b37SFranco Fichtner.Va buf_idx
341fb578518SFranco Fichtner(which can be part of any ring for the given interface).
342fb578518SFranco Fichtner.El
3437c417b37SFranco Fichtner.Ss FLAGS
344fb578518SFranco FichtnerNormally, buffers are associated to slots when interfaces are bound,
345fb578518SFranco Fichtnerand one packet is fully contained in a single buffer.
3467c417b37SFranco FichtnerClients can, however, modify the mapping using the
347fb578518SFranco Fichtnerfollowing flags:
3487c417b37SFranco Fichtner.Bl -tag -width ".Fn NS_RFRAGS slot"
3497c417b37SFranco Fichtner.It Dv NS_BUF_CHANGED
3507c417b37SFranco Fichtnerindicates that the
3517c417b37SFranco Fichtner.Va buf_idx
3527c417b37SFranco Fichtnerin the slot has changed.
353fb578518SFranco FichtnerThis can be useful if the client wants to implement
354fb578518SFranco Fichtnersome form of zero-copy forwarding (e.g. by passing buffers
355fb578518SFranco Fichtnerfrom an input interface to an output interface), or
356fb578518SFranco Fichtnerneeds to process packets out of order.
357fb578518SFranco Fichtner.Pp
358fb578518SFranco FichtnerThe flag MUST be used whenever the buffer index is changed.
3597c417b37SFranco Fichtner.It Dv NS_REPORT
360fb578518SFranco Fichtnerindicates that we want to be woken up when this buffer
3617c417b37SFranco Fichtnerhas been transmitted.
3627c417b37SFranco FichtnerThis reduces performance but insures
363fb578518SFranco Fichtnera prompt notification when a buffer has been sent.
364fb578518SFranco FichtnerNormally,
365fb578518SFranco Fichtner.Nm
366fb578518SFranco Fichtnernotifies transmit completions in batches, hence signals
3677c417b37SFranco Fichtnermay be delayed indefinitely.
3687c417b37SFranco FichtnerHowever, we need such notifications
369fb578518SFranco Fichtnerbefore closing a descriptor.
3707c417b37SFranco Fichtner.It Dv NS_FORWARD
3717c417b37SFranco FichtnerWhen the device is opened in
3727c417b37SFranco Fichtner.Sq transparent
3737c417b37SFranco Fichtnermode, the client can mark slots in receive rings with this flag.
374fb578518SFranco FichtnerFor all marked slots, marked packets are forwarded to
375fb578518SFranco Fichtnerthe other endpoint at the next system call, thus restoring
376fb578518SFranco Fichtner(in a selective way) the connection between the NIC and the
377fb578518SFranco Fichtnerhost stack.
3787c417b37SFranco Fichtner.It Dv NS_NO_LEARN
379fb578518SFranco Fichtnertells the forwarding code that the SRC MAC address for this
3807c417b37SFranco Fichtnerpacket should not be used in the learning bridge.
3817c417b37SFranco Fichtner.It Dv NS_INDIRECT
3827c417b37SFranco Fichtnerindicates that the packet's payload is not in the
3837c417b37SFranco Fichtner.Nm Ns -supplied
3847c417b37SFranco Fichtnerbuffer, but in a user-supplied buffer whose
3857c417b37SFranco Fichtneruser virtual address is in the
3867c417b37SFranco Fichtner.Va ptr
3877c417b37SFranco Fichtnerfield of the slot.
388fb578518SFranco FichtnerThe size can reach 65535 bytes.
3897c417b37SFranco FichtnerThis is only supported on the transmit ring of virtual ports.
3907c417b37SFranco Fichtner.It Dv NS_MOREFRAG
391fb578518SFranco Fichtnerindicates that the packet continues with subsequent buffers;
3927c417b37SFranco Fichtnerthe last buffer in a packet must have the flag cleared.
393fb578518SFranco FichtnerThe maximum length of a chain is 64 buffers.
3947c417b37SFranco FichtnerThis is only supported on virtual ports.
3957c417b37SFranco Fichtner.It Fn NS_RFRAGS slot
396fb578518SFranco Fichtneron receive rings, returns the number of remaining buffers
397fb578518SFranco Fichtnerin a packet, including this one.
3987c417b37SFranco FichtnerSlots with a value greater than 1 also have
3997c417b37SFranco Fichtner.Dv NS_MOREFRAG
4007c417b37SFranco Fichtnerset.
4017c417b37SFranco FichtnerThe length refers to the individual buffer;
4027c417b37SFranco Fichtnerthere is no field for the total length.
403fb578518SFranco Fichtner.Pp
4047c417b37SFranco FichtnerOn transmit rings, if
4057c417b37SFranco Fichtner.Dv NS_DST
4067c417b37SFranco Fichtneris set, it is passed to the lookup
407fb578518SFranco Fichtnerfunction, which can use it e.g. as the index of the destination
408fb578518SFranco Fichtnerport instead of doing an address lookup.
409fb578518SFranco Fichtner.El
4107c417b37SFranco Fichtner.Sh SYSTEM CALLS
411fb578518SFranco Fichtner.Nm
4127c417b37SFranco Fichtnersupports
4137c417b37SFranco Fichtner.Xr ioctl 2
4147c417b37SFranco Fichtnercommands to synchronize the state of the rings
4157c417b37SFranco Fichtnerbetween the kernel and the user processes, as well as
416fb578518SFranco Fichtnerto query and configure the interface.
4177c417b37SFranco FichtnerThe former do not require any argument, whereas the latter use a
4187c417b37SFranco Fichtner.Vt struct nmreq
419fb578518SFranco Fichtnerdefined as follows:
420fb578518SFranco Fichtner.Bd -literal
421fb578518SFranco Fichtnerstruct nmreq {
422fb578518SFranco Fichtner        char      nr_name[IFNAMSIZ];
423fb578518SFranco Fichtner        uint32_t  nr_version;     /* API version */
424fb578518SFranco Fichtner#define NETMAP_API      4         /* current version */
425fb578518SFranco Fichtner        uint32_t  nr_offset;      /* nifp offset in the shared region */
426fb578518SFranco Fichtner        uint32_t  nr_memsize;     /* size of the shared region */
427fb578518SFranco Fichtner        uint32_t  nr_tx_slots;    /* slots in tx rings */
428fb578518SFranco Fichtner        uint32_t  nr_rx_slots;    /* slots in rx rings */
429fb578518SFranco Fichtner        uint16_t  nr_tx_rings;    /* number of tx rings */
430fb578518SFranco Fichtner        uint16_t  nr_rx_rings;    /* number of tx rings */
431fb578518SFranco Fichtner        uint16_t  nr_ringid;      /* ring(s) we care about */
432fb578518SFranco Fichtner#define NETMAP_HW_RING    0x4000  /* low bits indicate one hw ring */
433fb578518SFranco Fichtner#define NETMAP_SW_RING    0x2000  /* we process the sw ring */
434fb578518SFranco Fichtner#define NETMAP_NO_TX_POLL 0x1000  /* no gratuitous txsync on poll */
435fb578518SFranco Fichtner#define NETMAP_RING_MASK  0xfff   /* the actual ring number */
436fb578518SFranco Fichtner        uint16_t  nr_cmd;
437fb578518SFranco Fichtner#define NETMAP_BDG_ATTACH       1 /* attach the NIC */
438fb578518SFranco Fichtner#define NETMAP_BDG_DETACH       2 /* detach the NIC */
439fb578518SFranco Fichtner#define NETMAP_BDG_LOOKUP_REG   3 /* register lookup function */
440fb578518SFranco Fichtner#define NETMAP_BDG_LIST         4 /* get bridge's info */
441fb578518SFranco Fichtner        uint16_t  nr_arg1;
442fb578518SFranco Fichtner        uint16_t  nr_arg2;
443fb578518SFranco Fichtner        uint32_t  spare2[3];
444fb578518SFranco Fichtner};
445fb578518SFranco Fichtner.Ed
4467c417b37SFranco Fichtner.Pp
447fb578518SFranco FichtnerA device descriptor obtained through
448fb578518SFranco Fichtner.Pa /dev/netmap
4497c417b37SFranco Fichtnersupports the
450fb578518SFranco Fichtner.Xr ioctl 2
4517c417b37SFranco Fichtnercommand codes supported by network devices, as well as
4527c417b37SFranco Fichtnerspecific command codes defined in
453*f933b737SSascha Wildner.In net/netmap/netmap.h .
4547c417b37SFranco FichtnerThese specific command codes are as follows:
4557c417b37SFranco Fichtner.Bl -tag -width ".Dv NIOCTXSYNC"
456fb578518SFranco Fichtner.It Dv NIOCGINFO
4577c417b37SFranco Fichtnerreturns
4587c417b37SFranco Fichtner.Dv EINVAL
4597c417b37SFranco Fichtnerif the named device does not support
4607c417b37SFranco Fichtner.Nm .
4617c417b37SFranco FichtnerOtherwise, it returns zero and advisory information
462fb578518SFranco Fichtnerabout the interface.
463fb578518SFranco FichtnerNote that all the information below can change before the
4647c417b37SFranco Fichtnerinterface is actually put into
4657c417b37SFranco Fichtner.Nm
4667c417b37SFranco Fichtnermode.
467fb578518SFranco Fichtner.Pp
4687c417b37SFranco Fichtner.Va nr_memsize
4697c417b37SFranco Fichtnerindicates the size of the
4707c417b37SFranco Fichtner.Nm
4717c417b37SFranco Fichtnermemory region.
4727c417b37SFranco FichtnerPhysical devices all share the same memory region, whereas
4737c417b37SFranco Fichtner.Xr vale 4
4747c417b37SFranco Fichtnerports may have independent regions for each port.
4757c417b37SFranco FichtnerThese sizes can be set through system-wide
4767c417b37SFranco Fichtner.Xr sysctl 8
4777c417b37SFranco Fichtnervariables.
4787c417b37SFranco Fichtner.Va nr_tx_slots
4797c417b37SFranco Fichtnerand
4807c417b37SFranco Fichtner.Va nr_rx_slots
4817c417b37SFranco Fichtnerindicate the size of transmit and receive rings, respectively.
4827c417b37SFranco Fichtner.Va nr_tx_rings
4837c417b37SFranco Fichtnerand
4847c417b37SFranco Fichtner.Va nr_rx_rings
4857c417b37SFranco Fichtnerindicate the number of transmit and receive rings, respectively.
4867c417b37SFranco FichtnerBoth ring number and size may be configured at runtime
4877c417b37SFranco Fichtnerusing interface-specific functions (e.g.\&
4887c417b37SFranco Fichtner.Xr sysctl 8
4897c417b37SFranco Fichtneron BSD, or
4907c417b37SFranco Fichtner.Xr ethtool 8
4917c417b37SFranco Fichtneron Linux).
492fb578518SFranco Fichtner.It Dv NIOCREGIF
4937c417b37SFranco Fichtnerputs the interface specified via
4947c417b37SFranco Fichtner.Va nr_name
4957c417b37SFranco Fichtnerinto
4967c417b37SFranco Fichtner.Nm
4977c417b37SFranco Fichtnermode, disconnecting it from the host stack, and/or defines which
4987c417b37SFranco Fichtnerrings are controlled through this file descriptor.
4997c417b37SFranco FichtnerOn return, it gives the same info as
5007c417b37SFranco Fichtner.Dv NIOCGINFO ,
5017c417b37SFranco Fichtnerand
5027c417b37SFranco Fichtner.Va nr_ringid
503fb578518SFranco Fichtnerindicates the identity of the rings controlled through the file
504fb578518SFranco Fichtnerdescriptor.
505fb578518SFranco Fichtner.Pp
5067c417b37SFranco FichtnerPossible values for
5077c417b37SFranco Fichtner.Va nr_ringid
5087c417b37SFranco Fichtnerare as follows:
5097c417b37SFranco Fichtner.Bl -tag -width "Dv NETMAP_HW_RING + i"
510fb578518SFranco Fichtner.It 0
5117c417b37SFranco Fichtnerdefault; all hardware rings
5127c417b37SFranco Fichtner.It Dv NETMAP_SW_RING
5137c417b37SFranco Fichtner.Dq host rings
5147c417b37SFranco Fichtnerconnecting to the host stack
5157c417b37SFranco Fichtner.It Dv NETMAP_HW_RING + i
5167c417b37SFranco Fichtneri-th hardware ring
517fb578518SFranco Fichtner.El
518fb578518SFranco Fichtner.Pp
5197c417b37SFranco FichtnerBy default, a
5207c417b37SFranco Fichtner.Xr poll 2
5217c417b37SFranco Fichtneror
5227c417b37SFranco Fichtner.Xr select 2
5237c417b37SFranco Fichtnercall pushes out any pending packets on the transmit ring, even if
5247c417b37SFranco Fichtnerno write events were specified.
5257c417b37SFranco FichtnerThe feature can be disabled by OR-ing the flag
5267c417b37SFranco Fichtner.Dv NETMAP_NO_TX_SYNC
5277c417b37SFranco Fichtnerinto
5287c417b37SFranco Fichtner.Va nr_ringid .
5297c417b37SFranco FichtnerNormally, you should keep this feature unless you are using
5307c417b37SFranco Fichtnerseparate file descriptors for the send and receive rings, because
5317c417b37SFranco Fichtnerotherwise packets are pushed out only if
5327c417b37SFranco Fichtner.Dv NETMAP_TXSYNC
5337c417b37SFranco Fichtneris called, or the send queue is full.
5347c417b37SFranco Fichtner.Pp
5357c417b37SFranco Fichtner.Dv NIOCREGIF
536fb578518SFranco Fichtnercan be used multiple times to change the association of a
537fb578518SFranco Fichtnerfile descriptor to a ring pair, always within the same device.
538fb578518SFranco Fichtner.Pp
539fb578518SFranco FichtnerWhen registering a virtual interface that is dynamically created to a
540fb578518SFranco Fichtner.Xr vale 4
541fb578518SFranco Fichtnerswitch, we can specify the desired number of rings (1 by default,
5427c417b37SFranco Fichtnerand currently up to 16) by setting the
5437c417b37SFranco Fichtner.Va nr_tx_rings
5447c417b37SFranco Fichtnerand
5457c417b37SFranco Fichtner.Va nr_rx_rings
5467c417b37SFranco Fichtnerfields accordingly.
547fb578518SFranco Fichtner.It Dv NIOCTXSYNC
5487c417b37SFranco Fichtnertells the hardware about new packets to transmit, and updates the
549fb578518SFranco Fichtnernumber of slots available for transmission.
550fb578518SFranco Fichtner.It Dv NIOCRXSYNC
5517c417b37SFranco Fichtnertells the hardware about consumed packets, and asks for newly available
552fb578518SFranco Fichtnerpackets.
553fb578518SFranco Fichtner.El
5547c417b37SFranco Fichtner.Pp
555fb578518SFranco Fichtner.Nm
556fb578518SFranco Fichtneruses
557fb578518SFranco Fichtner.Xr select 2
558fb578518SFranco Fichtnerand
559fb578518SFranco Fichtner.Xr poll 2
560fb578518SFranco Fichtnerto wake up processes when significant events occur, and
561fb578518SFranco Fichtner.Xr mmap 2
562fb578518SFranco Fichtnerto map memory.
563fb578518SFranco Fichtner.Pp
564fb578518SFranco FichtnerApplications may need to create threads and bind them to
565fb578518SFranco Fichtnerspecific cores to improve performance, using standard
5667c417b37SFranco FichtnerOS primitives; see
567fb578518SFranco Fichtner.Xr pthread 3 .
568fb578518SFranco FichtnerIn particular,
569fb578518SFranco Fichtner.Xr pthread_setaffinity_np 3
570fb578518SFranco Fichtnermay be of use.
571fb578518SFranco Fichtner.Sh EXAMPLES
5727c417b37SFranco FichtnerThe following code implements a traffic generator:
5737c417b37SFranco Fichtner.Bd -literal
5747c417b37SFranco Fichtner#include <sys/ioctl.h>
5757c417b37SFranco Fichtner#include <sys/mman.h>
5767c417b37SFranco Fichtner#include <sys/socket.h>
5777c417b37SFranco Fichtner#include <sys/time.h>
5787c417b37SFranco Fichtner#include <sys/types.h>
579*f933b737SSascha Wildner#include <net/netmap/netmap_user.h>
5807c417b37SFranco Fichtner
5817c417b37SFranco Fichtner#include <fcntl.h>
5827c417b37SFranco Fichtner#include <poll.h>
5837c417b37SFranco Fichtner#include <string.h>
5847c417b37SFranco Fichtner
5857c417b37SFranco Fichtnerint
5867c417b37SFranco Fichtnermain(void)
5877c417b37SFranco Fichtner{
588fb578518SFranco Fichtner	struct netmap_if *nifp;
589fb578518SFranco Fichtner	struct netmap_ring *ring;
5907c417b37SFranco Fichtner	struct pollfd fds;
591fb578518SFranco Fichtner	struct nmreq nmr;
5927c417b37SFranco Fichtner	void *p;
5937c417b37SFranco Fichtner	int fd;
594fb578518SFranco Fichtner
595fb578518SFranco Fichtner	fd = open("/dev/netmap", O_RDWR);
596fb578518SFranco Fichtner	bzero(&nmr, sizeof(nmr));
597fb578518SFranco Fichtner	strcpy(nmr.nr_name, "ix0");
5987c417b37SFranco Fichtner	nmr.nr_version = NETMAP_API;
599fb578518SFranco Fichtner	ioctl(fd, NIOCREGIF, &nmr);
6007c417b37SFranco Fichtner	p = mmap(0, nmr.nr_memsize, PROT_WRITE | PROT_READ,
6017c417b37SFranco Fichtner	    MAP_SHARED, fd, 0);
602fb578518SFranco Fichtner	nifp = NETMAP_IF(p, nmr.nr_offset);
603fb578518SFranco Fichtner	ring = NETMAP_TXRING(nifp, 0);
604fb578518SFranco Fichtner	fds.fd = fd;
605fb578518SFranco Fichtner	fds.events = POLLOUT;
6067c417b37SFranco Fichtner
607fb578518SFranco Fichtner	for (;;) {
6087c417b37SFranco Fichtner		poll(&fds, 1, -1);
609fb578518SFranco Fichtner		for (; ring->avail > 0; ring->avail--) {
6107c417b37SFranco Fichtner			uint32_t i;
6117c417b37SFranco Fichtner			void *buf;
6127c417b37SFranco Fichtner
613fb578518SFranco Fichtner			i = ring->cur;
6147c417b37SFranco Fichtner			buf = NETMAP_BUF(ring, ring->slot[i].buf_idx);
6157c417b37SFranco Fichtner			/* prepare packet in buf */
6167c417b37SFranco Fichtner			ring->slot[i].len = 0; /* packet length */
617fb578518SFranco Fichtner			ring->cur = NETMAP_RING_NEXT(ring, i);
618fb578518SFranco Fichtner		}
619fb578518SFranco Fichtner	}
6207c417b37SFranco Fichtner}
621fb578518SFranco Fichtner.Ed
622fb578518SFranco Fichtner.Sh SUPPORTED INTERFACES
623fb578518SFranco Fichtner.Nm
624fb578518SFranco Fichtnersupports the following interfaces:
625fb578518SFranco Fichtner.Xr em 4 ,
626fb578518SFranco Fichtner.Xr igb 4 ,
627fb578518SFranco Fichtner.Xr ixgbe 4 ,
628fb578518SFranco Fichtner.Xr lem 4 ,
6297c417b37SFranco Fichtnerand
6307c417b37SFranco Fichtner.Xr re 4 .
631fb578518SFranco Fichtner.Sh SEE ALSO
632fb578518SFranco Fichtner.Xr vale 4
6337c417b37SFranco Fichtner.Rs
6347c417b37SFranco Fichtner.%A Luigi Rizzo
6357c417b37SFranco Fichtner.%T Revisiting network I/O APIs: the netmap framework
6367c417b37SFranco Fichtner.%J Communications of the ACM
6377c417b37SFranco Fichtner.%V 55 (3)
6387c417b37SFranco Fichtner.%P 45-51
6397c417b37SFranco Fichtner.%D March 2012
6407c417b37SFranco Fichtner.Re
6417c417b37SFranco Fichtner.Rs
6427c417b37SFranco Fichtner.%A Luigi Rizzo
6437c417b37SFranco Fichtner.%T netmap: a novel framework for fast packet I/O
6447c417b37SFranco Fichtner.%D June 2012
6457c417b37SFranco Fichtner.%O USENIX ATC '12, Boston
6467c417b37SFranco Fichtner.Re
647fb578518SFranco Fichtner.Pp
6487c417b37SFranco Fichtner.Lk http://info.iet.unipi.it/~luigi/netmap/
649fb578518SFranco Fichtner.Sh AUTHORS
650fb578518SFranco Fichtner.An -nosplit
651fb578518SFranco FichtnerThe
652fb578518SFranco Fichtner.Nm
653fb578518SFranco Fichtnerframework has been originally designed and implemented at the
654fb578518SFranco FichtnerUniversita` di Pisa in 2011 by
655fb578518SFranco Fichtner.An Luigi Rizzo ,
656fb578518SFranco Fichtnerand further extended with help from
657fb578518SFranco Fichtner.An Matteo Landi ,
658fb578518SFranco Fichtner.An Gaetano Catalli ,
659fb578518SFranco Fichtner.An Giuseppe Lettieri ,
6607c417b37SFranco Fichtnerand
661fb578518SFranco Fichtner.An Vincenzo Maffione .
662fb578518SFranco Fichtner.Pp
663fb578518SFranco Fichtner.Nm
664fb578518SFranco Fichtnerand
6657c417b37SFranco Fichtner.Xr vale 4
6667c417b37SFranco Fichtnerhave been funded by the European Commission within the FP7 Projects
667fb578518SFranco FichtnerCHANGE (257422) and OPENLAB (287581).
668