xref: /dpdk/doc/guides/nics/memif.rst (revision 1283ff7081c9cd47c1253fb73ff41ded37f7f26b)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2018-2019 Cisco Systems, Inc.
3
4======================
5Memif Poll Mode Driver
6======================
7
8Shared memory packet interface (memif) PMD allows for DPDK and any other client
9using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
10Linux only.
11
12The created device transmits packets in a raw format. It can be used with
13Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
14supported in DPDK memif implementation.
15
16Memif works in two roles: server and client. Client connects to server over an
17existing socket. It is also a producer of shared memory file and initializes
18the shared memory. Each interface can be connected to one peer interface
19at same time. The peer interface is identified by id parameter. Server
20creates the socket and listens for any client connection requests. The socket
21may already exist on the system. Be sure to remove any such sockets, if you
22are creating a server interface, or you will see an "Address already in use"
23error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
24will also remove a listener socket, if it is not being used by any other
25interface.
26
27The method to enable one or more interfaces is to use the
28``--vdev=net_memif0`` option on the DPDK application command line. Each
29``--vdev=net_memif1`` option given will create an interface named net_memif0,
30net_memif1, and so on. Memif uses unix domain socket to transmit control
31messages. Each memif has a unique id per socket. This id is used to identify
32peer interface. If you are connecting multiple
33interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
34etc. Note that if you assign a socket to a server interface it becomes a
35listener socket. Listener socket can not be used by a client interface on same
36client.
37
38.. csv-table:: **Memif configuration options**
39   :header: "Option", "Description", "Default", "Valid value"
40
41   "id=0", "Used to identify peer interface", "0", "uint32_t"
42   "role=server", "Set memif role", "client", "server|client"
43   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
44   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
45   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 108"
46   "socket-abstract=no", "Set usage of abstract socket address", "yes", "yes|no"
47   "owner-uid=1000", "Set socket listener owner uid. Only relevant to server with socket-abstract=no", "unchanged", "uid_t"
48   "owner-gid=1000", "Set socket listener owner gid. Only relevant to server with socket-abstract=no", "unchanged", "gid_t"
49   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
50   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
51   "zero-copy=yes", "Enable/disable zero-copy client mode. Only relevant to client, requires '--single-file-segments' eal argument", "no", "yes|no"
52
53**Connection establishment**
54
55In order to create memif connection, two memif interfaces, each in separate
56process, are needed. One interface in ``server`` role and other in
57``client`` role. It is not possible to connect two interfaces in a single
58process. Each interface can be connected to one interface at same time,
59identified by matching id parameter.
60
61Memif driver uses unix domain socket to exchange required information between
62memif interfaces. Socket file path is specified at interface creation see
63*Memif configuration options* table above. If socket is used by ``server``
64interface, it's marked as listener socket (in scope of current process) and
65listens to connection requests from other processes. One socket can be used by
66multiple interfaces. One process can have ``client`` and ``server`` interfaces
67at the same time, provided each role is assigned unique socket.
68
69For detailed information on memif control messages, see: net/memif/memif.h.
70
71Client interface attempts to make a connection on assigned socket. Process
72listening on this socket will extract the connection request and create a new
73connected socket (control channel). Then it sends the 'hello' message
74(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Client interface
75adjusts its configuration accordingly, and sends 'init' message
76(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
77uses this id to find server interface, and assigns the control channel to this
78interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
79sent. Client interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
80every region allocated. Server responds to each of these messages with 'ack'
81message. Same behavior applies to rings. Client sends 'add ring' message
82(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Server again responds to
83each message with 'ack' message. To finalize the connection, client interface
84sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
85server maps regions to its address space, initializes rings and responds with
86'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
87(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both server and client interfaces at
88any time, due to driver error or if the interface is being deleted.
89
90Files
91
92- net/memif/memif.h *- control messages definitions*
93- net/memif/memif_socket.h
94- net/memif/memif_socket.c
95
96Shared memory
97~~~~~~~~~~~~~
98
99**Shared memory format**
100
101Client is producer and server is consumer. Memory regions, are mapped shared memory files,
102created by memif client and provided to server at connection establishment.
103Regions contain rings and buffers. Rings and buffers can also be separated into multiple
104regions. For no-zero-copy, rings and buffers are stored inside single memory
105region to reduce the number of opened files.
106
107region n (no-zero-copy):
108
109+-----------------------+-------------------------------------------------------------------------+
110| Rings                 | Buffers                                                                 |
111+-----------+-----------+-----------------+---+---------------------------------------------------+
112| C2S rings | S2C rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(c2s + s2c))-1 |
113+-----------+-----------+-----------------+---+---------------------------------------------------+
114
115C2S OR S2C Rings:
116
117+--------+--------+-----------------------+
118| ring 0 | ring 1 | ring num_c2s_rings - 1|
119+--------+--------+-----------------------+
120
121ring 0:
122
123+-------------+---------------------------------------+
124| ring header | (1 << pmd->run.log2_ring_size) * desc |
125+-------------+---------------------------------------+
126
127Descriptors are assigned packet buffers in order of rings creation. If we have one ring
128in each direction and ring size is 1024, then first 1024 buffers will belong to C2S ring and
129last 1024 will belong to S2C ring. In case of zero-copy, buffers are dequeued and
130enqueued as needed.
131
132**Descriptor format**
133
134+----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
135|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
136|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
137|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
138+----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
139|0   |length                                                         |region                         |flags                          |
140+----+---------------------------------------------------------------+-------------------------------+-------------------------------+
141|1   |metadata                                                       |offset                                                         |
142+----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
143|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
144|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
145|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
146+----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
147
148**Flags field - flags (Quad Word 0, bits 0:15)**
149
150+-----+--------------------+------------------------------------------------------------------------------------------------+
151|Bits |Name                |Functionality                                                                                   |
152+=====+====================+================================================================================================+
153|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
154+-----+--------------------+------------------------------------------------------------------------------------------------+
155
156**Region index - region (Quad Word 0, 16:31)**
157
158Index of memory region, the buffer is located in.
159
160**Data length - length (Quad Word 0, 32:63)**
161
162Length of transmitted/received data.
163
164**Data Offset - offset (Quad Word 1, 0:31)**
165
166Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
167
168**Metadata - metadata (Quad Word 1, 32:63)**
169
170Buffer metadata.
171
172Files
173
174- net/memif/memif.h *- descriptor and ring definitions*
175- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
176
177Zero-copy client
178~~~~~~~~~~~~~~~~
179
180Zero-copy client can be enabled with memif configuration option 'zero-copy=yes'. This option
181is only relevant to client and requires eal argument '--single-file-segments'.
182This limitation is in place, because it is too expensive to identify memseg
183for each packet buffer, resulting in worse performance than with zero-copy disabled.
184With single file segments we can calculate offset from the beginning of the file
185for each packet buffer.
186
187**Shared memory format**
188
189Region 0 is created by memif driver and contains rings. Client interface exposes DPDK memory (memseg).
190Instead of using memfd_create() to create new shared file, existing memsegs are used.
191Server interface functions the same as with zero-copy disabled.
192
193region 0:
194
195+-----------------------+
196| Rings                 |
197+-----------+-----------+
198| C2S rings | S2C rings |
199+-----------+-----------+
200
201region n:
202
203+-----------------+
204| Buffers         |
205+-----------------+
206|memseg           |
207+-----------------+
208
209Buffers are dequeued and enqueued as needed. Offset descriptor field is calculated at tx.
210Only single file segments mode (EAL option --single-file-segments) is supported, as calculating
211offset from multiple segments is too expensive.
212
213Example: testpmd
214----------------------------
215In this example we run two instances of testpmd application and transmit packets over memif.
216
217First create ``server`` interface::
218
219    # ./<build_dir>/app/dpdk-testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=server -- -i
220
221Now create ``client`` interface (server must be already running so the client will connect)::
222
223    # ./<build_dir>/app/dpdk-testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
224
225You can also enable ``zero-copy`` on ``client`` interface::
226
227    # ./<build_dir>/app/dpdk-testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif,zero-copy=yes --single-file-segments -- -i
228
229Start forwarding packets::
230
231    Client:
232        testpmd> start
233
234    Server:
235        testpmd> start tx_first
236
237Show status::
238
239    testpmd> show port stats 0
240
241For more details on testpmd please refer to :doc:`../testpmd_app_ug/index`.
242
243Example: testpmd and VPP
244------------------------
245For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
246
247Start VPP in interactive mode (should be by default). Create memif server interface in VPP::
248
249    vpp# create interface memif id 0 server no-zero-copy
250    vpp# set interface state memif0/0 up
251    vpp# set interface ip address memif0/0 192.168.1.1/24
252
253To see socket filename use show memif command::
254
255    vpp# show memif
256    sockets
257     id  listener    filename
258      0   yes (1)     /run/vpp/memif.sock
259    ...
260
261Now create memif interface by running testpmd with these command line options::
262
263    # ./dpdk-testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
264
265Testpmd should now create memif client interface and try to connect to server.
266In testpmd set forward option to icmpecho and start forwarding::
267
268    testpmd> set fwd icmpecho
269    testpmd> start
270
271Send ping from VPP::
272
273    vpp# ping 192.168.1.2
274    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
275    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
276    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
277    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
278
279Example: testpmd memif loopback
280-------------------------------
281In this example we will create 2 memif ports connected into loopback.
282The situation is analogous to cross connecting 2 ports of the NIC by cable.
283
284To set the loopback, just use the same socket and id with different roles::
285
286    # ./dpdk-testpmd --vdev=net_memif0,role=server,id=0 --vdev=net_memif1,role=client,id=0 -- -i
287
288Then start the communication::
289
290    testpmd> start tx_first
291
292Finally we can check port stats to see the traffic::
293
294    testpmd> show port stats all
295