xref: /dpdk/doc/guides/nics/memif.rst (revision 68a03efeed657e6e05f281479b33b51102797e15)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2018-2019 Cisco Systems, Inc.
3
4======================
5Memif Poll Mode Driver
6======================
7
8Shared memory packet interface (memif) PMD allows for DPDK and any other client
9using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
10Linux only.
11
12The created device transmits packets in a raw format. It can be used with
13Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
14supported in DPDK memif implementation.
15
16Memif works in two roles: server and client. Client connects to server over an
17existing socket. It is also a producer of shared memory file and initializes
18the shared memory. Each interface can be connected to one peer interface
19at same time. The peer interface is identified by id parameter. Server
20creates the socket and listens for any client connection requests. The socket
21may already exist on the system. Be sure to remove any such sockets, if you
22are creating a server interface, or you will see an "Address already in use"
23error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
24will also remove a listener socket, if it is not being used by any other
25interface.
26
27The method to enable one or more interfaces is to use the
28``--vdev=net_memif0`` option on the DPDK application command line. Each
29``--vdev=net_memif1`` option given will create an interface named net_memif0,
30net_memif1, and so on. Memif uses unix domain socket to transmit control
31messages. Each memif has a unique id per socket. This id is used to identify
32peer interface. If you are connecting multiple
33interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
34etc. Note that if you assign a socket to a server interface it becomes a
35listener socket. Listener socket can not be used by a client interface on same
36client.
37
38.. csv-table:: **Memif configuration options**
39   :header: "Option", "Description", "Default", "Valid value"
40
41   "id=0", "Used to identify peer interface", "0", "uint32_t"
42   "role=server", "Set memif role", "client", "server|client"
43   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
44   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
45   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 108"
46   "socket-abstract=no", "Set usage of abstract socket address", "yes", "yes|no"
47   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
48   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
49   "zero-copy=yes", "Enable/disable zero-copy client mode. Only relevant to client, requires '--single-file-segments' eal argument", "no", "yes|no"
50
51**Connection establishment**
52
53In order to create memif connection, two memif interfaces, each in separate
54process, are needed. One interface in ``server`` role and other in
55``client`` role. It is not possible to connect two interfaces in a single
56process. Each interface can be connected to one interface at same time,
57identified by matching id parameter.
58
59Memif driver uses unix domain socket to exchange required information between
60memif interfaces. Socket file path is specified at interface creation see
61*Memif configuration options* table above. If socket is used by ``server``
62interface, it's marked as listener socket (in scope of current process) and
63listens to connection requests from other processes. One socket can be used by
64multiple interfaces. One process can have ``client`` and ``server`` interfaces
65at the same time, provided each role is assigned unique socket.
66
67For detailed information on memif control messages, see: net/memif/memif.h.
68
69Client interface attempts to make a connection on assigned socket. Process
70listening on this socket will extract the connection request and create a new
71connected socket (control channel). Then it sends the 'hello' message
72(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Client interface
73adjusts its configuration accordingly, and sends 'init' message
74(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
75uses this id to find server interface, and assigns the control channel to this
76interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
77sent. Client interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
78every region allocated. Server responds to each of these messages with 'ack'
79message. Same behavior applies to rings. Client sends 'add ring' message
80(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Server again responds to
81each message with 'ack' message. To finalize the connection, client interface
82sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
83server maps regions to its address space, initializes rings and responds with
84'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
85(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both server and client interfaces at
86any time, due to driver error or if the interface is being deleted.
87
88Files
89
90- net/memif/memif.h *- control messages definitions*
91- net/memif/memif_socket.h
92- net/memif/memif_socket.c
93
94Shared memory
95~~~~~~~~~~~~~
96
97**Shared memory format**
98
99Client is producer and server is consumer. Memory regions, are mapped shared memory files,
100created by memif client and provided to server at connection establishment.
101Regions contain rings and buffers. Rings and buffers can also be separated into multiple
102regions. For no-zero-copy, rings and buffers are stored inside single memory
103region to reduce the number of opened files.
104
105region n (no-zero-copy):
106
107+-----------------------+-------------------------------------------------------------------------+
108| Rings                 | Buffers                                                                 |
109+-----------+-----------+-----------------+---+---------------------------------------------------+
110| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
111+-----------+-----------+-----------------+---+---------------------------------------------------+
112
113S2M OR M2S Rings:
114
115+--------+--------+-----------------------+
116| ring 0 | ring 1 | ring num_s2m_rings - 1|
117+--------+--------+-----------------------+
118
119ring 0:
120
121+-------------+---------------------------------------+
122| ring header | (1 << pmd->run.log2_ring_size) * desc |
123+-------------+---------------------------------------+
124
125Descriptors are assigned packet buffers in order of rings creation. If we have one ring
126in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
127last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
128enqueued as needed.
129
130**Descriptor format**
131
132+----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
133|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
134|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
135|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
136+----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
137|0   |length                                                         |region                         |flags                          |
138+----+---------------------------------------------------------------+-------------------------------+-------------------------------+
139|1   |metadata                                                       |offset                                                         |
140+----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
141|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
142|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
143|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
144+----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
145
146**Flags field - flags (Quad Word 0, bits 0:15)**
147
148+-----+--------------------+------------------------------------------------------------------------------------------------+
149|Bits |Name                |Functionality                                                                                   |
150+=====+====================+================================================================================================+
151|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
152+-----+--------------------+------------------------------------------------------------------------------------------------+
153
154**Region index - region (Quad Word 0, 16:31)**
155
156Index of memory region, the buffer is located in.
157
158**Data length - length (Quad Word 0, 32:63)**
159
160Length of transmitted/received data.
161
162**Data Offset - offset (Quad Word 1, 0:31)**
163
164Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
165
166**Metadata - metadata (Quad Word 1, 32:63)**
167
168Buffer metadata.
169
170Files
171
172- net/memif/memif.h *- descriptor and ring definitions*
173- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
174
175Zero-copy client
176~~~~~~~~~~~~~~~~
177
178Zero-copy client can be enabled with memif configuration option 'zero-copy=yes'. This option
179is only relevant to client and requires eal argument '--single-file-segments'.
180This limitation is in place, because it is too expensive to identify memseg
181for each packet buffer, resulting in worse performance than with zero-copy disabled.
182With single file segments we can calculate offset from the beginning of the file
183for each packet buffer.
184
185**Shared memory format**
186
187Region 0 is created by memif driver and contains rings. Client interface exposes DPDK memory (memseg).
188Instead of using memfd_create() to create new shared file, existing memsegs are used.
189Server interface functions the same as with zero-copy disabled.
190
191region 0:
192
193+-----------------------+
194| Rings                 |
195+-----------+-----------+
196| S2M rings | M2S rings |
197+-----------+-----------+
198
199region n:
200
201+-----------------+
202| Buffers         |
203+-----------------+
204|memseg           |
205+-----------------+
206
207Buffers are dequeued and enqueued as needed. Offset descriptor field is calculated at tx.
208Only single file segments mode (EAL option --single-file-segments) is supported, as calculating
209offset from multiple segments is too expensive.
210
211Example: testpmd
212----------------------------
213In this example we run two instances of testpmd application and transmit packets over memif.
214
215First create ``server`` interface::
216
217    #./<build_dir>/app/dpdk-testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=server -- -i
218
219Now create ``client`` interface (server must be already running so the client will connect)::
220
221    #./<build_dir>/app/dpdk-testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
222
223You can also enable ``zero-copy`` on ``client`` interface::
224
225    #./<build_dir>/app/dpdk-testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif,zero-copy=yes --single-file-segments -- -i
226
227Start forwarding packets::
228
229    Client:
230        testpmd> start
231
232    Server:
233        testpmd> start tx_first
234
235Show status::
236
237    testpmd> show port stats 0
238
239For more details on testpmd please refer to :doc:`../testpmd_app_ug/index`.
240
241Example: testpmd and VPP
242------------------------
243For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
244
245Start VPP in interactive mode (should be by default). Create memif server interface in VPP::
246
247    vpp# create interface memif id 0 server no-zero-copy
248    vpp# set interface state memif0/0 up
249    vpp# set interface ip address memif0/0 192.168.1.1/24
250
251To see socket filename use show memif command::
252
253    vpp# show memif
254    sockets
255     id  listener    filename
256      0   yes (1)     /run/vpp/memif.sock
257    ...
258
259Now create memif interface by running testpmd with these command line options::
260
261    #./dpdk-testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
262
263Testpmd should now create memif client interface and try to connect to server.
264In testpmd set forward option to icmpecho and start forwarding::
265
266    testpmd> set fwd icmpecho
267    testpmd> start
268
269Send ping from VPP::
270
271    vpp# ping 192.168.1.2
272    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
273    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
274    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
275    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
276
277Example: testpmd memif loopback
278-------------------------------
279In this example we will create 2 memif ports connected into loopback.
280The situation is analogous to cross connecting 2 ports of the NIC by cable.
281
282To set the loopback, just use the same socket and id with different roles::
283
284    #./dpdk-testpmd --vdev=net_memif0,role=server,id=0 --vdev=net_memif1,role=client,id=0 -- -i
285
286Then start the communication::
287
288    testpmd> start tx_first
289
290Finally we can check port stats to see the traffic::
291
292    testpmd> show port stats all
293