15630257fSFerruh Yigit.. SPDX-License-Identifier: BSD-3-Clause 25630257fSFerruh Yigit Copyright(c) 2016 Intel Corporation. 302f96a0aSKeith Wiles 4*401a2737SStephen HemmingerTAP Poll Mode Driver 5*401a2737SStephen Hemminger==================== 602f96a0aSKeith Wiles 7*401a2737SStephen HemmingerThe TAP Poll Mode Driver (PMD) is a virtual device for injecting packets 8*401a2737SStephen Hemmingerto be processed by the Linux kernel. 9*401a2737SStephen HemmingerThis PMD is useful when writing DPDK application 10*401a2737SStephen Hemmingerfor offloading network functionality (such as tunneling) from the kernel. 1102f96a0aSKeith Wiles 12*401a2737SStephen HemmingerFrom the kernel point of view, the TAP device looks like a regular network interface. 13*401a2737SStephen HemmingerThe network device can be managed by standard tools such as ``ip`` and ``ethtool`` commands. 14*401a2737SStephen HemmingerIt is also possible to use existing packet tools such as ``wireshark`` or ``tcpdump``. 1502f96a0aSKeith Wiles 16*401a2737SStephen HemmingerFrom the DPDK application, the TAP device looks like a DPDK ethdev. 17*401a2737SStephen HemmingerPackets are sent and received in L2 (Ethernet) format. 18*401a2737SStephen HemmingerThe standard DPDK API's to query for information, statistics and send/receive packets 19*401a2737SStephen Hemmingerwork as expected. 2002f96a0aSKeith Wiles 21*401a2737SStephen Hemminger 22*401a2737SStephen HemmingerRequirements 23*401a2737SStephen Hemminger------------ 24*401a2737SStephen Hemminger 25*401a2737SStephen HemmingerThe TAP PMD requires kernel support for multiple queues in TAP device 26*401a2737SStephen Hemmingeras well as the multi-queue ``multiq`` and incoming ``ingress`` queue disciplines. 27*401a2737SStephen HemmingerThese are standard kernel features in most Linux distributions. 28*401a2737SStephen Hemminger 29*401a2737SStephen Hemminger 30*401a2737SStephen HemmingerArguments 31*401a2737SStephen Hemminger--------- 32*401a2737SStephen Hemminger 33*401a2737SStephen HemmingerTAP devices are created with the command line ``--vdev=net_tap0`` option. 34*401a2737SStephen HemmingerThis option may be specified more than once by repeating with a different ``net_tapX`` device. 35*401a2737SStephen Hemminger 36*401a2737SStephen HemmingerBy default, the Linux interfaces are named ``dtap0``, ``dtap1``, etc. 37*401a2737SStephen HemmingerThe interface name can be specified by adding the ``iface=foo0``, for example:: 3802f96a0aSKeith Wiles 390f224234SKeith Wiles --vdev=net_tap0,iface=foo0 --vdev=net_tap1,iface=foo1, ... 4002f96a0aSKeith Wiles 41*401a2737SStephen HemmingerNormally the PMD will generate a random MAC address. 42*401a2737SStephen HemmingerIf a static address is desired instead, the ``mac=fixed`` can be used:: 43f6921783SPascal Mazon 44f6921783SPascal Mazon --vdev=net_tap0,mac=fixed 45f6921783SPascal Mazon 46*401a2737SStephen HemmingerWith the fixed option, the MAC address will have the first octets: 47*401a2737SStephen Hemmingeras 02:'d':'t':'a':'p':[00-FF] and the last octets are the interface number. 48f6921783SPascal Mazon 49*401a2737SStephen HemmingerTo specify a specific MAC address use the conventional representation. 50*401a2737SStephen HemmingerThe string is byte converted to hex, the result is MAC address: ``02:64:74:61:70:11``. 51bcab6c1dSVipin Varghese 522bc06869SPascal MazonIt is possible to specify a remote netdevice to capture packets from by adding 532bc06869SPascal Mazon``remote=foo1``, for example:: 542bc06869SPascal Mazon 552bc06869SPascal Mazon --vdev=net_tap,iface=tap0,remote=foo1 562bc06869SPascal Mazon 572bc06869SPascal MazonIf a ``remote`` is set, the tap MAC address will be set to match the remote one 582bc06869SPascal Mazonjust after netdevice creation. Using TC rules, traffic from the remote netdevice 592bc06869SPascal Mazonwill be redirected to the tap. If the tap is in promiscuous mode, then all 602bc06869SPascal Mazonpackets will be redirected. In allmulti mode, all multicast packets will be 612bc06869SPascal Mazonredirected. 622bc06869SPascal Mazon 632bc06869SPascal MazonUsing the remote feature is especially useful for capturing traffic from a 642bc06869SPascal Mazonnetdevice that has no support in the DPDK. It is possible to add explicit 652bc06869SPascal Mazonrte_flow rules on the tap PMD to capture specific traffic (see next section for 662bc06869SPascal Mazonexamples). 672bc06869SPascal Mazon 689b4b4d95SStephen HemmingerNormally, when the DPDK application exits, 699b4b4d95SStephen Hemmingerthe TAP device is marked down and is removed. 70*401a2737SStephen HemmingerBut this behavior can be overridden by the use of the persist flag, example:: 719b4b4d95SStephen Hemminger 729b4b4d95SStephen Hemminger --vdev=net_tap0,iface=tap0,persist ... 739b4b4d95SStephen Hemminger 74*401a2737SStephen Hemminger 75*401a2737SStephen HemmingerTUN devices 76*401a2737SStephen Hemminger----------- 77*401a2737SStephen Hemminger 78*401a2737SStephen HemmingerThe TAP device can be used as an L3 tunnel only device (TUN). 79*401a2737SStephen HemmingerThis type of device does not include the Ethernet (L2) header; 80*401a2737SStephen Hemmingerall packets are sent and received as IP packets. 81*401a2737SStephen Hemminger 82*401a2737SStephen HemmingerTUN devices are created with the command line arguments ``--vdev=net_tunX``, 8358f7db43SVipin Varghesewhere X stands for unique id, example:: 8458f7db43SVipin Varghese 8558f7db43SVipin Varghese --vdev=net_tun0 --vdev=net_tun1,iface=foo1, ... 8658f7db43SVipin Varghese 8758f7db43SVipin VargheseUnlike TAP PMD, TUN PMD does not support user arguments as ``MAC`` or ``remote`` user 8858f7db43SVipin Vargheseoptions. Default interface name is ``dtunX``, where X stands for unique id. 8958f7db43SVipin Varghese 90*401a2737SStephen Hemminger 91de96fe68SPascal MazonFlow API support 92de96fe68SPascal Mazon---------------- 93de96fe68SPascal Mazon 94*401a2737SStephen HemmingerThe TAP PMD supports major flow API pattern items and actions. 95be9e4951SThomas Monjalon 96*401a2737SStephen HemmingerRequirements 97*401a2737SStephen Hemminger~~~~~~~~~~~~ 98be9e4951SThomas Monjalon 99*401a2737SStephen HemmingerFlow support in TAP driver requires the Linux kernel support of 100*401a2737SStephen Hemmingerflow based traffic control filter ``flower``. 101*401a2737SStephen HemmingerThis was added in Linux 4.3 kernel. 102*401a2737SStephen Hemminger 103*401a2737SStephen HemmingerThe implementation of RSS action uses an eBPF module 104*401a2737SStephen Hemmingerthat requires additional libraries and tools. 105*401a2737SStephen HemmingerBuilding the RSS support requires the ``clang`` compiler 106*401a2737SStephen Hemmingerto compile the C code to BPF target; 107*401a2737SStephen Hemminger``bpftool`` to convert the compiled BPF object to a header file; 108*401a2737SStephen Hemmingerand ``libbpf`` to load the eBPF action into the kernel. 109*401a2737SStephen Hemminger 110*401a2737SStephen HemmingerSupported match items: 111de96fe68SPascal Mazon 112de96fe68SPascal Mazon - eth: src and dst (with variable masks), and eth_type (0xffff mask). 113e58638c3SAdrien Mazarguil - vlan: vid, pcp, but not eid. (requires kernel 4.9) 114de96fe68SPascal Mazon - ipv4/6: src and dst (with variable masks), and ip_proto (0xffff mask). 115de96fe68SPascal Mazon - udp/tcp: src and dst port (0xffff) mask. 116de96fe68SPascal Mazon 117de96fe68SPascal MazonSupported actions: 118de96fe68SPascal Mazon 119de96fe68SPascal Mazon- DROP 120de96fe68SPascal Mazon- QUEUE 121de96fe68SPascal Mazon- PASSTHRU 122*401a2737SStephen Hemminger- RSS 123de96fe68SPascal Mazon 124de96fe68SPascal MazonIt is generally not possible to provide a "last" item. However, if the "last" 125de96fe68SPascal Mazonitem, once masked, is identical to the masked spec, then it is supported. 126de96fe68SPascal Mazon 127de96fe68SPascal MazonOnly IPv4/6 and MAC addresses can use a variable mask. All other items need a 128de96fe68SPascal Mazonfull mask (exact match). 129de96fe68SPascal Mazon 130de96fe68SPascal MazonAs rules are translated to TC, it is possible to show them with something like:: 131de96fe68SPascal Mazon 132*401a2737SStephen Hemminger tc -s filter show dev dtap1 parent 1: 133de96fe68SPascal Mazon 134de96fe68SPascal MazonExamples of testpmd flow rules 135de96fe68SPascal Mazon~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 136de96fe68SPascal Mazon 137781088daSAndrzej OstruszkaDrop packets for destination IP 192.0.2.1:: 138de96fe68SPascal Mazon 139781088daSAndrzej Ostruszka testpmd> flow create 0 priority 1 ingress pattern eth / ipv4 dst is 192.0.2.1 \ 140de96fe68SPascal Mazon / end actions drop / end 141de96fe68SPascal Mazon 142de96fe68SPascal MazonEnsure packets from a given MAC address are received on a queue 2:: 143de96fe68SPascal Mazon 144de96fe68SPascal Mazon testpmd> flow create 0 priority 2 ingress pattern eth src is 06:05:04:03:02:01 \ 145de96fe68SPascal Mazon / end actions queue index 2 / end 146de96fe68SPascal Mazon 147de96fe68SPascal MazonDrop UDP packets in vlan 3:: 148de96fe68SPascal Mazon 149de96fe68SPascal Mazon testpmd> flow create 0 priority 3 ingress pattern eth / vlan vid is 3 / \ 150de96fe68SPascal Mazon ipv4 proto is 17 / end actions drop / end 151de96fe68SPascal Mazon 152584f7e9fSOphir MunkDistribute IPv4 TCP packets using RSS to a given MAC address over queues 0-3:: 153584f7e9fSOphir Munk 154584f7e9fSOphir Munk testpmd> flow create 0 priority 4 ingress pattern eth dst is 0a:0b:0c:0d:0e:0f \ 155584f7e9fSOphir Munk / ipv4 / tcp / end actions rss queues 0 1 2 3 end / end 156584f7e9fSOphir Munk 157*401a2737SStephen Hemminger 158c9aa56edSRaslan DarawshehMulti-process sharing 159c9aa56edSRaslan Darawsheh--------------------- 160c9aa56edSRaslan Darawsheh 161c9aa56edSRaslan DarawshehIt is possible to attach an existing TAP device in a secondary process, 162c9aa56edSRaslan Darawshehby declaring it as a vdev with the same name as in the primary process, 163c9aa56edSRaslan Darawshehand without any parameter. 164c9aa56edSRaslan Darawsheh 165c9aa56edSRaslan DarawshehThe port attached in a secondary process will give access to the 166c9aa56edSRaslan Darawshehstatistics and the queues. 167c9aa56edSRaslan DarawshehTherefore it can be used for monitoring or Rx/Tx processing. 168c9aa56edSRaslan Darawsheh 169c9aa56edSRaslan DarawshehThe IPC synchronization of Rx/Tx queues is currently limited: 170c9aa56edSRaslan Darawsheh 171c9aa56edSRaslan Darawsheh - Maximum 8 queues shared 172c9aa56edSRaslan Darawsheh - Synchronized on probing, but not on later port update 173c9aa56edSRaslan Darawsheh 174584f7e9fSOphir Munk 175584f7e9fSOphir MunkRSS specifics 176584f7e9fSOphir Munk------------- 177584f7e9fSOphir Munk 178*401a2737SStephen HemmingerThe default packet distribution in TAP without flow rules 179*401a2737SStephen Hemmingeris done by the kernel which has a default flow based distribution. 180*401a2737SStephen HemmingerWhen flow rules are used to distribute packets across a set of queues, 181*401a2737SStephen Hemmingeran eBPF program is used to calculate the RSS based on Toeplitz algorithm 182*401a2737SStephen Hemmingerwith the given key. 183584f7e9fSOphir Munk 184*401a2737SStephen HemmingerThe hash is calculated for IPv4 and IPv6, 185*401a2737SStephen Hemmingerover src/dst addresses (8 or 32 bytes for IPv4 or IPv6 respectively) 186*401a2737SStephen Hemmingerand optionally the src/dst TCP/UDP ports (4 bytes). 187584f7e9fSOphir Munk 188adcf2717SNobuhiro Miki 189adcf2717SNobuhiro MikiLimitations 190adcf2717SNobuhiro Miki----------- 191adcf2717SNobuhiro Miki 192*401a2737SStephen Hemminger- Since TAP device uses a file descriptor to talk to the kernel, 193*401a2737SStephen Hemminger the same number of queues must be specified for receive and transmit. 194*401a2737SStephen Hemminger 195*401a2737SStephen Hemminger- The RSS algorithm only support L3 or L4 functions. 196*401a2737SStephen Hemminger It does not support finer grain selections 197*401a2737SStephen Hemminger (for example: only IPV6 packets with extension headers). 198