xref: /dpdk/doc/guides/prog_guide/lpm_lib.rst (revision a45b288ef21a5ab1da6bbe95a33404006c7cb051)
1..  BSD LICENSE
2    Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
3    All rights reserved.
4
5    Redistribution and use in source and binary forms, with or without
6    modification, are permitted provided that the following conditions
7    are met:
8
9    * Redistributions of source code must retain the above copyright
10    notice, this list of conditions and the following disclaimer.
11    * Redistributions in binary form must reproduce the above copyright
12    notice, this list of conditions and the following disclaimer in
13    the documentation and/or other materials provided with the
14    distribution.
15    * Neither the name of Intel Corporation nor the names of its
16    contributors may be used to endorse or promote products derived
17    from this software without specific prior written permission.
18
19    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31.. _LPM_Library:
32
33LPM Library
34===========
35
36The Intel® DPDK LPM library component implements the Longest Prefix Match (LPM) table search method for 32-bit keys
37that is typically used to find the best route match in IP forwarding applications.
38
39LPM API Overview
40----------------
41
42The main configuration parameter for LPM component instances is the maximum number of rules to support.
43An LPM prefix is represented by a pair of parameters (32- bit key, depth), with depth in the range of 1 to 32.
44An LPM rule is represented by an LPM prefix and some user data associated with the prefix.
45The prefix serves as the unique identifier of the LPM rule.
46In this implementation, the user data is 1-byte long and is called next hop,
47in correlation with its main use of storing the ID of the next hop in a routing table entry.
48
49The main methods exported by the LPM component are:
50
51*   Add LPM rule: The LPM rule is provided as input.
52    If there is no rule with the same prefix present in the table, then the new rule is added to the LPM table.
53    If a rule with the same prefix is already present in the table, the next hop of the rule is updated.
54    An error is returned when there is no available rule space left.
55
56*   Delete LPM rule: The prefix of the LPM rule is provided as input.
57    If a rule with the specified prefix is present in the LPM table, then it is removed.
58
59*   Lookup LPM key: The 32-bit key is provided as input.
60    The algorithm selects the rule that represents the best match for the given key and returns the next hop of that rule.
61    In the case that there are multiple rules present in the LPM table that have the same 32-bit key,
62    the algorithm picks the rule with the highest depth as the best match rule,
63    which means that the rule has the highest number of most significant bits matching between the input key and the rule key.
64
65Implementation Details
66----------------------
67
68The current implementation uses a variation of the DIR-24-8 algorithm that trades memory usage for improved LPM lookup speed.
69The algorithm allows the lookup operation to be performed with typically a single memory read access.
70In the statistically rare case when the best match rule is having a depth bigger than 24,
71the lookup operation requires two memory read accesses.
72Therefore, the performance of the LPM lookup operation is greatly influenced by
73whether the specific memory location is present in the processor cache or not.
74
75The main data structure is built using the following elements:
76
77*   A table with 2^24 entries.
78
79*   A number of tables (RTE_LPM_TBL8_NUM_GROUPS) with 2^8 entries.
80
81The first table, called tbl24, is indexed using the first 24 bits of the IP address to be looked up,
82while the second table(s), called tbl8, is indexed using the last 8 bits of the IP address.
83This means that depending on the outcome of trying to match the IP address of an incoming packet to the rule stored in the tbl24
84we might need to continue the lookup process in the second level.
85
86Since every entry of the tbl24 can potentially point to a tbl8, ideally, we would have 2^24 tbl8s,
87which would be the same as having a single table with 2^32 entries.
88This is not feasible due to resource restrictions.
89Instead, this approach takes advantage of the fact that rules longer than 24 bits are very rare.
90By splitting the process in two different tables/levels and limiting the number of tbl8s,
91we can greatly reduce memory consumption while maintaining a very good lookup speed (one memory access, most of the times).
92
93.. image39 has been renamed
94
95|tbl24_tbl8|
96
97An entry in tbl24 contains the following fields:
98
99*   next hop / index to the tbl8
100
101*   valid flag
102
103*   external entry flag
104
105*   depth of the rule (length)
106
107The first field can either contain a number indicating the tbl8 in which the lookup process should continue
108or the next hop itself if the longest prefix match has already been found.
109The two flags are used to determine whether the entry is valid or not and
110whether the search process have finished or not respectively.
111The depth or length of the rule is the number of bits of the rule that is stored in a specific entry.
112
113An entry in a tbl8 contains the following fields:
114
115*   next hop
116
117*   valid
118
119*   valid group
120
121*   depth
122
123Next hop and depth contain the same information as in the tbl24.
124The two flags show whether the entry and the table are valid respectively.
125
126The other main data structure is a table containing the main information about the rules (IP and next hop).
127This is a higher level table, used for different things:
128
129*   Check whether a rule already exists or not, prior to addition or deletion,
130    without having to actually perform a lookup.
131
132*   When deleting, to check whether there is a rule containing the one that is to be deleted.
133    This is important, since the main data structure will have to be updated accordingly.
134
135Addition
136~~~~~~~~
137
138When adding a rule, there are different possibilities.
139If the rule's depth is exactly 24 bits, then:
140
141*   Use the rule (IP address) as an index to the tbl24.
142
143*   If the entry is invalid (i.e. it doesn't already contain a rule) then set its next hop to its value,
144    the valid flag to 1 (meaning this entry is in use),
145    and the external entry flag to 0
146    (meaning the lookup process ends at this point, since this is the longest prefix that matches).
147
148If the rule's depth is exactly 32 bits, then:
149
150*   Use the first 24 bits of the rule as an index to the tbl24.
151
152*   If the entry is invalid (i.e. it doesn't already contain a rule) then look for a free tbl8,
153    set the index to the tbl8 to this value,
154    the valid flag to 1 (meaning this entry is in use), and the external entry flag to 1
155    (meaning the lookup process must continue since the rule hasn't been explored completely).
156
157If the rule's depth is any other value, prefix expansion must be performed.
158This means the rule is copied to all the entries (as long as they are not in use) which would also cause a match.
159
160As a simple example, let's assume the depth is 20 bits.
161This means that there are 2^(24 - 20) = 16 different combinations of the first 24 bits of an IP address that
162would cause a match.
163Hence, in this case, we copy the exact same entry to every position indexed by one of these combinations.
164
165By doing this we ensure that during the lookup process, if a rule matching the IP address exists,
166it is found in either one or two memory accesses,
167depending on whether we need to move to the next table or not.
168Prefix expansion is one of the keys of this algorithm,
169since it improves the speed dramatically by adding redundancy.
170
171Lookup
172~~~~~~
173
174The lookup process is much simpler and quicker. In this case:
175
176*   Use the first 24 bits of the IP address as an index to the tbl24.
177    If the entry is not in use, then it means we don't have a rule matching this IP.
178    If it is valid and the external entry flag is set to 0, then the next hop is returned.
179
180*   If it is valid and the external entry flag is set to 1,
181    then we use the tbl8 index to find out the tbl8 to be checked,
182    and the last 8 bits of the IP address as an index to this table.
183    Similarly, if the entry is not in use, then we don't have a rule matching this IP address.
184    If it is valid then the next hop is returned.
185
186Limitations in the Number of Rules
187~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
188
189There are different things that limit the number of rules that can be added.
190The first one is the maximum number of rules, which is a parameter passed through the API.
191Once this number is reached,
192it is not possible to add any more rules to the routing table unless one or more are removed.
193
194The second reason is an intrinsic limitation of the algorithm.
195As explained before, to avoid high memory consumption, the number of tbl8s is limited in compilation time
196(this value is by default 256).
197If we exhaust tbl8s, we won't be able to add any more rules.
198How many of them are necessary for a specific routing table is hard to determine in advance.
199
200A tbl8 is consumed whenever we have a new rule with depth bigger than 24,
201and the first 24 bits of this rule are not the same as the first 24 bits of a rule previously added.
202If they are, then the new rule will share the same tbl8 than the previous one,
203since the only difference between the two rules is within the last byte.
204
205With the default value of 256, we can have up to 256 rules longer than 24 bits that differ on their first three bytes.
206Since routes longer than 24 bits are unlikely, this shouldn't be a problem in most setups.
207Even if it is, however, the number of tbl8s can be modified.
208
209Use Case: IPv4 Forwarding
210~~~~~~~~~~~~~~~~~~~~~~~~~
211
212The LPM algorithm is used to implement Classless Inter-Domain Routing (CIDR) strategy used by routers implementing IPv4 forwarding.
213
214References
215~~~~~~~~~~
216
217*   RFC1519 Classless Inter-Domain Routing (CIDR): an Address Assignment and Aggregation Strategy,
218    `http://www.ietf.org/rfc/rfc1519 <http://www.ietf.org/rfc/rfc1519>`_
219
220*   Pankaj Gupta, Algorithms for Routing Lookups and Packet Classification, PhD Thesis, Stanford University,
221    2000  (`http://klamath.stanford.edu/~pankaj/thesis/ thesis_1sided.pdf <http://klamath.stanford.edu/~pankaj/thesis/%20thesis_1sided.pdf>`_ )
222
223.. |tbl24_tbl8| image:: img/tbl24_tbl8.png
224