1.. BSD LICENSE 2 Copyright(c) 2016-2017 Intel Corporation. All rights reserved. 3 4 Redistribution and use in source and binary forms, with or without 5 modification, are permitted provided that the following conditions 6 are met: 7 8 * Redistributions of source code must retain the above copyright 9 notice, this list of conditions and the following disclaimer. 10 * Redistributions in binary form must reproduce the above copyright 11 notice, this list of conditions and the following disclaimer in 12 the documentation and/or other materials provided with the 13 distribution. 14 * Neither the name of Intel Corporation nor the names of its 15 contributors may be used to endorse or promote products derived 16 from this software without specific prior written permission. 17 18 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 19 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 20 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 21 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 22 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 23 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 24 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 25 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 26 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 27 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 28 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 30 31Cryptography Device Library 32=========================== 33 34The cryptodev library provides a Crypto device framework for management and 35provisioning of hardware and software Crypto poll mode drivers, defining generic 36APIs which support a number of different Crypto operations. The framework 37currently only supports cipher, authentication, chained cipher/authentication 38and AEAD symmetric Crypto operations. 39 40 41Design Principles 42----------------- 43 44The cryptodev library follows the same basic principles as those used in DPDKs 45Ethernet Device framework. The Crypto framework provides a generic Crypto device 46framework which supports both physical (hardware) and virtual (software) Crypto 47devices as well as a generic Crypto API which allows Crypto devices to be 48managed and configured and supports Crypto operations to be provisioned on 49Crypto poll mode driver. 50 51 52Device Management 53----------------- 54 55Device Creation 56~~~~~~~~~~~~~~~ 57 58Physical Crypto devices are discovered during the PCI probe/enumeration of the 59EAL function which is executed at DPDK initialization, based on 60their PCI device identifier, each unique PCI BDF (bus/bridge, device, 61function). Specific physical Crypto devices, like other physical devices in DPDK 62can be white-listed or black-listed using the EAL command line options. 63 64Virtual devices can be created by two mechanisms, either using the EAL command 65line options or from within the application using an EAL API directly. 66 67From the command line using the --vdev EAL option 68 69.. code-block:: console 70 71 --vdev 'cryptodev_aesni_mb_pmd0,max_nb_queue_pairs=2,max_nb_sessions=1024,socket_id=0' 72 73Our using the rte_vdev_init API within the application code. 74 75.. code-block:: c 76 77 rte_vdev_init("cryptodev_aesni_mb_pmd", 78 "max_nb_queue_pairs=2,max_nb_sessions=1024,socket_id=0") 79 80All virtual Crypto devices support the following initialization parameters: 81 82* ``max_nb_queue_pairs`` - maximum number of queue pairs supported by the device. 83* ``max_nb_sessions`` - maximum number of sessions supported by the device 84* ``socket_id`` - socket on which to allocate the device resources on. 85 86 87Device Identification 88~~~~~~~~~~~~~~~~~~~~~ 89 90Each device, whether virtual or physical is uniquely designated by two 91identifiers: 92 93- A unique device index used to designate the Crypto device in all functions 94 exported by the cryptodev API. 95 96- A device name used to designate the Crypto device in console messages, for 97 administration or debugging purposes. For ease of use, the port name includes 98 the port index. 99 100 101Device Configuration 102~~~~~~~~~~~~~~~~~~~~ 103 104The configuration of each Crypto device includes the following operations: 105 106- Allocation of resources, including hardware resources if a physical device. 107- Resetting the device into a well-known default state. 108- Initialization of statistics counters. 109 110The rte_cryptodev_configure API is used to configure a Crypto device. 111 112.. code-block:: c 113 114 int rte_cryptodev_configure(uint8_t dev_id, 115 struct rte_cryptodev_config *config) 116 117The ``rte_cryptodev_config`` structure is used to pass the configuration 118parameters for socket selection and number of queue pairs. 119 120.. code-block:: c 121 122 struct rte_cryptodev_config { 123 int socket_id; 124 /**< Socket to allocate resources on */ 125 uint16_t nb_queue_pairs; 126 /**< Number of queue pairs to configure on device */ 127 }; 128 129 130Configuration of Queue Pairs 131~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 132 133Each Crypto devices queue pair is individually configured through the 134``rte_cryptodev_queue_pair_setup`` API. 135Each queue pairs resources may be allocated on a specified socket. 136 137.. code-block:: c 138 139 int rte_cryptodev_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id, 140 const struct rte_cryptodev_qp_conf *qp_conf, 141 int socket_id) 142 143 struct rte_cryptodev_qp_conf { 144 uint32_t nb_descriptors; /**< Number of descriptors per queue pair */ 145 }; 146 147 148Logical Cores, Memory and Queues Pair Relationships 149~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 150 151The Crypto device Library as the Poll Mode Driver library support NUMA for when 152a processor’s logical cores and interfaces utilize its local memory. Therefore 153Crypto operations, and in the case of symmetric Crypto operations, the session 154and the mbuf being operated on, should be allocated from memory pools created 155in the local memory. The buffers should, if possible, remain on the local 156processor to obtain the best performance results and buffer descriptors should 157be populated with mbufs allocated from a mempool allocated from local memory. 158 159The run-to-completion model also performs better, especially in the case of 160virtual Crypto devices, if the Crypto operation and session and data buffer is 161in local memory instead of a remote processor's memory. This is also true for 162the pipe-line model provided all logical cores used are located on the same 163processor. 164 165Multiple logical cores should never share the same queue pair for enqueuing 166operations or dequeuing operations on the same Crypto device since this would 167require global locks and hinder performance. It is however possible to use a 168different logical core to dequeue an operation on a queue pair from the logical 169core which it was enqueued on. This means that a crypto burst enqueue/dequeue 170APIs are a logical place to transition from one logical core to another in a 171packet processing pipeline. 172 173 174Device Features and Capabilities 175--------------------------------- 176 177Crypto devices define their functionality through two mechanisms, global device 178features and algorithm capabilities. Global devices features identify device 179wide level features which are applicable to the whole device such as 180the device having hardware acceleration or supporting symmetric Crypto 181operations, 182 183The capabilities mechanism defines the individual algorithms/functions which 184the device supports, such as a specific symmetric Crypto cipher, 185authentication operation or Authenticated Encryption with Associated Data 186(AEAD) operation. 187 188 189Device Features 190~~~~~~~~~~~~~~~ 191 192Currently the following Crypto device features are defined: 193 194* Symmetric Crypto operations 195* Asymmetric Crypto operations 196* Chaining of symmetric Crypto operations 197* SSE accelerated SIMD vector operations 198* AVX accelerated SIMD vector operations 199* AVX2 accelerated SIMD vector operations 200* AESNI accelerated instructions 201* Hardware off-load processing 202 203 204Device Operation Capabilities 205~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 206 207Crypto capabilities which identify particular algorithm which the Crypto PMD 208supports are defined by the operation type, the operation transform, the 209transform identifier and then the particulars of the transform. For the full 210scope of the Crypto capability see the definition of the structure in the 211*DPDK API Reference*. 212 213.. code-block:: c 214 215 struct rte_cryptodev_capabilities; 216 217Each Crypto poll mode driver defines its own private array of capabilities 218for the operations it supports. Below is an example of the capabilities for a 219PMD which supports the authentication algorithm SHA1_HMAC and the cipher 220algorithm AES_CBC. 221 222.. code-block:: c 223 224 static const struct rte_cryptodev_capabilities pmd_capabilities[] = { 225 { /* SHA1 HMAC */ 226 .op = RTE_CRYPTO_OP_TYPE_SYMMETRIC, 227 .sym = { 228 .xform_type = RTE_CRYPTO_SYM_XFORM_AUTH, 229 .auth = { 230 .algo = RTE_CRYPTO_AUTH_SHA1_HMAC, 231 .block_size = 64, 232 .key_size = { 233 .min = 64, 234 .max = 64, 235 .increment = 0 236 }, 237 .digest_size = { 238 .min = 12, 239 .max = 12, 240 .increment = 0 241 }, 242 .aad_size = { 0 }, 243 .iv_size = { 0 } 244 } 245 } 246 }, 247 { /* AES CBC */ 248 .op = RTE_CRYPTO_OP_TYPE_SYMMETRIC, 249 .sym = { 250 .xform_type = RTE_CRYPTO_SYM_XFORM_CIPHER, 251 .cipher = { 252 .algo = RTE_CRYPTO_CIPHER_AES_CBC, 253 .block_size = 16, 254 .key_size = { 255 .min = 16, 256 .max = 32, 257 .increment = 8 258 }, 259 .iv_size = { 260 .min = 16, 261 .max = 16, 262 .increment = 0 263 } 264 } 265 } 266 } 267 } 268 269 270Capabilities Discovery 271~~~~~~~~~~~~~~~~~~~~~~ 272 273Discovering the features and capabilities of a Crypto device poll mode driver 274is achieved through the ``rte_cryptodev_info_get`` function. 275 276.. code-block:: c 277 278 void rte_cryptodev_info_get(uint8_t dev_id, 279 struct rte_cryptodev_info *dev_info); 280 281This allows the user to query a specific Crypto PMD and get all the device 282features and capabilities. The ``rte_cryptodev_info`` structure contains all the 283relevant information for the device. 284 285.. code-block:: c 286 287 struct rte_cryptodev_info { 288 const char *driver_name; 289 uint8_t driver_id; 290 struct rte_pci_device *pci_dev; 291 292 uint64_t feature_flags; 293 294 const struct rte_cryptodev_capabilities *capabilities; 295 296 unsigned max_nb_queue_pairs; 297 298 struct { 299 unsigned max_nb_sessions; 300 } sym; 301 }; 302 303 304Operation Processing 305-------------------- 306 307Scheduling of Crypto operations on DPDK's application data path is 308performed using a burst oriented asynchronous API set. A queue pair on a Crypto 309device accepts a burst of Crypto operations using enqueue burst API. On physical 310Crypto devices the enqueue burst API will place the operations to be processed 311on the devices hardware input queue, for virtual devices the processing of the 312Crypto operations is usually completed during the enqueue call to the Crypto 313device. The dequeue burst API will retrieve any processed operations available 314from the queue pair on the Crypto device, from physical devices this is usually 315directly from the devices processed queue, and for virtual device's from a 316``rte_ring`` where processed operations are place after being processed on the 317enqueue call. 318 319 320Enqueue / Dequeue Burst APIs 321~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 322 323The burst enqueue API uses a Crypto device identifier and a queue pair 324identifier to specify the Crypto device queue pair to schedule the processing on. 325The ``nb_ops`` parameter is the number of operations to process which are 326supplied in the ``ops`` array of ``rte_crypto_op`` structures. 327The enqueue function returns the number of operations it actually enqueued for 328processing, a return value equal to ``nb_ops`` means that all packets have been 329enqueued. 330 331.. code-block:: c 332 333 uint16_t rte_cryptodev_enqueue_burst(uint8_t dev_id, uint16_t qp_id, 334 struct rte_crypto_op **ops, uint16_t nb_ops) 335 336The dequeue API uses the same format as the enqueue API of processed but 337the ``nb_ops`` and ``ops`` parameters are now used to specify the max processed 338operations the user wishes to retrieve and the location in which to store them. 339The API call returns the actual number of processed operations returned, this 340can never be larger than ``nb_ops``. 341 342.. code-block:: c 343 344 uint16_t rte_cryptodev_dequeue_burst(uint8_t dev_id, uint16_t qp_id, 345 struct rte_crypto_op **ops, uint16_t nb_ops) 346 347 348Operation Representation 349~~~~~~~~~~~~~~~~~~~~~~~~ 350 351An Crypto operation is represented by an rte_crypto_op structure, which is a 352generic metadata container for all necessary information required for the 353Crypto operation to be processed on a particular Crypto device poll mode driver. 354 355.. figure:: img/crypto_op.* 356 357The operation structure includes the operation type, the operation status 358and the session type (session-based/less), a reference to the operation 359specific data, which can vary in size and content depending on the operation 360being provisioned. It also contains the source mempool for the operation, 361if it allocated from a mempool. 362 363If Crypto operations are allocated from a Crypto operation mempool, see next 364section, there is also the ability to allocate private memory with the 365operation for applications purposes. 366 367Application software is responsible for specifying all the operation specific 368fields in the ``rte_crypto_op`` structure which are then used by the Crypto PMD 369to process the requested operation. 370 371 372Operation Management and Allocation 373~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 374 375The cryptodev library provides an API set for managing Crypto operations which 376utilize the Mempool Library to allocate operation buffers. Therefore, it ensures 377that the crytpo operation is interleaved optimally across the channels and 378ranks for optimal processing. 379A ``rte_crypto_op`` contains a field indicating the pool that it originated from. 380When calling ``rte_crypto_op_free(op)``, the operation returns to its original pool. 381 382.. code-block:: c 383 384 extern struct rte_mempool * 385 rte_crypto_op_pool_create(const char *name, enum rte_crypto_op_type type, 386 unsigned nb_elts, unsigned cache_size, uint16_t priv_size, 387 int socket_id); 388 389During pool creation ``rte_crypto_op_init()`` is called as a constructor to 390initialize each Crypto operation which subsequently calls 391``__rte_crypto_op_reset()`` to configure any operation type specific fields based 392on the type parameter. 393 394 395``rte_crypto_op_alloc()`` and ``rte_crypto_op_bulk_alloc()`` are used to allocate 396Crypto operations of a specific type from a given Crypto operation mempool. 397``__rte_crypto_op_reset()`` is called on each operation before being returned to 398allocate to a user so the operation is always in a good known state before use 399by the application. 400 401.. code-block:: c 402 403 struct rte_crypto_op *rte_crypto_op_alloc(struct rte_mempool *mempool, 404 enum rte_crypto_op_type type) 405 406 unsigned rte_crypto_op_bulk_alloc(struct rte_mempool *mempool, 407 enum rte_crypto_op_type type, 408 struct rte_crypto_op **ops, uint16_t nb_ops) 409 410``rte_crypto_op_free()`` is called by the application to return an operation to 411its allocating pool. 412 413.. code-block:: c 414 415 void rte_crypto_op_free(struct rte_crypto_op *op) 416 417 418Symmetric Cryptography Support 419------------------------------ 420 421The cryptodev library currently provides support for the following symmetric 422Crypto operations; cipher, authentication, including chaining of these 423operations, as well as also supporting AEAD operations. 424 425 426Session and Session Management 427 428Sessions are used in symmetric cryptographic processing to store the immutable 429data defined in a cryptographic transform which is used in the operation 430processing of a packet flow. Sessions are used to manage information such as 431expand cipher keys and HMAC IPADs and OPADs, which need to be calculated for a 432particular Crypto operation, but are immutable on a packet to packet basis for 433a flow. Crypto sessions cache this immutable data in a optimal way for the 434underlying PMD and this allows further acceleration of the offload of 435Crypto workloads. 436 437.. figure:: img/cryptodev_sym_sess.* 438 439The Crypto device framework provides APIs to allocate and initizalize sessions 440for crypto devices, where sessions are mempool objects. 441It is the application's responsibility to create and manage the session mempools. 442This approach allows for different scenarios such as having a single session 443mempool for all crypto devices (where the mempool object size is big 444enough to hold the private session of any crypto device), as well as having 445multiple session mempools of different sizes for better memory usage. 446 447An application can use ``rte_cryptodev_get_private_session_size()`` to 448get the private session size of given crypto device. This function would allow 449an application to calculate the max device session size of all crypto devices 450to create a single session mempool. 451If instead an application creates multiple session mempools, the Crypto device 452framework also provides ``rte_cryptodev_get_header_session_size`` to get 453the size of an uninitialized session. 454 455Once the session mempools have been created, ``rte_cryptodev_sym_session_create()`` 456is used to allocate an uninitialized session from the given mempool. 457The session then must be initialized using ``rte_cryptodev_sym_session_init()`` 458for each of the required crypto devices. A symmetric transform chain 459is used to specify the operation and its parameters. See the section below for 460details on transforms. 461 462When a session is no longer used, user must call ``rte_cryptodev_sym_session_clear()`` 463for each of the crypto devices that are using the session, to free all driver 464private session data. Once this is done, session should be freed using 465``rte_cryptodev_sym_session_free`` which returns them to their mempool. 466 467 468Transforms and Transform Chaining 469~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 470 471Symmetric Crypto transforms (``rte_crypto_sym_xform``) are the mechanism used 472to specify the details of the Crypto operation. For chaining of symmetric 473operations such as cipher encrypt and authentication generate, the next pointer 474allows transform to be chained together. Crypto devices which support chaining 475must publish the chaining of symmetric Crypto operations feature flag. 476 477Currently there are three transforms types cipher, authentication and AEAD. 478Also it is important to note that the order in which the 479transforms are passed indicates the order of the chaining. 480 481.. code-block:: c 482 483 struct rte_crypto_sym_xform { 484 struct rte_crypto_sym_xform *next; 485 /**< next xform in chain */ 486 enum rte_crypto_sym_xform_type type; 487 /**< xform type */ 488 union { 489 struct rte_crypto_auth_xform auth; 490 /**< Authentication / hash xform */ 491 struct rte_crypto_cipher_xform cipher; 492 /**< Cipher xform */ 493 struct rte_crypto_aead_xform aead; 494 /**< AEAD xform */ 495 }; 496 }; 497 498The API does not place a limit on the number of transforms that can be chained 499together but this will be limited by the underlying Crypto device poll mode 500driver which is processing the operation. 501 502.. figure:: img/crypto_xform_chain.* 503 504 505Symmetric Operations 506~~~~~~~~~~~~~~~~~~~~ 507 508The symmetric Crypto operation structure contains all the mutable data relating 509to performing symmetric cryptographic processing on a referenced mbuf data 510buffer. It is used for either cipher, authentication, AEAD and chained 511operations. 512 513As a minimum the symmetric operation must have a source data buffer (``m_src``), 514a valid session (or transform chain if in session-less mode) and the minimum 515authentication/ cipher/ AEAD parameters required depending on the type of operation 516specified in the session or the transform 517chain. 518 519.. code-block:: c 520 521 struct rte_crypto_sym_op { 522 struct rte_mbuf *m_src; 523 struct rte_mbuf *m_dst; 524 525 union { 526 struct rte_cryptodev_sym_session *session; 527 /**< Handle for the initialised session context */ 528 struct rte_crypto_sym_xform *xform; 529 /**< Session-less API Crypto operation parameters */ 530 }; 531 532 union { 533 struct { 534 struct { 535 uint32_t offset; 536 uint32_t length; 537 } data; /**< Data offsets and length for AEAD */ 538 539 struct { 540 uint8_t *data; 541 phys_addr_t phys_addr; 542 } digest; /**< Digest parameters */ 543 544 struct { 545 uint8_t *data; 546 phys_addr_t phys_addr; 547 } aad; 548 /**< Additional authentication parameters */ 549 } aead; 550 551 struct { 552 struct { 553 struct { 554 uint32_t offset; 555 uint32_t length; 556 } data; /**< Data offsets and length for ciphering */ 557 } cipher; 558 559 struct { 560 struct { 561 uint32_t offset; 562 uint32_t length; 563 } data; 564 /**< Data offsets and length for authentication */ 565 566 struct { 567 uint8_t *data; 568 phys_addr_t phys_addr; 569 } digest; /**< Digest parameters */ 570 } auth; 571 }; 572 }; 573 }; 574 575 576Asymmetric Cryptography 577----------------------- 578 579Asymmetric functionality is currently not supported by the cryptodev API. 580 581 582Crypto Device API 583~~~~~~~~~~~~~~~~~ 584 585The cryptodev Library API is described in the *DPDK API Reference* document. 586