1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright (c) 2022 Marvell. 3 4dpdk-test-mldev Application 5=========================== 6 7The ``dpdk-test-mldev`` tool is a Data Plane Development Kit (DPDK) application 8that allows testing various mldev use cases. 9This application has a generic framework to add new mldev based test cases 10to verify functionality 11and measure the performance of inference execution on DPDK ML devices. 12 13 14Application and Options 15----------------------- 16 17The application has a number of command line options: 18 19.. code-block:: console 20 21 dpdk-test-mldev [EAL Options] -- [application options] 22 23 24EAL Options 25~~~~~~~~~~~ 26 27The following are the EAL command-line options that can be used 28with the ``dpdk-test-mldev`` application. 29See the DPDK Getting Started Guides for more information on these options. 30 31``-c <COREMASK>`` or ``-l <CORELIST>`` 32 Set the hexadecimal bitmask of the cores to run on. 33 The corelist is a list of cores to use. 34 35``-a <PCI_ID>`` 36 Attach a PCI based ML device. 37 Specific to drivers using a PCI based ML device. 38 39``--vdev <driver>`` 40 Add a virtual mldev device. 41 Specific to drivers using a ML virtual device. 42 43 44Application Options 45~~~~~~~~~~~~~~~~~~~ 46 47The following are the command-line options supported by the test application. 48 49``--test <name>`` 50 Name of the test to execute. 51 ML tests are divided into three groups: Device, Model and Inference tests. 52 Test name should be one of the following supported tests. 53 54 **ML Device Tests** :: 55 56 device_ops 57 58 **ML Model Tests** :: 59 60 model_ops 61 62 **ML Inference Tests** :: 63 64 inference_ordered 65 inference_interleave 66 67``--dev_id <n>`` 68 Set the device ID of the ML device to be used for the test. 69 Default value is ``0``. 70 71``--socket_id <n>`` 72 Set the socket ID of the application resources. 73 Default value is ``SOCKET_ID_ANY``. 74 75``--models <model_list>`` 76 Set the list of model files to be used for the tests. 77 Application expects the ``model_list`` in comma separated form 78 (i.e. ``--models model_A.bin,model_B.bin``). 79 Maximum number of models supported by the test is ``8``. 80 81``--filelist <file_list>`` 82 Set the list of model, input, output and reference files to be used for the tests. 83 Application expects the ``file_list`` to be in comma separated form 84 (i.e. ``--filelist <model,input,output>[,reference]``). 85 86 Multiple filelist entries can be specified when running the tests with multiple models. 87 Both quantized and dequantized outputs are written to the disk. 88 Dequantized output file would have the name specified by the user through ``--filelist`` option. 89 A suffix ``.q`` is appended to quantized output filename. 90 Maximum number of filelist entries supported by the test is ``8``. 91 92``--quantized_io`` 93 Disable IO quantization and dequantization. 94 95``--repetitions <n>`` 96 Set the number of inference repetitions to be executed in the test per each model. 97 Default value is ``1``. 98 99``--burst_size <n>`` 100 Set the burst size to be used when enqueuing / dequeuing inferences. 101 Default value is ``1``. 102 103``--queue_pairs <n>`` 104 Set the number of queue-pairs to be used for inference enqueue and dequeue operations. 105 Default value is ``1``. 106 107``--queue_size <n>`` 108 Set the size of queue-pair to be created for inference enqueue / dequeue operations. 109 Queue size would translate into ``rte_ml_dev_qp_conf::nb_desc`` field during queue-pair creation. 110 Default value is ``1``. 111 112``--tolerance <n>`` 113 Set the tolerance value in percentage to be used for output validation. 114 Default value is ``0``. 115 116``--stats`` 117 Enable reporting device extended stats. 118 119``--debug`` 120 Enable the tests to run in debug mode. 121 122``--help`` 123 Print help message. 124 125 126ML Device Tests 127--------------- 128 129ML device tests are functional tests to validate ML device API. 130Device tests validate the ML device handling configure, close, start and stop APIs. 131 132 133Application Options 134~~~~~~~~~~~~~~~~~~~ 135 136Supported command line options for the ``device_ops`` test are following:: 137 138 --debug 139 --test 140 --dev_id 141 --socket_id 142 --queue_pairs 143 --queue_size 144 145 146DEVICE_OPS Test 147~~~~~~~~~~~~~~~ 148 149Device ops test validates the device configuration and reconfiguration support. 150The test configures ML device based on the options 151``--queue_pairs`` and ``--queue_size`` specified by the user, 152and later reconfigures the ML device with the number of queue pairs and queue size 153based on the maximum specified through the device info. 154 155 156Example 157^^^^^^^ 158 159Command to run ``device_ops`` test: 160 161.. code-block:: console 162 163 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 164 --test=device_ops 165 166Command to run ``device_ops`` test with user options: 167 168.. code-block:: console 169 170 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 171 --test=device_ops --queue_pairs <M> --queue_size <N> 172 173 174ML Model Tests 175-------------- 176 177Model tests are functional tests to validate ML model API. 178Model tests validate the functioning of load, start, stop and unload ML models. 179 180 181Application Options 182~~~~~~~~~~~~~~~~~~~ 183 184Supported command line options for the ``model_ops`` test are following:: 185 186 --debug 187 --test 188 --dev_id 189 --socket_id 190 --models 191 192List of model files to be used for the ``model_ops`` test can be specified 193through the option ``--models <model_list>`` as a comma separated list. 194Maximum number of models supported in the test is ``8``. 195 196.. note:: 197 198 * The ``--models <model_list>`` is a mandatory option for running this test. 199 * Options not supported by the test are ignored if specified. 200 201 202MODEL_OPS Test 203~~~~~~~~~~~~~~ 204 205The test is a collection of multiple sub-tests, 206each with a different order of slow-path operations 207when handling with `N` number of models. 208 209**Sub-test A:** 210executes the sequence of load / start / stop / unload for a model in order, 211followed by next model. 212 213.. _figure_mldev_model_ops_subtest_a: 214 215.. figure:: img/mldev_model_ops_subtest_a.* 216 217 Execution sequence of model_ops subtest A. 218 219**Sub-test B:** 220executes load for all models, followed by a start for all models. 221Upon successful start of all models, stop is invoked for all models followed by unload. 222 223.. _figure_mldev_model_ops_subtest_b: 224 225.. figure:: img/mldev_model_ops_subtest_b.* 226 227 Execution sequence of model_ops subtest B. 228 229**Sub-test C:** 230loads all models, followed by a start and stop of all models in order. 231Upon completion of stop, unload is invoked for all models. 232 233.. _figure_mldev_model_ops_subtest_c: 234 235.. figure:: img/mldev_model_ops_subtest_c.* 236 237 Execution sequence of model_ops subtest C. 238 239**Sub-test D:** 240executes load and start for all models available. 241Upon successful start of all models, stop is executed for the models. 242 243.. _figure_mldev_model_ops_subtest_d: 244 245.. figure:: img/mldev_model_ops_subtest_d.* 246 247 Execution sequence of model_ops subtest D. 248 249 250Example 251^^^^^^^ 252 253Command to run ``model_ops`` test: 254 255.. code-block:: console 256 257 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 258 --test=model_ops --models model_1.bin,model_2.bin,model_3.bin, model_4.bin 259 260 261ML Inference Tests 262------------------ 263 264Inference tests are a set of tests to validate end-to-end inference execution on ML device. 265These tests executes the full sequence of operations required to run inferences 266with one or multiple models. 267 268 269Application Options 270~~~~~~~~~~~~~~~~~~~ 271 272Supported command line options for inference tests are following:: 273 274 --debug 275 --test 276 --dev_id 277 --socket_id 278 --filelist 279 --repetitions 280 --burst_size 281 --queue_pairs 282 --queue_size 283 --tolerance 284 --stats 285 286List of files to be used for the inference tests can be specified 287through the option ``--filelist <file_list>`` as a comma separated list. 288A filelist entry would be of the format 289``--filelist <model_file,input_file,output_file>[,reference_file]`` 290and is used to specify the list of files required to test with a single model. 291Multiple filelist entries are supported by the test, one entry per model. 292Maximum number of file entries supported by the test is ``8``. 293 294When ``--burst_size <num>`` option is specified for the test, 295enqueue and dequeue burst would try to enqueue or dequeue 296``num`` number of inferences per each call respectively. 297 298In the inference test, a pair of lcores are mapped to each queue pair. 299Minimum number of lcores required for the tests is equal to ``(queue_pairs * 2 + 1)``. 300 301Output validation of inference would be enabled only 302when a reference file is specified through the ``--filelist`` option. 303Application would additionally consider the tolerance value 304provided through ``--tolerance`` option during validation. 305When the tolerance values is 0, CRC32 hash of inference output 306and reference output are compared. 307When the tolerance is non-zero, element wise comparison of output is performed. 308Validation is considered as successful only 309when all the elements of the output tensor are with in the tolerance range specified. 310 311Enabling ``--stats`` would print the extended stats supported by the driver. 312 313.. note:: 314 315 * The ``--filelist <file_list>`` is a mandatory option for running inference tests. 316 * Options not supported by the tests are ignored if specified. 317 * Element wise comparison is not supported when 318 the output dtype is either fp8, fp16 or bfloat16. 319 This is applicable only when the tolerance is greater than zero 320 and for pre-quantized models only. 321 322 323INFERENCE_ORDERED Test 324~~~~~~~~~~~~~~~~~~~~~~ 325 326This is a functional test for validating the end-to-end inference execution on ML device. 327This test configures ML device and queue pairs 328as per the queue-pair related options (queue_pairs and queue_size) specified by the user. 329Upon successful configuration of the device and queue pairs, 330the first model specified through the filelist is loaded to the device 331and inferences are enqueued by a pool of worker threads to the ML device. 332Total number of inferences enqueued for the model are equal to the repetitions specified. 333A dedicated pool of worker threads would dequeue the inferences from the device. 334The model is unloaded upon completion of all inferences for the model. 335The test would continue loading and executing inference requests for all models 336specified through ``filelist`` option in an ordered manner. 337 338.. _figure_mldev_inference_ordered: 339 340.. figure:: img/mldev_inference_ordered.* 341 342 Execution of inference_ordered on single model. 343 344 345Example 346^^^^^^^ 347 348Example command to run ``inference_ordered`` test: 349 350.. code-block:: console 351 352 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 353 --test=inference_ordered --filelist model.bin,input.bin,output.bin 354 355Example command to run ``inference_ordered`` test with a specific burst size: 356 357.. code-block:: console 358 359 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 360 --test=inference_ordered --filelist model.bin,input.bin,output.bin \ 361 --burst_size 12 362 363Example command to run ``inference_ordered`` test with multiple queue-pairs and queue size: 364 365.. code-block:: console 366 367 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 368 --test=inference_ordered --filelist model.bin,input.bin,output.bin \ 369 --queue_pairs 4 --queue_size 16 370 371Example command to run ``inference_ordered`` with output validation using tolerance of ``1%``: 372 373.. code-block:: console 374 375 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 376 --test=inference_ordered --filelist model.bin,input.bin,output.bin,reference.bin \ 377 --tolerance 1.0 378 379 380INFERENCE_INTERLEAVE Test 381~~~~~~~~~~~~~~~~~~~~~~~~~ 382 383This is a stress test for validating the end-to-end inference execution on ML device. 384The test configures the ML device and queue pairs 385as per the queue-pair related options (queue_pairs and queue_size) specified by the user. 386Upon successful configuration of the device and queue pairs, 387all models specified through the filelist are loaded to the device. 388Inferences for multiple models are enqueued by a pool of worker threads in parallel. 389Inference execution by the device is interleaved between multiple models. 390Total number of inferences enqueued for a model are equal to the repetitions specified. 391An additional pool of threads would dequeue the inferences from the device. 392Models would be unloaded upon completion of inferences for all models loaded. 393 394.. _figure_mldev_inference_interleave: 395 396.. figure:: img/mldev_inference_interleave.* 397 398 Execution of inference_interleave on single model. 399 400 401Example 402^^^^^^^ 403 404Example command to run ``inference_interleave`` test: 405 406.. code-block:: console 407 408 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 409 --test=inference_interleave --filelist model.bin,input.bin,output.bin 410 411Example command to run ``inference_interleave`` test with multiple models: 412 413.. code-block:: console 414 415 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 416 --test=inference_interleave --filelist model_A.bin,input_A.bin,output_A.bin \ 417 --filelist model_B.bin,input_B.bin,output_B.bin 418 419Example command to run ``inference_interleave`` test 420with a specific burst size, multiple queue-pairs and queue size: 421 422.. code-block:: console 423 424 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 425 --test=inference_interleave --filelist model.bin,input.bin,output.bin \ 426 --queue_pairs 8 --queue_size 12 --burst_size 16 427 428Example command to run ``inference_interleave`` test 429with multiple models and output validation using tolerance of ``2.0%``: 430 431.. code-block:: console 432 433 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 434 --test=inference_interleave \ 435 --filelist model_A.bin,input_A.bin,output_A.bin,reference_A.bin \ 436 --filelist model_B.bin,input_B.bin,output_B.bin,reference_B.bin \ 437 --tolerance 2.0 438 439 440Debug mode 441---------- 442 443ML tests can be executed in debug mode by enabling the option ``--debug``. 444Execution of tests in debug mode would enable additional prints. 445 446When a validation failure is observed, output from that buffer is written to the disk, 447with the filenames having similar convention when the test has passed. 448Additionally index of the buffer would be appended to the filenames. 449