Configure Napatech SmartNICs for optimal performance of TRex.
About this task
Procedure
-
Modify the /opt/napatech3/config/ntservice.ini file. The Napatech
driver uses host buffers to receive and transmit data. The Napatech driver is configured
using the ntservice.ini file. The following shows the default host
buffer configuration.
NumaNode = -1 HostBuffersRx = [4,16,-1] HostBuffersTx = [4,16,-1]
Four 16 MB host buffers for both RX and TX are configured on the NUMA node taken from the NumaNode value. NUMAnode = -1 means that driver attempts to determine the SmartNIC NUMA node location. The number of host buffers must be equal to or larger than the number of CPU cores/threads used by TRex. In the following example the number of host buffers for RX and TX is set to 32.HostBuffersRx = [32,16,-1] HostBuffersTx = [32,16,-1]
In addition, apply the following changes to the /opt/napatech3/config/ntservice.ini file.[System] ... # Set statistics interval 1 milliseconds. StatInterval=1 ... [Adapter0] ... # All pending TX packets are canceled when TX streams are closed. # The port mask value is a bit mask. # 1 for port 0, 3 for port 0 1,F for port 0 1 2 3. CancelTxOnCloseMask = 3 ... # Set host buffer refresh interval to 50 microseconds. HostBufferRefreshIntervalTx = 50 ...
See DN-0449 for more information about the configuration parameters in the /opt/napatech3/config/ntservice.ini file. -
Stop and restart ntservice after changes are made in the
ntservice.ini file as shown in the following example.
/opt/napatech3/bin/ntstop.sh /opt/napatech3/bin/ntstart.sh
-
Adjust the NUMA node and the CPU cores in the /etc/trex_cfg.yaml
file according to the target system setup. In the following configuration example, NUMA
node is set to 0, and 18 CPU cores on NUMA node 0 are specified.
### Config file generated by dpdk_setup_ports.py ### - version: 2 interfaces: ['03:00.0/0', '03:00.0/1'] port_info: - dest_mac: 00:0d:e9:07:9d:18 # MAC OF LOOPBACK TO IT'S DUAL INTERFACE src_mac: 00:0d:e9:07:9d:17 - dest_mac: 00:0d:e9:07:9d:17 # MAC OF LOOPBACK TO IT'S DUAL INTERFACE src_mac: 00:0d:e9:07:9d:18 platform: master_thread_id: 0 latency_thread_id: 38 dual_if: - socket: 0 threads: [2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36]
Note: Use the lscpu command to determine the mapping of CPUs to NUMA nodes.# lscpu | grep NUMA NUMA node(s): 2 NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
-
Run TRex. The following command example runs TRex in stateless mode using 12 CPU
cores/threads. Adjust the number of CPU cores to improve performance.
./t-rex-64 -i -c 12
-
Start trex-console in the new terminal.
./trex-console
An output example:Using 'python' as Python interpreter Connecting to RPC server on localhost:4501 [SUCCESS] Connecting to publisher server on localhost:4500 [SUCCESS] Acquiring ports [0, 1]: [SUCCESS] Server Info: Server version: v2.71 @ STL Server mode: Stateless Server CPU: 11 x Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz Ports count: 2 x 100Gbps @ NT200A02 Network Adapter -=TRex Console v3.0=- Type 'help' or '?' for supported actions trex>
-
Start generating traffic as shown in the following trex-console
command example.
trex>start -p 0 1 -m 99.9% -f stl/bench.py -t size=1514
An output example:Removing all streams from port(s) [0._]: [SUCCESS] Attaching 1 streams to port(s) [0._]: [SUCCESS] Starting traffic on port(s) [0._]: [SUCCESS] 31.18 [ms]
-
The first terminal will show the statistics related to the currently generated
traffic.
An output example:
-Per port stats table ports | 0 | (link DOWN) 1 ----------------------------------------------------------------------------------------- opackets | 2671171951 | 0 obytes | 4044154333814 | 0 ipackets | 2671173459 | 0 ibytes | 4044156660414 | 0 ierrors | 0 | 0 oerrors | 0 | 0 Tx Bw | 98.51 Gbps | 0.00 bps -Global stats enabled Cpu Utilization : 17.3 % 103.4 Gb/core Platform_factor : 1.0 Total-Tx : 98.51 Gbps Total-Rx : 98.51 Gbps Total-PPS : 8.13 Mpps Total-CPS : 0.00 cps Expected-PPS : 0.00 pps Expected-CPS : 0.00 cps Expected-BPS : 0.00 bps Active-flows : 0 Clients : 0 Socket-util : 0.0000 % Open-flows : 0 Servers : 0 Socket : 0 Socket/Clients : -nan Total_queue_full : 1668795616 drop-rate : 0.00 bps current time : 817.6 sec test duration : 0.0 sec
-
More statistics can be displayed on trex-console using the
tui command.
trex>tui
An output example:Global Statistics connection : localhost, Port 4501 total_tx_L2 : 98.53 Gb/sec version : STL @ v2.71 total_tx_L1 : 99.83 Gb/sec cpu_util. : 15.29% @ 12 cores (12 per dual port) total_rx : 98.53 Gb/sec rx_cpu_util. : 0.0% / 0 pkt/sec total_pps : 8.11 Mpkt/sec async_util. : 0.06% / 1.24 KB/sec drop_rate : 0 b/sec total_cps. : 0 cps/sec queue_full : 0 pkts Port Statistics port | 0 | 1 | total -----------+-------------------+-------------------+------------------ owner | root | root | link | UP | DOWN | state | TRANSMITTING | IDLE | speed | 100 Gb/s | 1 Gb/s | CPU util. | 15.29% | 0.0% | -- | | | Tx bps L2 | 98.53 Gbps | 0 bps | 98.53 Gbps Tx bps L1 | 99.83 Gbps | 0 bps | 99.83 Gbps Tx pps | 8.11 Mpps | 0 pps | 8.11 Mpps Line Util. | 99.83 % | 0 % | --- | | | Rx bps | 98.53 Gbps | 0 bps | 98.53 Gbps Rx pps | 8.11 Mpps | 0 pps | 8.11 Mpps ---- | | | opackets | 4173272792 | 0 | 4173272792 ipackets | 4173273453 | 0 | 4173273453 obytes | 500186876910 | 0 | 500186876910 ibytes | 500187922358 | 0 | 500187922358 tx-pkts | 4.17 Gpkts | 0 pkts | 4.17 Gpkts rx-pkts | 4.17 Gpkts | 0 pkts | 4.17 Gpkts tx-bytes | 500.19 GB | 0 B | 500.19 GB rx-bytes | 500.19 GB | 0 B | 500.19 GB ----- | | | oerrors | 0 | 0 | 0 ierrors | 0 | 0 | 0 status: - Press 'ESC' for navigation panel... status:
- Start the monitoring tool for performance monitoring. See SmartNIC Performance Monitoring.