SmartNICs running on the images with the socket load balancing feature can be bonded in a master-slave configuration so that all traffic received on the master SmartNIC is replicated to the slave SmartNIC, ensuring local cache access to two NUMA nodes in a dual CPU socket server.
Purpose
The CPU socket load balancing feature can be used to optimize performance on a dual CPU socket server for demanding data processing.
Socket load balancing example
To avoid the large performance penalty of moving data between applications running on different NUMA nodes via QPI, the CPU socket load balancing feature allows traffic received on one SmartNIC (master) to be transferred at full line rate to another SmartNIC (slave) through an expansion bus.
This illustration shows a setup with two NT40A01 SmartNICs running on the 4 × 10/1 Gbit/s SLB image.
-
4×10 Gbit/s traffic received on the master card is replicated to the slave card. This allows load distribution of traffic in up to 128 streams per CPU socket (NUMA node), for a total of 256 streams in a dual CPU socket server.
-
Time-stamping and port statistics are done by the master SmartNIC, while frame processing, filtering and stream distribution is done by each SmartNIC. In effect, this works as if an optical splitter had been inserted before each pair of ports, with the advantage that time-stamping of the received frames is guaranteed to be identical.
The ports on the slave SmartNIC are disabled and cannot be used.
Configuration
- Master or slave.
- NUMA node locality.
- SmartNICs in a pair must be consecutively numbered, with the slave number = master number + 1.
Filtering and distribution of traffic to streams and setting affinity of streams to NUMA nodes is done as usual using NTPL.
The traffic received on port 0 on the master SmartNIC is replicated and available as if received on port 0 on the slave SmartNIC. From the host's point of view, since ports are enumerated consecutively in SmartNIC order, port n and port n+4 receive the same traffic.
ntservice.ini
[Adapter0] BondingType = Master NumaNode = 0 # Local NUMA node for this PCIe slot [Adapter1] BondingType = Slave NumaNode = 1 # Local NUMA node for this PCIe slot