Two NT100E3-1-PTPs can be bonded through an expansion bus that allow streams to be redirected to the peer SmartNIC, ensuring local cache access.
QPI bypass example
The process is described in this table:
Stage | Description |
---|---|
1 | The bonding is a static configuration set up in ntservice.ini (see below). |
2 | Using NTPL commands (see below), the application sets up affinity between streams and NUMA nodes, sets up distribution and creates the streams. |
3 |
The application creates a thread for each stream on the NUMA nodes and sets the appropriate affinity. |
4 |
Each thread opens its stream using the packet interface. |
5 |
When a packet is received, all time-stamping, packet processing and statistics is performed by the receiving SmartNIC. |
6 |
If the packet belongs to a stream that should be handled by the remote NUMA node, the packet is transferred over the expansion bus to the other SmartNIC. |
7 |
The packet is DMAed over the PCIe interface to the local cache of the NUMA node, just as if it had been received on this SmartNIC. |
8 |
The packet is received by the thread. |
ntservice.ini
[Adapter0] AdapterType = NT100E3_1_PTP BondingType = Peer RemoteAdapter = 1 # The bonded peer NumaNode = 0 # Local NUMA node for this PCIe slot BusId = 0000:06:00.0 TimeSyncReferencePriority = OSTime # Or another reference clock source TimeSyncConnectorInt1 = NttsOut [Adapter1] AdapterType = NT100E3_1_PTP BondingType = Peer RemoteAdapter = 0 # The bonded peer NumaNode = 1 # Local NUMA node for this PCIe slot BusId = 0000:84:00.0 TimeSyncReferencePriority = Int1 # Synced to Adapter0 TimeSyncConnectorInt1 = NttsIn [NT100E3_1_PTP] HostBufferSegmentSizeRx = dynamic # Peer bonding requires dynamic segment size HostBuffersRx = [32,16,$Local$] # Hostbuffers allocated on the local NUMA node
- NumaNode = ...
-
Specifies the NUMA node local for each PCIe slot. It can be identified from hardware documentation or by using a utility such as lstopo from the portable hardware locality (hwloc) package.
- RemoteAdapter = ...
-
Specifies the bonded peer SmartNIC.
- TimeSyncReferencePriority = ...
- TimeSyncConnectorInt1 = ...
-
Sets up time synchronization of the SmartNICs to ensure proper merging of traffic.
- HostBuffersRx = [32,16,$Local$]
-
Allocates 32 host buffers on the NUMA node local to each SmartNIC.
A small host buffer size helps avoid L3 cache eviction.
NTPL
Setup[NUMANode=0] = StreamId==(0..15) Setup[NUMANode=1] = StreamId==(16..31) HashMode = Hash5TupleSorted Assign[StreamId=(0..31)] = All
- Setup[NUMANode=...] = ...
-
Sets up the affinity between streams and NUMA nodes.
- HashMode = Hash5TupleSorted
-
The hash mode Hash5TupleSorted assures that up- and downstream traffic for each service is merged to the same stream.
- Assign[StreamId=(0..31)] = All
-
Creates 32 streams.