QPI Bypass

Napatech Link™ Software Features

Intel® PAC
Napatech SmartNIC
Feature Description

Two NT100E3-1-PTPs can be bonded through an expansion bus that allows streams to be redirected to the peer SmartNIC, ensuring local cache access.

QPI bypass example

In this example, two bonded NT100E3-1-PTPs are used to merge and distribute up- and downstream traffic in a 100G network.
Note: This example only applies to the NT200C01 solution consisting of two NT100E3-1-PTP SmartNICs.
2 bonded NT100E3-1-PTPs are used to merge and distribute up- and downstream traffic in a 100G network

This table describes the QPI bypass process:

Stage Description
1 The bonding is set up as a static configuration in ntservice.ini (see ntservice.ini).
2 Using NTPL commands (see NTPL), the application sets up affinity between streams and NUMA nodes, sets up distribution and creates the streams.
3 The application creates a thread for each stream on the NUMA nodes and sets the appropriate affinity.
4 Each thread opens its stream using the packet interface.
5 When a packet is received, all time-stamping, packet processing and statistics is performed by the receiving SmartNIC.
6 If the packet belongs to a stream that should be handled by the remote NUMA node, the packet is transferred over the expansion bus to the other SmartNIC.
7 The packet is DMAed over the PCIe interface to the local cache of the NUMA node, just as if it had been received on this SmartNIC.
8 The packet is received by the thread.


AdapterType = NT100E3_1_PTP
BondingType = Peer
RemoteAdapter = 1                     # The bonded peer
NumaNode = 0                          # Local NUMA node for this PCIe slot
BusId = 0000:06:00.0
TimeSyncReferencePriority = OSTime    # Or another reference clock source
TimeSyncConnectorInt1 = NttsOut 

AdapterType = NT100E3_1_PTP
BondingType = Peer
RemoteAdapter = 0                     # The bonded peer
NumaNode = 1                          # Local NUMA node for this PCIe slot
BusId = 0000:84:00.0
TimeSyncReferencePriority = Int1      # Synced to Adapter0
TimeSyncConnectorInt1 = NttsIn

HostBufferSegmentSizeRx = dynamic     # Peer bonding requires dynamic segment size
HostBuffersRx = [32,16,$Local$]       # Hostbuffers allocated on the local NUMA node

This table explains the parameters relevant for QPI bypass.

Parameter Description
NumaNode Specifies the NUMA node local for each PCIe slot. It can be identified from hardware documentation or by using a utility such as lstopo from the portable hardware locality (hwloc) package.
RemoteAdapter Specifies the bonded peer SmartNIC.
TimeSyncReferencePriority Sets up time synchronization of the SmartNICs to ensure proper merging of traffic.

Allocates RX host buffers. A small host buffer size helps avoid L3 cache eviction.

Example: [32,16,$Local$] allocates 32 host buffers each 16 MB long on the NUMA node local to each SmartNIC.


This NTPL example:
  • Sets up the affinity between streams and NUMA nodes 0 and 1.
  • Selects hash mode Hash5TupleSorted to assure that up- and downstream traffic for each service is merged to the same stream.
  • Creates 32 streams.
Setup[NUMANode=0] = StreamId==(0..15)
Setup[NUMANode=1] = StreamId==(16..31)
HashMode = Hash5TupleSorted
Assign[StreamId=(0..31)] = All