TX Architecture

Feature Set N-ANL9

Napatech SmartNIC
Content Type
Feature Description
Capture Software Version

Transmission features

The transmission functionality encompasses these features:

  • Full line rate host-based transmission
  • Up to 128 host buffers/streams supported per accelerator
  • Traffic merge from TX buffers done in the FPGA

TX architecture overview

This figure illustrates a setup with a TX application.

Page-1 Rectangle.24 CPU + CACHE CPU + CACHE Rectangle.2 CORE0 CORE0 Rectangle.26 CORE1 CORE1 Rectangle.27 CORE2 CORE2 Rectangle.28 COREN - 1 COREN - 1 Rectangle.29 3GD DRIVER 3GD DRIVER Sheet.7 Sheet.8 Sheet.9 Sheet.10 Rectangle.34 NTAPI NTSERVICE + LIBNTOS NTAPINTSERVICE + LIBNTOS Rectangle.40 IIO IIO Rectangle.44 MEMORY CONTROLLER MEMORY CONTROLLER Rectangle.48 NAPATECH ACCELERATOR NAPATECHSMARTNIC Sheet.15 Sheet.16 PCIe3 PCIe3 Rectangle.51 APPLICATION THREAD (USER) APPLICATION THREAD (USER) Sheet.18 Sheet.19 Rectangle.30 DDR MEMORY DDR MEMORY Sheet.21 Rectangle.35 Sheet.23 Sheet.24 Rectangle.43 HOST BUFFER HOST BUFFER Sheet.26 Rectangle.58 NETWORK STREAM NETWORK STREAM Sheet.28 Sheet.29 Rectangle.61 USER SPACE USER SPACE Rectangle.62 KERNEL SPACE KERNEL SPACE Rectangle.63 HARDWARE HARDWARE Sheet.33

The application writes frames to network streams, which map to TX host buffers. Data is transferred to the NT accelerator via PCIe using DMA for transmission on the specified port.

Transmission scenarios

Frames can be transmitted in different ways:

  • On a single port from a single TX host buffer
  • On a single port from multiple TX host buffers
  • On multiple ports from a single TX host buffer

Typical TX application process

This is a high-level overview of a typical TX application process:

  1. The application opens a TX stream using the NT_NetTxOpen function.
  2. The application obtains a buffer handle using the NT_NetTxGet function.
  3. The application writes packet data to the buffer as required.
  4. The application releases the buffer to the stream for transmission using the NT_NetTxRelease function.
  5. The application repeats Step 2 to Step 5 as required.
  6. When the transmission is complete, the application closes the stream using the NT_NetTxClose function.

Transmission performance

The high-speed transmission functionality supports full line rate transmission with low CPU load of all frame sizes from 64 to 10000 bytes.
Note: Sliced and hard-sliced frames, and frames for which there is a mismatch between the wire length and the stored length in the standard packet descriptor are not transmitted.

Frames can be transmitted at the speed supported by the ports, the PCI interface and the transmission pipe, that is typically 200 Gbit/s for NT200C01-2×100, 100 Gbit/s for NT100E3-1-PTP, 80 Gbit/s for NT200A01-2×100/40 running at 2 × 40 Gbit/s, NT200A01-2×40 and NT80E3-2-PTP, 40/4 Gbit/s for NT40E3-4-PTP, 20/2 Gbit/s for NT20E3-2-PTP, 4 Gbit/s for NT40A01-4×100.

TX performance in general (both throughput and latency) strongly depends on:
  • The actual user application
  • The number of host buffers mapped to the individual ports
  • The PCI bus utilization level and hence the PCI bus latency
  • General server performance
Note: The TX host buffer size has an impact for the overall TX throughput. Using host buffers of the same size, improves the TX performance substantially.
Note: Frames being released are transmitted immediately using a best-effort approach. Transmission based on time stamps is not supported.

Transmission configuration

It can be configured on which port a frame is to be transmitted, and if it is to be transmitted at all.

Ethernet CRC

The accelerator generates a new Ethernet CRC for transmitted frames.