TX Architecture

Transmission features

The transmission functionality encompasses these features:

Full line rate host-based transmission
Up to 128 host buffers/streams supported per accelerator
Traffic merge from TX buffers done in the FPGA

TX architecture overview

This figure illustrates a setup with a TX application.

The application writes frames to network streams, which map to TX host buffers. Data is transferred to the NT accelerator via PCIe using DMA for transmission on the specified port.

Transmission scenarios

Frames can be transmitted in different ways:

On a single port from a single TX host buffer
On a single port from multiple TX host buffers
On multiple ports from a single TX host buffer

Typical TX application process

This is a high-level overview of a typical TX application process:

The application opens a TX stream using the NT_NetTxOpen function.
The application obtains a buffer handle using the NT_NetTxGet function.
The application writes packet data to the buffer as required.
The application releases the buffer to the stream for transmission using the NT_NetTxRelease function.
The application repeats Step 2 to Step 5 as required.
When the transmission is complete, the application closes the stream using the NT_NetTxClose function.

Transmission performance

The high-speed transmission functionality supports full line rate transmission with low CPU load of all frame sizes from 64 to 10000 bytes.

Note: Sliced and hard-sliced frames, and frames for which there is a mismatch between the wire length and the stored length in the standard packet descriptor are not transmitted.

Frames can be transmitted at the speed supported by the ports, the PCI interface and the transmission pipe, that is typically 200 Gbit/s for NT200C01-2×100, 100 Gbit/s for NT100E3-1-PTP, 80 Gbit/s for NT200A01-2×100/40 running at 2 × 40 Gbit/s, NT200A01-2×40 and NT80E3-2-PTP, 40/4 Gbit/s for NT40E3-4-PTP, 20/2 Gbit/s for NT20E3-2-PTP, 4 Gbit/s for NT40A01-4×100.

TX performance in general (both throughput and latency) strongly depends on:

The actual user application
The number of host buffers mapped to the individual ports
The PCI bus utilization level and hence the PCI bus latency
General server performance

Note: The TX host buffer size has an impact for the overall TX throughput. Using host buffers of the same size, improves the TX performance substantially.

Note: Frames being released are transmitted immediately using a best-effort approach. Transmission based on time stamps is not supported.

Transmission configuration

It can be configured on which port a frame is to be transmitted, and if it is to be transmitted at all.

Ethernet CRC

The accelerator generates a new Ethernet CRC for transmitted frames.