In-Line Functionality

Napatech Link™ Software Features

Intel® PAC
Napatech SmartNIC
Feature Description


This section does not apply to the Intel® PAC with Intel® Arria® 10 GX FPGA.

In-line features

The in-line functionality encompasses these features:

  • Zero copy transfer from RX to TX
  • Single bit flip to discard or forward frames
  • Low latency from RX to TX fiber

In-line applications

Full throughput in-line functionality with low latency enables applications such as:

  • Intrusion prevention systems (IPS)
  • Boundary advanced threat detection (ATD)
  • Policy enforcement systems
  • Firewalls

In-line overview

In order for in-line transmission to work, an application needs to be attached to the declared in-line streams. This application must release the individual frames for transmission. Reception and transmission can take place on different SmartNICs.

This figure illustrates a setup with two different applications running in in-line mode.

In-line transmission with two applications

Zero- copy transfer from RX to TX

When a stream is configured for in-line operation, the underlying RX host buffer is mapped one-to-one to a TX host buffer. This removes the need for memory copy operations when forwarding frames. Once the RX host buffer has been updated and forwarded by the application via a call to NT_NetRxRelease, the SmartNIC automatically ensures that data is transferred using DMA from the host memory to the SmartNIC for transmission.

In-line configuration

The Setup NTPL command is used for configuring in-line operation. These Setup parameters control the in-line transmission behavior:

  • TxDescriptor: Specifies the descriptor type the incoming frames are prepended with. Three values are available: PCAP, NT and DYN. This field is mandatory for in-line transmission.
  • TxPorts: Specifies the ports that frames can be transmitted on. This field is mandatory for in-line transmission.
  • RxPorts: Specifies the ports of the receiving SmartNIC(s). By default all SmartNIC ports are selected.
  • TxPortPos: Specifies the position of the dynamic transmission bit(s) selecting the port that the frame is transmitted on. This field is mandatory if TxPorts specifies a range.
  • TxIgnorePos: Specifies the position of the dynamic transmission bit that determines whether a frame is to be discarded or transmitted. By default no dynamic transmission bit is specified.
  • RxCRC: Specifies whether incoming frames include FCS. The default value is True.
  • TxMetaData: Specifies whether the RX time stamp in native UNIX nanosecond format is appended to the frame along with a new FCS including the time stamp before in-line transmission. By default no time stamp is appended.
  • UseWL: Specifies whether wire length is used instead of capture length for length calculations in connection with transmission. This applies to descriptor types containing a wire length field, for instance the standard packet descriptor and dynamic packet descriptor 3. Specifying the use of wire length enables the application to reduce the length of the transmitted frames. The default value is FALSE.
    Note: For descriptor types such as dynamic descriptors 1, 2 and 4 that do not contain a wire length field, UseWL must be FALSE.

There are two in-line transmission scenarios, static and dynamic. In a static scenario, frames are only transmitted on one specific port. This scenario is configured by only specifying one transmission port, for instance TxPorts=1. In a dynamic scenario, a number of possible transmission ports are specified as well as the position of one or more bits in each frame that specify the actual transmission port. In a similar fashion it is possible to specify the position of a single bit in the frame, that indicates whether or not the packet is to be transmitted.

Note: The specified RX packet descriptor must match the TX packet descriptor specified in the Setup command. When TxDescriptor = Dyn is specified in the Setup command, Descriptor = Dyn<x> must be specified in the Assign command. When TxDescriptor = NT or TxDescriptor = PCAP is specified in the Setup command, Descriptor = NT or Descriptor = PCAP, respectively, must be specified in the Assign command, or, if no descriptor is defined here, PacketDescriptor = NT or PacketDescriptor = PCAP, respectively, must be specified in the ntservice.ini file.

Dynamic descriptor 3

Dynamic descriptor 3 is designed for transmission and in-line use cases.

These fields in dynamic descriptor 3 can be used for TxPortPos and TxIgnorePos:

  • uint64_t color_lo:14; // 28
  • uint64_t rxPort:6; // 42
  • uint64_t tsColor:1; // 62
  • uint64_t timestamp; // 64
  • uint64_t color_hi:28 //128
  • uint16_t offset0:10; //156
  • uint16_t offset1:10; //166

The default settings for the dynamic offsets corresponds to:

  • offset0: Outer layer 3 header
  • offset1: Outer layer 4 header

Static in-line transmission example

This NTPL example shows how to set up in-line transmission to port 1 for all frames received on port 0.

Setup[TxDescriptor = Dyn; TxPorts = 1] = StreamId == 42
Assign[StreamId = 42; Descriptor = Dyn3] = Port == 0

Dynamic in-line transmission example

This NTPL example demonstrates how to set up transmission of frames to ports 0-3. In this example, bits 137 and 138 in the color_hi field in dynamic descriptor 3 are used for configuring the actual TX port. These bits must be set by the application when iterating over the incoming frames and deciding on which port the individual frame is to be transmitted on.

Setup[TxDescriptor = Dyn; TxPorts = (0..3); TxPortPos = 137] = StreamId == 42
Assign [StreamId = 42; Descriptor = Dyn3] = Port == (0..3)

Inserting user data in transmitted frames

Dynamic descriptor 3 (see Packet Descriptors) can be used for inserting user data into transmitted frames. This can be used, for instance, for inserting VLAN IDs into the frames to accelerate redirection and load-balancing. This is illustrated in the net/vlandemo application example (see vlandemo application example), which is included in the package.

This figure shows how to enable redirection of frames to other appliances via VLAN switching.

Redirection of frames to various appliances using VLAN switching

The user data feature can also be used for accelerating multistage application processing, where one application applies metadata, indexing or similar information to the packet for accelerated processing in subsequent stages. Back-to-back alignment of application/packet data results in an efficient CPU cache utilization. This figure shows a situation where the first processing stage runs DPI and applies metadata/indexing for further processing in the second stage, which performs flow record generation.

SmartNIC receives packet, runs DPI / metadata generation at processing stage 1 and flow record generation at processing stage 2

vlandemo application example

This is a code snippet from the net/vlandemo application example, which illustrates the user data feature.

// Add VLAN tag and forward packet
while(running) { // start of packet processing loop

  // 1: Add VLAN tag
  pDyn3 = (NtDyn3Descr_t*)NT_NET_GET_PKT_DESCR_PTR(hNetBuf) ;
  pb    = (uint8_t*)pDyn3;
  pDyn3->descrLength = (uint64_t)((pDyn3->descrLength - 4)&0x3f);   // create space in actual packet to prepend 4-byte VLAN-tag,
  memcpy( (pb+pDyn3->descrLength), (pb+pDyn3->descrLength+4), 12);  // move actual MAC-header to new start of packet payload
  memcpy( (pb+pDyn3->descrLength+12), &vlan_tag, 4);                // insert VLAN tag after MAC-header
  pDyn3->wireLength  = (uint64_t)((pDyn3->wireLength  + 4)&0x3fff); // set new wirelength to inlude the 4-byte VLAN-tag

  // 2: Forward the VLAN-tagged packet
  if((status = NT_NetRxRelease(hNetRx, hNetBuf)) != NT_SUCCESS) {
    // Get the status code as text

  // 3: Get the next packet
  while(running) {
    if((status = NT_NetRxGet(hNetRx, &hNetBuf, 1000)) != NT_SUCCESS) {
    // Get the status code as text
    break; // We got a packet

} // end of packet processing loop

VLAN ID example

This figure illustrates a situation where a 4-byte VLAN ID is added after the original content of dynamic descriptor 3 (22 bytes).

4-byte VLAN ID is added after the original content of dynamic descriptor 3 (22 bytes)

This NTPL example specifies the descriptor:

Assign[StreamId = 42; Descriptor = Dyn3, Length = 26] = Port == 0

For transmission, the descrLength field in dynamic descriptor 3 must be set to 22 by the application in this example.

Note: It is also possible to shorten the descriptor length in the NTPL command to, for instance 16, and in that case let the application place the user data inside the original timestamp field.

Discarding frames

Frames can be discarded (based on the analysis performed by the user application) in two different ways:
  • For the standard descriptor and for dynamic descriptor 3, a frame is discarded if the wireLength field is set to 0 as shown in the API example:
    pDyn3->wireLength = 0;
    Note: This requires that the UseWL parameter in the Setup NTPL command is set to TRUE.
  • For all descriptors, a bit in the descriptor can be selected as a discard bit by using the Setup NTPL command as shown in this NTPL example:
    Setup[TxDescriptor=NT; TxIgnorePos=28; TxPorts=1] = StreamId==42
    Assign[StreamId=42] = Port == 0
    In this example the application must set bit 28, which is part of the color_lo field, to 1 to discard the frame. An advantage of using one of the color fields is that the color can be set automatically based on a filter set up by an Assign NTPL command. This method can also be used for backwards compatibility with the standard descriptor as explained in Legacy example.

Legacy example

Legacy applications using the standard descriptor and the NT_NET_SET_PKT_TXIGNORE macro, can still be used provided the TxIgnorePos parameter points to the txIgnore field in the standard descriptor (bit 95) as shown in this NTPL example:
Setup[TxDescriptor=NT; TxIgnorePos=95; TxPorts=1] = StreamId == 42
Assign[StreamId=42] = Port == 0
The txIgnore field can then be set by the application as previously as shown in this API example: