In-Line Functionality

Feature Set N-ANL9

Napatech SmartNIC
Content Type
Feature Description
Capture Software Version

In-line features

The in-line functionality encompasses these features:

  • Zero copy transfer from RX to TX
  • Single bit flip to discard or forward frames
  • Low latency from RX to TX fiber

In-line applications

Full throughput in-line functionality with low latency enables applications such as:

  • Intrusion prevention systems (IPS)
  • Boundary advanced threat detection (ATD)
  • Policy enforcement systems
  • Firewalls

In-line overview

In order for in-line transmission to work, an application needs to be attached to the declared in-line streams. This application must release the individual frames for transmission. Reception and transmission can take place on different accelerators.

This figure illustrates a setup with two different applications.

Page-1 Sheet.1 Dynamic connector Rectangle A A Rectangle.4 B B Dynamic connector.11 Rectangle.318 Node A Node A Rectangle.319 Node B Node B Rectangle.8 CPU + CACHE CPU + CACHE Rectangle.9 CORE0 CORE0 Rectangle.26 CORE1 CORE1 Rectangle.27 CORE2 CORE2 Rectangle.12 COREN - 1 COREN - 1 Rectangle.29 3GD CORE (KERNEL) 3GD CORE (KERNEL) Sheet.14 Sheet.15 Sheet.16 Sheet.17 Rectangle.34 3GD (USER) 3GD (USER) Rectangle.40 IIO IIO Sheet.20 Sheet.21 NUMA NODE NUMA NODE Rectangle.44 DDR MEMORY DDR MEMORY Rectangle.51 APPLICATION THREAD (USER) APPLICATION THREAD (USER) Rectangle.53 APPLICATION THREAD (USER) APPLICATION THREAD (USER) Rectangle.54 3GD CORE (KERNEL) 3GD CORE (KERNEL) Sheet.26 Sheet.27 Rectangle.57 3GD (USER) 3GD (USER) Sheet.29 ... ... Sheet.30 Sheet.31 Sheet.32 Sheet.33 Sheet.34 Sheet.35

Zero- copy transfer from RX to TX

When a stream is configured for in-line operation, the underlying RX host buffer is mapped one-to-one to a TX host buffer. This removes the need for memory copy operations when forwarding frames. Only simple pointer operations must be performed by the NT_NetRxRelease NTAPI function. Once the RX host buffer has been updated and forwarded by the application via a call to NT_NetRxRelease, the accelerator automatically ensures that data is transferred using DMA from the host memory to the accelerator for transmission.

In-line configuration

The Setup NTPL command is used for configuring in-line operation. These Setup parameters control the in-line transmission behavior:

  • TxDescriptor: Specifies the descriptor type the incoming frames are prepended with. Three values are available: PCAP, NT and DYN.
  • TxPorts: Specifies the ports that frames can be transmitted on.
  • RxPorts: Specifies the ports of the receiving accelerator(s). The default setting selects all accelerator ports.
  • TxPortPos: Specifies the position of the dynamic transmission bit(s) selecting the port that the frame is transmitted on.
  • TxIgnorePos: Specifies the position of the dynamic transmission bit that determines whether a frame is to be discarded or transmitted.
  • RxCRC: Specifies whether incoming frames include FCS. The default value is True.

There are two in-line transmission scenarios, static and dynamic. In a static scenario, frames are only transmitted on one specific port. This scenario is configured by only specifying one transmission port, for instance TxPorts=1. In a dynamic scenario, a number of possible transmission ports are specified as well as the position of one or more bits in each frame that specify the actual transmission port. In a similar fashion it is possible to specify the position of a single bit in the frame, that indicates whether or not the packet is to be transmitted.

Note: When TxDescriptor = Dyn is specified in the Setup command, the RX packet descriptor must be specified in the Assign command. When TxDescriptor = NT is specified in the Setup command, the RX packet descriptor must be specified in the ntservice.ini file.

Static in-line transmission example

This NTPL example shows how to set up in-line transmission to port 1 for all frames received on port 0.

Setup[TxDescriptor = Dyn; TxPorts = 1] = StreamId == 42
Assign[StreamId = 42] = Port == 0

Dynamic in-line transmission example

This NTPL example demonstrates how to set up transmission of frames to ports 0-3. In this example, bits 137 and 138 in the color_hi field in dynamic descriptor 3 are used for configuring the actual TX port. These bits must be set by the application when iterating over the incoming frames and deciding on which port the individual frame is to be transmitted on.

Setup[TxDescriptor = Dyn; TxPorts = (0..3); TxPortPos = 137] = StreamId == 42
Assign [StreamId = 42; Descriptor = Dyn3] = Port == (0..3)

Dynamic descriptor 3

Dynamic descriptor 3 is designed for transmission and in-line use cases.

These fields in dynamic descriptor 3 can be used for TxPortPos and TxIgnorePos:

  • uint64_t color_lo:14; // 28
  • uint64_t rxPort:6; // 42
  • uint64_t tsColor:1; // 62
  • uint64_t timestamp; // 64
  • uint64_t color_hi:28 //128
  • uint16_t offset0:10; //156
  • uint16_t offset1:10; //166

The default settings for the dynamic offsets corresponds to:

  • offset0: outer layer 3 header
  • offset1: outer layer 4 header

Inserting user data in transmitted frames

Dynamic descriptor 3 (see Packet Descriptors) can be used for inserting user data into transmitted frames. This can be used, for instance, for inserting VLAN IDs into the frames to accelerate redirection and load-balancing. This is illustrated in the net/vlandemo application example (see vlandemo application example), which is included in the package.

This figure shows how to enable redirection of frames to other appliances via VLAN switching.

image/svg+xmlVLAN switching Monitoringappliances Monitorednetwork Monitoring and distribution appliance

The user data feature can also be used for accelerating multistage application processing, where one application applies metadata, indexing or similar information to the packet for accelerated processing in subsequent stages. Back-to-back alignment of application/packet data results in an efficient CPU cache utilization. This figure shows a situation where the first processing stage runs DPI and applies metadata/indexing for further processing in the second stage, which performs flow record generation.

packet_descriptor_with_user_dat packet_descriptor_with_user_data Sheet.2 DPI / Metadata Generation Sheet.3 Sheet.4 DPI / Metadata Generation DPI / Metadata Generation Sheet.5 Flow record generation Sheet.6 Sheet.7 Flow record generation Flow record generation Sheet.8 Napatech Accelerator Sheet.9 Sheet.10 Napatech SmartNIC Napatech SmartNIC Sheet.11 Sheet.12 Sheet.13 Ethernet frame Sheet.14 Sheet.15 Ethernet frame Ethernet frame Sheet.16 Sheet.17 Sheet.18 Packet information Sheet.19 Sheet.20 Packet information Packet information Sheet.21 Sheet.22 Sheet.23 Metadata Sheet.24 Sheet.25 Metadata Metadata Sheet.26 Sheet.27 Sheet.28 Ethernet frame Sheet.29 Sheet.30 Ethernet frame Ethernet frame Sheet.31 Sheet.32 Sheet.33 Packet information Sheet.34 Sheet.35 Packet information Packet information Sheet.36 Sheet.37 Sheet.38 Sheet.39 Sheet.40 Sheet.41 Sheet.42 Sheet.43 Sheet.44 Sheet.45 Sheet.46 Sheet.47 Sheet.48 Sheet.49 Sheet.50 Packet reception Sheet.51 Sheet.52 Packet reception Packet reception Sheet.53 User data: Unused Sheet.54 Sheet.55 User data: Unused User data: Unused Sheet.56 Processing stage 1 Sheet.57 Sheet.58 Processing stage 1 Processing stage 1 Sheet.59 User data: Metadata Sheet.60 Sheet.61 User data: Metadata User data: Metadata Sheet.62 Processing stage 2 Sheet.63 Sheet.64 Processing stage 2 Processing stage 2

vlandemo application example

This is a code snippet from the net/vlandemo application example, which illustrates the user data feature.

// Add VLAN tag and forward packet
while(running) { // start of packet processing loop

  // 1: Add VLAN tag
  pDyn3 = (NtDyn3Descr_t*)NT_NET_GET_PKT_DESCR_PTR(hNetBuf) ;
  pb    = (uint8_t*)pDyn3;
  pDyn3->descrLength = (uint64_t)((pDyn3->descrLength - 4)&0x3f);   // create space in actual packet to prepend 4-byte VLAN-tag,
  memcpy( (pb+pDyn3->descrLength), (pb+pDyn3->descrLength+4), 12);  // move actual MAC-header to new start of packet payload
  memcpy( (pb+pDyn3->descrLength+12), &vlan_tag, 4);                // insert VLAN tag after MAC-header
  pDyn3->wireLength  = (uint64_t)((pDyn3->wireLength  + 4)&0x3fff); // set new wirelength to inlude the 4-byte VLAN-tag

  // 2: Forward the VLAN-tagged packet
  if((status = NT_NetRxRelease(hNetRx, hNetBuf)) != NT_SUCCESS) {
    // Get the status code as text

  // 3: Get the next packet
  while(running) {
    if((status = NT_NetRxGet(hNetRx, &hNetBuf, 1000)) != NT_SUCCESS) {
    // Get the status code as text
    break; // We got a packet

} // end of packet processing loop

VLAN ID example

This figure illustrates a situation where a 4-byte VLAN ID is added after the original content of descriptor 3 (22 bytes).

insert_vlan_id Sheet.4 26 bytes 26 bytes Sheet.5 Sheet.6 Sheet.7 Packet Information Packet Information Sheet.9 Sheet.10 Dyn3 Packet Descriptor Dyn3 Packet Descriptor Sheet.11 Sheet.12 Sheet.13 Payload Payload Sheet.14 Sheet.15 Sheet.16 Eth Hdr Eth Hdr Sheet.18 Sheet.19 Ethernet frame Ethernet frame Sheet.20 Sheet.21 Sheet.22 Packet Information Packet Information Sheet.25 Sheet.26 Sheet.27 User Data User Data Sheet.28 Sheet.29 Dyn3 Packet Descriptor Dyn3 Packet Descriptor Sheet.30 Sheet.31 Sheet.32 Payload Payload Sheet.33 Sheet.34 Sheet.35 Eth Hdr Eth Hdr Sheet.37 Sheet.38 Sheet.39 VLAN VLAN Sheet.40 Sheet.41 Ethernet frame Ethernet frame Sheet.42 Zero payload copy Zero payload copy Sheet.45 Received packet Received packet Sheet.47 Packet for transmission Packet for transmission Sheet.50 Move (copy) Ethernet header Move (copy)Ethernet header Sheet.57 Insert VLAN ID Insert VLAN ID Sheet.60 4 bytes 4 bytes Sheet.69 Sheet.70 Sheet.72 Sheet.73 Sheet.81 Sheet.82 Sheet.78 Sheet.79 Sheet.75 Sheet.76

This NTPL example specifies the descriptor:

Assign[StreamId = 42; Descriptor = Dyn3, Length = 26] = Port == 0

For transmission, the descrLength field in dynamic descriptor 3 must be set to 22 by the application in this example.

Note: It is also possible to shorten the descriptor length in the NTPL command to, for instance 16, and in that case let the application place the user data inside the original timestamp field.

Discarding frames

Frames can be discarded (based on the analysis performed by the user application) in two different ways:
  • For the standard descriptor and for dynamic descriptor 3, a frame is discarded if the wireLength field is set to 0 as shown in the API example:
    pDyn3->wireLength = 0;
  • For all descriptors, a bit in the descriptor can be selected as a discard bit by using the Setup NTPL command as shown in this NTPL example:
    Setup[TxDescriptor=NT; TxIgnorePos=28; TxPorts=1] = StreamId==42
    Assign[StreamId=42] = Port == 0
    In this example the application must set bit 28, which is part of the color_lo field, to 1 to discard the frame. An advantage of using one of the color fields is that the color can be set automatically based on a filter set up by an Assign NTPL command. This method can also be used for backwards compatibility with the standard descriptor as explained in Legacy example.

Legacy example

Legacy applications using the standard descriptor and the NT_NET_SET_PKT_TXIGNORE macro, can still be used provided the TxIgnorePos parameter points to the txIgnore field in the standard descriptor (bit 95) as shown in this NTPL example:
Setup[TxDescriptor=NT; TxIgnorePos=95; TxPorts=1] = StreamId == 42
Assign[StreamId=42] = Port == 0
The txIgnore field can then be set by the application as previously as shown in this API example: