Receive Side Scaling (RSS)

Link-Inline™ Software Features

Platform
Napatech SmartNIC
Content Type
Feature Description
Capture Software Version
Link-Inline™ Software 3.2

When frames are forwarded to the host, the SmartNIC can distribute these frames across multiple RX queues using the receive side scaling (RSS) functionality, which enables processing traffic in multiple CPU cores.

Hash algorithms

The SmartNIC can distribute traffic to a maximum of 128 queues based on calculated hash values using a hash algorithm. The NTH10 (Napatech hasher) and Toeplitz algorithms are supported. By default, the NTH10 algorithm is used. The Toeplitz algorithm can be configured with a 40-byte user-defined key. The NTH10 algorithm allows a 4-byte user-defined key. The default key value is set to 0. It is important to configure the key appropriately as the key determines how the hash values are calculated for traffic distribution.

Header fields

A maximum of 16 different RSS configurations are supported. This implies that 16 different combinations of header fields can be configured, including one RSS configuration on ports (physical and virtual ports). By default, the 5-tuple hash mode on the outer layer is configured on all ports. The contents of these header fields are used:
  • 32-bit IPv4 / 128-bit IPv6 source address
  • 32-bit IPv4 / 128-bit IPv6 destination address
  • 16-bit UDP, TCP or SCTP source port number
  • 16-bit UDP, TCP or SCTP destination port number
  • 8-bit IPv4/IPv6 protocol number / next header
The following table shows the supported header fields.
DPDK RSS flags Header fields

RTE_ETH_RSS_IPV4

IPv4 source/destination addresses.

RTE_ETH_RSS_FRAG_IPV4

IPv4 source/destination addresses and identification of group of fragments.

RTE_ETH_RSS_NONFRAG_IPV4_TCP

IPv4 source/destination addresses and IP protocol.

RTE_ETH_RSS_NONFRAG_IPV4_UDP

IPv4 source/destination addresses and IP protocol.

RTE_ETH_RSS_NONFRAG_IPV4_SCTP

IPv4 source/destination addresses and IP protocol.

RTE_ETH_RSS_NONFRAG_IPV4_OTHER

IPv4 source/destination addresses and IP protocol.

RTE_ETH_RSS_IPV6

IPv6 source/destination addresses.

RTE_ETH_RSS_FRAG_IPV6

IPv6 source/destination addresses and identification of group of fragments.

RTE_ETH_RSS_NONFRAG_IPV6_TCP

IPv6 source/destination addresses and IP protocol.

RTE_ETH_RSS_NONFRAG_IPV6_UDP

IPv6 source/destination addresses and IP protocol.

RTE_ETH_RSS_NONFRAG_IPV6_SCTP

IPv6 source/destination addresses and IP protocol.

RTE_ETH_RSS_NONFRAG_IPV6_OTHER

IPv6 source/destination addresses and IP protocol.

RTE_ETH_RSS_IPV6_EX

Same as RTE_ETH_RSS_IPV6.

RTE_ETH_RSS_IPV6_TCP_EX

Same as RTE_ETH_RSS_IPV6_TCP.

RTE_ETH_RSS_IPV6_UDP_EX

Same as RTE_ETH_RSS_IPV6_UDP.

RTE_ETH_RSS_PORT

Layer 4 source/destination ports and IP protocol. Used with one of TCP, UDP or SCTP protocols.

RTE_ETH_RSS_GTPU

GTP TEID taken from the GTPU header on the outermost layer.

RTE_ETH_RSS_ETH

Source/destination MAC addresses.

RTE_ETH_RSS_S_VLAN

Outermost 802.1Q tag and IP identification.

RTE_ETH_RSS_C_VLAN

Innermost 802.1Q tag and IP identification.

RTE_ETH_RSS_IPV4_CHKSUM

IPv4 checksum.

RTE_ETH_RSS_L4_CHKSUM

Layer 4 checksum. Used with one of TCP, UDP or SCTP protocols.

RTE_ETH_RSS_L2_SRC_ONLY

Source MAC address.

RTE_ETH_RSS_L2_DST_ONLY

Destination MAC address.

RTE_ETH_RSS_L3_SRC_ONLY

IPv4 or IPv6 source address.

RTE_ETH_RSS_L3_DST_ONLY

IPv4 or IPv6 destination address.

RTE_ETH_RSS_L4_SRC_ONLY

Layer 4 source port and IP protocol.

RTE_ETH_RSS_L4_DST_ONLY

Layer 4 destination port and IP protocol.

RTE_ETH_RSS_L4_DST_ONLY

IPv4 or IPv6 destination addresses.

RTE_ETH_RSS_LEVEL_OUTERMOST

Outermost layer. Used with other RSS flags.

RTE_ETH_RSS_LEVEL_INNERMOST

Innermost layer. Used with other RSS flags.

Note:
  • By default, RTE_ETH_RSS_LEVEL_OUTERMOST is set.
  • If both RTE_ETH_RSS_LEVEL_OUTERMOST and RTE_ETH_RSS_LEVEL_INNERMOST are configured, RTE_ETH_RSS_LEVEL_INNERMOST is used.
  • If both RTE_ETH_RSS_Lx_SRC_ONLY and RTE_ETH_RSS_Lx_DST_ONLY on the same layer are set, then both flags are ignored.

Optimizing header field selection for hash calculation

Selected header fields for hash calculation are stored in a 320-bit memory (plus an 8-bit memory for the IP protocol) in the SmartNIC. This consists of 2 blocks with four-words each (128 bits) and 2 blocks with one-word each (32 bits). If the SmartNIC cannot store the selected fields in its four blocks, the hash configuration fails and it triggers to generate an error message.
Note: If the Toeplitz algorithm is used, one-word (32 bits) out of 320 bits is not available. Consequently, 288 bits are available for selected header fields for hash calculation.

Frames must contain the required protocol fields to be considered valid for hash calculation. It is, therefore, recommended to incorporate a filter to verify the presence of the selected fields in the RSS configuration. Otherwise, it may results in unexpected distribution outcomes.

Port RSS configuration

It is possible to configure the RSS feature on all physical and virtual ports using the DPDK API function rte_eth_dev_rss_hash_update with the rte_eth_rss_conf structure. The rte_eth_rss_conf structure is defined as follows.
struct rte_eth_rss_conf {
    uint8_t                         *rss_key;
    uint8_t                         rss_key_len;
    uint64_t                        rss_hf;
    enum rte_eth_hash_function      algorithm;
};
where:
  • rss_key: Set to either NULL or a pointer to a 4-byte key for NTH10 or a 40-byte key for Toeplitz. A key with a length shorter than 40 bytes is right-padded with zeros.
  • rss_key_len: Specifies the length of rss_key in bytes.
  • rss_hf: Combination of RTE_ETH_RSS_* flags to select header fields for hash calculation. See Supported header fields.
  • algorithm: Set to either RTE_ETH_HASH_FUNCTION_DEFAULT for NTH10 or RTE_ETH_HASH_FUNCTION_TOEPLITZ for Toeplitz
If the RSS configuration is changed on one port, it applies to all physical and virtual ports as a single RSS configuration is shared across all ports.

Flow RSS configuration

The RSS feature can be configured per flow using the DPDK API function rte_flow_createand rte_flow_actions_update with the rte_flow_action_rss structure.

The rte_flow_action_rss structure is defined as follows.
struct rte_flow_action_rss {
    enum rte_eth_hash_function      func; 
    uint32_t                        level;
    uint64_t                        types; 
    uint32_t                        key_len; 
    uint32_t                        queue_num; 
    const uint8_t                   *key; 
    const uint16_t                  *queue; 
};
where:
  • func: Set to either RTE_ETH_HASH_FUNCTION_DEFAULT for NTH10 or RTE_ETH_HASH_FUNCTION_TOEPLITZ for Toeplitz.
  • level: Specifies the inner layer or the outer layer. Set to 1 for the outermost layer or 2 for the innermost layer. It overrides RTE_ETH_RSS_LEVEL_INNERMOST and RTE_ETH_RSS_LEVEL_OUTERMOST flags from types.
  • types: Combination of RTE_ETH_RSS_* flags to select header fields for hash calculation. See Supported header fields.
  • key_len: Specifies the length of key in bytes
  • key: Set to either NULL or a pointer to a 4-byte key for NTH10 or a 40-byte key for Toeplitz. A key with a length shorter than 40 bytes is right-padded with zeros.
  • queue: Specifies queues for traffic distribution.

Maximum number of memory maps in DPDK

Even though the SmartNIC supports a maximum of 256 queues (128 RX queues + 128 TX queues), the number of supported queues can be limited by the maximum number of memory maps in DPDK, which is set to 256 by default. Out of 256 memory maps available, 2 (or more) are used by internal processes. This implies that DPDK can create 254 or fewer RX and TX queues using the default configuration. If an application attempts to create more memory maps than allowed, it may generate an error. It is possible to modify the maximum number of memory maps in the lib/eal/linux/eal_vfio.c file of the DPDK package. The following line in the file defines the value:
#define VFIO_MAX_USER_MEM_MAPS 256

RX queues for physical ports

If RX queues are configured only for physical ports, the same number of queues are assigned to each port. For the SmartNIC with two physical ports, a maximum of 64 RX queues per port can be used. The number of queues for physical ports can be configured using the rxqs parameter in the DPDK application. See the following DPDK testpmd command example.
./dpdk-testpmd --iova-mode=pa --vfio-vf-token=14d63f20-8445-11ea-8900-1f9ce7d5650d \
-a 0000:65:00.0,rxqs=64 -- -i
Using this command, 64 RX queues are created per port for the SmartNIC with two physical ports..

RX queues for virtual functions

A maximum of 124 RX queues (out of 128 RX queues) are available for virtual functions as 2 queues are used for physical ports, and 2 queues are used for internal processes.

In addition, the maximum number of queues for virtual functions are limited by the maximum number of virtual queue pairs, which is defined in the DPDK drivers/net/virtio/virtio.h file as follows.
#define VIRTIO_MAX_VIRTQUEUE_PAIRS 8
By default, it is set to 8, meaning that DPDK supports a maximum of 8 virtual queue pairs for each virtio device. If you need to change this limit, you can modify the value in the header file to suit your specific requirements. When the limit is adjusted, it must be evaluated and tested as it can impact system performance and resource utilization.

The maximum number of virtual functions are also limited by DPDK RTE_MAX_ETHPORTS, which is set to 32 by default. This means that a maximum of 30 virtual functions can be created as 2 out of 32 are used for physical ports. For example, if queues are evenly distributed to 30 virtual functions, a maximum of 4 queues can be assigned to each virtual function. See Running a user application with the DPDK virtio PMD in DN-1354.