Monitoring Capabilities

Feature Set N-ANL9

product_line_custom
Napatech SmartNIC
category
Feature Description

In this chapter

This chapter describes the monitoring capabilities of the accelerators. The parameters described in the chapter can be monitored from the application through the API or by using the monitoring tool (see DN-0449).

Monitoring parameters can be divided into optical module parameters and accelerator parameters. They consist of both sensor parameters and state parameters.

Note: The sensor parameter values are updated every 5 seconds. This means that the values received at a user application can be up to 5 seconds old, and therefore might not reflect the current situation.

Optical module parameters

This table shows the sensor parameters available for monitoring for each optical module type. The mandatory parameters are available in all the relevant optical modules, whereas the optional parameters are module-dependent.
Note: QSFP28 modules, CFP4 modules and QSFP+ modules (except for bidirectional modules) have four sensors for some of the parameters, one for each laser. These sensors measure TX bias current, TX power and RX power. Therefore, four values are available for each of these parameters.
Note: Bidirectional QSFP+ modules are not required to have sensors.
Module Accelerators Mandatory Parameters Optional Parameters
QSFP28 NT200A01-2×100/40, NT200A01-2×100
  • QSFP28 temp.
  • Supply voltage
  • TX bias current (x4)
  • TX power (x4)
  • RX power (x4)
CFP4 NT200C01-2×100, NT100E3-1-PTP
  • CFP4 temp.
  • Supply voltage
  • TX bias current (x4)
  • TX power (x4)
  • RX power (x4)
QSFP+ NT200A01-2×100/40, NT200A01-2×40, NT80E3-2-PTP
  • QSFP+ temp.
  • Supply voltage
  • TX bias current (x4)
  • TX power (x4)
  • RX power (x4)
SFP+ NT40E3-4-PTP, NT20E3-2-PTP  
  • SFP+ temp.
  • Supply voltage
  • TX bias current
  • TX power
  • RX power
SFP NT40A01-4×1, NT40E3-4-PTP, NT20E3-2-PTP  
  • SFP temp.
  • Supply voltage
  • TX bias current
  • TX power
  • RX power
For each optical module parameter the current value as well as the lowest and highest registered values can be retrieved. For some parameters also high and low alarm levels can be shown as well as the alarm state.
Note: One QSFP28/CFP4/QSFP+/SFP+/SFP temperature level is significant for the NT accelerators: At 75 °C a message is created in the driver log, and a high-temperature event is written to the hardware log. The event can be monitored from the application through the API.

Monitoring parameters

This table shows the most important accelerator parameters of which the NT accelerators provide monitoring.

Parameter Mode of Operation
Link status Ethernet link status per port
PCB temperature PCB temperature level and alarm
FPGA temperature FPGA temperature level and alarm

Automatic over-temperature shutdown

Note: Three FPGA temperature levels are significant:
  • 90 °C – a message is created in the driver log, and a high-temperature event is written to the hardware log. The event can be monitored from the application through the API.
  • 95 °C – the driver triggers the FPGA to shut down. A message is created in the driver log, the event is written to the hardware log, and the system LED flashes red.

    The accelerator will remain in this state until the server is powered off then on again (a hard reset). Any applications that were running must be restarted. The event can be monitored from the application through the API when the server is powered on again.

  • 125 °C – for safety an automatic fail-safe FPGA shutdown is conducted by the FPGA (built in). The event is written to the hardware log, and can be monitored from the application through the API.
Note: When the LogToSystem parameter in the ntservice.ini file (see DN-0449) is set to 1, the warning messages are also logged in the system log.
Time synchronization status Status on accelerator time synchronization and loss of synchronization
Fan speed Speed of the cooling fan
Note: An out-of-range fan speed event is written to the hardware log if the fan rotates at a speed outside a specific range. The event can be monitored from the application through the API.

For NT200A01-SCC the low limit is 5264 RPM and the high limit is 7450 RPM.

For NT200C01-SCC, NT100E3-1-PTP and NT80E3-2-PTP the low limit is 4738 RPM and the high limit is 6566 RPM.

For NT40A01-4×1, NT40E3-4-PTP and NT20E3-2-PTP the low limit is 4000 RPM and the high limit is 6000 RPM.

Critical temperature Temperature of critical components
Power Total power consumption