This describes the monitoring capabilities of the SmartNICs.
In this chapter
The parameters described in the chapter can be monitored from the application through the API or by using the monitoring tool (see monitoring in DN-0449).
Monitoring parameters can be divided into optical module parameters and SmartNIC parameters. They consist of both sensor parameters and state parameters.
Optical module parameters
| Module | SmartNICs | Mandatory Parameters | Optional Parameters |
|---|---|---|---|
QSFP56 |
NT400D13 running at 2 × 200 Gbit/s, |
|
|
QSFP28 |
NT400D13 running at 2 × 100 Gbit/s, NT400D13 running at 8 × 25/10 Gbit/s, NT400D11 running at 2 × 100 Gbit/s, NT400D11 running at 8 × 25/10 Gbit/s, NT200A02 running at 2 × 100 Gbit/s, NT200A02 running at 4 × 25/10 Gbit/s, NT200A02 running at 2 × 25/10 Gbit/s |
|
|
QSFP+ |
NT400D13 running at 2 × 40 Gbit/s, NT400D11 running at 2 × 40 Gbit/s, NT400D11 running at 8 × 25/10 Gbit/s NT200A02 running at 2 × 40 Gbit/s, NT200A02 running at 4 × 25/10 Gbit/s, NT200A02 running at 2 × 25/10 Gbit/s, NT200A02 running at 8 × 10 Gbit/s |
|
|
SFP28 |
NT200A02 running at 2 × 25/10 Gbit/s (with a pluggable module adapter (PMA – see https://accessories.napatech.com/) enabling SFP28, SFP+ and SFP modules to be plugged into a QSFP28 port), NT100A01 running at 4 × 25/10 Gbit/s, NT50B01 running at 2 × 25/10 Gbit/s |
|
|
SFP+ |
NT200A02 running at 2 × 10/1 Gbit/s (with a pluggable module adapter (PMA – see https://accessories.napatech.com/) enabling SFP28, SFP+ and SFP modules to be plugged into a QSFP28 port), NT100A01 running at 4 × 10/1 Gbit/s, NT50B01 running at 2 × 10/1 Gbit/s, NT40A11 running at 4 × 10/1 Gbit/s, NT40A01 running at 4 × 10/1 Gbit/s, NT40E3-4-PTP running at 4 × 10/1 Gbit/s, NT20E3-2-PTP running at 2 × 10/1 Gbit/s |
|
|
SFP |
NT200A02 running at 2 × 1 Gbit/s (with a pluggable module adapter (PMA – see https://accessories.napatech.com/) enabling SFP28, SFP+ and SFP modules to be plugged into a QSFP28 port), NT100A01 running at 4 × 1 Gbit/s, NT50B01 running at 2 × 1 Gbit/s, NT40A11 running at 4 × 1 Gbit/s, NT40A01 running at 4 × 1 Gbit/s, NT40E3-4-PTP running at 4 × 1 Gbit/s, NT20E3-2-PTP running at 2 × 1 Gbit/s |
|
Monitoring parameters for NT SmartNICs
This table shows the most important SmartNIC parameters of which the NT SmartNICs provide monitoring.
| Parameter | Mode of Operation |
|---|---|
| Link status | Ethernet link status per port |
| PCB temperature | PCB temperature level and alarm |
| FPGA temperature | FPGA temperature level and alarm Automatic over-temperature shutdown Note: Three FPGA temperature levels are significant. See FPGA temperature thresholds.
Note: When the
LogToSystem parameter in the
ntservice.ini
file (see DN-0449) is set to
1, the warning messages are also logged in the system
log.
|
| Time synchronization status | Status on SmartNIC time synchronization and loss of synchronization |
| Fan speed | Speed of the cooling fan Note: An out-of-range fan speed
event is written to the hardware log if the fan rotates at a speed outside a
specific range. The event can be monitored from the application through the API.
|
| Critical temperature | Temperature of critical components |
| Power | Total power consumption |
The following FPGA temperature thresholds apply:
| FPGA Temperature Level | Standard Action | |
|---|---|---|
| NT400Dxx SmartNICs | Other SmartNICs | |
| 95 °C | 90 °C | Message is created in the driver log. A high-temperature event is written to the hardware log and can be monitored through the API. |
| 105 °C | 95 °C | Driver triggers FPGA shutdown. A message is created in the driver log, the event is written to the hardware log, and the system LED flashes red. The SmartNIC remains in this state until the server is power-cycled (a hard reset). Applications must be restarted. The event can be monitored again through the API after reboot. |
| 120 °C | 125 °C | Automatic fail-safe shutdown performed by the FPGA (built-in). The event is written to the hardware log and can be monitored through the API. |