Standard Platform Alarms

The monitoring system in the Nexus platform includes a predefined set of alarms that detect abnormal operating conditions across edge devices, system modules, and data connectors. These alarms serve as early indicators of potential failures, performance degradation, or configuration errors, enabling operators to identify and address issues proactively.

Overview of Alarms

Alarm Name
Description

Data Connector State

Raised when a data connector enters an error state or loses connection.

Data Connector Endpoint State

Indicates that the external system or endpoint associated with the connector is unreachable.

Module Disconnected

Triggers if a module fails to send heartbeat or telemetry for over 30 minutes.

Module Configuration Alarm

Raised when there are issues validating or applying configuration settings.

Blob Storage Alarm

Indicates issues with local blob storage, such as failed write operations or space limits.

Edge Agent Runtime Alarm

Raised when the Edge Agent reports unhealthy status or runtime issues.

Edge Hub Runtime Alarm

Raised when the Edge Hub is non-operational or reports errors in message routing.

Module Low Memory *)

Triggered when a module's available memory falls below a defined threshold.

Module High CPU *)

Raised when a module consistently consumes high CPU resources.

Device Low Disk Space Available *)

Indicates that the device's available disk space is critically low.

Module Communication Queue Size *)

Warns of growing message queues within a module, indicating backpressure or downstream issues.

Certain alarms require that Device Monitoring functionality is active on the device. This is achieved by deploying the DeviceMonitoring module, which periodically collects health and performance metrics from both the host system and the deployed modules.

*) Requires that the DeviceMonitoring module is deployed and running on the device.

Last updated

Was this helpful?