Metrics & Monitoring

SmartReader provides comprehensive metrics and monitoring capabilities through a Prometheus-compatible endpoint. This feature enables real-time monitoring of system health, performance, and operational status, making it easy to integrate with monitoring dashboards and alerting systems.

Overview

Metrics & Monitoring provides:

  • System Metrics: CPU usage, memory consumption, disk usage, and network statistics

  • Application Metrics: Application uptime, process information, and performance data

  • Service Metrics: MQTT connection status, TCP socket clients, HTTP POST status, and more

  • Prometheus Format: Standard Prometheus text format for easy integration

  • Real-time Updates: Metrics are updated continuously and available on demand

Accessing Metrics

The metrics endpoint is available at /metrics and returns data in Prometheus text format:

curl -H "Authorization: Basic YWRtaW46YWRtaW4=" \
  https://<reader-ip>:8443/metrics

Response Format:

# HELP metric_name Auto-generated metric
# TYPE metric_name gauge
metric_name value

Available Metrics

System Metrics

These metrics are provided by the MetricsMonitoringService:

Metric
Description
Unit

metricsmonitoringservice_system_cpu_usage____

Current CPU usage percentage

Percentage (0-100)

metricsmonitoringservice_system_memory_usage__mb__

Current memory usage

Megabytes

metricsmonitoringservice_system_disk_usage____

Disk usage percentage

Percentage (0-100)

metricsmonitoringservice_system_os_uptime__seconds__

Operating system uptime

Seconds

metricsmonitoringservice_system_application_uptime__seconds__

Application uptime

Seconds

metricsmonitoringservice_system_cpu_temperature____c__

CPU temperature

Celsius

metricsmonitoringservice_system_cpu_max_allowed_temp____c__

Maximum allowed CPU temperature

Celsius

metricsmonitoringservice_system_cpu_max_recorded_temp____c__

Maximum recorded CPU temperature

Celsius

metricsmonitoringservice_system_min_memory_used__mb__

Minimum memory used since startup

Megabytes

metricsmonitoringservice_system_max_memory_used__mb__

Maximum memory used since startup

Megabytes

metricsmonitoringservice_system_min_cpu_used____

Minimum CPU usage since startup

Percentage

metricsmonitoringservice_system_max_cpu_used____

Maximum CPU usage since startup

Percentage

metricsmonitoringservice_system_network_rx_bytes

Network bytes received

Bytes

metricsmonitoringservice_system_network_tx_bytes

Network bytes transmitted

Bytes

Service Metrics

Additional metrics are provided by various services:

MQTT Service Metrics

Metric
Description
Unit

mqttservice_mqtt_connected

MQTT connection status

Boolean (0 or 1)

mqttservice_messages_pending

Number of pending MQTT messages

Count

TCP Socket Service Metrics

Metric
Description
Unit

tcpsocketservice_socket_server_healthy

TCP socket server health status

Boolean (0 or 1)

tcpsocketservice_connected_clients

Number of connected TCP clients

Count

WebSocket Service Metrics

Metric
Description
Unit

websocketservice_websocket_active_clients

Number of active WebSocket clients

Count

HTTP Event Publisher Metrics

Metric
Description
Unit

httpeventpublisherservice_last_http_post_status

Last HTTP POST response status code

HTTP status code

httpeventpublisherservice_http_post_enabled

HTTP POST enabled status

Boolean (0 or 1)

gRPC Service Metrics

Metric
Description
Unit

grpcservice_grpc_enabled

gRPC enabled status

Boolean (0 or 1)

Metric Naming Convention

Metric names are automatically generated using the following rules:

  1. Provider Name: The .NET type name of the metric provider (e.g., MetricsMonitoringService)

  2. Original Key: The original metric key from the provider (e.g., System.CPU Usage (%))

  3. Transformation:

    • Converted to lowercase

    • Non-alphanumeric characters replaced with underscores

    • Multiple underscores may appear (e.g., System.CPU Usage (%) becomes system_cpu_usage____)

Example:

  • Provider: MetricsMonitoringService

  • Original Key: System.CPU Usage (%)

  • Final Metric Name: metricsmonitoringservice_system_cpu_usage____

Metric Types

All metrics are exported as gauge types in Prometheus format:

  • Gauge: A metric that represents a single numerical value that can go up and down

  • Boolean Values: Converted to 0 (false) or 1 (true) for compatibility

Example Metrics Output

Integration with Prometheus

Prometheus Configuration

Add the SmartReader metrics endpoint to your prometheus.yml:

Grafana Dashboard

Create a Grafana dashboard to visualize the metrics:

  1. Add Prometheus Data Source: Configure Prometheus as a data source in Grafana

  2. Create Dashboard: Create a new dashboard with panels for:

    • CPU Usage over time

    • Memory Usage over time

    • Network traffic (RX/TX)

    • Application uptime

    • Service health status (MQTT, TCP, HTTP, etc.)

    • Connected clients

Example Query:

Monitoring Best Practices

  1. Scrape Interval: Set an appropriate scrape interval (e.g., 30 seconds) based on your needs

  2. Alerting Rules: Create Prometheus alerting rules for:

    • High CPU usage (> 80%)

    • High memory usage (> 90%)

    • Service disconnections (MQTT, TCP, etc.)

    • Application crashes (uptime resets)

  3. Dashboard Organization: Organize dashboards by:

    • System health (CPU, memory, disk)

    • Service status (MQTT, TCP, HTTP)

    • Network performance

    • Application metrics

  4. Retention: Configure appropriate metric retention in Prometheus

  5. Authentication: Always use authentication when exposing metrics endpoints

Metric Collection

Metrics are collected:

  • Continuously: System metrics are updated every 10 seconds by the MetricsMonitoringService

  • On Demand: Service metrics are collected when the /metrics endpoint is called

  • Real-time: All metrics reflect current system state

Error Handling

If metric collection fails:

  • The endpoint returns 200 OK with # Metrics unavailable

  • Errors are logged but don't affect application operation

  • Individual metric providers may fail independently without affecting others

Configuration

Metrics collection is configured in appsettings.json:

  • MetricsChannelCapacity: Maximum number of metrics that can be queued (default: 1000)

Troubleshooting

No Metrics Available

  1. Check Authentication: Ensure you're providing valid Basic Auth credentials

  2. Check Endpoint: Verify the /metrics endpoint is accessible

  3. Review Logs: Check application logs for metric collection errors

  4. Verify Services: Ensure metric providers are registered and running

Missing Service Metrics

  1. Service Status: Verify the service is enabled and running

  2. Provider Registration: Check that the service implements IMetricProvider

  3. Service Configuration: Ensure the service is properly configured

Inconsistent Metric Names

  • Metric names are auto-generated from provider type names and metric keys

  • Names may change if provider types are renamed

  • Use Prometheus label selectors for more flexible querying

API Reference

For complete API documentation, see the REST API documentation.

Last updated