Search

Gsmonitor

9 min read 0 views
Gsmonitor

Introduction

GSMonitor is an open‑source system monitoring daemon designed to collect, aggregate, and expose operational metrics from Linux and Unix‑like operating systems. Written primarily in the Go programming language, it targets environments ranging from small edge devices to large cloud‑native deployments. By providing a lightweight, efficient, and modular architecture, GSMonitor aims to reduce the overhead associated with traditional monitoring agents while maintaining compatibility with widely used observability backends such as Prometheus, InfluxDB, and Graphite.

Unlike monolithic monitoring solutions, GSMonitor follows a service‑centric model. It exposes a RESTful HTTP API, a Prometheus metrics endpoint, and an optional WebSocket interface for real‑time streaming of events. Its modular plug‑in system allows operators to enable or disable collectors based on the specific instrumentation required, thereby minimizing the attack surface and resource consumption.

History and Development

Origins

The genesis of GSMonitor can be traced to a 2016 internal project within a mid‑size cloud services provider that required a custom monitoring solution to handle the peculiarities of their virtual machine fleet. The existing off‑the‑shelf agents were deemed too heavy and lacked the flexibility to instrument custom application metrics without substantial code changes.

In 2018, the core developer team open‑sourceled the project under the MIT license, inviting contributions from the broader community. Early releases focused on core system metrics - CPU, memory, disk I/O, and network statistics - leveraging the Go standard library’s syscall and syscall‑fs packages to obtain high‑resolution data.

Major Milestones

  • Version 0.1 (2018) – First public release; basic system metrics and HTTP metrics endpoint.
  • Version 1.0 (2019) – Introduction of plug‑in architecture, Prometheus support, and built‑in authentication.
  • Version 1.5 (2020) – WebSocket event stream, Docker container introspection, and TLS termination.
  • Version 2.0 (2022) – Native support for Kubernetes via the Kubernetes API, integration with OpenTelemetry collector, and a new configuration schema using HCL.
  • Version 2.5 (2023) – Advanced alerting framework, plugin registry, and community‑contributed plugins for cloud provider metrics.

Architecture and Design

Core Components

GSMonitor is structured around four primary components: the Collector Engine, the Plug‑in System, the HTTP API Layer, and the Persistence Layer.

The Collector Engine is responsible for scheduling metric collection jobs. It runs each collector as a lightweight goroutine, respecting user‑defined intervals and thresholds. The plug‑in system allows dynamic loading of collectors at runtime, enabling developers to write Go or Lua scripts that interact with the core API.

The HTTP API Layer serves both human‑readable and machine‑friendly endpoints. The /metrics endpoint conforms to the Prometheus exposition format; the /api/v1/metrics endpoint returns JSON payloads. The WebSocket endpoint (/ws/events) streams real‑time events such as process start/stop or file system changes.

The Persistence Layer handles optional storage of historical data in local time‑series databases like InfluxDB, or forwards metrics to external backends via HTTP or gRPC.

Plug‑in Architecture

Plug‑ins are compiled as shared libraries (.so files) that export a predefined interface. Each plug‑in registers with the Collector Engine at startup, providing metadata such as name, version, and supported OS. The plug‑in API is intentionally minimal, exposing only the necessary hooks for metric emission and configuration parsing.

Developers can also create script‑based plug‑ins in Lua, which are interpreted by an embedded Lua VM. This facilitates rapid prototyping and deployment of custom collectors without recompiling the main binary.

Configuration Model

GSMonitor’s configuration is expressed in HashiCorp Configuration Language (HCL), a human‑readable format similar to JSON but with a more succinct syntax. The configuration file (gsmonitor.hcl) defines global settings, plug‑in registrations, collection schedules, and alert rules.

Example fragment:

global {
  listen_addr = "0.0.0.0:9100"
  enable_tls  = true
  tls_cert    = "/etc/gsmonitor/cert.pem"
  tls_key     = "/etc/gsmonitor/key.pem"
}

plugin "cpu" {
  interval = 15
  enabled  = true
}

plugin "docker" {
  interval = 30
  enabled  = true
}

Security Model

GSMonitor implements role‑based access control (RBAC) at the HTTP API layer. Tokens are issued by a separate authentication service (e.g., OAuth2 provider) and presented via Bearer headers. Only users with the "monitor:read" scope can access metric endpoints; users with "monitor:write" can configure plug‑ins or modify alert rules.

All internal communication is TLS‑encrypted when the enable_tls flag is true. The plug‑in system performs code‑signing verification, ensuring that only trusted libraries are loaded.

Core Features

System Metrics Collection

  • CPU: user, system, idle, iowait percentages; per‑core usage.
  • Memory: total, used, free, buffers/cache, swap usage.
  • Disk I/O: read/write operations per device, throughput, latency.
  • Network: per‑interface byte counts, packet drops, error rates.
  • Process Accounting: top CPU/memory consuming processes, orphan processes.
  • File System Events: inotify‑based monitoring of critical directories.

Container and Orchestration Awareness

When deployed inside containerized environments, GSMonitor can introspect Docker, containerd, and CRI‑O runtimes. It collects container resource usage, restart counts, and image metadata. In Kubernetes clusters, GSMonitor accesses the Kubernetes API to gather pod status, node resource quotas, and namespace utilization.

Prometheus Exporter

By default, GSMonitor exposes metrics in the Prometheus format at /metrics. The exporter implements the Prometheus client library’s conventions, including histogram buckets for latency metrics and gauge values for instantaneous measurements.

WebSocket Event Stream

Operators can subscribe to real‑time events such as process lifecycle changes, network connection drops, or file system modifications. The WebSocket stream uses JSON framing and includes event timestamps, event types, and payload details.

Alerting and Rules Engine

GSMonitor incorporates an embedded rules engine based on the Drools rule language, enabling users to define threshold‑based alerts. Alert rules can reference any exported metric and are evaluated every collection interval.

Alerts are surfaced via the REST API, stored in a local log, and can optionally be forwarded to third‑party systems (e.g., PagerDuty, Slack) via webhooks.

Extensibility

The plug‑in API allows developers to expose custom metrics, such as application‑specific counters or latency measurements, without modifying the core daemon. A plugin registry can be used to discover available community‑contributed collectors.

Usage and Deployment

Installation

GSMonitor binaries are available for Linux, macOS, and Windows. Users may download precompiled binaries from the project’s releases page or build from source using the Go toolchain.

  1. Install Go 1.18 or later.
  2. Clone the repository: git clone https://github.com/gsmonitor/gsmonitor.git
  3. Build: go build -o gsmonitor ./cmd/gsmonitor
  4. Place the binary in /usr/local/bin or a suitable directory.
  5. Create the configuration file at /etc/gsmonitor/gsmonitor.hcl.
  6. Start the service: systemctl start gsmonitor (on systems with systemd).

Configuration Best Practices

  • Define separate configurations for production and development environments, adjusting collection intervals to balance accuracy and resource usage.
  • Enable TLS in production to protect metric traffic, especially when exposing metrics on public networks.
  • Use the plug‑in system to disable unused collectors, reducing memory footprint.
  • Periodically rotate TLS certificates and update the configuration accordingly.
  • Leverage RBAC tokens to enforce least‑privilege access to monitoring endpoints.

Containerized Deployment

GSMonitor can run as a sidecar container within Kubernetes pods. The sidecar shares the pod’s host network namespace to access kernel metrics and mounts the /proc and /sys filesystems. A typical Dockerfile might look like:

FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o gsmonitor ./cmd/gsmonitor

FROM alpine:latest
COPY --from=builder /app/gsmonitor /usr/local/bin/gsmonitor
CMD ["gsmonitor"]

Integration with Prometheus

To scrape GSMonitor’s metrics endpoint, add a job to the Prometheus scrape configuration:

scrape_configs:
- job_name: gsmonitor
  static_configs:
  - targets: ['localhost:9100']

Alert Routing

Alerts defined within GSMonitor can be forwarded to Alertmanager by configuring a webhook endpoint. GSMonitor sends an HTTP POST containing alert details in JSON format, which Alertmanager processes and routes to configured receivers.

Integration with Monitoring Ecosystems

Prometheus and Grafana

GSMonitor’s native Prometheus exporter makes it straightforward to visualise system metrics in Grafana dashboards. Community dashboards exist for CPU, memory, disk, and network metrics, as well as for Docker and Kubernetes resource usage.

OpenTelemetry Collector

GSMonitor can export metrics via the OpenTelemetry protocol to the OpenTelemetry Collector. This facilitates unified telemetry ingestion across heterogeneous systems and supports exporters to cloud provider monitoring services such as Google Cloud Operations, AWS CloudWatch, and Azure Monitor.

Elastic Stack

By configuring the persistence layer to forward metrics to Elasticsearch, GSMonitor integrates with the Elastic Stack. Combined with Beats agents, this creates a comprehensive observability platform covering logs, metrics, and traces.

Grafana Loki and Tempo

Although GSMonitor does not generate logs, its event WebSocket stream can be captured and forwarded to Loki for log aggregation, and the metrics can be correlated with traces in Tempo by associating request IDs.

Security Considerations

Runtime Hardening

GSMonitor runs as a non‑root user by default, restricting file system access to necessary /proc, /sys, and container runtime sockets. File permissions on the configuration file and plug‑in directory are set to 640 to prevent unauthorized modifications.

Network Exposure

When metrics are exposed over the network, TLS encryption is mandatory to prevent eavesdropping on sensitive performance data. Additionally, firewall rules should restrict access to the metrics port to trusted IP ranges.

Plug‑in Verification

Plug‑ins are required to be signed by the project’s maintainers or by a trusted third‑party. The daemon verifies the signature before loading a plug‑in, preventing the execution of malicious code.

Audit Logging

All configuration changes, plug‑in load/unload events, and alert firings are recorded in an audit log. The audit log is stored in a write‑once, append‑only format and can be rotated based on size or time policies.

Community and Ecosystem

Governance

GSMonitor follows a meritocratic governance model. Core maintainers manage issue triage, pull request reviews, and release cycles. Contributors can submit feature requests, bug reports, and pull requests through the project's issue tracker.

Documentation

The project provides comprehensive documentation covering installation, configuration, plug‑in development, and troubleshooting. Documentation is hosted on GitHub Pages and is updated with each release.

Contributors

Since its open‑source release, GSMonitor has attracted contributions from over 120 developers worldwide. Notable contributors include the authors of the Prometheus client library, the OpenTelemetry Go SDK, and the Docker Go client.

Third‑Party Plug‑ins

Community plug‑ins extend GSMonitor’s capabilities. Examples include:

  • aws-cloudwatch-collector: Exports AWS CloudWatch metrics for EC2 instances.
  • prometheus-query-collector: Executes arbitrary PromQL queries and publishes the results as custom metrics.
  • kubernetes-event-collector: Captures Kubernetes events and emits them as JSON metrics.

Future Directions

Adaptive Sampling

Research is underway to enable GSMonitor to adjust collection frequencies based on system load. By reducing sampling rates during low activity periods, the agent can further minimize its resource footprint.

AI‑Based Anomaly Detection

Integrating lightweight machine‑learning models for anomaly detection could allow GSMonitor to autonomously identify unusual patterns in resource usage and surface them as alerts.

Cross‑Platform Expansion

While GSMonitor currently supports Linux, macOS, and Windows, plans exist to port the core engine to FreeBSD and OpenBSD, broadening its applicability in enterprise environments.

Enhanced Observability APIs

Future releases will expose a gRPC API for programmatic metric ingestion and control, aligning with Kubernetes operator patterns and enabling declarative management of monitoring agents.

References & Further Reading

1. Smith, J., & Doe, A. (2019). *Systems Monitoring with Go: A Practical Approach*. O’Reilly Media.

2. Johnson, R. (2020). *Prometheus and OpenTelemetry Integration*. Cloud Native Computing Foundation.

3. KubeCon & CloudNativeCon Proceedings, 2022. “Observability in Kubernetes Clusters.”

4. The Prometheus Project Documentation, 2023. “Exporters and Clients.”

5. The OpenTelemetry Specification, 2023. “Metrics and Traces.”

Was this helpful?

Share this article

See Also

Suggest a Correction

Found an error or have a suggestion? Let us know and we'll review it.

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!