Introduction
GSMonitor is an open‑source system monitoring daemon designed to collect, aggregate, and expose operational metrics from Linux and Unix‑like operating systems. Written primarily in the Go programming language, it targets environments ranging from small edge devices to large cloud‑native deployments. By providing a lightweight, efficient, and modular architecture, GSMonitor aims to reduce the overhead associated with traditional monitoring agents while maintaining compatibility with widely used observability backends such as Prometheus, InfluxDB, and Graphite.
Unlike monolithic monitoring solutions, GSMonitor follows a service‑centric model. It exposes a RESTful HTTP API, a Prometheus metrics endpoint, and an optional WebSocket interface for real‑time streaming of events. Its modular plug‑in system allows operators to enable or disable collectors based on the specific instrumentation required, thereby minimizing the attack surface and resource consumption.
History and Development
Origins
The genesis of GSMonitor can be traced to a 2016 internal project within a mid‑size cloud services provider that required a custom monitoring solution to handle the peculiarities of their virtual machine fleet. The existing off‑the‑shelf agents were deemed too heavy and lacked the flexibility to instrument custom application metrics without substantial code changes.
In 2018, the core developer team open‑sourceled the project under the MIT license, inviting contributions from the broader community. Early releases focused on core system metrics - CPU, memory, disk I/O, and network statistics - leveraging the Go standard library’s syscall and syscall‑fs packages to obtain high‑resolution data.
Major Milestones
- Version 0.1 (2018) – First public release; basic system metrics and HTTP metrics endpoint.
- Version 1.0 (2019) – Introduction of plug‑in architecture, Prometheus support, and built‑in authentication.
- Version 1.5 (2020) – WebSocket event stream, Docker container introspection, and TLS termination.
- Version 2.0 (2022) – Native support for Kubernetes via the Kubernetes API, integration with OpenTelemetry collector, and a new configuration schema using HCL.
- Version 2.5 (2023) – Advanced alerting framework, plugin registry, and community‑contributed plugins for cloud provider metrics.
Architecture and Design
Core Components
GSMonitor is structured around four primary components: the Collector Engine, the Plug‑in System, the HTTP API Layer, and the Persistence Layer.
The Collector Engine is responsible for scheduling metric collection jobs. It runs each collector as a lightweight goroutine, respecting user‑defined intervals and thresholds. The plug‑in system allows dynamic loading of collectors at runtime, enabling developers to write Go or Lua scripts that interact with the core API.
The HTTP API Layer serves both human‑readable and machine‑friendly endpoints. The /metrics endpoint conforms to the Prometheus exposition format; the /api/v1/metrics endpoint returns JSON payloads. The WebSocket endpoint (/ws/events) streams real‑time events such as process start/stop or file system changes.
The Persistence Layer handles optional storage of historical data in local time‑series databases like InfluxDB, or forwards metrics to external backends via HTTP or gRPC.
Plug‑in Architecture
Plug‑ins are compiled as shared libraries (.so files) that export a predefined interface. Each plug‑in registers with the Collector Engine at startup, providing metadata such as name, version, and supported OS. The plug‑in API is intentionally minimal, exposing only the necessary hooks for metric emission and configuration parsing.
Developers can also create script‑based plug‑ins in Lua, which are interpreted by an embedded Lua VM. This facilitates rapid prototyping and deployment of custom collectors without recompiling the main binary.
Configuration Model
GSMonitor’s configuration is expressed in HashiCorp Configuration Language (HCL), a human‑readable format similar to JSON but with a more succinct syntax. The configuration file (gsmonitor.hcl) defines global settings, plug‑in registrations, collection schedules, and alert rules.
Example fragment:
global {
listen_addr = "0.0.0.0:9100"
enable_tls = true
tls_cert = "/etc/gsmonitor/cert.pem"
tls_key = "/etc/gsmonitor/key.pem"
}
plugin "cpu" {
interval = 15
enabled = true
}
plugin "docker" {
interval = 30
enabled = true
}
Security Model
GSMonitor implements role‑based access control (RBAC) at the HTTP API layer. Tokens are issued by a separate authentication service (e.g., OAuth2 provider) and presented via Bearer headers. Only users with the "monitor:read" scope can access metric endpoints; users with "monitor:write" can configure plug‑ins or modify alert rules.
All internal communication is TLS‑encrypted when the enable_tls flag is true. The plug‑in system performs code‑signing verification, ensuring that only trusted libraries are loaded.
Core Features
System Metrics Collection
- CPU: user, system, idle, iowait percentages; per‑core usage.
- Memory: total, used, free, buffers/cache, swap usage.
- Disk I/O: read/write operations per device, throughput, latency.
- Network: per‑interface byte counts, packet drops, error rates.
- Process Accounting: top CPU/memory consuming processes, orphan processes.
- File System Events: inotify‑based monitoring of critical directories.
Container and Orchestration Awareness
When deployed inside containerized environments, GSMonitor can introspect Docker, containerd, and CRI‑O runtimes. It collects container resource usage, restart counts, and image metadata. In Kubernetes clusters, GSMonitor accesses the Kubernetes API to gather pod status, node resource quotas, and namespace utilization.
Prometheus Exporter
By default, GSMonitor exposes metrics in the Prometheus format at /metrics. The exporter implements the Prometheus client library’s conventions, including histogram buckets for latency metrics and gauge values for instantaneous measurements.
WebSocket Event Stream
Operators can subscribe to real‑time events such as process lifecycle changes, network connection drops, or file system modifications. The WebSocket stream uses JSON framing and includes event timestamps, event types, and payload details.
Alerting and Rules Engine
GSMonitor incorporates an embedded rules engine based on the Drools rule language, enabling users to define threshold‑based alerts. Alert rules can reference any exported metric and are evaluated every collection interval.
Alerts are surfaced via the REST API, stored in a local log, and can optionally be forwarded to third‑party systems (e.g., PagerDuty, Slack) via webhooks.
Extensibility
The plug‑in API allows developers to expose custom metrics, such as application‑specific counters or latency measurements, without modifying the core daemon. A plugin registry can be used to discover available community‑contributed collectors.
Usage and Deployment
Installation
GSMonitor binaries are available for Linux, macOS, and Windows. Users may download precompiled binaries from the project’s releases page or build from source using the Go toolchain.
- Install Go 1.18 or later.
- Clone the repository:
git clone https://github.com/gsmonitor/gsmonitor.git - Build:
go build -o gsmonitor ./cmd/gsmonitor - Place the binary in /usr/local/bin or a suitable directory.
- Create the configuration file at /etc/gsmonitor/gsmonitor.hcl.
- Start the service:
systemctl start gsmonitor(on systems with systemd).
Configuration Best Practices
- Define separate configurations for production and development environments, adjusting collection intervals to balance accuracy and resource usage.
- Enable TLS in production to protect metric traffic, especially when exposing metrics on public networks.
- Use the plug‑in system to disable unused collectors, reducing memory footprint.
- Periodically rotate TLS certificates and update the configuration accordingly.
- Leverage RBAC tokens to enforce least‑privilege access to monitoring endpoints.
Containerized Deployment
GSMonitor can run as a sidecar container within Kubernetes pods. The sidecar shares the pod’s host network namespace to access kernel metrics and mounts the /proc and /sys filesystems. A typical Dockerfile might look like:
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o gsmonitor ./cmd/gsmonitor
FROM alpine:latest
COPY --from=builder /app/gsmonitor /usr/local/bin/gsmonitor
CMD ["gsmonitor"]
Integration with Prometheus
To scrape GSMonitor’s metrics endpoint, add a job to the Prometheus scrape configuration:
scrape_configs:
- job_name: gsmonitor
static_configs:
- targets: ['localhost:9100']
Alert Routing
Alerts defined within GSMonitor can be forwarded to Alertmanager by configuring a webhook endpoint. GSMonitor sends an HTTP POST containing alert details in JSON format, which Alertmanager processes and routes to configured receivers.
Integration with Monitoring Ecosystems
Prometheus and Grafana
GSMonitor’s native Prometheus exporter makes it straightforward to visualise system metrics in Grafana dashboards. Community dashboards exist for CPU, memory, disk, and network metrics, as well as for Docker and Kubernetes resource usage.
OpenTelemetry Collector
GSMonitor can export metrics via the OpenTelemetry protocol to the OpenTelemetry Collector. This facilitates unified telemetry ingestion across heterogeneous systems and supports exporters to cloud provider monitoring services such as Google Cloud Operations, AWS CloudWatch, and Azure Monitor.
Elastic Stack
By configuring the persistence layer to forward metrics to Elasticsearch, GSMonitor integrates with the Elastic Stack. Combined with Beats agents, this creates a comprehensive observability platform covering logs, metrics, and traces.
Grafana Loki and Tempo
Although GSMonitor does not generate logs, its event WebSocket stream can be captured and forwarded to Loki for log aggregation, and the metrics can be correlated with traces in Tempo by associating request IDs.
Security Considerations
Runtime Hardening
GSMonitor runs as a non‑root user by default, restricting file system access to necessary /proc, /sys, and container runtime sockets. File permissions on the configuration file and plug‑in directory are set to 640 to prevent unauthorized modifications.
Network Exposure
When metrics are exposed over the network, TLS encryption is mandatory to prevent eavesdropping on sensitive performance data. Additionally, firewall rules should restrict access to the metrics port to trusted IP ranges.
Plug‑in Verification
Plug‑ins are required to be signed by the project’s maintainers or by a trusted third‑party. The daemon verifies the signature before loading a plug‑in, preventing the execution of malicious code.
Audit Logging
All configuration changes, plug‑in load/unload events, and alert firings are recorded in an audit log. The audit log is stored in a write‑once, append‑only format and can be rotated based on size or time policies.
Community and Ecosystem
Governance
GSMonitor follows a meritocratic governance model. Core maintainers manage issue triage, pull request reviews, and release cycles. Contributors can submit feature requests, bug reports, and pull requests through the project's issue tracker.
Documentation
The project provides comprehensive documentation covering installation, configuration, plug‑in development, and troubleshooting. Documentation is hosted on GitHub Pages and is updated with each release.
Contributors
Since its open‑source release, GSMonitor has attracted contributions from over 120 developers worldwide. Notable contributors include the authors of the Prometheus client library, the OpenTelemetry Go SDK, and the Docker Go client.
Third‑Party Plug‑ins
Community plug‑ins extend GSMonitor’s capabilities. Examples include:
- aws-cloudwatch-collector: Exports AWS CloudWatch metrics for EC2 instances.
- prometheus-query-collector: Executes arbitrary PromQL queries and publishes the results as custom metrics.
- kubernetes-event-collector: Captures Kubernetes events and emits them as JSON metrics.
Future Directions
Adaptive Sampling
Research is underway to enable GSMonitor to adjust collection frequencies based on system load. By reducing sampling rates during low activity periods, the agent can further minimize its resource footprint.
AI‑Based Anomaly Detection
Integrating lightweight machine‑learning models for anomaly detection could allow GSMonitor to autonomously identify unusual patterns in resource usage and surface them as alerts.
Cross‑Platform Expansion
While GSMonitor currently supports Linux, macOS, and Windows, plans exist to port the core engine to FreeBSD and OpenBSD, broadening its applicability in enterprise environments.
Enhanced Observability APIs
Future releases will expose a gRPC API for programmatic metric ingestion and control, aligning with Kubernetes operator patterns and enabling declarative management of monitoring agents.
No comments yet. Be the first to comment!