Documentation
¶
Overview ¶
Package agent implements the Beszel monitoring agent that collects and serves system metrics.
The agent runs on monitored systems and communicates collected data to the Beszel hub for centralized monitoring and alerting.
Index ¶
- func DeleteFingerprint(dataDir string) error
- func GetAddress(addr string) string
- func GetDataDir(dataDirs ...string) (string, error)
- func GetEnv(key string) (value string, exists bool)
- func GetFingerprint(dataDir, hostname, cpuModel string) string
- func GetNetwork(addr string) string
- func NewSystemDataCache() *systemDataCache
- func ParseKeys(input string) ([]gossh.PublicKey, error)
- func SaveFingerprint(dataDir, fingerprint string) error
- func Update(useMirror bool) error
- type Agent
- type CheckFingerprintHandler
- type ConnectionEvent
- type ConnectionManager
- type ConnectionState
- type CpuMetrics
- type DeviceInfo
- type GPUManager
- type GetContainerInfoHandler
- type GetContainerLogsHandler
- type GetDataHandler
- type GetSmartDataHandler
- type GetSystemdInfoHandler
- type HandlerContext
- type HandlerRegistry
- type NicConfig
- type RequestHandler
- type Responder
- type RocmSmiJson
- type SensorConfig
- type ServerOptions
- type SmartManager
- type WebSocketClient
- func (client *WebSocketClient) Close()
- func (client *WebSocketClient) Connect() (err error)
- func (client *WebSocketClient) OnClose(conn *gws.Conn, err error)
- func (client *WebSocketClient) OnMessage(conn *gws.Conn, message *gws.Message)
- func (client *WebSocketClient) OnOpen(conn *gws.Conn)
- func (client *WebSocketClient) OnPing(conn *gws.Conn, message []byte)
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func DeleteFingerprint ¶ added in v0.18.4
DeleteFingerprint removes the saved fingerprint file from the data directory. Returns nil if the file does not exist (idempotent).
func GetAddress ¶
GetAddress determines the network address to listen on from various sources. It checks the provided address, then environment variables (LISTEN, PORT), and finally defaults to ":45876".
func GetDataDir ¶ added in v0.18.4
GetDataDir returns the path to the data directory for the agent and an error if the directory is not valid. Attempts to find the optimal data directory if no data directories are provided.
func GetEnv ¶
GetEnv retrieves an environment variable with a "BESZEL_AGENT_" prefix, or falls back to the unprefixed key.
func GetFingerprint ¶ added in v0.18.4
GetFingerprint returns the agent fingerprint. It first tries to read a saved fingerprint from the data directory. If not found (or dataDir is empty), it generates one from system properties. The hostname and cpuModel parameters are used as fallback material if host.HostID() fails. If either is empty, they are fetched from the system automatically.
If a new fingerprint is generated and a dataDir is provided, it is saved.
func GetNetwork ¶
GetNetwork determines the network type based on the address format. It checks the NETWORK environment variable first, then infers from the address format: addresses starting with "/" are "unix", others are "tcp".
func NewSystemDataCache ¶ added in v0.13.0
func NewSystemDataCache() *systemDataCache
NewSystemDataCache creates a cache keyed by the polling interval in milliseconds.
func ParseKeys ¶
ParseKeys parses a string containing SSH public keys in authorized_keys format. It returns a slice of ssh.PublicKey and an error if any key fails to parse.
func SaveFingerprint ¶ added in v0.18.4
SaveFingerprint writes the fingerprint to the data directory.
Types ¶
type Agent ¶
type Agent struct {
sync.Mutex // Used to lock agent while collecting data
// contains filtered or unexported fields
}
func NewAgent ¶
NewAgent creates a new agent with the given data directory for persisting data. If the data directory is not set, it will attempt to find the optimal directory.
func (*Agent) Start ¶
func (a *Agent) Start(serverOptions ServerOptions) error
Start initializes and starts the agent with optional WebSocket connection
func (*Agent) StartServer ¶
func (a *Agent) StartServer(opts ServerOptions) error
StartServer starts the SSH server with the provided options. It configures the server with secure defaults, sets up authentication, and begins listening for connections. Returns an error if the server is already running or if there's an issue starting the server.
func (*Agent) StopServer ¶
StopServer stops the SSH server if it's running. It returns an error if the server is not running or if there's an error stopping it.
type CheckFingerprintHandler ¶ added in v0.13.0
type CheckFingerprintHandler struct{}
CheckFingerprintHandler handles authentication challenges
func (*CheckFingerprintHandler) Handle ¶ added in v0.13.0
func (h *CheckFingerprintHandler) Handle(hctx *HandlerContext) error
type ConnectionEvent ¶
type ConnectionEvent uint8
ConnectionEvent represents connection-related events that can occur.
const ( WebSocketConnect ConnectionEvent = iota // WebSocket connection established WebSocketDisconnect // WebSocket connection lost SSHConnect // SSH connection established SSHDisconnect // SSH connection lost )
Connection events
type ConnectionManager ¶
type ConnectionManager struct {
State ConnectionState // Current connection state
ConnectionType system.ConnectionType
// contains filtered or unexported fields
}
ConnectionManager manages the connection state and events for the agent. It handles both WebSocket and SSH connections, automatically switching between them based on availability and managing reconnection attempts.
func (*ConnectionManager) Start ¶
func (c *ConnectionManager) Start(serverOptions ServerOptions) error
Start begins connection attempts and enters the main event loop. It handles connection events, periodic health updates, and graceful shutdown.
type ConnectionState ¶
type ConnectionState uint8
ConnectionState represents the current connection state of the agent.
const ( Disconnected ConnectionState = iota // No active connection WebSocketConnected // Connected via WebSocket SSHConnected // Connected via SSH )
Connection states
type CpuMetrics ¶ added in v0.15.3
type CpuMetrics struct {
Total float64
User float64
System float64
Iowait float64
Steal float64
Idle float64
}
CpuMetrics contains detailed CPU usage breakdown
type DeviceInfo ¶ added in v0.15.0
type GPUManager ¶
type GPUManager struct {
sync.Mutex
GpuDataMap map[string]*system.GPUData
// contains filtered or unexported fields
}
GPUManager manages data collection for GPUs (either Nvidia or AMD)
func NewGPUManager ¶
func NewGPUManager() (*GPUManager, error)
NewGPUManager creates and initializes a new GPUManager
func (*GPUManager) GetCurrentData ¶
func (gm *GPUManager) GetCurrentData(cacheKey uint16) map[string]system.GPUData
GetCurrentData returns GPU utilization data averaged since the last call with this cacheKey
type GetContainerInfoHandler ¶ added in v0.14.0
type GetContainerInfoHandler struct{}
GetContainerInfoHandler handles container info requests
func (*GetContainerInfoHandler) Handle ¶ added in v0.14.0
func (h *GetContainerInfoHandler) Handle(hctx *HandlerContext) error
type GetContainerLogsHandler ¶ added in v0.14.0
type GetContainerLogsHandler struct{}
GetContainerLogsHandler handles container log requests
func (*GetContainerLogsHandler) Handle ¶ added in v0.14.0
func (h *GetContainerLogsHandler) Handle(hctx *HandlerContext) error
type GetDataHandler ¶ added in v0.13.0
type GetDataHandler struct{}
GetDataHandler handles system data requests
func (*GetDataHandler) Handle ¶ added in v0.13.0
func (h *GetDataHandler) Handle(hctx *HandlerContext) error
type GetSmartDataHandler ¶ added in v0.15.0
type GetSmartDataHandler struct{}
GetSmartDataHandler handles SMART data requests
func (*GetSmartDataHandler) Handle ¶ added in v0.15.0
func (h *GetSmartDataHandler) Handle(hctx *HandlerContext) error
type GetSystemdInfoHandler ¶ added in v0.16.0
type GetSystemdInfoHandler struct{}
GetSystemdInfoHandler handles detailed systemd service info requests
func (*GetSystemdInfoHandler) Handle ¶ added in v0.16.0
func (h *GetSystemdInfoHandler) Handle(hctx *HandlerContext) error
type HandlerContext ¶ added in v0.13.0
type HandlerContext struct {
Client *WebSocketClient
Agent *Agent
Request *common.HubRequest[cbor.RawMessage]
RequestID *uint32
HubVerified bool
// SendResponse abstracts how a handler sends responses (WS or SSH)
SendResponse func(data any, requestID *uint32) error
}
HandlerContext provides context for request handlers
type HandlerRegistry ¶ added in v0.13.0
type HandlerRegistry struct {
// contains filtered or unexported fields
}
HandlerRegistry manages the mapping between actions and their handlers
func NewHandlerRegistry ¶ added in v0.13.0
func NewHandlerRegistry() *HandlerRegistry
NewHandlerRegistry creates a new handler registry with default handlers
func (*HandlerRegistry) GetHandler ¶ added in v0.13.0
func (hr *HandlerRegistry) GetHandler(action common.WebSocketAction) (RequestHandler, bool)
GetHandler returns the handler for a specific action
func (*HandlerRegistry) Handle ¶ added in v0.13.0
func (hr *HandlerRegistry) Handle(hctx *HandlerContext) error
Handle routes the request to the appropriate handler
func (*HandlerRegistry) Register ¶ added in v0.13.0
func (hr *HandlerRegistry) Register(action common.WebSocketAction, handler RequestHandler)
Register registers a handler for a specific action type
type NicConfig ¶ added in v0.12.11
type NicConfig struct {
// contains filtered or unexported fields
}
NicConfig controls inclusion/exclusion of network interfaces via the NICS env var
Behavior mirrors SensorConfig's matching logic: - Leading '-' means blacklist mode; otherwise whitelist mode - Supports '*' wildcards using path.Match - In whitelist mode with an empty list, no NICs are selected - In blacklist mode with an empty list, all NICs are selected
type RequestHandler ¶ added in v0.13.0
type RequestHandler interface {
// Handle processes the request and returns an error if unsuccessful
Handle(hctx *HandlerContext) error
}
RequestHandler defines the interface for handling specific websocket request types
type Responder ¶ added in v0.13.0
Responder sends handler responses back to the hub (over WS or SSH)
type RocmSmiJson ¶
type RocmSmiJson struct {
ID string `json:"GUID"`
Name string `json:"Card series"`
Temperature string `json:"Temperature (Sensor edge) (C)"`
MemoryUsed string `json:"VRAM Total Used Memory (B)"`
MemoryTotal string `json:"VRAM Total Memory (B)"`
Usage string `json:"GPU use (%)"`
PowerPackage string `json:"Average Graphics Package Power (W)"`
PowerSocket string `json:"Current Socket Graphics Package Power (W)"`
}
RocmSmiJson represents the JSON structure of rocm-smi output
type SensorConfig ¶
type SensorConfig struct {
// contains filtered or unexported fields
}
type ServerOptions ¶
type ServerOptions struct {
Addr string // Network address to listen on (e.g., ":45876" or "/path/to/socket")
Network string // Network type ("tcp" or "unix")
Keys []gossh.PublicKey // SSH public keys for authentication
}
ServerOptions contains configuration options for starting the SSH server.
type SmartManager ¶ added in v0.15.0
type SmartManager struct {
sync.Mutex
SmartDataMap map[string]*smart.SmartData
SmartDevices []*DeviceInfo
// contains filtered or unexported fields
}
SmartManager manages data collection for SMART devices
func NewSmartManager ¶ added in v0.15.0
func NewSmartManager() (*SmartManager, error)
NewSmartManager creates and initializes a new SmartManager
func (*SmartManager) CollectSmart ¶ added in v0.15.0
func (sm *SmartManager) CollectSmart(deviceInfo *DeviceInfo) error
CollectSmart collects SMART data for a device Collect data using `smartctl -d <type> -aj /dev/<device>` when device type is known Always attempts to parse output even if command fails, as some data may still be available If collect fails, return error If collect succeeds, parse the output and update the SmartDataMap Uses -n standby to avoid waking up sleeping disks, but bypasses standby mode for initial data collection when no cached data exists
func (*SmartManager) GetCurrentData ¶ added in v0.15.0
func (sm *SmartManager) GetCurrentData() map[string]smart.SmartData
GetCurrentData returns the current SMART data
func (*SmartManager) Refresh ¶ added in v0.15.0
func (sm *SmartManager) Refresh(forceScan bool) error
Refresh updates SMART data for all known devices
func (*SmartManager) ScanDevices ¶ added in v0.15.0
func (sm *SmartManager) ScanDevices(force bool) error
ScanDevices scans for SMART devices Scan devices using `smartctl --scan -j` If scan fails, return error If scan succeeds, parse the output and update the SmartDevices slice
type WebSocketClient ¶
type WebSocketClient struct {
gws.BuiltinEventHandler
Conn *gws.Conn // Active WebSocket connection
// contains filtered or unexported fields
}
WebSocketClient manages the WebSocket connection between the agent and hub. It handles authentication, message routing, and connection lifecycle management.
func (*WebSocketClient) Close ¶
func (client *WebSocketClient) Close()
Close closes the WebSocket connection gracefully. This method is safe to call multiple times.
func (*WebSocketClient) Connect ¶
func (client *WebSocketClient) Connect() (err error)
Connect establishes a WebSocket connection to the hub. It closes any existing connection before attempting to reconnect.
func (*WebSocketClient) OnClose ¶
func (client *WebSocketClient) OnClose(conn *gws.Conn, err error)
OnClose handles WebSocket connection closure. It logs the closure reason and notifies the connection manager.
func (*WebSocketClient) OnMessage ¶
func (client *WebSocketClient) OnMessage(conn *gws.Conn, message *gws.Message)
OnMessage handles incoming WebSocket messages from the hub. It decodes CBOR messages and routes them to appropriate handlers.
func (*WebSocketClient) OnOpen ¶
func (client *WebSocketClient) OnOpen(conn *gws.Conn)
OnOpen handles WebSocket connection establishment. It sets a deadline for the connection to prevent hanging.
Source Files
¶
- agent.go
- agent_cache.go
- client.go
- connection_manager.go
- cpu.go
- data_dir.go
- disk.go
- docker.go
- emmc_common.go
- emmc_linux.go
- fingerprint.go
- gpu.go
- gpu_amd_linux.go
- gpu_darwin_unsupported.go
- gpu_intel.go
- gpu_nvml_unsupported.go
- gpu_nvtop.go
- handlers.go
- network.go
- response.go
- sensors.go
- sensors_default.go
- server.go
- smart.go
- smart_nonwindows.go
- system.go
- systemd.go
- update.go
- utils.go
Directories
¶
| Path | Synopsis |
|---|---|
|
Package battery provides functions to check if the system has a battery and to get the battery stats.
|
Package battery provides functions to check if the system has a battery and to get the battery stats. |
|
Package deltatracker provides a tracker for calculating differences in numeric values over time.
|
Package deltatracker provides a tracker for calculating differences in numeric values over time. |
|
Package health provides functions to check and update the health of the agent.
|
Package health provides functions to check and update the health of the agent. |
|
tools
|
|
|
fetchsmartctl
command
|