Observability
This document describes the observability features implemented in the telecom module. These features enable users and administrators to monitor system performance, track usage, and maintain visibility into the health of the telecom infrastructure.
User Observability Features
The telecom module provides several APIs that allow users to monitor their usage and the status of their assigned nodes.
Traffic Usage Tracking
The AnalysisService provides comprehensive traffic monitoring capabilities for users through the GetRecentTrafficUsage API.
API Endpoint
- gRPC Service:
Telecom.GetRecentTrafficUsage - Request:
GetRecentTrafficUsageRequest - Response:
RecentTrafficUsageResponse
Implementation Details
The service provides traffic data in different time ranges:
- Day: Hourly bucketing for the last 24 hours
- Week: Daily bucketing for the last 7 days
- Month: Daily bucketing for the last 30 days
Key Components:
- Raw Traffic: Actual traffic consumed by the user
- Billed Traffic: Traffic that was actually charged to the user’s quota (may differ due to traffic multipliers)
// Usage example in service layer
use crate::services::analysis::{AnalysisService, GetRecentTrafficUsage, RecentRange};
let usage_response = analysis_service.process(GetRecentTrafficUsage {
user_id: user_id,
range: RecentRange::Day, // or Week, Month
}).await?;
// Response contains two data sets:
// - usage_response.raw: actual traffic consumed
// - usage_response.actually_billed: traffic charged to quota
Database Queries Used:
GetUserHourlyUsage: For day-range queriesGetUserDailyUsage: For week/month-range queries
Node Status History
Users can monitor the historical status of their assigned proxy nodes through the ListNodeStatusHistory API.
API Endpoint
- gRPC Service:
Telecom.ListNodeStatusHistory - Request:
ListNodeStatusHistoryRequest - Response:
ListNodeStatusHistoryReply
Implementation Details
The API provides hourly aggregated node status information:
- Online Nodes: Count of nodes that were online in each hour
- Offline Nodes: Count of nodes that were offline in each hour
- Maintenance Nodes: Count of nodes under maintenance in each hour
// Usage example
use crate::services::analysis::{AnalysisService, ListUserNodeStatusHistory};
let history = analysis_service.process(ListUserNodeStatusHistory {
start: start_time,
end: end_time,
user_id: user_id,
}).await?;
// Each history entry contains:
// - bucket_start: timestamp of the hour
// - online_nodes, offline_nodes, maintenance_nodes: counts for that hour
Data Source: Uses materialized view node_status_hourly_mv for efficient querying.
Node Usage Analytics
The system tracks which nodes users utilize most frequently through the ListUsuallyUsedNodes API.
API Endpoint
- gRPC Service:
Telecom.ListUsuallyUsedNodes - Request:
ListUsuallyUsedNodesRequest - Response:
ListUsuallyUsedNodesResponse
Implementation Details
Provides analytics on user’s node usage patterns:
- Node Information: ID, name of frequently used nodes
- Traffic Statistics: Upload, download, and billed traffic per node
// Usage example
use crate::services::analysis::{AnalysisService, ListUserUsuallyUsedNodes};
let nodes = analysis_service.process(ListUserUsuallyUsedNodes {
user_id: user_id,
}).await?;
// Each node entry contains:
// - node_client_id, node_name: identification
// - upload, download, billed_traffic: usage statistics
Node List and Status
Users can view their available nodes and their current status through the ListNodes API.
API Endpoint
- gRPC Service:
Telecom.ListNodes - Request:
ListNodesRequest - Response:
ListNodesReply
Implementation Details
Provides real-time information about user’s assigned nodes:
- Node Details: ID, name, traffic factor, display order
- Performance Info: Speed limits, current status
- Metadata: Country, location, route class
Admin Observability Features
Administrators have access to comprehensive monitoring and management capabilities for the entire telecom infrastructure.
Server Monitoring
List Node Servers
Admins can monitor all proxy servers in the system.
- gRPC Service:
NodeServerManage.ListNodeServers - Features:
- Filter by server status (Online/Offline/Maintenance)
- Pagination support (limit/offset)
- Shows server compatibility, status, last online time, and client count
// Usage example in manage service
use crate::services::manage::{AdminListServers, ManageService};
let servers = manage_service.process(AdminListServers {
limit: 50,
offset: 0,
filter_status: Some(NodeServerStatus::Offline), // Optional filter
}).await?;
Show Individual Server Details
- gRPC Service:
NodeServerManage.ShowNodeServer - Features:
- Complete server configuration
- Current status and performance metrics
- Last online timestamp
Node Client Management
List All Node Clients
Comprehensive view of all proxy node clients.
- gRPC Service:
NodeClientManage.ListNodeClients - Features:
- Complete client information including server relationships
- Traffic factors and routing configurations
- Status monitoring and metadata
Individual Client Details
- gRPC Service:
NodeClientManage.ShowNodeClient - Features:
- Detailed client configuration
- Associated server information
- Performance and status metrics
Package Queue Monitoring
Queue Statistics
Monitor package queue health and performance.
- gRPC Service:
PackageQueueManage.CountQueuedPackages - Features:
- Count of packages by series
- Queue status overview
Package List Management
- gRPC Service:
PackageQueueManage.ListQueuedPackages - Features:
- Filter by user, order, package, or status
- Pagination support
- Complete package lifecycle visibility
Background Job Monitoring
The telecom module runs several scheduled jobs for system maintenance and monitoring:
Node Health Monitoring (RefreshServerStatus)
- Purpose: Automatically mark servers as online/offline based on heartbeat
- Frequency: Configurable via
TelecomConfig.node_health_check.offline_timeout - Implementation:
TelecomCronExecutorincron.rs
Package Expiration Management (PackageExpiringJob)
- Purpose: Automatically expire packages based on time limits
- Frequency: Regular scanning for expired packages
- Events: Publishes
PackageExpiringEventto message queue
Traffic Billing Processing (BillTrafficJob)
- Purpose: Process unbilled traffic usage and publish billing events
- Frequency: Regular processing of accumulated traffic data
- Events: Publishes
UserUsageBillingEventfor each user
Node Status History Recording (RecordNodeStatusHistoryJob)
- Purpose: Record current status of all node servers for historical analysis
- Frequency: Hourly status snapshots
- Storage: Populates
node_status_historytable
Status View Refresh (RefreshNodeStatusViewJob)
- Purpose: Refresh the materialized view for efficient status queries
- Frequency: Regular refresh of
node_status_hourly_mv - Optimization: Includes data cleanup and analysis for performance
Event-Driven Observability
Usage Billing Events
The system processes usage data through asynchronous events:
UserUsageBillingEvent
- Publisher: External systems (cron jobs, usage collectors)
- Consumer:
TelecomBillingHook - Route:
telecom.user_usage_billing - Purpose: Record user traffic consumption and trigger package expiration
// Event structure
pub struct UserUsageBillingEvent {
pub server_id: i32,
pub user: Uuid,
pub billed_download: i64,
pub billed_upload: i64,
pub time: u64,
}
PackageExpiringEvent
- Publisher: Telecom billing system
- Route: Package expiration processing
- Purpose: Handle package lifecycle events
Tracing and Instrumentation
All RPC endpoints and critical services include comprehensive tracing:
- Instrumentation: Uses
tracing::instrumentfor observability - Error Logging: Structured error reporting with context
- Performance Tracking: Request/response times and error rates
Database Schema for Observability
Core Tables
node_status_history
Stores historical node status data:
- id: Primary key
- node_server_id: Reference to node server
- status: Online/Offline/Maintenance
- created_at: Timestamp
user_package_usage
Tracks user traffic consumption:
- Hourly and daily aggregations
- Raw and billed traffic separation
- User and server associations
Materialized Views
node_status_hourly_mv
Optimized view for status history queries:
- Hourly aggregations of node status
- Efficient querying for analytics
- Automatic refresh via cron jobs
Usage Guidelines
For Users
- Use
GetRecentTrafficUsageto monitor bandwidth consumption - Check
ListNodeStatusHistoryfor node availability patterns - Analyze
ListUsuallyUsedNodesto optimize node selection - Monitor
ListNodesfor real-time node status
For Administrators
- Use server management APIs to monitor infrastructure health
- Monitor package queues for system performance
- Review cron job logs for automated maintenance status
- Analyze event streams for system-wide observability
Development Considerations
- All APIs follow the Processor pattern [[memory:6079830]]
- Database connections use owned types, not static lifetimes [[memory:7107428]]
- Comprehensive error handling with structured logging
- Event-driven architecture for scalable monitoring
- Materialized views for performance-critical queries
This observability framework provides complete visibility into the telecom system’s operation, enabling both users and administrators to monitor, analyze, and optimize the service effectively.