Helium
Helium is a modern commercial VPN SaaS system built with Rust, focusing on scalability, security, and user-friendliness.
Features
- Kubernetes/docker native: stateless, horizontally scalable, and easy to deploy.
- High security: no shell execution, no deserialize vulnerabilities, and no SQL injection.
- Pluggable frontend: full functioned grpc API, and easy to build your own frontend.
- Lightweight: Minimal to 40MB memory usage per service. Can handle 1000+ requests per second on a single 1-core CPU server.
- Advanced Selling System: easy to handle complex business strategy, suitable for billions of users.
Tech Stack
- Rust: Memory-safe systems programming with C-level performance
- gRPC + Tonic: High-performance API with type-safe contracts
- PostgreSQL + SQLx: Reliable database with compile-time query validation
- Redis: Fast in-memory caching and session storage
- AMQP: Reliable message queuing for microservices
- Tokio: Async runtime for handling thousands of concurrent connections
Key Advantages:
- Microservices architecture with independent scaling
- Container-native design for Kubernetes deployment
- Memory safety eliminates entire classes of security vulnerabilities
- Exceptional performance: 1000+ RPS on single-core CPU with 40MB memory usage
Microservices Architecture
Helium is built as a collection of focused microservices that cooperate through a shared set of contracts, messaging patterns, and observability tooling. This section introduces the high-level layout of the system, explains how the services interact, and highlights the infrastructure choices that enable the platform to scale for large commercial VPN deployments.
Architectural Goals
- Independent scaling – Each service can be deployed and scaled based on its workload characteristics (API traffic, background jobs, email throughput, etc.).
- Clear boundaries – Services expose well-defined APIs (gRPC, REST, AMQP events) and depend on shared libraries for cross-cutting concerns, ensuring that business logic remains isolated inside its module.
- Operational resiliency – Stateless services, database connection pooling, and message queues allow resilient deployments with graceful failure handling.
- Security by design – Rust, strict processor patterns, and zero shared mutable state within processes prevent memory safety issues and accidental privilege escalations.
Service Topology
The helium-server crate can run in multiple worker modes. Each mode is
packaged into its own container image or deployment unit, providing a natural
microservice boundary while reusing the same codebase and shared libraries.
| Worker | Entry Point | Responsibilities |
|---|---|---|
grpc | GrpcWorker | Exposes gRPC APIs for all business domains (Auth, Manage, Telecom, Market, Shop, Support, etc.). Performs request validation, invokes the corresponding module services, and emits events. |
subscribe_api | SubscribeApiWorker | Provides REST endpoints optimized for subscription clients. Primarily a read-heavy facade backed by Redis caching and the service layer. |
webhook_api | WebHookApiWorker | Receives payment gateway callbacks and external partner webhooks, normalizes payloads, and dispatches workflow events. |
consumer | ConsumerWorker | Listens on AMQP queues for asynchronous jobs (billing, node updates, provisioning) emitted by other services. Orchestrates long-running tasks that should not block API responses. |
mailer | MailerWorker | Specialized consumer responsible for templated email delivery, retry management, and transactional messaging. |
cron_executor | CronWorker | Periodically scans for scheduled work (subscription renewals, quota resets, health checks) and dispatches jobs via the same service layer used by the API workers. |
These workers are deployed independently and scaled according to throughput requirements. For example, a busy billing period can scale the consumer and cron workers without affecting the gRPC API footprint.
Domain-Oriented Modules
Each domain (Auth, Manage, Telecom, Shop, Market, Notification, Support, Shield,
Mailer) is implemented as an independent module under modules/. Modules follow
a common layout (entities, services, rpc, hooks, events) as described in the
Project Structure Guide. Within the microservices
architecture:
- Modules provide service layer processors that encapsulate business logic.
- RPC layers expose the processors through gRPC servers. The
GrpcWorkeraggregates these services and mounts them behind a single TLS termination point, while keeping module ownership intact. - Hooks and events enable cross-module interactions without tight coupling, allowing, for instance, the Telecom module to emit usage events consumed by the billing logic in the Manage module.
Communication Patterns
Helium combines synchronous APIs with asynchronous messaging to balance latency and resiliency.
gRPC Contract
- Tonic-generated servers provide strongly typed interfaces for customer-facing and operator APIs.
- A uniform Processor trait ensures every RPC delegate is testable in isolation and can be reused by background workers.
- Service discovery is handled at the infrastructure layer (Kubernetes or Docker Compose) because workers are stateless; clients load-balance using standard mechanisms (Envoy, NGINX, etc.).
REST Facades
- Subscription and webhook workers expose lightweight REST routes via Axum.
- REST APIs reuse the same service processors, ensuring identical business behavior across protocols and simplifying versioning.
Asynchronous Messaging
- RabbitMQ (AMQP) is used to propagate domain events and dispatch background jobs.
- Producers append metadata (correlation IDs, tenant identifiers) to support observability and reliable retries.
- Consumers acknowledge messages only after successful processing, preventing data loss during failures.
Data Management
- PostgreSQL is the system of record. SQLx is used through the
DatabaseProcessorabstraction to keep SQL isolated insideentities/modules and to support compile-time query checking. - Redis provides ephemeral caches, session storage, and rate limiting. The
RedisConnectionwrapper fromhelium-frameworkmanages pooled connections shared by API and worker processes. - Consistent migrations live in the top-level
migrations/directory and are applied during deployment. Services run with zero shared mutable state; all coordination happens through the database or message queues.
Observability & Operations
- Tracing is initialized in every worker with structured logs and span annotations. This enables distributed tracing across API and background workloads when combined with log collectors.
- Metrics exporters (e.g., Prometheus integration) can be attached at the deployment layer because each worker exposes a predictable Axum/Tonic server endpoint.
- Health probes: gRPC and REST workers perform dependency checks on startup (database, Redis, AMQP). Container orchestrators can use readiness/liveness probes to restart unhealthy instances.
Deployment Model
- Workers are packaged as lightweight containers (<50MB RSS) and designed to be horizontally scalable. Scaling policies are set per worker depending on CPU or queue length metrics.
- Configuration is provided through environment variables (
DATABASE_URL,MQ_URL,REDIS_URL,WORK_MODE, etc.), making the platform 12-factor compliant. - Infrastructure typically consists of:
- Kubernetes/Docker orchestrating the worker deployments
- Managed PostgreSQL and Redis services
- RabbitMQ cluster for messaging
- Optional CDN or reverse proxy terminating TLS before forwarding requests to gRPC/REST workers.
Extensibility
Adding a new capability follows a repeatable pattern:
- Create or extend a module under
modules/with the Processor-based service implementation. - Expose the functionality via RPC/REST by wiring the service into the relevant worker.
- Emit domain events or enqueue background jobs when work must be processed asynchronously.
- Deploy the updated worker image; other workers continue functioning without redeployment because contracts are versioned explicitly.
This approach keeps Helium maintainable while providing the flexibility to grow with complex VPN SaaS requirements.
Helium Project Structure Guide
This document describes the modular architecture and organization of the Helium VPN SaaS system.
Project Overview
Helium is a modern VPN SaaS system built with Rust, organized as a workspace with multiple modules. The system follows a modular architecture where each module represents a specific business domain.
Module Architecture
Each module follows a consistent internal structure with standardized components:
1. Entity Layer (entities/)
Purpose: Data models and database access patterns
Structure:
entities/
├── mod.rs # Module exports
├── db/ # Database entity processors
│ ├── mod.rs
│ ├── user_account.rs # User account queries/commands
│ └── ...
└── redis/ # Redis entity processors
├── mod.rs
├── session.rs # Session cache operations
└── ...
Key Patterns:
- Implements
Processor<Input, Result<Output, sqlx::Error>>forDatabaseProcessor - Contains all SQL queries and database operations
- Separated by storage backend (db/ for PostgreSQL, redis/ for Redis)
- No business logic - pure data access
2. Service Layer (services/)
Purpose: Business logic orchestration and workflows
Structure:
services/
├── mod.rs # Service exports
├── manage.rs # Management operations
├── user_profile.rs # User profile management
└── ...
Key Patterns:
- Implements
Processor<Input, Result<Output, Error>>for service operations - Orchestrates multiple entity operations
- Handles validation, transformation, and business rules
- No direct SQL - delegates to entity processors
- Uses
DatabaseProcessorfor data access
Example:
#[derive(Clone)]
pub struct UserManageService {
pub db: sqlx::PgPool,
}
impl Processor<ListUsersRequest, Result<ListUsersResponse, Error>> for UserManageService {
async fn process(&self, input: ListUsersRequest) -> Result<ListUsersResponse, Error> {
let db = DatabaseProcessor::from_pool(self.db.clone());
let users = db.process(ListUsers { ... }).await?;
Ok(ListUsersResponse { users })
}
}
3. gRPC Layer (rpc/)
Purpose: gRPC service implementations and external API
Structure:
rpc/
├── mod.rs # RPC exports
├── auth_service.rs # Authentication gRPC service
├── manage_service.rs # Management gRPC service
├── middleware.rs # gRPC middleware
└── ...
Key Patterns:
- Implements generated gRPC trait definitions
- Converts protobuf messages to service DTOs
- Delegates to service layer via
Processor::process - Handles authentication and authorization
4. Hook System (hooks/)
Purpose: Event-driven side effects and integrations
Structure:
hooks/
├── mod.rs # Hook exports
├── billing.rs # Billing event hooks
├── register.rs # Registration hooks
└── ...
Key Patterns:
- Responds to domain events
- Handles cross-module integrations
- Implements side effects (notifications, external API calls)
- Decoupled from main business flows
5. Event System (events/)
Purpose: Domain event definitions and publishing
Structure:
events/
├── mod.rs # Event exports
├── user.rs # User-related events
├── order.rs # Order events
└── ...
Key Patterns:
- Defines domain events using message queue integration
- Publishes events for cross-module communication
- Enables audit trails and analytics
- Supports eventual consistency patterns
6. API Layer (api/)
Purpose: REST API endpoints and HTTP handlers
Structure:
api/
├── mod.rs # API exports
├── subscribe.rs # Subscription endpoints
└── xrayr/ # XrayR integration APIs
├── mod.rs
└── ...
Key Patterns:
- Implements REST endpoints using Axum
- Handles HTTP-specific concerns (parsing, serialization)
- Delegates to service layer
- Provides alternative to gRPC for specific use cases
7. Cron Jobs (cron.rs)
Purpose: Scheduled tasks and background jobs
Key Patterns:
- Implements periodic maintenance tasks
- Handles cleanup operations
- Manages recurring billing cycles
- Executes system health checks
8. Testing (tests/)
Purpose: Integration and unit tests
Structure:
tests/
├── common/ # Test utilities
│ └── mod.rs # Common test setup
├── service_name_test.rs # Service integration tests
└── ...
Key Patterns:
- Integration tests for complete workflows
- Uses testcontainers for database testing
- Isolated test environments
- Comprehensive service testing
Module Configuration
Dependencies (Cargo.toml)
Each module declares:
- Workspace dependencies (shared versions)
- Inter-module dependencies
- Module-specific dependencies
- Dev dependencies for testing
- Build dependencies (typically
tonic-prost-buildfor gRPC)
Build Configuration (build.rs)
Most modules include build scripts for:
- gRPC code generation from proto files
- Custom compilation steps
- Environment-specific builds
Module Entry Point (lib.rs)
Standard module structure:
#![forbid(clippy::unwrap_used)]
#![forbid(unsafe_code)]
#![deny(clippy::expect_used)]
#![deny(clippy::panic)]
pub mod config;
pub mod cron;
pub mod entities;
pub mod events;
pub mod hooks; // Optional
pub mod api; // Optional
pub mod rpc;
pub mod services;
Protocol Buffers (proto/)
Organization: Organized by module with consistent naming:
proto/
├── auth/
│ ├── auth.proto # Core auth services
│ ├── account.proto # Account management
│ └── manage.proto # Admin operations
├── telecom/
│ ├── telecom.proto # VPN services
│ └── manage.proto # Telecom management
└── ...
Patterns:
- Service definitions mirror module structure
- Consistent message naming conventions
- Shared types in common proto files
Key Architectural Principles
1. Processor Pattern
All APIs use the kanau::processor::Processor trait for consistent interfaces and composability.
2. Separation of Concerns
- Entities: Data access only
- Services: Business logic only
- RPC/API: Protocol handling only
- Events/Hooks: Side effects only
3. Database Abstraction
Services never contain raw SQL - all database access goes through entity processors.
4. Event-Driven Architecture
Modules communicate via events to maintain loose coupling.
5. RBAC and Audit
Administrative operations implement consistent role-based access control and audit logging.
Development Guidelines
Adding a New Module
- Create module directory under
modules/ - Add basic
Cargo.tomlwith workspace dependencies - Create
src/lib.rswith standard module structure - Add module to workspace
Cargo.toml - Create proto definitions if gRPC services needed
- Implement entities → services → rpc layers in order
Testing Strategy
- Unit tests for complex business logic in services
- Integration tests in
tests/directory - Use
testcontainer-helium-modulesfor database tests - Mock external dependencies
- Test error handling paths
Documentation Standards
- Document all public APIs
- Include examples for complex workflows
- Maintain this guide as modules evolve
- Document breaking changes in module changelogs
This modular architecture enables independent development, testing, and deployment of features while maintaining system coherence through standardized patterns and interfaces.
Admin Manage Module
The Admin Manage module provides the operational control plane for Helium. It brings together the tooling that internal staff need to configure commercial VPN offerings, supervise subscriber lifecycle tasks, and monitor the reliability of the distributed network. While customer-facing applications interact with the public APIs, the Admin Manage module focuses on privileged workflows such as policy curation, partner management, and sensitive account intervention.
At a glance, the module enables administrators to:
- Onboard new business units, partners, and reseller organizations.
- Provision and maintain privileged operator accounts with fine-grained access scopes.
- Configure catalog data (plans, bundles, promotions) that the market and shop domains surface to end customers.
- Oversee subscriber management, including suspension, KYC verification, and support escalations.
- Inspect operational telemetry generated by other Helium modules to triage issues quickly.
The Manage module is intentionally integrated with the platform’s observability, billing, and identity services. By housing these capabilities in one place, Helium ensures that administrative actions respect the same audit and security guarantees enforced across the rest of the microservices architecture.
Admin Account System
The Manage service implements a purpose-built administrator directory that is separate from customer identities. It stores control-plane operators, delegated tenant managers, and read-only auditors together with the access credentials required to call privileged APIs.
Account Personas and Records
Every administrator entry stores an immutable id, display name, granted
role, and optional contact metadata. Platform operators run the Helium cloud
and may assume any tenant context. Tenant administrators belong to a single
customer tenant and can invite peers. Auditors observe configuration and
compliance state without mutation rights. Each record retains lifecycle
timestamps (creation, last login, invitation usage) so governance reports can be
produced without touching runtime logs.
Registration Workflow
Administrator onboarding is handled by invitation and implemented inside
AdminAuthService:
- Invitation lookup – The service receives a
RegisterAdmincommand and usesFindAdminInvitationByTokento retrieve the invitation that seeded the registration. Missing invitations resolve toRegisterAdminResult::InvalidInvitation. - Usage guard – If the invitation was already consumed (
invitation.used), the service short-circuits withRegisterAdminResult::InvitationUsedto stop replayed activation links. - Account creation – A new UUID is minted and passed to
CreateAdminAccounttogether with the invite’s role and the operator-supplied profile fields. This persists the administrator row and binds it to the permission set determined during invitation issuance. - API key provisioning –
generate_admin_tokenproduces an opaque API key (passkey). The token plus key label (key_name) are stored throughCreateAdminToken, ensuring future logins can authenticate against the database record. - Invitation finalization –
UseAdminInvitationmarks the invitation as used so the one-time link cannot be replayed. - Response – The service returns
RegisterAdminResult::Successwith the freshly createdadmin_idand the plaintext API token. The caller is responsible for presenting that key securely to the new administrator; it is never persisted in plaintext elsewhere.
This flow guarantees that registration can only occur with a valid invitation and that each administrator starts with at least one API credential for future logins.
Login Workflow
Subsequent logins exchange the stored API token for a short-lived JWT access credential:
- API key lookup –
AdminAuthServicereceives anAdminLogincommand containing the submitted API token that was minted during registration (or from a laterCreateAdminTokenissuance). It invokesFindAdminPasskeyByTokento resolve the underlying admin key row stored inadmin.admin_key. Absent or revoked tokens yieldAdminLoginResult::KeyNotFound. - Account hydration – With the key resolved, the service loads the
administrator profile through
FindAdminById. Missing accounts are treated as a failed login to avoid leaking information about deleted users. - JWT configuration – The service clones its Redis connection and calls
find_config_from_redis::<AdminJwtConfig>to load the current signing material, issuer, audience, and expiration settings. - Claim assembly – Using
AdminJwtClaims, the service setssubto the admin ID,nameandroleto the profile details, and timestamps (iat,exp) based onOffsetDateTime::now_utc()plus the configured TTL. - Token issuance – The encoded claims are signed using the JWT encoder from
AdminJwtConfig::encode(). On success the service returnsAdminLoginResult::Success(AdminAccessToken); failures bubble up as framework errors.
The caller receives only the access token string. Subsequent Manage API calls
must include it (for example, as a Bearer token) so workers can authorize the
request.
Access Token Semantics
The issued JWT embeds the administrator’s role in lower case (via
role.to_jwt_string().to_lowercase()), the issuer/audience pair, and the expiry
chosen by configuration. Manage workers validate tokens on every request using
the same Redis-backed configuration, checking signature validity and time-based
claims before resolving RBAC permissions. Because API tokens act as a second
factor, operators are encouraged to rotate them regularly; new keys can be
created through the same CreateAdminToken processor while old keys are revoked
out-of-band.
Account Lifecycle
Beyond registration and login, Manage provides tools for invitation management, account suspension, and archival. Suspended accounts retain their historical record but cannot authenticate until reinstated. Archival removes active credentials yet preserves audit context, ensuring the administrator directory remains authoritative for compliance reporting.
Role-based Access Control
The Manage module implements role-based access control (RBAC) with a compact, code-first model. Every administrative API call flows through an authorization check that compares the caller’s stored role with a whitelist embedded in the operation being executed. This section documents the concrete design so other modules can plug into the same pattern.
Admin roles
Administrator accounts persist a strongly-typed role in the
admin.admin_account table. The role enum is defined in Rust as
AdminRole with four variants: SuperAdmin, Moderator,
CustomerSupport, and SupportBot. Each variant describes the maximum
authority an operator can have, and the enum provides helpers for serialising
the value into JWT claims:
#[derive(Debug, Clone, Copy, PartialEq, Eq, sqlx::Type, Serialize)]
#[serde(rename_all = "snake_case")]
#[sqlx(type_name = "admin.admin_role", rename_all = "snake_case")]
pub enum AdminRole {
/// The super admin is the highest level of admin. Super admin can use all manage APIs.
SuperAdmin,
/// The moderator is a lower level of admin. Moderator can use all non-sensitive APIs.
Moderator,
/// The customer support is a lower level of admin. Customer support can only access user management APIs.
CustomerSupport,
/// The support bot can access most of non-sensitive read APIs, but cannot access any write APIs.
SupportBot,
}
impl AdminRole {
pub fn to_jwt_string(&self) -> String {
match self {
AdminRole::SuperAdmin => "super_admin".to_string(),
AdminRole::Moderator => "moderator".to_string(),
AdminRole::CustomerSupport => "customer_support".to_string(),
AdminRole::SupportBot => "support_bot".to_string(),
}
}
}
The current Manage APIs are conservative: every write path and most read paths
are restricted to SuperAdmin. That choice is encoded directly in the service
layer and can be relaxed by expanding the allowed-role lists as more granular
policies are introduced. For example, here are some typical operation implementations:
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
pub struct CreateInvite {
pub inviter_id: Uuid,
pub role: AdminRole,
}
impl AdminOperation for CreateInvite {
const ALLOWED_ROLES: &'static [AdminRole] = &[AdminRole::SuperAdmin];
const OPERATION_NAME: &'static str = "create_invite";
const OPERATION_TARGET: &'static str = "admin";
fn to_audit_log(&self) -> Result<String, serde_json::Error> {
// ...
}
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
pub struct ChangeRole {
pub admin_id: Uuid,
pub role: AdminRole,
}
impl AdminOperation for ChangeRole {
const ALLOWED_ROLES: &'static [AdminRole] = &[AdminRole::SuperAdmin];
const OPERATION_NAME: &'static str = "change_role";
const OPERATION_TARGET: &'static str = "admin";
fn to_audit_log(&self) -> Result<String, serde_json::Error> {
// ...
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)]
pub struct ListAdmins {
pub limit: i64,
pub offset: i64,
}
impl AdminOperation for ListAdmins {
const ALLOWED_ROLES: &'static [AdminRole] = &[AdminRole::SuperAdmin];
const OPERATION_NAME: &'static str = "list_admins";
const OPERATION_TARGET: &'static str = "admin";
fn to_audit_log(&self) -> Result<String, serde_json::Error> {
// ...
}
}
Operation gating
Each RPC-facing action is modelled as a struct (for example CreateInvite or
ListAdmins) that implements the AdminOperation trait. The trait requires the
operation to publish three constants—ALLOWED_ROLES, OPERATION_NAME, and
OPERATION_TARGET—and a method for serialising audit metadata. The
ALLOWED_ROLES array is the crucial RBAC rule: it declares which AdminRole
values may invoke the action:
pub trait AdminOperation {
const ALLOWED_ROLES: &'static [AdminRole];
const OPERATION_NAME: &'static str;
const OPERATION_TARGET: &'static str;
fn check_permission(rule: AdminRole) -> bool {
// implemented function
}
fn with_admin_id(self, admin_id: Uuid) -> RecordedAdminOperation<Self>
where
Self: Sized,
{
// implemented function
}
// required function
fn to_audit_log(&self) -> Result<String, serde_json::Error>;
}
Requests are wrapped in an AuditLayer before they reach the underlying
service. The layer looks up the caller’s account, verifies that the stored role
is present in ALLOWED_ROLES, records an audit entry when appropriate, and only
then dispatches the call to the service implementation. Any mismatch results in
an immediate PermissionsDenied error without touching the business logic.
This keeps enforcement centralised and guarantees that logging and RBAC stay in
sync:
#[derive(Debug, Clone)]
pub struct AuditLayer {
// private fields
}
impl AuditLayer {
pub fn new(database_processor: DatabaseProcessor) -> Self {
// implemented function
}
// Main audited wrapper
async fn wrap<Oper, Output, Proc>(
&self,
processor: &Proc,
input: RecordedAdminOperation<Oper>,
) -> Result<Output, Error>
where
Oper: AdminOperation + Send,
Proc: Processor<RecordedAdminOperation<Oper>, Result<Output, Error>> + Send + Sync,
{
// implemented function
}
}
Reusing the RBAC pattern
Other Manage submodules—or even external crates—should follow the same contract when adding administrative features:
- Define a command object for the new action and implement
AdminOperationon it. Select the minimal role set necessary for the task and provide a concise audit payload viato_audit_log. - Process the command through
AuditLayer::wrap(orwrap_without_recordif the action should skip audit logging) so that permission checks and audit persistence always run. - When exposing the command through gRPC or HTTP, ensure the request handler
obtains the caller’s ID from the authentication middleware and forwards it by
calling
.with_admin_id(...)on the operation before handing it to the service.
By embedding role checks inside the operation type instead of scattering them through handler logic, the Manage module keeps RBAC auditable, testable, and easy to extend. Future modules can adopt finer-grained policies simply by expanding the enum or splitting operations with different allowed-role sets without having to rework middleware.
Authentication & Audit
The Manage service authenticates every administrative RPC call and records the privileged work that survives RBAC checks. This document captures the moving pieces so existing contributors remember how the plumbing fits together and new contributors can reuse it correctly.
Access token issuance
AdminAuthService is responsible for exchanging an API key for a signed access
JWT. The AdminLogin processor performs the following steps:
- Look up the submitted API key with
FindAdminPasskeyByToken. Unknown keys immediately returnAdminLoginResult::KeyNotFound. - Resolve the owning administrator via
FindAdminById. Keys referencing a removed account are treated as unknown. - Pull
AdminJwtConfigfrom Redis usingfind_config_from_redis. The config bundle provides the HMAC/EdDSA encoder plus default issuer, audience, and expiry settings. - Build
AdminJwtClaimsfrom the admin profile (ID, display name, role) and the timestamps computed fromOffsetDateTime::now_utc(). - Sign the JWT with the encoder returned by the config and wrap the token in
AdminAccessTokenfor the RPC response.
Every Manage client must attach the returned JWT to subsequent requests using
x-admin-authorization. No refresh token logic exists—clients repeat the login
flow when the token expires.
Authentication middleware
The RPC server mounts AdminAuthLayer (see rpc::middleware) on every handler
that requires an authenticated admin. The middleware:
- Loads
AdminJwtConfigfrom Redis (identical to the login flow) so it can reuse the configured decoder. - Extracts
x-admin-authorizationfrom the incoming headers and validates the JWT. Invalid or missing tokens result inStatus::unauthenticatedonce the gRPC method executes. - Stores the authenticated administrator ID in the request extensions as
AdminIdso downstream handlers can fetch it withAdminId::from_request(&Request).
When you add a new RPC surface, ensure the router is wrapped with
AdminAuthLayer::new(redis.clone()) and read the admin identifier from the
extensions instead of parsing the header manually. This keeps every handler in
sync with the central decoding logic and JWT configuration source.
Auditing and RBAC enforcement
AuditLayer wires authentication into authorization and durable audit trails.
Operations that mutate state are modelled as RecordedAdminOperation<T> where
T: AdminOperation describes the action being performed. Wrapping a processor
with the layer performs the following steps:
- Reload the administrator from the database (
FindAdminById). Requests referencing a deleted admin fail withError::PermissionsDenied. - Emit tracing fields (
admin_name,admin_role) for observability. - Check whether the admin role satisfies the static allow-list declared on the
operation type via
AdminOperation::check_permission. Permission failures short-circuit the call withError::PermissionsDenied. - If permitted, transform the operation into an audit payload with
to_audit_log()and persist it usingAddAuditLog. - Execute the wrapped processor and log success or failure.
Use AuditLayer::wrap_without_record when you only need the permission check
(for example, read-only operations). Otherwise prefer wrap so the audit table
stays authoritative.
Checklist for new handlers
- Accept the admin context by calling
AdminId::from_requestin the RPC entry point. - Build the corresponding operation type that implements
AdminOperation. - Wrap the service processor with
AuditLayer::wrap(orwrap_without_recordfor read endpoints). - Propagate
Error::PermissionsDeniedback to the client untouched so callers see a clear 403-style error.
Following these conventions ensures every manage RPC reuses the same JWT validation, role checks, and audit recording logic.
Auth Module
The auth module owns every touch point around user identity: registration, login, and session lifecycle. When you need to change how accounts are created, how tokens are minted or validated, or how third‑party logins work, this is the place to start.
Overview
- Config (
config.rs) – Centralizes runtime configuration for email rules, JWT token parameters, and OAuth providers. Look here when wiring new environment variables or tuning token TTLs. - Entities (
entities/) – Typed models for Redis and database records used during authentication, such as session IDs and magic links. Extending persistence schemas happens here. - Services (
services/) – Stateless services that implement registration flows, session issuance, and password utilities. Application code should depend on these instead of hand‑rolling auth logic. - RPC (
rpc/) – Public gRPC/HTTP endpoints that expose authentication capabilities to other services. If you are adding a new feature, start by defining or updating the RPCs. - OAuth (
oauth/) – Provider integrations, OpenID helpers, and challenge storage. Any new provider or OpenID tweak belongs here. - Cron (
cron.rs) – Housekeeping jobs that prune expired challenges, OTPs, and sessions. Whenever you add a new time‑bound artifact, ensure a cleanup job exists. - Hooks & Events (
hooks/,events/) – Event emission and subscriber glue for cross‑module reactions (e.g., notifying other modules about new signups). - Password (
password.rs) – Hashing strategy and password policy helpers. Adjust this when requirements change.
Typical Extension Workflow
- Start with configuration – Introduce config structs or fields in
config.rs, then surface them throughConfigProviderso deployments can set them. - Update domain logic – Modify the relevant service in
services/to implement the new behaviour. Use existing entity types or create new ones insideentities/. - Expose interfaces – Adjust RPC handlers under
rpc/(and optionallyevents/orhooks/) so callers can access the new functionality. - Keep maintenance in mind – Schedule cleanups in
cron.rsand emit events where downstream consumers expect them.
Usage Notes
- Always reuse the JWT helpers in
config::JwtConfigwhen issuing tokens so validation rules stay consistent. - When introducing a new OAuth provider, add its configuration to
config.rs, implement the client underoauth/, and register it through the provider registry. - Prefer high‑level service APIs for authentication operations inside other modules; they encapsulate hashing, validation, and side effects.
- Tests live under
modules/auth/tests/. Mirror the high‑level flows there to keep regressions visible.
Keep this document in sync with structural changes so future maintainers know where to find the pieces they need.
User Account System
Core concepts
- User authentication record –
UserAuthAccountis the canonical row inauth.user. It tracks the account UUID, ban flag, registration timestamp, and whether two-factor is enabled while letting the same identity be accessed through multiple login surfaces:
#[derive(Debug, Clone, Copy, PartialEq, Eq, sqlx::FromRow)]
/// The core entity of user authentication.
///
/// This is the top-level entity that represents a user's authentication account.
/// It contains the user's ID, whether they are banned, and the date they registered.
///
/// The user can have multiple way to login, such as email, OAuth.
pub struct UserAuthAccount {
pub id: Uuid,
pub is_banned: bool,
pub registered_at: time::PrimitiveDateTime,
pub two_factor_enabled: bool,
}
- Profile vs. login surface –
UserProfilekeeps mutable presentation data (name, picture, marketing email, group membership, MFA flag) separate from credentials; the email stored here is not automatically a login method:
#[derive(Debug, Clone, PartialEq, Eq, sqlx::FromRow)]
/// The profile of a user.
pub struct UserProfile {
pub id: Uuid,
pub name: Option<String>,
pub picture: Option<String>,
/// The email address will be used for notification and marketing.
///
/// For the address used for authentication or security, refer to the EmailAccount entity.
pub email: Option<String>,
pub created_at: time::PrimitiveDateTime,
pub updated_at: time::PrimitiveDateTime,
/// User's group determines what production can be shown to the user.
pub user_group: i32,
/// User's extra groups are used to determine what production can be shown to the user.
/// Extra group is for private production.
pub user_extra_groups: Vec<i32>,
/// Whether MFA is enabled for the user
pub mfa_enabled: bool,
}
Dedicated tables hold actual login credentials: password-backed EmailAccount rows link an address and password hash to the user:
#[derive(Clone, PartialEq, Eq, sqlx::FromRow, Zeroize, ZeroizeOnDrop)]
pub struct EmailAccount {
pub id: i64,
pub email: String,
pub password_hash: CompactString,
pub user_id: Uuid,
}
Each OAuth connection lives in OAuthAccount with provider metadata and timestamps:
#[derive(Debug, Clone, PartialEq, Eq, sqlx::FromRow)]
/// The OAuth account of a user.
///
/// One user can have multiple OAuth accounts.
pub struct OAuthAccount {
pub id: i64,
pub user_id: Uuid,
pub provider_name: OAuthProviderName,
pub provider_user_id: String,
pub registered_at: PrimitiveDateTime,
pub token_updated_at: PrimitiveDateTime,
}
- Events and audits – account binding/unbinding and security changes emit AMQP events (see
AccountBindEvent,AccountUnbindEvent,PasswordResetEvent, etc.) so downstream systems can react or log activity. When you add new surfaces remember to publish the appropriate events.
Account data layout
When a user registers through email, RegisterEmailAccount creates the auth.user row, seeds a profile, and stores the hashed password in auth.email_account:
#[derive(Clone, PartialEq, Eq, Zeroize, ZeroizeOnDrop)]
pub struct RegisterEmailAccount {
pub email: String,
pub password_hash: CompactString,
pub user_group: i32,
}
pub enum RegisterEmailAccountResult {
Success {
user_id: Uuid,
email_account_id: i64,
},
EmailAlreadyExists,
}
OAuth registrations follow the same pattern via RegisterOAuthAccount, inserting the profile with provider metadata before creating the first OAuth credential row:
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct RegisterOAuthAccount {
pub provider_name: OAuthProviderName,
pub provider_user_id: String,
pub email: Option<String>,
pub name: Option<String>,
pub picture: Option<String>,
pub user_group: i32,
}
The account service layers aggregate these shards on demand. UserManageService::process(ShowUserDetail) joins profile, auth flags, email login (if any), OAuth logins, and whether TOTP exists so dashboards can render a full state snapshot.
Whenever you extend the schema, double-check that:
CountUserLoginMethodscontinues to report the real number of usable login paths (it currently sums email + OAuth rows)- Removal flows (user-facing and admin) still guard against deleting the last login method.
- Admin-facing DTOs (
UserDetailResponse,UserSummary) expose whatever additional surface you add for operations tooling.
Login flows and user self-service
Email/password login
EmailProviderService::process(EmailLogin) performs credential lookup, constant-time password verification (dummy hash fallback), and MFA evaluation before minting access/refresh JWTs through SessionService::CreateSession and emitting a UserLoginEvent for analytics:
#[derive(Clone, PartialEq, Eq)]
pub struct EmailLogin {
pub email: String,
pub password: String,
pub mfa: Option<MfaMethod>,
pub ip: Option<IpAddr>,
pub user_agent: Option<String>,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum EmailLoginResult {
Success(AccessToken, RefreshToken),
WrongCredential,
RequireMfa,
MfaFailed,
NotFound,
}
The session creation process stores refresh tokens in Redis and generates JWT access tokens:
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct CreateSession {
pub user_id: Uuid,
pub login_method: LoginMethod,
pub ip: Option<std::net::IpAddr>,
pub user_agent: Option<String>,
}
Expect a RequireMfa or MfaFailed result when MFA is toggled on and the client omits or fails verification.
OAuth login
OAuthProviderService::process(OAuthLogin) validates the state challenge stored in Redis, exchanges the provider code for tokens, fetches user info, and either looks up or registers an OAuthAccount. New registrations populate profile defaults and raise UserRegisterEvent. Successful logins create a session tagged with the provider for downstream attribution:
#[derive(Debug, Clone)]
pub struct OAuthLogin {
pub provider_name: OAuthProviderName,
pub code: String,
pub state: Uuid,
pub ip: Option<std::net::IpAddr>,
pub user_agent: Option<String>,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum OAuthLoginResult {
LoggedIn(AccessToken, RefreshToken),
InvalidState,
ProviderMismatch,
ProviderError(String),
UserRegistered(AccessToken, RefreshToken),
}
Managing login methods
Users can bind extra surfaces only after entering sudo mode. Email-based sudo tokens are issued via MfaService and verified by both EmailProviderService and OAuthProviderService before binding, changing passwords, or unlinking methods.
For removal, EmailProviderService::RemoveEmailAccount and OAuthProviderService::RemoveOAuthAccount ensure at least one login method remains, delete the credential row, and fire an AccountUnbindEvent so audit logs stay complete.
Those flows back the gRPC UserAccountService endpoints that power user settings; when adding a new method wire it through the same guardrails.
Security hardening
- MFA & sudo mode –
MfaServicesupports TOTP and email OTP verification, toggles MFA on the profile, and issues short-lived sudo tokens cached in Redis. Any destructive credential change validates a sudo token first:
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum MfaMethod {
Totp { code: u32 },
Email { email: String, code: String },
}
-
Password lifecycle – Password resets validate email links, hash the new password with the configured algorithm, terminate every active session, and broadcast a
PasswordResetEvent. See the Password Reset Flow guide for detailed information about the reset process and APIs. Explicit password changes follow the same hashing path and sudo check. -
Session management – The session service stores refresh tokens in Redis keyed by
SessionId, issues JWTs from config, writes login events, and can terminate an entire user’s session set on demand (used after password reset or by security tooling). -
Constant-time credential checks – Email login performs dummy verifications when the address is missing to minimize timing leaks, and password hashes are never logged thanks to explicit redaction in debug output (as shown in the
EmailAccountdebug implementation above).
Administrative operations
UserManageService exposes RBAC-protected processors for customer support and moderation:
- Counting users in configurable time windows, optionally excluding banned accounts
- Listing users with filters on email, group, ban status, and registration time, plus aggregate OAuth provider names for quick scanning
- Showing detailed account state (profile, auth flags, login methods, MFA) for a specific user (as shown in the
ShowUserDetailimplementation above) - Removing login methods while keeping at least one usable path, editing profile basics, banning/unbanning, and forcibly removing TOTP when users are locked out
All administrative operations follow the RBAC pattern established in the Manage module, where each operation implements AdminOperation with role restrictions and audit logging:
impl AdminOperation for RemoveLoginMethodRequest {
const ALLOWED_ROLES: &'static [AdminRole] = &[AdminRole::SuperAdmin, AdminRole::Moderator];
const OPERATION_NAME: &'static str = "remove_login_method";
const OPERATION_TARGET: &'static str = "user_account";
fn to_audit_log(&self) -> Result<String, serde_json::Error> {
// ...
}
}
Whenever you add a new credential type or security lever, update these processors plus the protobufs exposed by UserAccountService so both admins and end users can inspect and manage the new surface:
Session Management
What counts as a session?
A session is the Redis record keyed by a SessionId UUID that maps a logged-in user to the metadata needed to mint tokens and audit activity:
pub struct Session {
pub id: SessionId,
pub user_id: Uuid,
pub terminated: bool,
pub last_refreshed: u64,
}
The record stores the owning user, whether it has been terminated, and the last time it was refreshed so we can expire idle sessions without touching the database. Bulk operations rely on the companion UserSessions index, which keeps the list of session IDs per user so ListUserSessions and TerminateAllSessions can enumerate them without scanning SQL tables.
Dual-token model
Each session issues two JWTs with different audiences and lifetimes:
pub struct JwtConfig {
pub secret: CompactString,
pub refresh_token_expiration: time::Duration,
pub access_token_expiration: time::Duration,
pub issuer: CompactString,
pub access_audience: CompactString,
pub refresh_audience: CompactString,
}
- Access token – Short-lived bearer token for regular APIs guarded by
UserAuthLayer. It embeds the user ID (sub) and session ID (sid) and expires peraccess_token_expirationin the configuration. - Refresh token – Long-lived token scoped to refreshing or terminating the session. Its audience differs from the access token and it carries a longer
exp, typically thirty days by default.
Both tokens are minted when SessionService::process(CreateSession) is called. The refresh expiration is also used as the TTL for the session key in Redis, ensuring Redis evicts the record once the refresh token is no longer valid:
pub async fn verify_refresh_token(
&self,
refresh_token: &str,
) -> Result<Option<SessionId>, Error> {
let config = self.load_config().await?;
let decode = config.jwt.refresh_token_decoder();
let session_id = decode(refresh_token).ok().map(|c| c.claims.sid);
Ok(session_id.map(SessionId))
}
The refresh token ID is verified via SessionService::verify_refresh_token before any privileged session action proceeds, keeping refresh operations isolated from the access token path.
Lifecycle
Creation
Login and registration flows call SessionService::CreateSession after authentication succeeds. The service allocates a fresh session UUID, stores the Redis record with a TTL derived from the refresh lifetime, emits a UserLoginEvent for downstream consumers, and returns both JWTs:
async fn process(&self, input: CreateSession) -> Result<(AccessToken, RefreshToken), Error> {
let session_id = Uuid::new_v4();
let now = time::OffsetDateTime::now_utc().unix_timestamp() as u64;
let config = self.load_config().await?;
let session = Session {
id: SessionId(session_id),
user_id: input.user_id,
terminated: false,
last_refreshed: now,
};
// Store with TTL equal to refresh token expiration
Session::write_kv_with_ttl(&mut redis, SessionId(session_id), session, refresh_token_expiration).await?;
let access = config.jwt.generate_access_token(input.user_id, SessionId(session_id))?;
let refresh = config.jwt.generate_refresh_token(input.user_id, SessionId(session_id))?;
// Emit login event for downstream consumers
UserLoginEvent { ... }.send(&self.mq).await?;
Ok((access, refresh))
}
OAuth and email providers share this path so every login surface behaves consistently.
Refresh
Clients invoke the RefreshSession RPC with the refresh token in the x-refresh-token header. The server decodes the session ID from the token, reloads the Redis record, rejects terminated or missing sessions, enforces the inactivity timeout, and then rewrites the record with an updated timestamp before minting a new access/refresh pair:
pub const REFRESH_TOKEN_HEADER: &str = "x-refresh-token";
async fn process(&self, input: RefreshSession) -> Result<SessionRefreshResult, Error> {
let Some(mut session) = Session::read(&mut redis, SessionId(input.session_id)).await? else {
return Ok(SessionRefreshResult::NotFound);
};
if session.terminated {
return Ok(SessionRefreshResult::Terminated);
}
let last_refreshed = time::OffsetDateTime::from_unix_timestamp(session.last_refreshed as i64);
let now = time::OffsetDateTime::now_utc();
if now - last_refreshed > refresh_expiration {
return Ok(SessionRefreshResult::Expired);
}
session.last_refreshed = now.unix_timestamp() as u64;
Session::write_kv(&mut redis, SessionId(input.session_id), session).await?;
let access = config.jwt.generate_access_token(user_id, SessionId(input.session_id))?;
let refresh = config.jwt.generate_refresh_token(user_id, SessionId(input.session_id))?;
Ok(SessionRefreshResult::Refreshed(access, refresh))
}
Refreshes never accept the access token, and the refresh token is not honored by UserAuthLayer, so each token stays in its intended lane.
Expiration and revocation
Sessions can disappear through several channels:
- Access token expiration – The JWT validator in
UserAuthLayersimply rejects expired access tokens, forcing the client to refresh with a valid refresh token:
pub const ACCESS_TOKEN_HEADER: &str = "x-user-authorization";
async fn user_auth(metadata: &HeaderMap, mut redis: RedisConnection) -> Result<UserId, Status> {
let header = metadata
.get(ACCESS_TOKEN_HEADER)
.and_then(|h| h.to_str().ok())
.ok_or(Status::unauthenticated("Missing authorization header"))?;
let config = find_config_from_redis::<AuthConfig>(&mut redis).await?;
let decode = config.jwt.decoder();
let jwt_claims = decode(header)
.map_err(|_| Status::unauthenticated("Invalid authorization header"))?
.claims;
Ok(UserId(jwt_claims.sub))
}
- Refresh inactivity timeout –
SessionService::process(RefreshSession)returnsExpiredonce the elapsed time sincelast_refreshedexceedsrefresh_token_expiration, preventing resurrection of idle sessions. - Redis TTL – Because the session key is stored with a TTL equal to the refresh lifetime, Redis will evict it automatically even if the refresh path never runs again.
- Scheduled cleanup – The
SessionCleanupJobcron scans remaining session keys, deleting any whoselast_refreshedpredates the configured expiration window to catch edge cases where TTLs were extended or missing. - Manual termination – Users can call
UserAccount::TerminateSessionwith a refresh token to flag a session as terminated; future refresh attempts see theTerminatedstatus and refuse to mint tokens. Password resets also invokeTerminateAllSessionsso compromised credentials cannot keep a foothold.
Administrative levers
Operations tooling can directly reuse SessionService::TerminateAllSessions to purge a user’s active logins, and the password-reset flow demonstrates how to hook that processor after security-sensitive events. Beyond the cron cleanup job, there is currently no dedicated admin RPC that lists or manages sessions; support dashboards should wire into the Redis-backed processors (ListUserSessions, TerminateSession, TerminateAllSessions) when that capability is required.
Token-free user APIs
Only the gRPC services mounted without UserAuthLayer skip access-token checks. These “entry” endpoints live on UserAuth and cover registration, login, password resets, OAuth challenges, and session refresh; everything else, including the UserAccount service, is wrapped by UserAuthLayer and requires the access token in the x-user-authorization header. The refresh token is still required via metadata when calling RefreshSession or TerminateSession, but it never unlocks general-purpose APIs.
Register Flow
This document explains how the email-based registration pipeline in the auth module is implemented and what a frontend client must do to integrate with it. The flow is deliberately split into two RPCs: one to issue a magic link by email and another to finalize the account creation. The sections below describe the data contracts, validation rules, and expected UX behavior at each step so that UI engineers can wire the screens without re-reading the Rust implementation.
Step 1. Request a registration email
Use the UserAuth.SendRegisterEmail RPC with the prospective user’s email address and optional referral code.
| Field | Notes |
|---|---|
email | Raw email string entered by the user. |
referral_code | Optional referral/invitation code. Will be passed to Step 2 via URL query. |
Backend behavior
- The service first validates the domain against the configurable whitelist/blacklist. Requests to disallowed domains return
INVALID_EMAILimmediately:
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct EmailDomainConfig {
pub enable_white_list: bool,
pub white_list: Box<[CompactString]>,
pub enable_black_list: bool,
pub black_list: Box<[CompactString]>,
}
impl EmailDomainConfig {
pub fn check_addr(&self, addr: impl AsRef<str>) -> bool {
// ...
}
}
- If the email already maps to an existing login, the call returns
EMAIL_EXISTSso the UI can direct the user to the login or password reset flow. - When rate limiting is triggered (same address requested again before
resend_intervalelapses), the server still returnsSENTbut quietly suppresses a duplicate email. Surface a neutral “email sent” toast to avoid leaking account existence. The defaultresend_intervalis 30 seconds. - For a fresh request, the service creates a magic-link record containing a 32-character
auth_key, queues an email via RabbitMQ, and responds withSENT:
#[derive(Debug, Clone, PartialEq, Eq, sqlx::FromRow)]
pub struct EmailVerifyLink {
pub id: i64,
pub email: String,
pub auth_key: String,
pub send_at: PrimitiveDateTime,
pub reason: EmailVerifyReason,
pub user_id: Option<Uuid>,
pub is_unused: bool,
}
const AUTH_KEY_LENGTH: usize = 32;
pub fn generate_auth_key() -> String {
// ...
}
Frontend expectations
- Present an email field, an optional referral code field, and an action button. On submission, call
SendRegisterEmailand branch on the enum:SENT: Show success UI (“Check your inbox for a magic link”) and inform the user to click the link in their email. The magic link will include theauth_keyand thereferral_code(if provided) as URL query parameters, automatically directing them to Step 2.INVALID_EMAIL: Highlight the field with a validation error.EMAIL_EXISTS: Offer links to sign in or reset password.
- Because the backend may skip resending, add a visible countdown using the configured
resend_interval(default 30 seconds) before enabling a “Resend” button. - The email template contains a clickable magic link that embeds the
auth_keyin the URL. When clicked, it should route the user directly to the registration completion page (Step 2) with the token and referral code automatically populated from URL query parameters.
Step 2. Complete registration via magic link
When the user clicks the magic link from the email, they are directed to the registration completion page. The URL will contain:
auth_key– extracted from the URL query parameter, automatically validates the user’s emailreferral_code– passed through from Step 1 via URL query parameter (if provided)
The magic link is time-limited. Backend enforcement uses link_expire_after (default 5 minutes). After that, registration attempts fail with INVALID_LINK.
On the frontend, the registration completion page should:
- Extract the
auth_keyfrom the URL query parameters (this proves the user has access to the email). - Extract the
referral_codefrom the URL query parameters (if present, this was provided in Step 1). - Display a form to collect:
- A password that satisfies the platform’s policy (policy enforcement happens upstream in the password module).
- Display the referral code if present (read-only or hidden field).
- A checkbox “Keep me signed in” that toggles auto-login.
Step 3. Finalize registration
Call UserAuth.RegisterUser with the collected data:
pub struct RegisterUser {
pub auth_key: String,
pub password: String,
pub referral_code: Option<String>,
pub auto_login: bool,
pub ip: Option<IpAddr>,
pub user_agent: Option<String>,
}
| Field | Required | Notes |
|---|---|---|
auth_key | ✓ | Magic link token from URL query parameter. Each token is single-use and tied to the email. |
password | ✓ | Plain password; hashing happens server-side. |
referral_code | Optional marketing code from URL query parameter (passed through from Step 1), forwarded unchanged. | |
auto_login | ✓ | When true, the backend issues access & refresh tokens on success. |
ip | Optional. If the frontend can detect the client’s public IP (e.g., API gateway), pass it for session metadata. | |
user_agent | Optional string captured from the browser; used for session/device history. |
Response handling
RegisterUserReply can return four shapes based on the service result:
pub enum RegisterUserResult {
Registered(Uuid),
RegisteredWithSession(Uuid, AccessToken, RefreshToken),
EmailAlreadyExists,
InvalidLink,
}
REGISTERED_WITH_SESSION: Registration succeeded and a new session was created. The reply contains theuser_idplusaccess_tokenandrefresh_token. Store them immediately using the same rules as a login response, then route to the signed-in area.REGISTERED: Registration succeeded but no session was created (becauseauto_loginwas false). Route to the login screen and preload the email for convenience.- gRPC error
ALREADY_EXISTS: The email was registered after the magic link was issued. Show an “already registered” message and link to sign-in. INVALID_LINK: Magic link was unknown, already used, or expired. Inform the user that the link is invalid/expired and offer to send a new registration email.
When auto-login is enabled, the backend records the session with the supplied IP and user agent before returning tokens, so capturing accurate metadata is important for the session list and security analytics.
Retrying after failures
If the magic link expires or has already been used, require the user to restart from Step 1 and request a new registration email. Once a link has been consumed, it cannot be retried.
UI checklist
- Provide separate screens for Step 1 (email + referral code submission) and Step 2 (password entry after clicking magic link).
- In Step 1, include both an email field and an optional referral code field.
- Show a visible countdown for magic link expiration and resend availability (based on config defaults; make them configurable via environment/UI constants).
- In Step 2, extract both
auth_keyandreferral_codefrom URL query parameters automatically—no manual token entry is needed. - Display the referral code (if present) on the Step 2 page for user confirmation, either as a read-only field or hidden input.
- Ensure all success paths funnel to analytics hooks alongside the
user_idreturned from the RPC for downstream tracking. - When auto-login is disabled, clearly direct the user to the login screen after successful registration.
This flow mirrors the backend implementation and should keep the frontend in sync with the server-side invariants without re-reading the Rust code every time changes are made.
Password Reset Flow
This document explains how the password reset pipeline in the auth module is implemented and what a frontend client must do to integrate with it. The flow is split into three RPCs: one to request a password reset email with a magic link, one to validate the reset token from the magic link, and one to finalize the password reset. The sections below describe the data contracts, validation rules, and expected UX behavior at each step so that UI engineers can wire the screens without re-reading the Rust implementation.
Step 1. Request a password reset email
Use the UserAuth.SendPasswordResetEmail RPC with the user’s email address.
| Field | Notes |
|---|---|
email | Raw email string entered by the user. |
Backend behavior
- The service first validates the email format. Requests with invalid email addresses return
INVALID_EMAILimmediately. - If the email doesn’t map to any existing account with email login, the call returns
NOT_FOUNDso the UI can inform the user appropriately. - When rate limiting is triggered (same address requested again before
resend_intervalelapses, default 30 seconds), the server returnsTOO_FREQUENT. The frontend should display a message asking the user to wait before requesting another reset email. - For a valid request, the service creates a password reset link record containing a 32-character
auth_key, queues an email via RabbitMQ, and responds withSENT:
#[derive(Debug, Clone, PartialEq, Eq, sqlx::FromRow)]
pub struct EmailVerifyLink {
pub id: i64,
pub email: String,
pub auth_key: String,
pub send_at: PrimitiveDateTime,
pub reason: EmailVerifyReason,
pub user_id: Option<Uuid>,
pub is_unused: bool,
}
const AUTH_KEY_LENGTH: usize = 32;
Frontend expectations
- Present a single email field and an action button. On submission, call
SendPasswordResetEmailand branch on the enum:PWD_RESET_EMAIL_RESULT_SENT: Show success UI (“Check your inbox for a password reset link”) and inform the user to click the magic link in their email.PWD_RESET_EMAIL_RESULT_INVALID_EMAIL: Highlight the field with a validation error.PWD_RESET_EMAIL_RESULT_NOT_FOUND: Inform the user that no account exists with this email address.PWD_RESET_EMAIL_RESULT_TOO_FREQUENT: Display a message asking the user to wait before requesting another reset email (default 30 seconds).
- Add a visible countdown using the configured
resend_interval(default 30 seconds) before enabling a “Resend” button. - The email template contains a clickable magic link that embeds the
auth_keyin the URL. When clicked, it should route the user directly to the password reset page (Step 2/3) with the token automatically populated from URL query parameters.
Step 2. Validate the reset token (optional but recommended)
Use the UserAuth.CheckPasswordResetToken RPC to validate the token before allowing the user to enter a new password. This step is optional but provides better user experience by catching expired or invalid tokens early.
| Field | Notes |
|---|---|
token | The 32-character auth_key extracted from the URL query parameter |
Backend behavior
- The service looks up the token in the database and validates:
- The token exists
- The token is for password reset (not registration or other purposes)
- The token hasn’t been used yet
- The token hasn’t expired (default expiry is 5 minutes, controlled by
link_expire_after)
- If all validations pass, it returns
valid: trueand theexpire_attimestamp (Unix timestamp in seconds). - If any validation fails, it returns
valid: falsewith no expiration time.
pub struct CheckPasswordResetToken {
pub token: String,
}
pub enum CheckPasswordResetTokenResult {
Valid(PrimitiveDateTime),
Invalid,
}
Frontend expectations
- Extract the
auth_keyfrom the URL query parameters when the user lands on the password reset page via the magic link. - Call this RPC with the extracted token to validate it before showing the password reset form.
- If
validistrue:- Display the password reset form
- Optionally show a countdown timer based on the
expire_attimestamp to inform the user how much time they have left
- If
validisfalse:- Display an error message that the reset link is invalid or has expired
- Offer a link to request a new password reset email
- This validation step helps prevent the user from filling out a new password only to discover the token is invalid when they submit.
Step 3. Collect the new password
The magic link is time-limited (default 5 minutes). After that, reset attempts fail with INVALID_LINK.
On the frontend, the password reset page should:
- Use the token (
auth_key) extracted from the URL query parameters (Step 2). - Collect a new password that satisfies the platform’s password policy.
Step 4. Finalize password reset
Call UserAuth.ResetPassword with the collected data:
pub struct ResetPassword {
pub auth_key: String,
pub new_password: String,
}
| Field | Required | Notes |
|---|---|---|
auth_key | ✓ | Token from Step 1. Each token is single-use. |
new_password | ✓ | Plain password; hashing happens server-side. |
Response handling
ResetPasswordReply can return three results based on the service outcome:
pub enum ResetPasswordResult {
Success,
InvalidLink,
AccountNotFound,
}
RESET_PASSWORD_RESULT_SUCCESS: Password was successfully reset. All active sessions for this user have been terminated for security. Direct the user to the login page with a success message.RESET_PASSWORD_RESULT_INVALID_LINK: Token was unknown, already used, expired, or not for password reset. Show an error and offer to resend a reset email.RESET_PASSWORD_RESULT_ACCOUNT_NOT_FOUND: The account associated with this reset token no longer exists. This is rare but can happen if an account was deleted between steps. Display an appropriate error message.
Security considerations
When the password reset succeeds, the backend automatically:
- Hashes the new password using the configured password hashing algorithm
- Updates the password in the database
- Terminates all active sessions for this user account to prevent any potentially compromised sessions from remaining active
- Emits a
PasswordResetEventfor audit logging and downstream processing
Users will need to log in again with their new password after a successful reset.
Retrying after failures
If the user encounters an INVALID_LINK error, require them to restart from Step 1. Once a token has been consumed or expired, it cannot be retried. The single-use nature of tokens prevents replay attacks.
UI checklist
- Provide separate screens for email submission and password reset completion.
- When the user lands on the password reset page via magic link, automatically extract the
auth_keyfrom the URL query parameters—no manual token entry is needed. - Show a visible countdown for link expiration (5 minutes by default, based on
link_expire_afterconfig). - Use
CheckPasswordResetTokenbefore showing the password reset form to provide early feedback about token validity and show an expiration countdown. - After successful password reset, clearly direct the user to the login screen and inform them that all their sessions have been terminated for security.
- Consider implementing the token validation (Step 2) to improve user experience by catching invalid links before the user enters a new password.
Typical UX flow
A recommended user experience flow:
- Forgot Password Page: User enters email → call
SendPasswordResetEmail - Check Email Page: Show success message and instructions to click the magic link in their email
- Reset Password Page (accessed by clicking the magic link in the email):
- Extract
auth_keyfrom URL query parameters - On page load: call
CheckPasswordResetTokenwith the extracted token - If valid: show password form with expiration timer
- If invalid: show error and link back to Step 1
- Extract
- Submit New Password: call
ResetPasswordwith the token from URL and new password - Success Page: Inform user their password was reset and all sessions were terminated → redirect to login
This flow mirrors the backend implementation and should keep the frontend in sync with the server-side invariants without re-reading the Rust code every time changes are made.
Authentication for Other Modules
This guide explains how gRPC services outside of the auth module reuse the authentication middleware so they can trust the UserId that reaches their handlers. It focuses on the shared layer, required request metadata, and the pattern every user-facing RPC follows when extracting identity.
Middleware overview
The UserAuthLayer type wraps gRPC routers with a Tower middleware that runs before your service logic. When the layer sees an incoming request it:
- Looks for the
x-user-authorizationheader (a raw JWT access token). - Loads the current
AuthConfigfrom Redis viafind_config_from_redisso it can decode the token with the same secret and issuer values used during minting. - Decodes and validates the JWT using the access-token audience/issuer rules, producing a
UserIdnewtype when everything checks out. - Stores that
UserIdin the request extensions so downstream handlers can pull it without re-validating the token.【F:modules/auth/src/rpc/middleware.rs†L12-L110】
GrpcWorker::server_ready installs this layer globally before registering user-scoped services. Any service added after .layer(self.user_auth_middleware) automatically receives authenticated requests and does not need to declare the middleware explicitly.【F:server/src/worker/grpc.rs†L302-L349】
Required request metadata
Clients must send the access token in the x-user-authorization header on every RPC guarded by the middleware. Tokens are the opaque strings returned from login/registration flows; do not prefix them with Bearer. If the header is missing or the token fails validation, user_auth returns a Status::unauthenticated error. The layer logs the failure and forwards the request; when your handler subsequently calls UserId::from_request it will see the error and propagate the unauthenticated status back to the caller.【F:modules/auth/src/rpc/middleware.rs†L74-L110】
The middleware never accepts refresh tokens—those belong in the x-refresh-token header and are handled exclusively by the session RPC (RefreshSession).【F:modules/auth/src/rpc/middleware.rs†L82-L92】【F:modules/auth/src/rpc/auth_service.rs†L288-L331】 Keep access and refresh tokens in their respective lanes to avoid confusing downstream services.
Reading the authenticated user
Inside a gRPC handler, retrieve the caller identity by importing UserId from auth::rpc::middleware and calling UserId::from_request(&request) at the top of the method. That helper pulls the UserId extension set by the middleware and converts it back into a plain Uuid. Every user-facing module (market, telecom, shop, notification, support, etc.) follows this pattern so new services should mirror it.【F:modules/auth/src/rpc/middleware.rs†L94-L110】【F:modules/market/src/rpc/market.rs†L70-L132】【F:modules/telecom/src/rpc/telecom.rs†L69-L265】【F:modules/shop/src/rpc/order.rs†L149-L321】
Avoid re-parsing JWTs or threading user IDs through request payloads; the middleware ensures a single source of truth. If UserId::from_request returns Status::unauthenticated, simply propagate that error to the caller so clients know to refresh their session.
Adding a new authenticated service
When you introduce a new gRPC server that should require user authentication:
- Register it after the
.layer(self.user_auth_middleware)call inGrpcWorker::server_ready. - In every handler, call
UserId::from_requestbefore executing business logic. - Treat the returned
Uuidas the authenticated principal and authorize against your domain resources accordingly.
If you need unauthenticated entry points (e.g., public lookups), mount that service before the middleware layer or expose the RPC through the UserAuth service instead. Mixing authenticated and unauthenticated handlers in the same service leads to confusing guarantees, so prefer splitting them.
Testing tips
Integration tests that hit authenticated RPCs should obtain a valid access token through the login helpers and attach it to the request metadata under x-user-authorization. When writing unit tests that call handlers directly, construct a tonic::Request and insert a UserId into its extensions to simulate the middleware path. This mirrors what production infrastructure does and keeps your tests aligned with the runtime behavior.【F:modules/auth/src/rpc/middleware.rs†L30-L110】
Telecom Module
Overview
The Telecom Module is the core networking and proxy management component of the Helium system. It provides comprehensive functionality for managing VPN/proxy networks, user subscriptions, traffic monitoring, and billing operations.
This module implements a sophisticated proxy network infrastructure that supports multiple protocols and backends, including XRayR and SSP compatibility, making it suitable for various deployment scenarios.
Key Features
Node Management
- Node Servers: Physical proxy servers that handle user connections
- Node Clients: Proxy endpoints that users connect to
- Multi-protocol Support: Compatible with various proxy protocols
- Geographic Distribution: Node location and route classification
- Status Monitoring: Real-time node health and availability tracking
Package System
- User Packages: Subscription-based service packages with traffic limits
- Package Queue: Automated package activation system
- Flexible Billing: Traffic-based and time-based billing models
- Traffic Factor: Custom billing multipliers per node
Traffic Analysis
- Real-time Monitoring: Track upload/download usage per user
- Historical Data: Traffic usage history and trends
- Billing Analytics: Calculate actual billed traffic vs raw usage
- Node Usage Statistics: Identify popular nodes and usage patterns
Subscription Management
- Dynamic Links: Generate subscription URLs for client applications
- Multiple Formats: Support various client configurations
- Token Management: Secure subscription tokens per user
- Auto-configuration: Client-side configuration generation
Architecture
Core Services
| Service | Purpose |
|---|---|
| NodeClientService | Manages proxy client endpoints |
| NodeServerService | Handles proxy server infrastructure |
| PackageQueueService | Processes user subscription packages |
| SubscribeLinkService | Generates subscription links |
| AnalysisService | Analyzes traffic and usage patterns |
| ManageService | Administrative operations |
gRPC API Structure
The module exposes two main gRPC services:
-
Telecom Service (
helium.telecom)- User-facing operations
- Node listing and information
- Package management
- Subscription links
- Traffic usage queries
-
Management Service (
helium.telecom_manage)- Admin operations
- Node server/client CRUD
- Package queue management
- Configuration validation
Data Models
Node Hierarchy
NodeServer (Physical Server)
├── NodeClient (Proxy Endpoint 1)
├── NodeClient (Proxy Endpoint 2)
└── NodeClient (Proxy Endpoint N)
Package Lifecycle
Created → In Queue → Active → Consumed/Cancelled
Integration Points
Dependencies
- Database: PostgreSQL for persistent data
- Redis: Caching and session management
- Message Queue: AMQP for event processing
- Authentication: Integration with auth module
- Management: Admin interface integration
External Libraries
- libsubconv: Subscription format conversion
- xrayr_feeder: XRayR backend integration
- subscribe_client_config: Client configuration generation
Usage Examples
Creating a Node Client
use telecom::services::manage::ManageService;
let manage_service = ManageService::new(db_pool, redis_conn);
// Create node client via processor pattern
let request = CreateNodeClientRequest {
server_id: 1,
name: "US-West-1".to_string(),
traffic_factor: "1.0".to_string(),
display_order: 100,
// ... other fields
};
let result = manage_service.process(request).await?;
Checking User Package
use telecom::services::package_queue::PackageQueueService;
let package_service = PackageQueueService::new(db_pool, redis_conn);
// Get user's current active package
let request = GetCurrentPackage { user_id };
let package_info = package_service.process(request).await?;
Generating Subscription Links
use telecom::services::subscribe_link::SubscribeLinkService;
let subscribe_service = SubscribeLinkService::new(db_pool, redis_conn);
// Generate subscription links for user
let request = GetSubscribeLinks { user_id };
let links = subscribe_service.process(request).await?;
Database Schema
Key database entities:
node_servers: Physical proxy serversnode_clients: Proxy client endpointspackages: Available service packagespackage_queue: User package subscriptionsuser_package_usage: Traffic usage trackingnode_status_history: Node availability history
Configuration
Environment Variables
The module uses configuration from telecom::config which includes:
- Database connection settings
- Redis connection parameters
- External service endpoints
- Billing calculation parameters
Node Configuration
Both node servers and clients support flexible JSON configuration for different proxy protocols and backends.
Event System
The module implements an event-driven architecture with:
Events
- Package Activation: When packages become active
- Usage Recording: Traffic usage events
- Node Status Changes: Server/client status updates
Hooks
- Billing Hook: Calculate traffic charges
- Registration Hook: New user package setup
- Package Queue Hook: Automated package processing
Automated Tasks (Cron)
The module includes scheduled tasks for:
- Package queue processing
- Traffic usage aggregation
- Node status monitoring
- Billing calculations
Testing
Comprehensive test coverage includes:
- Unit tests for each service
- Integration tests with database
- Management service tests
- Package queue processing tests
- Subscribe link generation tests
Test infrastructure uses testcontainers for isolated testing environments.
Development Guidelines
Service Pattern
All business logic is implemented using the Processor pattern (not object-oriented patterns). Each service exposes its functionality through the Processor trait.
Error Handling
- Use
anyhow::Resultfor error propagation - Proper error context with
tracing - No
unwrap()orexpect()calls (forbidden by clippy)
Database Access
- Use owned
RedisConnectiontype (not static lifetimes) - Proper connection pooling
- Transaction management for consistency
API Documentation
For detailed API documentation, refer to the generated protobuf documentation from:
proto/telecom/telecom.protoproto/telecom/manage.protoproto/telecom/common.proto
Troubleshooting
Common Issues
- Node Offline: Check server connectivity and configuration
- Package Not Activating: Verify queue processing and billing status
- Subscription Links Invalid: Check token generation and node availability
- Traffic Not Recording: Verify event processing and database connections
Logging
The module uses structured logging with tracing. Key log points:
- Package activation/deactivation
- Node status changes
- Traffic usage recording
- API request/response patterns
Migration Notes
When migrating from other proxy management systems:
- Import existing node configurations
- Migrate user packages and quotas
- Set up traffic factor mappings
- Configure billing parameters
- Test subscription link generation
For specific migration procedures, refer to the migration documentation in doc/src/migration/.
Node Server
Concept
Node Server represents the physical proxy server infrastructure in the Helium telecom system. It is the actual backend server that handles proxy connections and processes user traffic.
Key Concepts
- Physical Infrastructure: Node Server is the actual server hardware/software that performs proxy operations
- Backend Component: Operates behind the scenes, not directly visible to end users
- Traffic Handler: Processes all user proxy connections and traffic routing
- Configuration Target: Holds server-side configuration that determines how the proxy server operates
Architecture Position
User Client → Node Client (User-facing) → Node Server (Infrastructure) → Internet
Node Server sits at the infrastructure layer, receiving connections from multiple Node Clients and handling the actual proxy work.
Relationship with Node Client
| Aspect | Node Server | Node Client |
|---|---|---|
| Purpose | Physical proxy infrastructure | User-facing proxy endpoint |
| Visibility | Backend/Admin only | Visible to end users |
| Configuration | Server-side proxy config | Client-side connection config |
| Relationship | 1 server : N clients | N clients : 1 server |
| Responsibility | Traffic processing | User interface |
Configuration
Node Server supports two main configuration types through the NodeServerConfig enum:
Configuration Types
1. NewV2b (UniProxy)
Modern proxy configuration using the UniProxy protocol:
NodeServerConfig::NewV2b(Box<UniProxyProtocolConfig>)
Features:
- High-performance proxy protocol
- Built-in traffic reporting
- Advanced user management
- Speed limiting per user
- Device limiting
2. SSP (SSPanel Compatible)
Legacy SSPanel-compatible configuration:
NodeServerConfig::Ssp(Box<CustomConfig>)
Features:
- SSPanel API compatibility
- Traditional proxy methods
- Custom host configuration
- Legacy traffic reporting
Core Configuration Fields
pub struct NodeServer {
pub id: i32,
pub server_side_config: Json<NodeServerConfig>, // Protocol configuration
pub speed_limit: i64, // Per-user speed limit in Byte/s
pub status: NodeServerStatus, // Online/Offline/Maintenance
pub last_online_time: PrimitiveDateTime, // Last heartbeat timestamp
}
Configuration Examples
These JSON examples show the API request format for creating Node Servers with different backend configurations.
Creating NewV2b Node Server
{
"server_side_config": {
"compatibility": "newv2b",
"api_host": "127.0.0.1",
"api_port": 8080,
"node_id": 1,
"cert_mode": "none",
"cert_domain": "example.com",
"cert_file": "",
"key_file": "",
"ca_file": "",
"timeout": 30,
"listen_ip": "0.0.0.0",
"send_ip": "0.0.0.0",
"device_limit": 0,
"speed_limit": 0,
"rule_list_path": "",
"dns_type": "AsIs",
"enable_dns": false,
"disable_upload_traffic": false,
"disable_get_rule": false,
"disable_ivpn_check": false,
"disable_memory_optimizations": false,
"enable_reality_show": false,
"enable_brutal": false,
"brutal_debug": false,
"enable_ip_sync": false,
"ip_sync_interval": 60
},
"speed_limit": 1000000000
}
Key Configuration Points:
compatibility: “newv2b” specifies UniProxy protocol backendspeed_limit: 1GB/s total server capacity per usernode_id: Must match the Node Server ID in databaseapi_host/api_port: Backend API connection settingscert_mode: Certificate handling (“none”, “file”, “http”, “dns”)
Creating SSP Node Server
{
"server_side_config": {
"compatibility": "ssp",
"host": "proxy.example.com",
"port": 80,
"node_id": 1,
"key": "your-api-key",
"speed_limit": 0,
"device_limit": 0,
"rule_list_path": "",
"custom_config": {
"offset_port_user": 0,
"offset_port_node": 0,
"server_key": "",
"host": "proxy.example.com",
"server_port": 443
}
},
"speed_limit": 500000000
}
Key Configuration Points:
compatibility: “ssp” specifies SSPanel-compatible backendspeed_limit: 500MB/s total server capacity per userhost: SSPanel API hostkey: Authentication key for SSPanel APIcustom_config: SSPanel-specific configuration options
Configuration Validation
The system provides configuration validation through the gRPC management API:
gRPC Service: helium.telecom_manage.NodeServerManage
Method: VerifyNodeServerConfig
Request:
message VerifyNodeServerConfigRequest {
string config = 1;
}
Response:
message VerifyReply {
bool valid = 1;
}
Example gRPC call:
grpcurl -plaintext \
-d '{"config": "{\"compatibility\":\"newv2b\",\"api_host\":\"127.0.0.1\",\"api_port\":8080}"}' \
localhost:50051 \
helium.telecom_manage.NodeServerManage/VerifyNodeServerConfig
Frontend applications should validate server configurations before creating Node Servers to ensure proper backend protocol settings and prevent deployment failures.
JSON Schema Reference
For frontend developers, here are the key JSON structures:
Node Server Creation Request
{
"server_side_config": {
// Required: Backend protocol configuration
"compatibility": "<string>" // Required: "newv2b" or "ssp"
// Configuration fields vary by compatibility type
},
"speed_limit": "<integer>" // Required: Per-user speed limit in Byte/s
}
Compatibility Types
- “newv2b”: Modern UniProxy protocol backend
- “ssp”: Legacy SSPanel-compatible backend
NewV2b Configuration Schema
{
"compatibility": "newv2b",
"api_host": "<string>", // Backend API host
"api_port": "<integer>", // Backend API port
"node_id": "<integer>", // Must match database Node Server ID
"cert_mode": "<string>", // Certificate mode: "none", "file", "http", "dns"
"cert_domain": "<string>", // Domain for certificate
"cert_file": "<string>", // Certificate file path
"key_file": "<string>", // Private key file path
"ca_file": "<string>", // CA certificate file path
"timeout": "<integer>", // Request timeout in seconds
"listen_ip": "<string>", // IP to bind for incoming connections
"send_ip": "<string>", // IP to use for outgoing connections
"device_limit": "<integer>", // Device limit per user (0 = no limit)
"speed_limit": "<integer>", // Speed limit per user (0 = no limit)
"rule_list_path": "<string>", // Path to routing rules file
"dns_type": "<string>", // DNS resolution type
"enable_dns": "<boolean>", // Enable DNS server
"disable_upload_traffic": "<boolean>",
"disable_get_rule": "<boolean>",
"disable_ivpn_check": "<boolean>",
"disable_memory_optimizations": "<boolean>",
"enable_reality_show": "<boolean>",
"enable_brutal": "<boolean>",
"brutal_debug": "<boolean>",
"enable_ip_sync": "<boolean>",
"ip_sync_interval": "<integer>" // IP sync interval in seconds
}
SSP Configuration Schema
{
"compatibility": "ssp",
"host": "<string>", // SSPanel API host
"port": "<integer>", // SSPanel API port
"node_id": "<integer>", // Must match database Node Server ID
"key": "<string>", // API authentication key
"speed_limit": "<integer>", // Speed limit per user (0 = no limit)
"device_limit": "<integer>", // Device limit per user (0 = no limit)
"rule_list_path": "<string>", // Path to routing rules file
"custom_config": {
// SSPanel-specific settings
"offset_port_user": "<integer>",
"offset_port_node": "<integer>",
"server_key": "<string>",
"host": "<string>",
"server_port": "<integer>"
}
}
Management and Observability
Server Status Management
Node Servers have three possible states:
Status Values:
["online", "offline", "maintenance"]
Status Descriptions:
- “online”: Server is healthy and processing requests
- “offline”: Server missed heartbeat threshold
- “maintenance”: Server manually marked for maintenance
Status Transitions
- Online → Offline: Automatic when
last_online_timeexceedsoffline_timeout - Offline → Online: Automatic when server sends heartbeat
- Any → Maintenance: Manual admin action
- Maintenance → Online: Manual admin action + heartbeat
Heartbeat System
Node Servers maintain connectivity through heartbeat reporting via traffic upload APIs. The specific API endpoint depends on the server compatibility type:
- NewV2b servers: Use
/api/v1/server/UniProxy/push(see NewV2b Traffic Reporting section) - SSP servers: Use
/mod_mu/users/traffic(see SSP Traffic Reporting section)
Heartbeat Behavior:
- Heartbeat is automatically updated when servers report traffic data
last_online_timeis set to current timestamp on each successful API call- Failed API calls do not update heartbeat timestamp
Health Monitoring Configuration
Configure health check timeouts through system configuration:
{
"node_health_check_config": {
"offline_timeout": 600
}
}
Configuration Fields:
offline_timeout: Seconds after which a server is marked offline (default: 600 = 10 minutes)
Health Check Logic:
- If
current_time - last_online_time > offline_timeout, server status becomes “offline” - If
current_time - last_online_time <= offline_timeout, server status becomes “online”
Automated Status Updates
The system runs automated background tasks for status management:
Automated Operations:
- Periodic Status Refresh: Runs every 5 minutes to update server statuses
- Offline Detection: Marks servers offline if
last_online_timeexceeds threshold - Online Recovery: Automatically marks servers online when they resume heartbeat
- Status History: Records status change events for monitoring and analytics
Status Update API for Monitoring:
Status updates are handled automatically by the system background tasks. Administrative monitoring uses the gRPC management API:
gRPC Service: helium.telecom_manage.NodeServerManage
Method: ListNodeServers
Request:
message ListNodeServersRequest {
int64 limit = 1;
int64 offset = 2;
optional NodeServerStatus filter_status = 3;
}
Response:
message ListNodeServersReply {
repeated NodeServerSummary servers = 1;
}
message NodeServerSummary {
int32 id = 1;
NodeServerCompatibility compatibility = 2;
NodeServerStatus status = 3;
int64 last_online_time = 4;
int64 client_number = 5;
}
enum NodeServerCompatibility {
NODE_SERVER_COMPATIBILITY_NEW_V2B = 0;
NODE_SERVER_COMPATIBILITY_SSP = 1;
}
Example gRPC call:
grpcurl -plaintext \
-d '{"limit": 100, "offset": 0}' \
localhost:50051 \
helium.telecom_manage.NodeServerManage/ListNodeServers
Administrative Operations
Node Server management is restricted to system administrators through the management gRPC service. The system implements role-based access control (RBAC) with four distinct admin levels, each with specific capabilities.
Admin Levels and Permissions
| Admin Level | Node Server Capabilities | Access Level |
|---|---|---|
| SuperAdmin | • List all servers • View detailed server information • Create new servers • Delete servers • Modify server configurations • Validate server configurations • Manual status overrides | Full access to all operations |
| Moderator | • List all servers • View detailed server information • Create new servers • Delete servers • Modify server configurations • Validate server configurations • Manual status overrides | Full access except super-admin exclusive operations |
| CustomerSupport | • List all servers (read-only) • View basic server status • Monitor server health | Read-only access for support purposes |
| SupportBot | • No direct server access | Automated systems only, no server management |
Detailed Permission Matrix
Server Management Operations:
- List Servers (
list_servers): ✅ SuperAdmin, ✅ Moderator, ✅ CustomerSupport - Show Server Details (
show_server): ✅ SuperAdmin, ✅ Moderator - Create Server (
create_server): ✅ SuperAdmin, ✅ Moderator - Delete Server (
delete_server): ✅ SuperAdmin, ✅ Moderator - Verify Configuration (
verify_server_config): ✅ SuperAdmin, ✅ Moderator
Access Control Features:
- All operations require valid admin authentication tokens
- Each administrative action is logged and auditable
- Server deletion is protected by dependency checking (cannot delete servers with active node clients)
- Configuration changes are validated before application
- Role-based restrictions prevent unauthorized access
Safety Mechanisms:
- Dependency validation prevents accidental service disruption
- Configuration validation ensures server stability
- Comprehensive audit logging tracks all administrative changes
- Graceful handling of server state transitions
- Permission checks occur before any operation execution
Traffic Monitoring
Node Servers automatically collect traffic statistics through API endpoints:
NewV2b Traffic Reporting
Servers push traffic data periodically using the UniProxy traffic upload API:
POST /api/v1/server/UniProxy/push?node_id=1&node_type=V2ray&token=your-server-token
Content-Type: application/json
{
"123": [5000000, 1000000],
"456": [8000000, 2000000]
}
Request Details:
- Path:
/api/v1/server/UniProxy/push - Authentication: Query parameters (
node_id,node_type,token) - Body Format:
{"user_id": [download_bytes, upload_bytes]} - Content-Type:
application/json
Response:
HTTP 200 OK
SSP Traffic Reporting
SSPanel-compatible servers use the legacy traffic reporting API:
POST /mod_mu/users/traffic?node_id=1&key=your-api-key
Content-Type: application/json
[
{
"u": 123,
"d": 1000000,
"upload": 5000000
},
{
"u": 456,
"d": 2000000,
"upload": 8000000
}
]
Request Details:
- Path:
/mod_mu/users/traffic - Authentication: Query parameters (
node_id,key) - Body Format: Array of
{"u": user_id, "d": download_bytes, "upload": upload_bytes} - Content-Type:
application/json
Response:
HTTP 200 OK
Observability Features
Status History Tracking
The system automatically records server status changes for monitoring and analytics:
Tracked Events:
- Status transitions (online ↔ offline ↔ maintenance)
- Uptime statistics and availability metrics
- Heartbeat intervals and response times
- Configuration change events
Status History API:
Status history is tracked automatically by the system. For administrative queries, use the gRPC management API:
gRPC Service: helium.telecom_manage.NodeServerManage
Method: ShowNodeServer
Request:
message ShowNodeServerRequest {
int32 id = 1;
}
Response:
message NodeServerReply {
int32 id = 1;
int64 speed_limit = 2;
string config = 3;
NodeServerStatus status = 4;
int64 last_online_time = 5;
}
enum NodeServerStatus {
NODE_SERVER_STATUS_ONLINE = 0;
NODE_SERVER_STATUS_OFFLINE = 1;
NODE_SERVER_STATUS_MAINTENANCE = 2;
}
Example gRPC call:
grpcurl -plaintext \
-d '{"id": 1}' \
localhost:50051 \
helium.telecom_manage.NodeServerManage/ShowNodeServer
Metrics and Logging
- Structured Logging: All operations use
tracingfor detailed logs - Performance Metrics: Traffic throughput and response times
- Health Metrics: Heartbeat intervals and status transitions
- Error Tracking: Failed authentications and connection issues
When to Use Node Server vs Node Client
Use Node Server When:
-
Setting up new physical infrastructure
gRPC Service:
helium.telecom_manage.NodeServerManage
Method:CreateNodeServerRequest:
message CreateNodeServerRequest { string config = 1; int64 speed_limit = 2; }Response:
message AdminEditReply { AdminEditResult result = 1; }Example gRPC call:
grpcurl -plaintext \ -d '{"config": "{\"compatibility\":\"newv2b\",\"api_host\":\"127.0.0.1\",\"api_port\":8080}", "speed_limit": 10000000000}' \ localhost:50051 \ helium.telecom_manage.NodeServerManage/CreateNodeServer -
Configuring proxy backend behavior
- Protocol selection (UniProxy vs SSPanel)
- Server-side performance settings
- Traffic processing configuration
-
Managing physical resources
- Server capacity planning
- Geographic deployment
- Infrastructure monitoring
-
Backend administration
- Server health monitoring
- Traffic aggregation
- System maintenance
Use Node Client When:
-
Creating user-facing proxy endpoints
gRPC Service:
helium.telecom_manage.NodeClientManage
Method:CreateNodeClientRequest:
message CreateNodeClientRequest { int32 server_id = 1; string name = 2; string traffic_factor = 3; int32 display_order = 4; string client_side_config = 5; repeated int32 available_groups = 6; optional helium.telecom.NodeMetadata metadata = 7; }Response:
message AdminEditReply { AdminEditResult result = 1; }Example gRPC call:
grpcurl -plaintext \ -d '{"server_id": 1, "name": "US West Coast", "traffic_factor": "1.0", "display_order": 100, "available_groups": [2, 3, 4], "config": "{\"protocol\":\"Vmess\"}"}' \ localhost:50051 \ helium.telecom_manage.NodeClientManage/CreateNodeClient -
Organizing user access
- Different service tiers (Premium, Basic)
- Geographic regions for users
- Access control by user groups
-
Billing and traffic management
- Different traffic factors per endpoint
- User-specific speed limits
- Package-based access control
-
User experience customization
- Display names and ordering
- Regional preferences
- Service level differentiation
Typical Workflow
- Infrastructure Setup: Create Node Servers for physical infrastructure
- Service Configuration: Create multiple Node Clients pointing to each server
- User Management: Assign users to appropriate Node Clients based on packages
- Monitoring: Monitor Node Server health while tracking Node Client usage
Example Architecture
Physical Infrastructure Layer (Node Servers):
├── US-West-Server (NewV2b, 10GB/s capacity)
├── EU-Central-Server (NewV2b, 5GB/s capacity)
└── Asia-Pacific-Server (SSP, 3GB/s capacity)
User-Facing Layer (Node Clients):
├── US-West-Premium (→ US-West-Server, 2x traffic factor)
├── US-West-Standard (→ US-West-Server, 1x traffic factor)
├── EU-Premium (→ EU-Central-Server, 2x traffic factor)
├── EU-Standard (→ EU-Central-Server, 1x traffic factor)
├── Asia-Premium (→ Asia-Pacific-Server, 2x traffic factor)
└── Asia-Budget (→ Asia-Pacific-Server, 0.5x traffic factor)
This separation allows for:
- Flexible service offerings without infrastructure changes
- Independent scaling of physical and logical resources
- Simplified user management through logical groupings
- Cost-effective resource utilization across multiple service tiers
Security Considerations
Authentication
Node Servers authenticate using different methods depending on compatibility type:
NewV2b Authentication (Query Parameters)
POST /api/v1/server/UniProxy/push?node_id=1&node_type=V2ray&token=your-server-token
Authentication Process:
- Server includes
node_id,node_type, andtokenas query parameters - Backend validates token and node_id combination
- If valid, request is processed and heartbeat updated
- If invalid, request is rejected with 401 Unauthorized
SSP Authentication (Query Parameters)
POST /mod_mu/users/traffic?node_id=1&key=your-api-key
Authentication Process:
- Server includes
node_idandkeyas query parameters - Backend validates key and node_id combination
- If valid, request is processed and heartbeat updated
- If invalid, request is rejected with 401 Unauthorized
Authentication Configuration:
{
"telecom_config": {
"vpn_server_token": "your-secure-server-token-here"
}
}
Note: The authentication tokens are configured per node server and validated against the vpn_server_token configuration. Both NewV2b and SSP servers use the same token validation mechanism, but with different request formats.
Access Control
- Server-side configuration is admin-only
- API endpoints require proper authentication
- Traffic data is validated before processing
- Heartbeat verification prevents spoofing
Data Protection
- All traffic statistics are aggregated and anonymized
- Configuration data is encrypted at rest
- API communications use secure channels
- User identification uses secure tokens
Troubleshooting
Common Issues
-
Server Shows Offline
- Check heartbeat timing configuration
- Verify server can reach the API endpoints
- Confirm authentication tokens are correct
- Review network connectivity
-
Traffic Not Reporting
- Verify server configuration type matches API calls
- Check traffic threshold filtering (>10KB)
- Confirm database connectivity
- Review authentication tokens
-
Configuration Validation Failures
- Validate JSON syntax in server config
- Check protocol-specific requirements
- Verify all required fields are present
- Test with minimal configuration first
-
Performance Issues
- Monitor server resource utilization
- Check speed_limit configuration
- Review traffic patterns and peaks
- Consider load balancing across servers
Debugging Tools
- Health Check API: Monitor server status programmatically
- Traffic Reports: Analyze throughput and usage patterns
- Status History: Review historical availability data
- Configuration Validation: Test configs before deployment
- Structured Logging: Detailed operation traces with tracing
Best Practices
- Monitor heartbeat intervals regularly
- Use automation for status management
- Implement proper alerting for offline servers
- Regular configuration backups
- Capacity planning based on traffic trends
- Geographic distribution for reliability
Node Client
Purpose of Node Client
Node Client serves as the user-facing proxy endpoint in the Helium telecom system. It provides a crucial abstraction layer that enables flexible service delivery while maintaining efficient resource utilization.
Service Tier Differentiation
The primary purpose of Node Client is to enable different service tiers to utilize the same physical infrastructure while providing distinct user experiences. Consider this scenario:
Physical Infrastructure (Node Server):
├── High-performance server in US-West datacenter
│ ├── 10Gbps bandwidth capacity
│ └── Premium network routing
User-Facing Services (Node Clients):
├── "US Premium" (traffic_factor: 2.0, high-priority routing)
├── "US Standard" (traffic_factor: 1.0, standard routing)
└── "US Budget" (traffic_factor: 0.5, economy routing)
All three service tiers connect to the same physical density server, but users access them through different entry servers that provide different characteristics:
- Budget Plan: Uses cheaper entry server with higher latency to user, but same backend density server
- Premium Plan: Uses premium entry server with optimized routing, same backend density server
- Standard Plan: Balanced entry server performance, same backend density server
Business Value
This architecture enables:
- Cost-Effective Infrastructure: One physical server supports multiple service tiers
- Flexible Pricing Models: Different billing rates (
traffic_factor) for same infrastructure - Access Control: User package groups control which nodes are accessible
- Geographic Organization: Logical grouping by region while optimizing server placement
- Service Quality Differentiation: Different route classes and metadata per service tier
Concept of Node Client
Architecture Overview
Node Client operates as the entry point of the proxy line - it’s what users see and configure in their proxy client applications.
User's Proxy Client → Node Client (User-facing) → Node Server (Infrastructure) → Internet
Key Relationships
| Component | Role | Visibility | Configuration Focus |
|---|---|---|---|
| Node Server | Physical infrastructure | Admin-only | Server-side proxy protocols, capacity |
| Node Client | User-facing endpoint | User-visible | Client-side connection settings, billing |
Core Concepts
1. Server Relationship
pub struct NodeClient {
pub server_id: i32, // Points to the physical Node Server
// ... other fields
}
Each Node Client must reference an existing Node Server. This creates a 1:N relationship where one physical server can support multiple user-facing endpoints.
2. Traffic Factor System
// Billing calculation:
// Billed Traffic = Actually Used Traffic × Traffic Factor
pub traffic_factor: Decimal,
Traffic factor enables flexible billing models:
0.5: Budget tier (user pays for half of actual usage)1.0: Standard tier (user pays for actual usage)2.0: Premium tier (user pays double, presumably for better service)
3. Access Control System
pub available_groups: Vec<i32>, // Package groups that can access this node
Users can only access Node Clients if their active package belongs to one of the available_groups. This enables:
- Package-based access control: Different subscription tiers access different nodes
- Geographic restrictions: Certain packages only access specific regions
- Service level enforcement: Premium packages get access to premium nodes
4. Protocol Configuration
pub client_side_config: Json<NodeClientConfig>,
Node Client stores the client-side protocol configuration that determines how users connect. This includes protocol-specific settings for:
- VMess, VLess, Trojan, Shadowsocks, Hysteria2, WireGuard, etc.
- Connection parameters (hostname, port, encryption methods)
- Transport settings (WebSocket, gRPC, TCP, etc.)
5. Metadata System
pub struct NodeClientMetadata {
pub country: Option<CountryCode>, // Geographic identification
pub location: Option<Locations>, // Regional classification
pub route_class: Option<RouteClass>, // Service quality indicator
}
Route Classes define service quality expectations:
SpecialCustom: Enterprise-grade infrastructurePremium: High-end infrastructure (IPLC, dedicated lines)Backbone: Standard backbone infrastructure (most common)GlobalAccess: International access nodesBudget: Cost-optimized infrastructureExperimental: Testing and development nodes
Configuration of Node Client
Core Configuration Fields
pub struct NodeClient {
pub id: i32, // Unique identifier
pub server_id: i32, // Physical server reference
pub name: String, // Display name for users
pub traffic_factor: Decimal, // Billing multiplier
pub display_order: i32, // Sort order in client apps
pub client_side_config: Json<NodeClientConfig>, // Protocol configuration
pub available_groups: Vec<i32>, // Access control groups
pub node_metadata: Json<NodeClientMetadata>, // Geographic/quality metadata
pub created_at: PrimitiveDateTime, // Creation timestamp
pub updated_at: PrimitiveDateTime, // Last modification
}
Configuration Examples
These JSON examples show the API request format that frontend applications should use when creating Node Clients.
Creating a Premium VMess Node Client
{
"server_id": 1,
"name": "🇺🇸 US Premium West",
"traffic_factor": "2.0",
"display_order": 100,
"available_groups": [1, 2],
"client_side_config": {
"protocol": "Vmess",
"v": 2,
"hostname": "premium-us.example.com",
"port": 443,
"alter_id": 0,
"encrypt_method": "auto",
"network": "ws",
"fake_type": "none",
"host": "premium-us.example.com",
"path": "/premium-path",
"tls": "tls",
"sni": "premium-us.example.com",
"alpn": null,
"fingerprint": null
},
"metadata": {
"country": "US",
"location": "north_america",
"route_class": "premium"
}
}
Key Configuration Points:
server_id: References the physical Node Server (ID: 1)traffic_factor: “2.0” means users pay 2x actual usage (premium pricing)available_groups: [1, 2] restricts access to Premium and Enterprise packagesprotocol: “Vmess” specifies VMess protocol with WebSocket transportroute_class: “premium” indicates high-end infrastructure
Creating a Budget Shadowsocks Node Client
{
"server_id": 3,
"name": "🇸🇬 Singapore Budget",
"traffic_factor": "0.5",
"display_order": 900,
"available_groups": [3, 4],
"client_side_config": {
"protocol": "Ss",
"server": "budget-asia.example.com",
"port": 8080,
"cipher": "aes-256-gcm",
"server_key": null,
"obfs": null,
"plugin": null
},
"metadata": {
"country": "SG",
"location": "southeast_asia",
"route_class": "budget"
}
}
Key Configuration Points:
server_id: References different physical server (ID: 3)traffic_factor: “0.5” means users pay half actual usage (budget pricing)available_groups: [3, 4] restricts access to Basic and Student packagesprotocol: “Ss” specifies Shadowsocks with AES-256-GCM encryptiondisplay_order: 900 (lower priority in client app display)
Creating a Hysteria2 High-Performance Node
{
"server_id": 2,
"name": "🇩🇪 Germany Hysteria2",
"traffic_factor": "1.5",
"display_order": 200,
"available_groups": [1, 2, 5],
"client_side_config": {
"protocol": "Hy2",
"server": "hy2-eu.example.com",
"port": 443,
"ports": "20000-55000",
"obfs": "salamander",
"obfs_password": "secret123",
"alpn": ["h3"],
"up": "100 Mbps",
"down": "500 Mbps",
"sni": "hy2-eu.example.com",
"skip_cert_verify": false,
"ca": null,
"ca_str": null,
"fingerprint": null,
"cwnd": 32
},
"metadata": {
"country": "DE",
"location": "europe",
"route_class": "backbone"
}
}
Key Configuration Points:
traffic_factor: “1.5” for premium protocol pricingavailable_groups: [1, 2, 5] for Premium, Enterprise, and Gaming packagesprotocol: “Hy2” specifies Hysteria2 with QUIC transportports: Port range for UDP multiplexingobfs: “salamander” obfuscation method
Protocol Support
Node Client supports a comprehensive range of proxy protocols through the NodeClientConfig enum:
| Protocol | Use Case | Authentication Method |
|---|---|---|
| VMess | General purpose, good compatibility | UUID-based |
| VLess | Modern, lower overhead than VMess | UUID-based |
| Trojan | Designed to bypass DPI | Password-based |
| Shadowsocks | Lightweight, good performance | Password-based |
| ShadowsocksR | Enhanced Shadowsocks with obfuscation | Password-based |
| Hysteria2 | High-performance UDP-based protocol | Password-based |
| Tuic | QUIC-based, low latency | UUID + Password |
| WireGuard | VPN protocol, excellent performance | Private key-based |
| Trojan-Go | Enhanced Trojan implementation | Password-based |
| HTTP/HTTPS/SOCKS | Basic proxy protocols | Username + Password |
Configuration Validation
The system provides configuration validation through the gRPC management API:
gRPC Service: helium.telecom_manage.NodeClientManage
Method: VerifyNodeClientConfig
Request:
message VerifyNodeClientConfigRequest {
string config = 1;
}
Response:
message VerifyReply {
bool valid = 1;
}
Example gRPC call:
grpcurl -plaintext \
-d '{"config": "{\"protocol\":\"Vmess\",\"hostname\":\"test.example.com\",\"port\":443}"}' \
localhost:50051 \
helium.telecom_manage.NodeClientManage/VerifyNodeClientConfig
Frontend applications should validate configurations before creating Node Clients to ensure proper protocol settings and prevent runtime errors.
JSON Schema Reference
For frontend developers, here are the key JSON structures:
Node Client Creation Request
{
"server_id": "<integer>", // Required: Physical server ID
"name": "<string>", // Required: Display name
"traffic_factor": "<decimal_string>", // Required: Billing multiplier (e.g., "1.0", "2.5")
"display_order": "<integer>", // Required: Sort order (higher = lower priority)
"available_groups": ["<integer>"], // Required: Package group IDs
"client_side_config": {
// Required: Protocol configuration
"protocol": "<protocol_name>" // Required: See Protocol Support table
// Protocol-specific fields vary
},
"metadata": {
// Optional: Geographic/quality metadata
"country": "<iso_country_code>", // Optional: Two-letter country code
"location": "<location_enum>", // Optional: Geographic region
"route_class": "<route_class_enum>" // Optional: Service quality tier
}
}
Location Enum Values
[
"north_america",
"south_america",
"europe",
"east_asia",
"southeast_asia",
"south_asia",
"middle_east",
"africa",
"oceania",
"arctic",
"antarctic"
]
Route Class Enum Values
[
"special_custom",
"premium",
"backbone",
"global_access",
"budget",
"experimental"
]
Common Protocol Configurations
VMess Protocol:
{
"protocol": "Vmess",
"v": 2,
"hostname": "<hostname>",
"port": "<integer>",
"alter_id": "<integer>",
"encrypt_method": "<string|null>",
"network": "<string|null>",
"fake_type": "<string|null>",
"host": "<string|null>",
"path": "<string|null>",
"tls": "<string|null>",
"sni": "<string|null>",
"alpn": "<string[]|null>",
"fingerprint": "<string|null>"
}
Shadowsocks Protocol:
{
"protocol": "Ss",
"server": "<hostname>",
"port": "<integer>",
"cipher": "<string>",
"server_key": "<string|null>",
"obfs": "<object|null>",
"plugin": "<object|null>"
}
Hysteria2 Protocol:
{
"protocol": "Hy2",
"server": "<hostname>",
"port": "<integer>",
"ports": "<string|null>",
"obfs": "<string|null>",
"obfs_password": "<string|null>",
"alpn": "<string[]|null>",
"up": "<string|null>",
"down": "<string|null>",
"sni": "<string|null>",
"skip_cert_verify": "<boolean>",
"ca": "<string|null>",
"ca_str": "<string|null>",
"fingerprint": "<string|null>",
"cwnd": "<integer|null>"
}
Administrative Management
Node Client management requires administrator privileges with role-based access control:
| Admin Level | Permissions |
|---|---|
| SuperAdmin | Full CRUD operations, configuration validation |
| Moderator | Full CRUD operations, configuration validation |
| CustomerSupport | Read-only access for support purposes |
| SupportBot | No direct access to node management |
When to Use Node Client vs Node Server
For a comprehensive comparison of when to use Node Client versus Node Server, including decision matrices, workflow examples, and use case scenarios, see the Node Server vs Node Client Usage Guide.
Quick Decision Guide for Node Client
Use Node Client when you need:
- User-facing service tiers (Premium, Standard, Budget)
- Service differentiation with same physical infrastructure
- Flexible billing models via traffic factors
- Package-based access control
- Geographic service organization
- Protocol-specific client configurations
Node Client Specific Examples
The following examples demonstrate Node Client’s unique capabilities:
1. Creating User-Facing Service Tiers
Scenario: Same physical server (ID: 1), multiple service levels
Premium Tier:
{
"server_id": 1,
"name": "🏆 US Premium Plus",
"traffic_factor": "2.0",
"available_groups": [1],
"client_side_config": {
"protocol": "Vmess",
"hostname": "premium.example.com",
"port": 443,
"network": "ws",
"tls": "tls"
},
"metadata": {
"route_class": "premium"
}
}
Standard Tier:
{
"server_id": 1,
"name": "⚡ US Standard",
"traffic_factor": "1.0",
"available_groups": [2, 3],
"client_side_config": {
"protocol": "Vmess",
"hostname": "standard.example.com",
"port": 443,
"network": "tcp"
},
"metadata": {
"route_class": "backbone"
}
}
Budget Tier:
{
"server_id": 1,
"name": "💰 US Economy",
"traffic_factor": "0.5",
"available_groups": [4],
"client_side_config": {
"protocol": "Ss",
"server": "budget.example.com",
"port": 8080,
"cipher": "aes-256-gcm"
},
"metadata": {
"route_class": "budget"
}
}
2. Implementing Geographic Service Organization
Physical servers in strategic locations:
- EU Server (ID: 2): Frankfurt datacenter
- Asia Server (ID: 3): Singapore datacenter
UK Node Client (uses Frankfurt server):
{
"server_id": 2,
"name": "🇬🇧 United Kingdom",
"traffic_factor": "1.0",
"available_groups": [1, 2, 3],
"client_side_config": {
"protocol": "Vless",
"hostname": "uk.example.com",
"port": 443,
"encrypt_method": "none",
"network": "ws",
"tls": "tls"
},
"metadata": {
"country": "GB",
"location": "europe",
"route_class": "premium"
}
}
Singapore Node Client (uses Singapore server):
{
"server_id": 3,
"name": "🇸🇬 Singapore",
"traffic_factor": "1.0",
"available_groups": [1, 2, 3],
"client_side_config": {
"protocol": "Vless",
"hostname": "sg.example.com",
"port": 443,
"encrypt_method": "none",
"network": "tcp"
},
"metadata": {
"country": "SG",
"location": "southeast_asia",
"route_class": "backbone"
}
}
3. Protocol Differentiation for Same Server
Multi-protocol capable server (ID: 4):
VMess Endpoint:
{
"server_id": 4,
"name": "VMess - US West",
"traffic_factor": "1.0",
"display_order": 100,
"available_groups": [2, 3, 4],
"client_side_config": {
"protocol": "Vmess",
"v": 2,
"hostname": "vmess.example.com",
"port": 443,
"alter_id": 0,
"network": "ws",
"tls": "tls"
}
}
Hysteria2 Endpoint (same server, faster protocol):
{
"server_id": 4,
"name": "Hysteria2 - US West",
"traffic_factor": "1.2",
"display_order": 150,
"available_groups": [1, 2],
"client_side_config": {
"protocol": "Hy2",
"server": "hy2.example.com",
"port": 443,
"obfs": "salamander",
"up": "50 Mbps",
"down": "200 Mbps"
}
}
4. Access Control and Package Management
Enterprise Dedicated Node (restricted access):
{
"server_id": 1,
"name": "🏢 Enterprise Dedicated",
"traffic_factor": "3.0",
"available_groups": [1],
"client_side_config": {
"protocol": "Vless",
"hostname": "enterprise.example.com",
"port": 443,
"encrypt_method": "none",
"network": "grpc",
"tls": "tls"
},
"metadata": {
"route_class": "special_custom"
}
}
Consumer Standard Node (broader access):
{
"server_id": 1,
"name": "👤 Consumer Standard",
"traffic_factor": "1.0",
"available_groups": [2, 3, 4],
"client_side_config": {
"protocol": "Vmess",
"v": 2,
"hostname": "consumer.example.com",
"port": 443,
"alter_id": 0,
"network": "ws"
},
"metadata": {
"route_class": "backbone"
}
}
Node Server Use Cases
For complete Node Server use cases and detailed infrastructure examples, see the Node Server documentation.
1. Setting Up Physical Infrastructure
Node Server setup focuses on backend infrastructure configuration:
{
"server_side_config": {
"compatibility": "newv2b"
// UniProxy protocol configuration for actual proxy server
// Backend-specific settings not visible to end users
},
"speed_limit": 10000000000
}
Key Focus:
- Server-side proxy protocol configuration
- Physical infrastructure capacity (10GB/s total)
- Backend performance tuning
- Not visible to end users
2. Configuring Backend Proxy Behavior
- Protocol selection (UniProxy vs SSPanel compatibility)
- Server-side performance settings
- Traffic processing configuration
- Capacity and resource limits
3. Physical Resource Management
- Server capacity planning
- Geographic server deployment
- Infrastructure health monitoring
- Hardware resource allocation
4. Backend System Administration
- Server heartbeat monitoring
- Traffic aggregation processing
- System maintenance operations
- Infrastructure-level configuration
Decision Matrix
| Requirement | Use Node Client | Use Node Server |
|---|---|---|
| User-facing configuration | ✅ Yes | ❌ No |
| Service tier differentiation | ✅ Yes | ❌ No |
| Billing rate control | ✅ Yes | ❌ No |
| Access control by package | ✅ Yes | ❌ No |
| Protocol-specific client config | ✅ Yes | ❌ No |
| Physical infrastructure setup | ❌ No | ✅ Yes |
| Server capacity management | ❌ No | ✅ Yes |
| Backend protocol configuration | ❌ No | ✅ Yes |
| System resource monitoring | ❌ No | ✅ Yes |
Development Workflow
For the complete development workflow including infrastructure setup, service configuration, and monitoring phases, see the Development Workflow Guide in the Node Server documentation.
Node Client Management APIs
For Node Client specific operations, use these gRPC endpoints:
Create Node Client:
// helium.telecom_manage.NodeClientManage/CreateNodeClient
message CreateNodeClientRequest {
int32 server_id = 1;
string name = 2;
string traffic_factor = 3;
int32 display_order = 4;
string client_side_config = 5;
repeated int32 available_groups = 6;
optional helium.telecom.NodeMetadata metadata = 7;
}
Edit Access Groups:
// helium.telecom_manage.NodeClientManage/EditNodeClientGroups
message EditNodeClientGroupsRequest {
int32 id = 1;
repeated int32 available_groups = 2;
}
List Node Clients:
// helium.telecom_manage.NodeClientManage/ListNodeClients
message ListNodeClientsRequest {}
message ListNodeClientsReply {
repeated NodeClientAdminGlance nodes = 1;
}
See Also: Node Server gRPC APIs for server management operations.
Node Client Best Practices
- Service Tier Strategy: Use Node Client to create multiple service tiers (Premium, Standard, Budget) on the same physical infrastructure
- Traffic Factor Planning: Set appropriate billing multipliers:
0.5for budget tiers1.0for standard service2.0+for premium tiers
- Access Control: Always configure
available_groupsto enforce package-based restrictions - User Experience: Use meaningful names, emojis, and proper
display_orderfor client applications - Geographic Metadata: Include country codes and location metadata to help users choose optimal endpoints
- Protocol Selection: Choose appropriate protocols based on target user base and technical requirements
For comprehensive best practices covering both Node Client and Node Server management, see the Best Practices Guide in the Node Server documentation.
Version Control of Packages
This document explains how the package version control system works in Helium, ensuring purchased packages remain consistent while allowing marketing teams to update offerings. This is essential for maintaining service reliability and customer trust.
The Concept of Packages
A package is the fundamental service offering unit that defines what a user receives when they purchase a production. Each package contains:
- Traffic Limit: Maximum data transfer allowed (in bytes)
- Max Client Number: Maximum simultaneous client connections
- Expire Duration: How long the package remains valid after activation
- Available Group: Access control group determining which proxy nodes are accessible
- Version: Version number for tracking changes
Packages are organized into Package Series - logical groupings identified by UUID that contain multiple versions of the same offering. When a user purchases a production, they receive a package from the associated series.
pub struct Package {
pub id: i64,
pub series: Uuid, // Groups related packages
pub version: i32, // Version within the series
pub is_master: bool, // Currently delivered version
pub available_group: i32, // Access control group
pub max_client_number: i32, // Connection limit
pub expire_duration: PgInterval, // Validity period
pub traffic_limit: i64, // Data transfer limit
}
Version Control System
The version control system ensures package stability for users while enabling content updates for marketing. The core principle is:
Production always delivers the master version of a package series
Master Package Concept
Within each package series, exactly one package is marked as is_master = true. This is the version that:
- Gets delivered to users when they purchase a production
- Appears in production listings and user interfaces
- Represents the current “live” offering
-- Finding the master package for a series
SELECT * FROM "telecom"."packages"
WHERE series = $1 AND is_master = TRUE
LIMIT 1
Version Control Benefits
- User Protection: Once a user purchases a package, their service parameters never change unexpectedly
- Marketing Flexibility: Marketing can update package contents by creating new versions
- Rollback Capability: Previous versions remain available for troubleshooting or rollbacks
- Audit Trail: Complete history of package changes through version tracking
Two Types of Editing Operations
The system supports two distinct editing patterns depending on the impact and intent:
Create New Version (Version-Changing Edit)
When to use: When making changes that affect the core service offering or user experience.
Use cases:
- Changing traffic limits or client connection limits
- Modifying expire duration
- Updating available proxy groups
- Any change that affects what users receive
Process:
- Create a new package in the same series with incremented version
- Set the new package as
is_master = true - Set the previous master as
is_master = false - New purchases will receive the new version
- Existing users keep their original package version
Example:
-- Old master: series=uuid-123, version=1, is_master=true, traffic_limit=100GB
-- New master: series=uuid-123, version=2, is_master=true, traffic_limit=200GB
Update Without Version Change (Non-Version Edit)
When to use: When making changes that don’t affect core functionality or user experience.
Use cases:
- Internal metadata updates
- Performance optimizations that don’t change user-visible behavior
- Bug fixes that don’t alter service parameters
- Administrative flags or internal tracking data
Process:
- Directly update the existing package record
- Version number remains unchanged
is_masterstatus remains unchanged- Changes may affect both new and existing users (use carefully)
Example:
-- Update internal flags without affecting service delivery
UPDATE "telecom"."packages"
SET internal_metadata = $1
WHERE id = $2
Decision Matrix
| Change Type | Version Impact | User Impact | Edit Type |
|---|---|---|---|
| Traffic limit increase | High | Positive | New Version |
| Client limit change | High | Variable | New Version |
| Proxy group modification | High | Variable | New Version |
| Performance optimization | Low | None | No Version Change |
| Internal metadata | None | None | No Version Change |
| Bug fix (no behavior change) | Low | Positive | No Version Change |
Admin Capabilities
Administrators have several tools for managing packages and the version control system:
Package Queue Management
Admins can directly manage user package assignments:
- Add Queued Package: Assign specific packages to users
- Cancel Queued Package: Remove packages from user queues
- List/Count Queued Packages: Monitor package distribution
pub struct AdminAddQueuedPackage {
pub user_id: Uuid,
pub package_id: i64, // Direct package ID, not series
pub by_order: Option<Uuid>, // Optional order reference
}
Production Management
Admins control the production catalog that references package series:
- Create Production: Define new offerings linked to package series
- Delete Production: Remove offerings from the market
- View Production Details: See master package information
pub struct AdminCreateProduction {
pub package_series: Uuid, // References the series
pub package_amount: i32, // Number of packages to deliver
// ... other production fields
}
Version Control Operations
Current Limitations: The codebase shows that direct package creation/editing APIs are not yet implemented in the admin interface. Package management currently happens at the database level.
Typical Admin Workflow:
- Create new package versions via database operations
- Update
is_masterflags to promote new versions - Create/update productions to reference package series
- Monitor package queues and user assignments
Admin Roles and Permissions
Different admin roles have different package management capabilities:
- SuperAdmin: Full access to all package operations
- Moderator: Can manage package queues and productions
- CustomerSupport: Can add/cancel user packages for support purposes
Best Practices
- Always use version changes for user-facing modifications
- Test new package versions before setting as master
- Maintain clear version history with meaningful version increments
- Monitor user impact when promoting new package versions
- Keep rollback capability by preserving previous versions
- Document version changes for team coordination
Technical Integration
Database Schema
-- Package series grouping
CREATE TABLE "telecom"."package_series" (
id UUID PRIMARY KEY
);
-- Individual packages with version control
CREATE TABLE "telecom"."packages" (
id BIGSERIAL PRIMARY KEY,
series UUID REFERENCES "telecom"."package_series"(id),
version INTEGER NOT NULL,
is_master BOOLEAN NOT NULL DEFAULT FALSE,
-- ... service parameters
UNIQUE(series, version)
);
Key Queries
-- Get current master package for a series
SELECT * FROM packages WHERE series = ? AND is_master = TRUE;
-- Promote a package to master (transaction required)
BEGIN;
UPDATE packages SET is_master = FALSE WHERE series = ?;
UPDATE packages SET is_master = TRUE WHERE id = ?;
COMMIT;
This version control system ensures that the platform can evolve its offerings while maintaining service consistency for existing users, providing both stability and flexibility for business operations.
Package Queue
Overview
The Package Queue is a core component of the Telecom module that manages user packages in a queue-based system. It ensures that users can have multiple packages but only one active package at a time, with automatic activation of the next package when the current one expires.
Concept
What is Package Queue?
The Package Queue is a system that manages the lifecycle of telecom packages for users. Think of it as a “playlist” for packages - users can have multiple packages in their queue, but only one plays (is active) at a time. When the current package expires or is consumed, the system automatically activates the next package in line.
Key Characteristics
- Single Active Package Rule: Each user can only have one active package at any given time
- FIFO Queue: Packages are activated in First-In-First-Out order based on
created_attimestamp - Automatic Activation: When an active package expires, the next queued package is automatically activated
- Traffic Tracking: Each package tracks upload/download usage with quota adjustments
- Event-Driven: Package lifecycle changes trigger events for other system components
Package States
pub enum LivePackageStatus {
/// The package is in the queue, but not active
InQueue,
/// The package that the user is using
Active,
/// The package that has expired due to time or traffic limits
Consumed,
/// The package that was cancelled (e.g., refunded)
Cancelled,
}
How it Works
Data Structure
The core data structure is PackageQueueItem:
pub struct PackageQueueItem {
pub id: i64,
pub user_id: Uuid,
pub package_id: i64, // Reference to package definition
pub by_order: Option<Uuid>, // Optional order ID that created this item
pub status: LivePackageStatus,
pub created_at: PrimitiveDateTime,
pub activated_at: Option<PrimitiveDateTime>,
// Traffic usage tracking
pub upload: i64, // Billed upload traffic in bytes
pub download: i64, // Billed download traffic in bytes
pub adjust_quota: i64, // Quota adjustment (can be negative or positive)
}
Queue Processing Workflow
1. Package Creation
When packages are purchased, they are added to the queue with InQueue status:
// Single package
CreateQueueItem { user_id, package_id, by_order }
// Multiple identical packages
CreateQueueItems { user_id, package_id, by_order, amount }
2. Automatic Activation
The system automatically activates packages through the process_package_queue_push function:
pub async fn process_package_queue_push(
transaction: &mut sqlx::Transaction<'_, sqlx::Postgres>,
user_id: Uuid,
) -> Result<PackageQueuePushResult, sqlx::Error>
Activation Logic:
- Check if user has an active package
- If active package exists, do nothing
- If no active package, find the oldest queued package (
ORDER BY created_at) - Activate the found package by setting
status = 'active'andactivated_at = NOW()
3. Traffic Usage Recording
Traffic usage is recorded through the billing system:
pub struct RecordPackageUsage {
pub user_id: Uuid,
pub upload: i64, // Additional upload traffic to bill
pub download: i64, // Additional download traffic to bill
}
Billing Logic:
- Find the user’s active package
- Add the new traffic to existing usage counters
- Check if total usage exceeds limit:
(upload + download) >= (traffic_limit + adjust_quota) - If limit exceeded, automatically set status to
Consumed
4. Package Expiration
Packages can expire due to two reasons:
- Time expiration: Handled by cron jobs that check
expire_attimestamps - Usage expiration: Triggered automatically when traffic limits are exceeded during billing
When a package expires, a PackageExpiringEvent is published.
5. Queue Advancement
When a package expires, the system automatically activates the next package:
PackageExpiringEventis consumed byTelecomPackageQueueHook- Expired package status is updated to
Consumed - System looks for the next
InQueuepackage for the user - If found, activates it and publishes
PackageActivateEvent - If no more packages, publishes
AllPackageExpiredEvent
Concurrency Control
The system uses Redis-based distributed locks to prevent race conditions:
pub struct PackageQueueLock;
Lock Usage:
- Lock ID: User ID (
LockId(user_id)) - TTL: 30 seconds default
- Retry Logic: Up to 10 retries for lock acquisition
- Operations Protected:
- Package queue push processing
- Package expiration handling
- Package activation
Event System
The Package Queue publishes several events for system integration:
PackageQueuePushEvent
pub struct PackageQueuePushEvent {
pub item_ids: Vec<i64>,
pub user_id: Uuid,
pub package_id: i64,
pub pushed_at: u64,
}
- Purpose: Internal event when packages are added to queue
- Route:
telecom.package_queuing
PackageActivateEvent
pub struct PackageActivateEvent {
pub item_id: i64,
pub user_id: Uuid,
pub package_id: i64,
pub activated_at: u64,
}
- Purpose: Internal event when a package becomes active
- Route:
telecom.package_activate
PackageExpiringEvent
pub struct PackageExpiringEvent {
pub item_id: i64,
pub user_id: Uuid,
pub package_id: i64,
pub expired_at: u64,
pub reason: PackageExpiredReason, // Time or Usage
}
- Purpose: Internal event when a package expires
- Route:
telecom.package_expiring
Service Layer
The PackageQueueService provides high-level operations:
pub struct PackageQueueService {
pub db: DatabaseProcessor,
pub redis: RedisConnection,
}
Available Operations:
GetCurrentPackage: Get user’s active package infoGetAllMyPackages: List all packages for a user
Database Schema
The package queue is stored in the telecom.package_queue table with the following key indexes:
- User-based queries:
(user_id, status) - Queue ordering:
(user_id, status, created_at) - Package lookup:
(package_id)
Usage Examples
For Developers
Adding Packages to Queue
// Add single package
let item = db.process(CreateQueueItem {
user_id: user.id,
package_id: package.id,
by_order: Some(order.id),
}).await?;
// Add multiple identical packages
let items = db.process(CreateQueueItems {
user_id: user.id,
package_id: package.id,
by_order: Some(order.id),
amount: 5,
}).await?;
Getting Active Package
let service = PackageQueueService { db, redis };
let active_package = service.process(GetCurrentPackage {
user_id: user.id,
}).await?;
Recording Traffic Usage
// This is typically done by the billing system
let record = db.process(RecordPackageUsage {
user_id: user.id,
upload: 1024 * 1024, // 1MB upload
download: 10 * 1024 * 1024, // 10MB download
}).await?;
if record.map(|r| r.expired).unwrap_or(false) {
// Package expired due to usage, will trigger queue advancement
}
Integration Points
- Shop Module: Creates queue items when users purchase packages
- Billing System: Records traffic usage and triggers expiration
- Node Management: Checks active packages for user access control
- Admin Interface: Views and manages user package queues
Best Practices
- Always use transactions when modifying package queue state
- Handle lock acquisition failures gracefully with retries
- Listen to package events for system integration
- Consider quota adjustments when calculating available traffic
- Test concurrent scenarios due to multi-user nature of the system
Common Pitfalls
- Race Conditions: Always use Redis locks when modifying queue state
- Transaction Boundaries: Ensure event publishing happens after database commits
- Zero-Duration Packages: Handle edge cases where packages expire immediately
- Quota Calculations: Remember that
adjust_quotacan be negative - Event Ordering: Package events may arrive out of order in distributed systems
Usage Recording Flow
Overview
The Usage Recording Flow is a comprehensive traffic monitoring and billing system in the Telecom Module that tracks user bandwidth consumption, applies billing multipliers, and processes package usage. This system ensures accurate billing while providing detailed analytics for both users and administrators.
The flow consists of three main phases:
- Data Collection: Node servers report raw traffic data
- Aggregation: Raw usage is collected, multiplied by traffic factors, and prepared for billing
- Billing: Aggregated usage is applied to user packages, potentially expiring them when limits are exceeded
System Architecture
┌─────────────────┐ ┌──────────────────────┐ ┌─────────────────────┐
│ Node Servers │───▶│ Traffic Report APIs │───▶│ Raw Usage Storage │
└─────────────────┘ └──────────────────────┘ └─────────────────────┘
│
▼
┌─────────────────────┐ ┌──────────────────┐ ┌──────────────────────┐
│ Package Expiry │◀───│ Billing Hook │◀───│ Cron Jobs - Traffic │
│ Events │ │ │ │ Billing │
└─────────────────────┘ └──────────────────┘ └──────────────────────┘
▲ │ │
│ ▼ ▼
┌─────────────────────┐ ┌──────────────────┐ ┌──────────────────────┐
│ Package Expiration │ │ Package Usage │ │ Usage Aggregation │
│ Check │ │ Update │ │ │
└─────────────────────┘ └──────────────────┘ └──────────────────────┘
│
▼
┌──────────────────────┐
│ Message Queue │
│ (UserUsageBilling │
│ Event) │
└──────────────────────┘
Data Flow:
- Node Servers → Report traffic via APIs (
ReportTraffic,UploadUniProxyTraffic) - Traffic Report APIs → Store raw usage in
user_traffic_usagetable - Cron Jobs → Periodically gather unbilled usage and aggregate with traffic factors
- Usage Aggregation → Create
UserUsageBillingEventmessages - Message Queue → Distribute billing events to processing hooks
- Billing Hook → Apply usage to user packages and check limits
- Package Usage Update → Update package usage counters
- Package Expiration Check → Determine if package limits exceeded
- Package Expiry Events → Trigger package expiration workflows
Core Components
1. Data Collection Layer
Node Traffic Reporting APIs
- ReportTraffic: Legacy SSP-compatible traffic reporting
- UploadUniProxyTraffic: Modern UniProxy traffic reporting
Both APIs collect traffic reports from node servers and store them as raw usage records.
2. Data Storage
user_traffic_usage Table
Stores individual traffic records with the following key fields:
user_id: User UUIDupload/download: Raw traffic in bytesnode_client_id: Source node informationtimestamp: When traffic occurredhas_been_billed: Billing status flag
3. Aggregation & Billing System
Components:
- TelecomCronExecutor: Scheduled jobs for billing
- GatherUnbilledUsage: Aggregates and marks unbilled traffic
- UserUsageBillingEvent: Message queue events for billing
- TelecomBillingHook: Processes billing events and updates packages
Detailed Flow Walkthrough
Phase 1: Data Collection
Node servers periodically report traffic usage to the system:
// Example: UniProxy traffic reporting
let report = UploadUniProxyTraffic {
node_id: 1,
records: vec![
TrafficReportRecord {
user_id: 12345, // number_id from token system
upload: 1048576, // 1MB uploaded
download: 5242880, // 5MB downloaded
}
]
};
// Process the report
let result = node_server_service.process(report).await?;
Key Processing Steps:
- Validation: Only records with total traffic > 10KB are processed
- User Resolution: The
number_idis resolved to actualuser_idvia tokens - Node Matching: System finds the appropriate
node_clientbased on user’s active package - Storage: Raw usage is stored in
user_traffic_usagewithhas_been_billed = FALSE
// Internal process: InsertTrafficReportBatch
impl Processor<InsertTrafficReportBatch, Result<(), sqlx::Error>> for DatabaseProcessor {
async fn process(&self, input: InsertTrafficReportBatch) -> Result<(), sqlx::Error> {
// Complex SQL that resolves number_id -> user_id -> node_client
// and inserts traffic records with proper node association
}
}
Phase 2: Aggregation & Billing Preparation
The system runs periodic cron jobs (typically every few minutes) to process unbilled usage:
// Cron job execution
impl Processor<PrimitiveDateTime, Result<Box<[BillTrafficJob]>, Error>> for TelecomCronExecutor {
async fn process(&self, _input: PrimitiveDateTime) -> Result<Box<[BillTrafficJob]>, Error> {
// Gather all unbilled usage and create billing jobs
let find_result = self
.db
.process(GatherUnbilledUsage) // Critical aggregation step
.await?
.into_iter()
.map(BillTrafficJob)
.collect::<Box<[_]>>();
Ok(find_result)
}
}
Critical Aggregation Process (GatherUnbilledUsage):
-- This query does several important things:
WITH updated AS (
UPDATE "telecom"."user_traffic_usage" AS u
SET has_been_billed = TRUE -- Mark as billed to prevent double-billing
FROM "telecom"."node_client" AS nc
WHERE u.node_client_id = nc.id AND u.has_been_billed = FALSE
RETURNING
nc.server_id,
u.user_id,
CEIL(u.download::numeric * nc.traffic_factor) AS billed_download, -- Apply traffic multiplier
CEIL(u.upload::numeric * nc.traffic_factor) AS billed_upload,
u.timestamp
)
SELECT
server_id,
user_id,
SUM(billed_download)::BIGINT AS billed_download,
SUM(billed_upload)::BIGINT AS billed_upload,
MAX(timestamp) AS time
FROM updated
GROUP BY server_id, user_id -- Aggregate by server and user
Key Operations:
- Atomic Marking: Marks records as billed to prevent race conditions
- Traffic Factor Application: Applies node-specific billing multipliers (
nc.traffic_factor) - Aggregation: Groups usage by server and user for efficient processing
- Ceiling Function: Ensures no fractional billing (always rounds up)
Phase 3: Package Billing & Expiration
For each aggregated usage record, the system publishes a billing event:
// Create and send billing event
let event = UserUsageBillingEvent {
server_id: item.server_id,
user: item.user_id,
billed_download: item.billed_download,
billed_upload: item.billed_upload,
time: start_time.assume_utc().unix_timestamp() as u64,
};
event.send(&mq).await?; // Send to message queue
The TelecomBillingHook consumes these events and applies usage to user packages:
impl Processor<UserUsageBillingEvent, Result<(), Error>> for TelecomBillingHook {
async fn process(&self, event: UserUsageBillingEvent) -> Result<(), Error> {
// Apply usage to user's active package
let Some(record) = self
.db
.process(RecordPackageUsage {
user_id: event.user,
upload: event.billed_upload,
download: event.billed_download,
})
.await?
else {
error!(user_id = %event.user, "Cannot find package for user");
return Err(Error::NotFound);
};
// Check if package exceeded limits
if record.expired {
// Send package expiration event
let ev = PackageExpiringEvent {
item_id: record.item_id,
user_id: record.user_id,
package_id: record.package_id,
expired_at: event.time,
reason: PackageExpiredReason::Usage, // Expired due to traffic usage
};
ev.send(&self.mq).await?;
}
Ok(())
}
}
Package Usage Recording (RecordPackageUsage):
This is the most critical operation that:
- Finds user’s currently active package
- Adds billed traffic to package usage counters
- Checks if total usage exceeds package limits
- Automatically transitions package status to ‘consumed’ if limits exceeded
-- Simplified version of the package usage update query
UPDATE "telecom"."package_queue" AS pq
SET upload = pq.upload + $2,
download = pq.download + $3,
status = CASE
WHEN pq.upload + $2 + pq.download + $3 >= p.traffic_limit + pq.adjust_quota
THEN 'consumed'::telecom.live_package_status
ELSE 'active'::telecom.live_package_status
END
FROM "telecom"."packages" AS p
WHERE pq.user_id = $1 AND pq.package_id = p.id
Developer Usage Guide
Adding New Traffic Sources
To add support for new node types or reporting formats:
- Create Traffic Report Structure:
#[derive(Debug, Clone, PartialEq)]
pub struct MyCustomTrafficReport {
pub node_id: i32,
pub usage_records: Vec<MyCustomRecord>,
}
- Implement Processor:
impl Processor<MyCustomTrafficReport, Result<ReportResult, Error>> for NodeServerService {
async fn process(&self, input: MyCustomTrafficReport) -> Result<ReportResult, Error> {
// Convert to standard TrafficReportRecord format
let records: Vec<TrafficReportRecord> = input
.usage_records
.into_iter()
.map(|r| TrafficReportRecord {
number_id: r.user_number,
upload: r.sent_bytes,
download: r.received_bytes,
})
.filter(|r| r.download + r.upload > 10_000) // Filter minimum threshold
.collect();
// Use existing batch insertion
self.db.process(InsertTrafficReportBatch {
server_id: input.node_id,
timestamp: now(),
records,
}).await?;
Ok(ReportResult::Ok)
}
}
Querying Usage Data
Get User’s Recent Usage
// Get hourly usage for the last 24 hours
let usage_data = db.process(GetUserHourlyUsage {
user: user_id,
begin: now - Duration::hours(24),
end: now,
}).await?;
for record in usage_data {
println!("Hour: {}, Raw: {}MB, Billed: {}MB",
record.time,
(record.upload + record.download) / 1_048_576,
(record.billed_upload + record.billed_download) / 1_048_576
);
}
Monitor Unbilled Usage
// Check for pending billing (useful for monitoring)
let unbilled = db.process(GatherUnbilledUsage).await?;
println!("Found {} users with unbilled usage", unbilled.len());
Analytics & Reporting
Traffic Factor Impact Analysis
// Compare raw vs billed traffic to understand traffic factor impact
let usage = db.process(GetUserDailyUsage {
user_id,
begin: start_date,
end: end_date,
}).await?;
for day in usage {
let raw_total = day.upload + day.download;
let billed_total = day.billed_upload + day.billed_download;
let factor = billed_total as f64 / raw_total as f64;
println!("Date: {}, Factor: {:.2}x", day.time, factor);
}
Important Developer Considerations
1. Race Conditions & Data Integrity
- Billing Flag: The
has_been_billedflag prevents double-billing during concurrent cron jobs - Atomic Updates: Use database transactions for critical operations
- Event Ordering: Message queue ensures billing events are processed in order
2. Traffic Factor System
- Node-Specific Multipliers: Each
node_clienthas atraffic_factor(e.g., 1.0, 1.5, 2.0) - Ceiling Rounding: Always rounds up to prevent under-billing
- Factor Changes: Changing factors only affects new traffic, not historical data
3. Performance Optimization
- Minimum Threshold: Only records with >10KB total traffic are processed
- Batch Processing: Traffic reports are processed in batches for efficiency
- Index Usage: Ensure proper indexing on
user_id,has_been_billed, andtimestamp
4. Error Handling
// Always handle missing package scenarios
if let Some(record) = db.process(RecordPackageUsage { ... }).await? {
// Process successful usage recording
} else {
// User has no active package - this is expected for expired/inactive users
warn!("User {} has no active package for billing", user_id);
}
5. Monitoring & Alerting
Key metrics to monitor:
- Unbilled Records: Should trend toward zero between cron runs
- Failed Billing Events: Errors in TelecomBillingHook processing
- Package Expiration Rate: Monitor PackageExpiringEvent frequency
- Traffic Factor Distribution: Ensure factors are applied correctly
6. Testing Considerations
When writing tests:
// Always test with realistic traffic factors
let test_factor = 1.5;
let raw_usage = 1_000_000; // 1MB
let expected_billed = (raw_usage as f64 * test_factor).ceil() as i64; // 1,500,000
// Test edge cases around package limits
let package_limit = 1_000_000_000; // 1GB
let usage_just_under = package_limit - 1;
let usage_just_over = package_limit + 1;
This usage recording system provides robust, scalable traffic monitoring with accurate billing that scales to handle high-traffic proxy networks while maintaining data integrity and providing detailed analytics.
Observability
This document describes the observability features implemented in the telecom module. These features enable users and administrators to monitor system performance, track usage, and maintain visibility into the health of the telecom infrastructure.
User Observability Features
The telecom module provides several APIs that allow users to monitor their usage and the status of their assigned nodes.
Traffic Usage Tracking
The AnalysisService provides comprehensive traffic monitoring capabilities for users through the GetRecentTrafficUsage API.
API Endpoint
- gRPC Service:
Telecom.GetRecentTrafficUsage - Request:
GetRecentTrafficUsageRequest - Response:
RecentTrafficUsageResponse
Implementation Details
The service provides traffic data in different time ranges:
- Day: Hourly bucketing for the last 24 hours
- Week: Daily bucketing for the last 7 days
- Month: Daily bucketing for the last 30 days
Key Components:
- Raw Traffic: Actual traffic consumed by the user
- Billed Traffic: Traffic that was actually charged to the user’s quota (may differ due to traffic multipliers)
// Usage example in service layer
use crate::services::analysis::{AnalysisService, GetRecentTrafficUsage, RecentRange};
let usage_response = analysis_service.process(GetRecentTrafficUsage {
user_id: user_id,
range: RecentRange::Day, // or Week, Month
}).await?;
// Response contains two data sets:
// - usage_response.raw: actual traffic consumed
// - usage_response.actually_billed: traffic charged to quota
Database Queries Used:
GetUserHourlyUsage: For day-range queriesGetUserDailyUsage: For week/month-range queries
Node Status History
Users can monitor the historical status of their assigned proxy nodes through the ListNodeStatusHistory API.
API Endpoint
- gRPC Service:
Telecom.ListNodeStatusHistory - Request:
ListNodeStatusHistoryRequest - Response:
ListNodeStatusHistoryReply
Implementation Details
The API provides hourly aggregated node status information:
- Online Nodes: Count of nodes that were online in each hour
- Offline Nodes: Count of nodes that were offline in each hour
- Maintenance Nodes: Count of nodes under maintenance in each hour
// Usage example
use crate::services::analysis::{AnalysisService, ListUserNodeStatusHistory};
let history = analysis_service.process(ListUserNodeStatusHistory {
start: start_time,
end: end_time,
user_id: user_id,
}).await?;
// Each history entry contains:
// - bucket_start: timestamp of the hour
// - online_nodes, offline_nodes, maintenance_nodes: counts for that hour
Data Source: Uses materialized view node_status_hourly_mv for efficient querying.
Node Usage Analytics
The system tracks which nodes users utilize most frequently through the ListUsuallyUsedNodes API.
API Endpoint
- gRPC Service:
Telecom.ListUsuallyUsedNodes - Request:
ListUsuallyUsedNodesRequest - Response:
ListUsuallyUsedNodesResponse
Implementation Details
Provides analytics on user’s node usage patterns:
- Node Information: ID, name of frequently used nodes
- Traffic Statistics: Upload, download, and billed traffic per node
// Usage example
use crate::services::analysis::{AnalysisService, ListUserUsuallyUsedNodes};
let nodes = analysis_service.process(ListUserUsuallyUsedNodes {
user_id: user_id,
}).await?;
// Each node entry contains:
// - node_client_id, node_name: identification
// - upload, download, billed_traffic: usage statistics
Node List and Status
Users can view their available nodes and their current status through the ListNodes API.
API Endpoint
- gRPC Service:
Telecom.ListNodes - Request:
ListNodesRequest - Response:
ListNodesReply
Implementation Details
Provides real-time information about user’s assigned nodes:
- Node Details: ID, name, traffic factor, display order
- Performance Info: Speed limits, current status
- Metadata: Country, location, route class
Admin Observability Features
Administrators have access to comprehensive monitoring and management capabilities for the entire telecom infrastructure.
Server Monitoring
List Node Servers
Admins can monitor all proxy servers in the system.
- gRPC Service:
NodeServerManage.ListNodeServers - Features:
- Filter by server status (Online/Offline/Maintenance)
- Pagination support (limit/offset)
- Shows server compatibility, status, last online time, and client count
// Usage example in manage service
use crate::services::manage::{AdminListServers, ManageService};
let servers = manage_service.process(AdminListServers {
limit: 50,
offset: 0,
filter_status: Some(NodeServerStatus::Offline), // Optional filter
}).await?;
Show Individual Server Details
- gRPC Service:
NodeServerManage.ShowNodeServer - Features:
- Complete server configuration
- Current status and performance metrics
- Last online timestamp
Node Client Management
List All Node Clients
Comprehensive view of all proxy node clients.
- gRPC Service:
NodeClientManage.ListNodeClients - Features:
- Complete client information including server relationships
- Traffic factors and routing configurations
- Status monitoring and metadata
Individual Client Details
- gRPC Service:
NodeClientManage.ShowNodeClient - Features:
- Detailed client configuration
- Associated server information
- Performance and status metrics
Package Queue Monitoring
Queue Statistics
Monitor package queue health and performance.
- gRPC Service:
PackageQueueManage.CountQueuedPackages - Features:
- Count of packages by series
- Queue status overview
Package List Management
- gRPC Service:
PackageQueueManage.ListQueuedPackages - Features:
- Filter by user, order, package, or status
- Pagination support
- Complete package lifecycle visibility
Background Job Monitoring
The telecom module runs several scheduled jobs for system maintenance and monitoring:
Node Health Monitoring (RefreshServerStatus)
- Purpose: Automatically mark servers as online/offline based on heartbeat
- Frequency: Configurable via
TelecomConfig.node_health_check.offline_timeout - Implementation:
TelecomCronExecutorincron.rs
Package Expiration Management (PackageExpiringJob)
- Purpose: Automatically expire packages based on time limits
- Frequency: Regular scanning for expired packages
- Events: Publishes
PackageExpiringEventto message queue
Traffic Billing Processing (BillTrafficJob)
- Purpose: Process unbilled traffic usage and publish billing events
- Frequency: Regular processing of accumulated traffic data
- Events: Publishes
UserUsageBillingEventfor each user
Node Status History Recording (RecordNodeStatusHistoryJob)
- Purpose: Record current status of all node servers for historical analysis
- Frequency: Hourly status snapshots
- Storage: Populates
node_status_historytable
Status View Refresh (RefreshNodeStatusViewJob)
- Purpose: Refresh the materialized view for efficient status queries
- Frequency: Regular refresh of
node_status_hourly_mv - Optimization: Includes data cleanup and analysis for performance
Event-Driven Observability
Usage Billing Events
The system processes usage data through asynchronous events:
UserUsageBillingEvent
- Publisher: External systems (cron jobs, usage collectors)
- Consumer:
TelecomBillingHook - Route:
telecom.user_usage_billing - Purpose: Record user traffic consumption and trigger package expiration
// Event structure
pub struct UserUsageBillingEvent {
pub server_id: i32,
pub user: Uuid,
pub billed_download: i64,
pub billed_upload: i64,
pub time: u64,
}
PackageExpiringEvent
- Publisher: Telecom billing system
- Route: Package expiration processing
- Purpose: Handle package lifecycle events
Tracing and Instrumentation
All RPC endpoints and critical services include comprehensive tracing:
- Instrumentation: Uses
tracing::instrumentfor observability - Error Logging: Structured error reporting with context
- Performance Tracking: Request/response times and error rates
Database Schema for Observability
Core Tables
node_status_history
Stores historical node status data:
- id: Primary key
- node_server_id: Reference to node server
- status: Online/Offline/Maintenance
- created_at: Timestamp
user_package_usage
Tracks user traffic consumption:
- Hourly and daily aggregations
- Raw and billed traffic separation
- User and server associations
Materialized Views
node_status_hourly_mv
Optimized view for status history queries:
- Hourly aggregations of node status
- Efficient querying for analytics
- Automatic refresh via cron jobs
Usage Guidelines
For Users
- Use
GetRecentTrafficUsageto monitor bandwidth consumption - Check
ListNodeStatusHistoryfor node availability patterns - Analyze
ListUsuallyUsedNodesto optimize node selection - Monitor
ListNodesfor real-time node status
For Administrators
- Use server management APIs to monitor infrastructure health
- Monitor package queues for system performance
- Review cron job logs for automated maintenance status
- Analyze event streams for system-wide observability
Development Considerations
- All APIs follow the Processor pattern [[memory:6079830]]
- Database connections use owned types, not static lifetimes [[memory:7107428]]
- Comprehensive error handling with structured logging
- Event-driven architecture for scalable monitoring
- Materialized views for performance-critical queries
This observability framework provides complete visibility into the telecom system’s operation, enabling both users and administrators to monitor, analyze, and optimize the service effectively.
Fetching Config
Overview
The Fetching Config system provides a subscription-based mechanism for users to dynamically retrieve proxy client configurations. This system allows users to get up-to-date proxy server configurations compatible with their preferred proxy clients without manual intervention.
The Concept of Subscribe Link
What is a Subscribe Link?
A subscribe link is a unique URL that allows users to fetch their personalized proxy configuration from the telecom service. Each user has a unique subscription token that grants access to their allowed proxy nodes based on their active package and permissions.
How Subscribe Links Work
- Token Generation: Each user gets a unique
subscription_link_key(UUID) stored in thenodes_tokentable - Template URLs: The system supports multiple subscribe endpoints configured via
SubscribeLinkConfig - Dynamic Generation: The final subscribe URL is generated by replacing
{SUBSCRIBE_TOKEN}placeholder with the user’s actual token - Access Control: Users can only access nodes their active package group allows
Subscribe Link Architecture
pub struct SubscribeLinkTemplate {
pub url_template: String, // e.g., "https://subscribe.congyu.moe/subscribe/{SUBSCRIBE_TOKEN}"
pub endpoint_name: String, // e.g., "default"
}
The default configuration provides a template like:
https://subscribe.congyu.moe/subscribe/{SUBSCRIBE_TOKEN}
Which becomes a working URL like:
https://subscribe.congyu.moe/subscribe/550e8400-e29b-41d4-a716-446655440000
Supported Proxy Clients
The system supports the following proxy clients through the libsubconv library:
Supported Client Types
| Client Name | Description | Detection Keywords |
|---|---|---|
| Clash | Popular cross-platform proxy client | clash, stash, shadowrocket, meta |
| V2Ray | Core V2Ray client | v2ray |
| SingBox | Next-generation universal proxy platform | singbox |
| QuantumultX | Advanced proxy client for iOS | quantumult |
| Loon | Network proxy client for iOS | loon |
| Surfboard | Network proxy client for iOS | surfboard |
| Surge4 | Advanced network toolbox | surge |
| Trojan | Uses V2Ray format | - |
Client Detection
The system automatically detects the client type using two methods:
- Query Parameter:
?client=clash(explicit specification) - User-Agent Header: Automatic detection based on HTTP User-Agent string
The detection logic prioritizes explicit query parameters over User-Agent detection.
How Proxy Client Config is Generated
Generation Process Flow
- Authentication: Validate the subscription token and retrieve user information
- Node Filtering: Apply user’s package permissions and filter options
- Node Retrieval: Fetch available nodes based on user’s active package group
- Format Conversion: Convert node configurations to client-specific format
- Response Generation: Generate final configuration with proper headers
Configuration Generation Steps
// 1. Validate subscription token
let token = FindNodesTokenBySubscribeId { subscribe_id };
// 2. Get user's active package and available nodes
let nodes = ListUserNodeClientConfigs {
user_id: token.user_id,
filter_option: NodeFilterOption { ... }
};
// 3. Convert to client-specific format
match client_name {
ClientName::Clash => {
let nodes = cores.into_iter()
.filter_map(|c| c.to_clash_node())
.collect();
Clash::generate(nodes).stringify()
},
ClientName::SingBox => {
let nodes = cores.into_iter()
.filter_map(|c| c.to_singbox_node())
.collect();
SingBox::generate(nodes).stringify()
},
// ... other clients
}
Node Filtering Options
The system supports sophisticated filtering based on:
- Country Exclusion:
exclude_country- List of country codes to exclude - Location Filtering:
only_locations- Only include specific locationsexclude_locations- Exclude specific locations
- Route Class Filtering:
only_route_classes- Only include specific route classesexclude_route_classes- Exclude specific route classes
Available Locations
pub enum Locations {
NorthAmerica,
SouthAmerica,
Europe,
EastAsia,
SoutheastAsia,
MiddleEast,
Africa,
Oceania,
Antarctica,
}
Available Route Classes
pub enum RouteClass {
/// Highest class, this kind of node is for enterprise customers.
SpecialCustom,
/// This class means the node is using high-end infrastructure like IPLC.
Premium,
/// This class means the node is using backbone infrastructure.
Backbone,
/// This class means the node is out of major countries and regions, provided for global access.
GlobalAccess,
/// This class means the node is a budget node, provided for budget sensitive customers.
Budget,
}
RESTful API Reference
Get Subscribe Links
gRPC Endpoint: TelecomService.GetSubscribeLinks
Purpose: Retrieve available subscribe link templates for a user
Request:
message GetSubscribeLinksRequest {}
Response:
message GetSubscribeLinksReply {
repeated SubscribeLink links = 1;
string subscribe_token = 2;
}
message SubscribeLink {
string url_template = 1;
string endpoint_name = 2;
}
Fetch Subscribe Content
HTTP Endpoint: GET /subscribe/{token}
Purpose: Retrieve proxy configuration for a specific subscription token
Path Parameters
| Parameter | Type | Description |
|---|---|---|
token | UUID | User’s subscription token |
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
client | ClientName | No | Force specific client type (overrides User-Agent detection) |
exclude_country | CountryCode[] | No | List of country codes to exclude |
only_locations | Locations[] | No | Only include nodes from these locations |
exclude_locations | Locations[] | No | Exclude nodes from these locations |
only_route_classes | RouteClass[] | No | Only include nodes of these route classes |
exclude_route_classes | RouteClass[] | No | Exclude nodes of these route classes |
Example Requests
Basic Request:
GET /subscribe/550e8400-e29b-41d4-a716-446655440000
User-Agent: Clash/1.0
With Filtering:
GET /subscribe/550e8400-e29b-41d4-a716-446655440000?client=clash&exclude_country=CN,RU&only_route_classes=premium,backbone
Force Client Type:
GET /subscribe/550e8400-e29b-41d4-a716-446655440000?client=singbox
Response Format
The response format varies by client type:
Headers:
Content-Type: Varies by client (e.g.,application/yamlfor Clash,application/jsonfor SingBox)- Custom headers with subscription information (traffic limits, expiration, etc.)
Response Body: Client-specific configuration format
Error Responses
| HTTP Status | Description |
|---|---|
404 Not Found | Invalid subscription token or no available nodes |
500 Internal Server Error | Server processing error |
Configuration Management
Subscribe link endpoints are configured via the TelecomConfig:
pub struct SubscribeLinkConfig {
pub endpoints: Vec<SubscribeLinkEndpoint>,
}
pub struct SubscribeLinkEndpoint {
pub url_template: String, // Template with {SUBSCRIBE_TOKEN} placeholder
pub endpoint_name: String, // Human-readable endpoint name
}
Default Configuration:
{
"subscribe_link": {
"endpoints": [
{
"url_template": "https://subscribe.congyu.moe/subscribe/{SUBSCRIBE_TOKEN}",
"endpoint_name": "default"
}
]
}
}
Implementation Notes
Security Considerations
- Token Validation: All requests must include a valid subscription token
- Package Verification: Users can only access nodes their package allows
- Rate Limiting: Consider implementing rate limiting for subscribe endpoints
- Token Rotation: Subscription tokens should be rotatable for security
Performance Considerations
- Caching: Node configurations can be cached to reduce database load
- Filtering: Client-side filtering is applied efficiently using database indexes
- Conversion: Node format conversion is optimized per client type
Maintenance Tasks
- Monitor Usage: Track subscribe link usage patterns
- Update Templates: Manage subscribe link templates through configuration
- Clean Up: Remove expired or unused subscription tokens
- Client Support: Add support for new proxy clients as needed
Market Module
The Market Module is responsible for managing affiliate marketing systems within the Helium platform. It provides comprehensive affiliate functionality including invite codes, referral tracking, reward calculation, and revenue distribution.
Overview
The market module implements a complete affiliate marketing system that allows users to invite new customers and earn commissions from their purchases. The system is built with the following core components:
- Affiliate Policy Management: Configurable commission rates and invitation rules per user
- Invite Code System: Generation and management of unique invitation codes
- Referral Tracking: Automatic tracking of user registration through invite codes
- Reward Calculation: Dynamic commission calculation based on order amounts and rates
- Revenue Management: Accumulated reward tracking and withdrawal functionality
Core Features
User Invitation System
- Generate unique 8-character alphanumeric invite codes
- Maximum configurable number of active codes per user
- Automatic code deactivation and cleanup
- Referral relationship establishment during user registration
Commission & Rewards
- Configurable commission rates per user
- Trigger limits per referred user to prevent abuse
- Automatic reward calculation on order payments
- Real-time affiliate statistics tracking
- Secure withdrawal system with balance verification
Administrative Controls
- Centralized affiliate policy management
- Global configuration for default rates and limits
- Comprehensive audit trail for all affiliate activities
- Integration with user balance and order systems
Module Structure
The market module follows the standard Helium module structure:
entities/: Database models and operations for affiliate dataservices/: Business logic processors implementing affiliate functionalityrpc/: gRPC service implementations for client communicationhooks/: Event listeners for user registration and order processingevents/: Internal event definitions for reward processingconfig.rs: Module configuration structure
Integration Points
The market module integrates with several other Helium modules:
- Auth Module: Listens to user registration events to initialize affiliate policies
- Shop Module: Processes order payment events to trigger reward calculations
- Manage Module: Uses configuration system for affiliate settings
- Redis: Stores module configuration for fast access
- RabbitMQ: Event-driven communication with other modules
API Endpoints
The module exposes the following gRPC APIs:
GetAffiliateStats: Retrieve user’s affiliate statistics and invite codesListInviteCodes: List active invite codes for a userCreateInviteCode: Generate new invite codesDeleteInviteCode: Deactivate invite codesWithdrawAffiliateReward: Transfer earned rewards to user balance
All APIs follow the Processor pattern for consistent business logic processing [[memory:6079830]].
Database Schema
The module uses a dedicated market schema with the following tables:
affiliate_policy: Per-user commission rates and invitation settingsaffiliate_stats: Aggregated revenue and referral statisticsinvite_code: User-generated invitation codesaffiliate_relation: Historical record of reward transactions
Event Flow
The affiliate system operates through an event-driven architecture:
- User Registration: When users register with an invite code, the system creates affiliate policies and establishes referral relationships
- Order Payment: When referred users make purchases, the system calculates and awards commissions to inviters
- Reward Processing: Affiliate rewards are processed asynchronously through internal events
- Balance Updates: Successful withdrawals update user account balances atomically
This documentation provides developers with the necessary understanding to maintain, extend, and integrate with the market module’s affiliate functionality.
Affiliate System
The Affiliate System is the core feature of the Market Module, providing a comprehensive referral and commission management system. This document details the implementation, data flow, and usage patterns for developers working with the affiliate functionality.
System Architecture
The affiliate system is built on an event-driven architecture that processes user interactions across multiple modules:
User Registration → Affiliate Policy Creation
Order Payment → Reward Calculation → Balance Update
Invite Code Generation → Code Validation → Referral Tracking
Core Components
1. Affiliate Policy (affiliate_policy)
Each user has an affiliate policy that defines their participation in the referral system:
- Reward Rate: Commission percentage (e.g., 0.1 = 10%)
- Trigger Time Per User: Maximum times a single referred user can trigger rewards
- Invitation Rights: Whether the user can create invite codes
- Referral Chain: Who invited this user (if anyone)
2. Invite Codes (invite_code)
Users generate unique codes to invite new customers:
- 8-character alphanumeric codes: Generated using secure randomization
- Active status: Codes can be deactivated without deletion
- User limits: Configurable maximum codes per user
- Collision handling: Automatic retry on code conflicts
3. Affiliate Statistics (affiliate_stats)
Real-time tracking of affiliate performance:
- Total Revenue: Cumulative commission earned
- Withdrawn Revenue: Amount already transferred to user balance
- Referral Count: Number of users successfully referred
- Available Balance:
total_revenue - withdrawn_revenue
4. Affiliate Relations (affiliate_relation)
Historical record of all reward transactions for audit and analytics.
Data Flow
1. User Registration Flow
When a new user registers with an invite code:
// Hook: RegisterHook in hooks/register.rs
impl Processor<UserRegisterEvent, Result<(), Error>> for RegisterHook {
async fn process(&self, ev: UserRegisterEvent) -> Result<(), Error> {
// 1. Load affiliate configuration
let cfg = self.load_config().await?;
// 2. Validate and find invite code
let mut invited_by = None;
if let Some(code) = ev.referral_code.clone() {
if let Some(inv) = self.db.process(FindInviteCodeByCode { code }).await? {
invited_by = Some(inv.user_id);
}
}
// 3. Create affiliate policy for new user
self.db.process(CreateAffiliatePolicy {
user_id: ev.user_id,
trigger_time_per_user: cfg.default_trigger_time_per_user,
cannot_invite: false,
rate: cfg.default_reward_rate,
invited_by,
}).await?;
}
}
2. Order Payment Flow
When a referred user makes a purchase:
// Hook: OrderHook in hooks/orders.rs
impl Processor<OrderPaidEvent, Result<(), Error>> for OrderHook {
async fn process(&self, event: OrderPaidEvent) -> Result<(), Error> {
// 1. Skip account balance payments (no commission)
if matches!(order.payment_method, Some(PaymentMethod::AccountBalance)) {
return Ok(());
}
// 2. Find invitee's affiliate policy
let invitee_policy = self.db.process(FindAffiliatePolicyByUser {
user_id: event.user_id,
}).await?;
// 3. Check if user was referred
if let Some(inviter) = invitee_policy.invited_by {
// 4. Verify trigger limits
let count = self.db.process(CountPaidOrders {
user_id: event.user_id,
}).await?;
if count <= inviter_policy.trigger_time_per_user as i64 {
// 5. Publish reward event
let reward = AffiliateReward {
inviter,
invitee: event.user_id,
order_id: event.order_id,
};
reward.send(&self.mq).await?;
}
}
}
}
3. Reward Processing Flow
The system processes affiliate rewards asynchronously:
// Hook: RewardHook in hooks/reward.rs
impl Processor<AffiliateReward, Result<(), Error>> for RewardHook {
async fn process(&self, ev: AffiliateReward) -> Result<(), Error> {
// 1. Load inviter's policy for commission rate
let policy = self.db.process(FindAffiliatePolicyByUser {
user_id: ev.inviter,
}).await?;
// 2. Calculate reward amount
let order = self.db.process(FindOrderByIdOnly {
order_id: ev.order_id,
}).await?;
let reward = order.total_amount * policy.rate;
// 3. Create affiliate relation and update stats
self.db.process(CreateAffiliateRelationAndReward {
from: ev.invitee,
to: ev.inviter,
order_id: ev.order_id,
rate: policy.rate,
reward,
}).await?;
}
}
Service Implementation
The AffiliateService provides the main business logic using the Processor pattern [[memory:6079830]]:
Core Operations
Get Affiliate Statistics
pub struct GetMyAffiliateStats {
pub user_id: Uuid,
}
// Returns: MyAffiliateStats with policy, stats, and invite codes
Invite Code Management
pub struct CreateMyInviteCode { pub user_id: Uuid }
pub struct DeleteMyInviteCode { pub user_id: Uuid, pub code_id: i64 }
pub struct ListMyInviteCodes { pub user_id: Uuid }
Reward Withdrawal
pub struct WithdrawAffiliateReward {
pub user_id: Uuid,
pub amount: Decimal,
}
The withdrawal operation is atomic and includes:
- Balance verification (sufficient available rewards)
- Affiliate stats update (increase withdrawn amount)
- User balance credit
- Transaction logging
Configuration
The affiliate system is configured through AffConfig:
pub struct AffConfig {
pub max_invite_code_per_user: i32, // Default: 10
pub default_reward_rate: Decimal, // Default: 0.1 (10%)
pub default_trigger_time_per_user: i32, // Default: 3
}
Configuration is stored in Redis under the key affiliate and loaded dynamically by services.
Database Operations
All database operations use the Processor pattern with strongly-typed inputs and outputs:
Affiliate Policy Operations
FindAffiliatePolicyByUser: Retrieve user’s policyCreateAffiliatePolicy: Initialize policy for new users
Invite Code Operations
CreateInviteCode: Generate new code with collision handlingListInviteCodesByUser: Get user’s active codesCountActiveInviteCodesByUser: Check against limitsSoftDeleteInviteCode: Deactivate codeFindInviteCodeByCode: Validate codes during registration
Statistics Operations
FindAffiliateStatsByUser: Get performance metricsAddAffiliateReward: Credit new rewardsWithdrawAffiliateRewardAtomic: Transfer to balance
gRPC API
The affiliate system exposes user-facing APIs through the Market service:
service Market {
rpc GetAffiliateStats (GetAffiliateStatsRequest) returns (GetAffiliateStatsReply);
rpc ListInviteCodes (ListInviteCodesRequest) returns (ListInviteCodesReply);
rpc CreateInviteCode (CreateInviteCodeRequest) returns (CreateInviteCodeReply);
rpc DeleteInviteCode (DeleteInviteCodeRequest) returns (DeleteInviteCodeReply);
rpc WithdrawAffiliateReward (WithdrawAffiliateRewardRequest) returns (WithdrawAffiliateRewardReply);
}
All APIs authenticate users and operate on their data only.
Event Integration
The system integrates with other modules through events:
Consumed Events
UserRegisterEvent(from auth module): Initialize affiliate policiesOrderPaidEvent(from shop module): Trigger reward calculations
Published Events
AffiliateReward(internal): Process commission calculations
Message Queues
helium_auth_user_register_market: User registration processinghelium_shop_order_paid: Order payment processinghelium_market_affiliate_reward: Internal reward processing
Business Rules
Reward Eligibility
- Payment Method: Only real payments trigger rewards (no account balance)
- Trigger Limits: Each referred user can only trigger rewards N times
- Active Codes: Only active invite codes establish referral relationships
- Valid Orders: Rewards only process for successfully paid orders
Security Considerations
- Atomic Withdrawals: Balance checks and updates are transactional
- Code Uniqueness: Invite codes are globally unique with retry logic
- Rate Validation: Commission rates are validated and stored as decimals
- Audit Trail: All reward transactions are recorded permanently
Error Handling
The system uses comprehensive error handling:
- Invalid Input: Invalid amounts, missing data
- Business Logic: Insufficient balance, code limits exceeded
- System Errors: Database failures, message queue issues
- Not Found: Missing policies, orders, or codes
Monitoring & Observability
Key metrics and logs for monitoring:
- Successful referral registrations
- Reward calculation events
- Withdrawal success/failure rates
- Invite code generation patterns
- Commission distribution analytics
Development Guidelines
When working with the affiliate system:
- Use Processors: All business logic must use the Processor pattern [[memory:6079830]]
- Avoid Static Lifetimes: Use owned connection types like
RedisConnection[[memory:7107428]] - Handle Decimals Carefully: Use
rust_decimal::Decimalfor all monetary calculations - Test Event Flows: Verify end-to-end event processing in integration tests
- Monitor Performance: Track database query performance for statistics operations
Common Integration Patterns
Adding New Reward Triggers
- Create event structure with routing information
- Implement event processor with business logic
- Register message queue and routing
- Add appropriate error handling and logging
Extending Statistics
- Update
AffiliateStatsentity - Modify aggregation queries
- Update API response structures
- Implement migration for existing data
This comprehensive documentation provides developers with the knowledge needed to maintain, extend, and troubleshoot the affiliate system effectively.
Shop Module
The shop module handles all e-commerce functionality: product listings, order management, coupon systems, user balance accounting, and gift card redemption. When you need to add payment integrations, modify pricing logic, or extend the checkout flow, this is where you start.
Overview
- Config (
config.rs) – Runtime configuration for order limits, auto-cancellation timing, and ePay integration URLs. Adjust these when tuning business rules or payment gateway endpoints. - Entities (
entities/) – Database models for orders, productions (products), coupons, user balances, gift cards, and ePay providers. Extend these when adding new persistent data structures. - Services (
services/) – Business logic services implementing order processing, coupon validation, balance management, and production queries. All APIs must be exposed through theProcessortrait pattern, not object-oriented methods. - RPC (
rpc/) – gRPC endpoints exposing shop capabilities to clients and other modules. Organized into user-facing services (order, production, account) and admin management services. - API (
api/) – REST/HTTP API endpoints for payment gateway callbacks and integrations that require non-gRPC interfaces. - Events (
events/) – Event definitions for order lifecycle (created, paid, cancelled, delivered) using RabbitMQ for inter-module communication. - Hooks (
hooks/) – Event consumers that react to order events, handle ePay callbacks, log balance changes, and initialize user balances on registration. - Cron (
cron.rs) – Background jobs for automatic order cancellation when unpaid orders exceed the configured timeout period.
Core Services
User-Facing Services
OrderService (services/order.rs)
Handles the complete order lifecycle from creation to payment. Key operations:
- List user orders with production details and status
- Create orders with optional coupon application
- Generate ePay payment URLs for third-party payment gateways
- Process payments via account balance or external providers
- Cancel unpaid orders
- List available ePay providers and channels
ProductionService (services/production.rs)
Manages product catalog visibility and access control:
- List productions filtered by user group permissions
- Retrieve individual production details with access validation
- Products can be restricted to specific user groups or extra permission groups
CouponService (services/coupon.rs)
Validates promotional codes and discount rules:
- Verify coupon validity considering time windows, usage limits, and user eligibility
- Support for both percentage-based and amount-based discounts
- Per-account and global usage tracking
BalanceService (services/user_balance.rs)
User account balance operations:
- Query available and frozen balance
- List balance change history with pagination
- Redeem gift cards for balance credits
GiftCardService (services/gift_card.rs)
Gift card redemption and validation
Management Services
ManageService (services/manage.rs)
Admin operations for all shop entities with RBAC enforcement:
- Orders: Mark paid, change amounts, list with filters, view details
- Coupons: Create, update, delete, list with full metadata
- Productions: Create, delete, list with package relationships
- Balances: Adjust user balances, view change logs
- Gift Cards: Generate in batches, create special codes, delete, list with filters
All management operations are integrated with the admin audit log system and require appropriate AdminRole permissions.
Payment Integration
The module integrates with ePay (易支付) third-party payment gateway:
-
Payment Flow:
- User creates order → receives order ID
- User requests payment URL with provider and channel (AliPay, WeChat, USDT)
- Module generates signed URL using provider credentials
- User completes payment on gateway
- Gateway sends callback to
api/epay.rswebhook - Hook verifies signature and updates order status
OrderPaidEventemitted for downstream processing
-
Provider Management:
- Multiple ePay providers supported with different channels
- Providers configured with merchant URLs, PIDs, and keys
- Each provider can be enabled/disabled via database switch
Events & Hooks
Published Events (via RabbitMQ):
OrderCreatedEvent(exchange:shop, routing:order_created) – Internal trackingOrderPaidEvent(exchange:shop, routing:order_paid) – Consumed by market module for affiliate rewardsOrderCancelledEvent(exchange:shop, routing:order_cancelled) – Internal trackingOrderDeliveredEvent(exchange:shop, routing:order_delivered) – Currently unused, candidate for removal
Event Consumers:
- ePay Hook (
hooks/epay.rs) – Listens to payment callbacks and updates order status - Log Hook (
hooks/log.rs) – Tracks balance changes in audit logs - Register Hook (
hooks/register.rs) – Initializes zero balance for new user accounts
Typical Extension Workflow
- Add Configuration – Define new config fields in
config.rsand surface through Redis configuration provider - Extend Entities – Add or modify database models in
entities/for new persistent data - Implement Service Logic – Create
Processorimplementations inservices/following the pattern:impl Processor<MyInput, Result<MyOutput, Error>> for MyService { async fn process(&self, input: MyInput) -> Result<MyOutput, Error> { // Business logic here } } - Expose via RPC – Add gRPC methods in
rpc/that delegate to service processors - Define Proto Messages – Update
.protofiles inproto/shop/and rebuild - Emit Events – Publish events through
AmqpPoolwhen state changes need cross-module notification - Add Hooks – Create event consumers in
hooks/if other modules need to react - Schedule Maintenance – Add cron jobs in
cron.rsfor cleanup or periodic tasks
Architecture Notes
- Processor Pattern: All service APIs use the
Processor<Input, Result<Output, Error>>trait from thekanaucrate. Never expose business logic through object methods. - Owned Resources: Services hold owned
RedisConnection,DatabaseProcessor, andAmqpPoolinstead of static lifetimes or references. - Coupon Discount Types: Two variants –
RateDiscount(percentage) andAmountDiscount(fixed amount with minimum threshold) - Order Status Flow: Unpaid → Paid → Delivered (or Cancelled/Refunding/Refunded)
- Balance Types: Available balance (spendable) and frozen balance (temporarily locked during transactions)
- Decimal Handling: All monetary values use
rust_decimal::Decimalinternally and serialize as strings in proto/JSON
Database Schema
Key tables (see migrations/20250815133800_create_shop_entites.sql):
productions– Product catalog with pricing and package referencesorders– Order records with status tracking and payment detailscoupons– Promotional codes with discount rules and usage trackinguser_balance– Account balance per useruser_balance_change_log– Audit trail for all balance modificationsgift_cards– Redeemable codes with amounts and expirationepay_providers– Configured payment gateway providers
Usage Examples
Creating an Order:
let result = order_service.process(CreateOrder {
user_id: user.id,
production_id: prod_uuid,
coupon_id: Some(123),
}).await?;
Payment with Balance:
let result = order_service.process(PayOrderWithBalance {
user_id: user.id,
order_id: order_uuid,
}).await?;
Redeeming Gift Card:
let result = balance_service.process(RedeemGiftCard {
user_id: user.id,
secret: "GIFT-CODE-123",
}).await?;
Configuration
Shop module configuration is stored in Redis under key shop:
{
"max_unpaid_orders": 5,
"auto_cancel_after": "30m",
"epay_notify_url": "https://api.example.com/shop/epay/callback",
"epay_return_url": "https://example.com/order/complete"
}
Keep this document synchronized with structural changes so future developers can quickly navigate the codebase.
Production
Production (also referred to as “product” in the codebase) is the user-facing catalog item that customers can purchase in the Helium shop system. It represents a complete service offering that, when purchased, grants users access to telecom packages.
Core Concept
A Production defines:
- What users see: Display information (title, description, price)
- What users receive: Reference to a package series and the quantity of packages
- Who can access it: Access control through user groups and extra permission groups
pub struct ServiceProduction {
pub id: Uuid, // Unique identifier
pub title: String, // Display name
pub description: String, // Marketing description
pub price: Decimal, // Purchase price
pub package_series: Uuid, // References a package series
pub package_amount: i32, // Number of packages to deliver
pub visible_to: i32, // User group visibility
pub is_private: bool, // Private production flag
pub limit_to_extra_group: i32, // Extra group requirement if private
pub on_sale: bool, // Whether currently available for purchase
}
Key Characteristics
-
Package Series Linkage: Production always references a package series, not individual packages. This ensures version control flexibility - the actual package delivered is the master package of the series at purchase time.
-
Quantity Control:
package_amountspecifies how many packages from the series the user receives. For example:package_amount = 1: Single subscription periodpackage_amount = 3: Three subscription periods (stacked in package queue)package_amount = 12: Annual subscription with 12 monthly packages
-
Access Control Layers:
- Basic visibility:
visible_todetermines which user group can see the production - Private productions: When
is_private = true, only users withlimit_to_extra_groupin theiruser_extra_groupscan access it - On-sale status:
on_salecontrols whether the production is currently purchasable
- Basic visibility:
-
Price Stability: The production price is independent of package contents. Even if the underlying master package changes (through version control), the production price remains stable unless explicitly updated.
Production Views
The system provides two different views of productions for different audiences:
User View (ProductionUserView)
pub struct ProductionUserView {
pub id: Uuid,
pub title: String,
pub description: String,
pub price: Decimal,
pub package_amount: i32,
pub traffic_limit: i64, // From master package
pub max_client_number: i32, // From master package
pub expire_duration: PgInterval, // From master package
}
Purpose: Shows end users what they’re buying with key package details (traffic, connections, duration) pulled from the current master package.
Admin View (ProductionAdminView)
pub struct ProductionAdminView {
pub id: Uuid,
pub title: String,
pub description: String,
pub price: Decimal,
pub package_series: Uuid,
pub package_amount: i32,
pub visible_to: i32,
pub is_private: bool,
pub limit_to_extra_group: i32,
pub package_id: i64, // Master package ID
pub package_version: i32, // Master package version
pub package_available_group: i32, // Access control from package
// ... additional package details
}
Purpose: Provides administrators with complete production metadata including internal IDs, version information, and access control settings.
Admin Management of Production
Administrators manage productions through the ManageService in the shop module. All operations require appropriate AdminRole permissions and are logged in the audit system.
Available Operations
1. Create Production
Operation: CreateProduction
pub struct CreateProduction {
pub title: String,
pub description: String,
pub price: Decimal,
pub package_series: Uuid, // Must reference existing series
pub package_amount: i32, // Quantity to deliver
pub visible_to: i32, // User group visibility
pub is_private: bool, // Private production flag
pub limit_to_extra_group: i32, // Extra group requirement
}
Requirements:
- Package series must exist in the telecom module
- Package series must have a master package (is_master = true)
- Admin must have
ModeratororSuperAdminrole - All fields are mandatory except the private/extra_group can be zero if not private
Workflow:
- Validate package series exists and has master package
- Create production record with UUID
- Production defaults to
on_sale = true - Logged in admin audit system
2. Delete Production
Operation: DeleteProduction
pub struct DeleteProduction {
pub id: Uuid, // Production ID to delete
}
Requirements:
- Production must exist
- Admin must have
ModeratororSuperAdminrole
Important Notes:
- Soft delete behavior: Production is removed from catalog but existing orders referencing it remain valid
- No cascade deletion: Deleting a production does NOT delete its associated package series
- Impact: Users can no longer purchase this production, but already-purchased orders are unaffected
3. List Productions
Operation: ListProductions
// No input parameters - returns all productions
Returns: Vec<ProductionAdminView> with complete production details including master package information
Use Cases:
- Catalog management dashboards
- Production inventory audits
- Package version tracking
- Access control verification
Reference to Version Control of Packages
Productions are tightly integrated with the Version Control of Packages system. Understanding this relationship is crucial for managing service offerings.
How Productions Use Package Versions
When a user purchases a production:
- At Purchase Time: The system looks up the master package of the referenced package series
- Version Snapshot: The specific package version (master at that moment) is recorded in the order
- Queue Insertion: Package queue items reference the specific package ID, not the series
- Version Isolation: If the master package changes later, existing purchases are unaffected
Example: Package Version Evolution
Timeline:
Day 1: Create Production "Monthly Premium"
└─> References package_series: abc-123
└─> Master package: version 1 (100GB traffic, $10)
Day 15: User A purchases "Monthly Premium"
└─> Receives: package version 1 (100GB)
Day 30: Marketing updates package series
└─> New master package: version 2 (200GB traffic, same $10 price)
└─> Old version 1: is_master = false (preserved for existing users)
Day 45: User B purchases "Monthly Premium"
└─> Receives: package version 2 (200GB)
Day 60: Both users' services:
└─> User A: Still has 100GB (version 1) - not affected by update
└─> User B: Has 200GB (version 2) - received new version
Version Control Best Practices for Productions
- Price Adjustments: Update production price separately from package content
- Content Updates: Create new package versions to change service parameters
- Catalog Refresh: Existing productions automatically deliver new master versions
- Rollback Strategy: Keep old package versions available; delete and recreate production if needed
- Testing: Verify master package is correct before major version changes
See Version Control of Packages for detailed information about:
- Creating new package versions
- Promoting packages to master
- Version change vs. non-version change edits
- Admin package management operations
Relationship Flow: Production → Package → Package Queue → Order → Node Client
Understanding how these components interconnect is essential for system comprehension and troubleshooting.
Component Relationships
┌──────────────┐ references ┌──────────────┐ has master ┌──────────────┐
│ │─────────────────────>│ Package │─────────────────────>│ Package │
│ Production │ │ Series │ │ (Master) │
│ │ │ │ │ │
└──────────────┘ └──────────────┘ └──────────────┘
│ │
│ │ defines
│ purchase creates │ service
│ │ parameters
▼ ▼
┌──────────────┐ delivers ┌──────────────┐ references ┌──────────────┐
│ │─────────────────────>│ Package │─────────────────────>│ Package │
│ Order │ │ Queue Item │ │ │
│ │ │ │ │ │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
│ │ │ controls
│ │ activates │ access
│ │ for user │
│ ▼ ▼
┌──────────────┐ ┌──────────────┐ filtered by ┌──────────────┐
│ Payment │ │ Active │─────────────────────>│ Node Client │
│ Status │ │ Package │ │ Access │
│ │ │ │ │ │
└──────────────┘ └──────────────┘ └──────────────┘
Detailed Relationship Flow
1. Production → Order
When: User initiates purchase
Process:
// User browses available productions
ListVisibleProductions { user_group, extra_groups }
└─> Returns productions matching access control
// User creates order
CreateOrder { user_id, production_id, coupon_id }
└─> Creates UserOrder with status: Unpaid
└─> Records production reference
└─> Applies coupon discount if provided
└─> Emits OrderCreatedEvent
Data Flow:
- Production ID stored in
order.production - Production price becomes
order.total_amount(after coupon) - Order status starts as
Unpaid
2. Order → Payment → Package Queue
When: User completes payment
Process:
// Payment via ePay gateway
PayOrderWithEpay { callback }
└─> Validates payment signature
└─> Updates order: status = Paid, paid_at = NOW()
└─> Emits OrderPaidEvent
// OR payment via account balance
PayOrderWithBalance { user_id, order_id }
└─> Deducts user balance
└─> Updates order: status = Paid, paid_at = NOW()
└─> Emits OrderPaidEvent
Data Flow:
OrderPaidEventpublished to RabbitMQ exchangeshopwith routing keyorder_paid- Market module consumes this event for affiliate rewards
- Note: Current codebase shows OrderPaidEvent is published but the actual delivery hook that creates package queue items is not yet visible in the telecom module hooks. This delivery mechanism needs to be implemented or is handled through a separate service/cron job.
Expected Delivery Flow (to be implemented):
// Expected hook (not found in current codebase)
consume OrderPaidEvent:
1. Query order details (production_id, user_id)
2. Query production (package_series, package_amount)
3. Find master package in package_series
4. Create package queue items:
CreateQueueItems {
user_id: order.user,
package_id: master_package.id, // Specific version
by_order: Some(order.id),
amount: production.package_amount
}
5. Trigger package queue push event
6. Update order: status = Delivered
3. Package Queue → Active Package
When: Package queue items are created
Process:
// Package queue push event triggers activation
PackageQueuePushEvent emitted
└─> TelecomPackageQueueHook consumes event
└─> process_package_queue_push(transaction, user_id)
└─> Check if user has active package
└─> If no active package:
- Find oldest queued package (ORDER BY created_at)
- Activate: status = Active, activated_at = NOW()
- Emit PackageActivateEvent
Data Flow:
- Package queue item status transitions:
InQueue→Active - Only ONE package per user can be
Activeat a time - Remaining packages wait in queue (FIFO order)
Queue States:
InQueue: Waiting to be activatedActive: Currently providing service to userConsumed: Expired due to time or traffic limitCancelled: Refunded or cancelled by admin
4. Active Package → Node Client Access
When: User requests node list or subscription link
Process:
// User lists available nodes
ListMyNodes { user_id }
└─> FindActiveAvailableGroup { user_id }
└─> Query active package
└─> Extract package.available_group
└─> ListUserNodeClients { group }
└─> SELECT * FROM node_client
WHERE available_groups @> [group]
└─> Returns accessible node clients
Access Control Logic:
-- Node client access check
SELECT * FROM "telecom"."node_client"
WHERE available_groups && ARRAY[user_active_package.available_group]
Data Flow:
- Active package defines user’s
available_group(e.g., group 1 = Premium) - Node clients have
available_groupsarray (e.g., [1, 2] = Premium and Standard users) - User can access node client if their package group is in node’s group array
- No active package = empty node list (user cannot connect)
5. Package Expiration → Queue Advancement
When: Package expires (time or traffic limit)
Process:
// Time-based expiration (cron job)
FindExpirePackageByTimeBefore { time: NOW() }
└─> For each expired package:
└─> Emit PackageExpiringEvent { reason: Time }
// Usage-based expiration (billing)
RecordPackageUsage { user_id, upload, download }
└─> If usage >= traffic_limit + adjust_quota:
└─> Update status: Consumed
└─> Emit PackageExpiringEvent { reason: Usage }
// Queue advancement (hook)
PackageExpiringEvent consumed
└─> TelecomPackageQueueHook processes:
└─> Find next queued package
└─> If found: Activate next package
└─> If none: Emit AllPackageExpiredEvent
Data Flow:
- Current package:
Active→Consumed - Next package:
InQueue→Active - User’s node access switches to new package’s available_group
- If no packages remain: User loses node access
Complete Purchase-to-Access Example
Scenario: User purchases “Monthly Premium” production
Step 1: Purchase
User: "Buy Monthly Premium ($30, 3 months)"
└─> Production {
title: "Monthly Premium",
price: $30,
package_series: series-uuid-123,
package_amount: 3
}
└─> Order created (Unpaid)
Step 2: Payment
User: "Pay with AliPay"
└─> Payment gateway callback
└─> Order status: Unpaid → Paid
└─> OrderPaidEvent emitted
Step 3: Delivery (Expected Implementation)
OrderPaidEvent consumed
└─> Find master package of series-uuid-123
└─> Package {
id: 12345,
version: 2,
available_group: 1 (Premium),
traffic_limit: 100GB,
expire_duration: 30 days
}
└─> Create 3 package queue items:
└─> Item 1: status = Active (immediately activated)
└─> Item 2: status = InQueue
└─> Item 3: status = InQueue
└─> Order status: Paid → Delivered
Step 4: Service Access
User: "Show my nodes"
└─> Query active package: Item 1 (package 12345)
└─> Extract available_group: 1
└─> Query node clients: WHERE available_groups @> [1]
└─> Returns: Premium tier nodes
User: "Generate subscription link"
└─> Generates config for premium nodes
└─> User can now connect
Step 5: Package Lifecycle (Day 30)
System: "Package 1 expired (30 days)"
└─> Item 1: Active → Consumed
└─> Item 2: InQueue → Active (auto-activated)
└─> User continues service with Item 2
└─> Still has Item 3 waiting in queue
Step 6: All Packages Consumed (Day 90)
System: "Package 3 expired"
└─> Item 3: Active → Consumed
└─> No more items in queue
└─> AllPackageExpiredEvent emitted
└─> User loses access to nodes (need to repurchase)
Node Client Relationship
Node clients use the active package’s available_group for access control. This creates a seamless flow from purchase to service access:
Access Control Chain:
- User purchases Production → receives Packages
- Package activation → defines available_group
- Node Client filtering → matches available_groups
- User connection → accesses filtered nodes
Example Group Mapping:
Package Groups:
- Group 1: Premium ($30/month, 200GB, premium nodes)
- Group 2: Standard ($15/month, 100GB, standard nodes)
- Group 3: Budget ($8/month, 50GB, budget nodes)
Node Client Configuration:
- "🇺🇸 US Premium": available_groups = [1]
- "🇺🇸 US Standard": available_groups = [1, 2]
- "🇸🇬 SG Budget": available_groups = [1, 2, 3]
User with Group 1 (Premium) package:
✅ Can access: US Premium, US Standard, SG Budget
User with Group 2 (Standard) package:
❌ Cannot access: US Premium
✅ Can access: US Standard, SG Budget
User with Group 3 (Budget) package:
❌ Cannot access: US Premium, US Standard
✅ Can access: SG Budget only
See Node Client for detailed information about node access control and configuration.
Implementation Notes for Developers
Current Implementation Status
Implemented:
- ✅ Production CRUD operations (Create, Delete, List)
- ✅ Production access control (user groups, private productions)
- ✅ Order creation with production reference
- ✅ Payment processing with OrderPaidEvent emission
- ✅ Package queue system with activation logic
- ✅ Node client access control based on active package
Needs Implementation/Verification:
- ⚠️ Order delivery hook: The mechanism that consumes
OrderPaidEventand creates package queue items is not visible in the current codebase. This critical component needs to be implemented or located. - ⚠️ Order status update: Automatic transition from
PaidtoDeliveredafter package delivery - ⚠️ Error handling: What happens if package delivery fails after payment?
- ⚠️ Refund workflow: How to handle package queue items when orders are refunded?
Expected Delivery Hook Implementation
Location: Should be in either:
modules/shop/src/hooks/delivery.rs(new file)modules/telecom/src/hooks/order.rs(new file)- Integration service in
server/src/
Pseudocode:
// File: modules/shop/src/hooks/delivery.rs (suggested)
pub struct ShopDeliveryHook {
pub telecom_db: DatabaseProcessor, // Access to telecom DB
pub shop_db: DatabaseProcessor, // Access to shop DB
pub mq: AmqpPool,
}
impl AmqpMessageProcessor<OrderPaidEvent> for ShopDeliveryHook {
const QUEUE: &'static str = "helium_shop_order_delivery";
}
impl Processor<OrderPaidEvent, Result<(), Error>> for ShopDeliveryHook {
async fn process(&self, event: OrderPaidEvent) -> Result<(), Error> {
// 1. Fetch order and production details
let order = self.shop_db.process(FindOrderByIdOnly {
order_id: event.order_id
}).await?.ok_or(Error::NotFound)?;
let production = self.shop_db.process(FindProductionById {
id: order.production
}).await?.ok_or(Error::NotFound)?;
// 2. Find master package of the series
let package = self.telecom_db.process(FindMasterPackageBySeries {
series: production.package_series
}).await?.ok_or(Error::NotFound)?;
// 3. Create package queue items
let items = self.telecom_db.process(CreateQueueItems {
user_id: event.user_id,
package_id: package.id,
by_order: Some(event.order_id),
amount: production.package_amount,
}).await?;
// 4. Emit package queue push event
PackageQueuePushEvent {
item_ids: items.iter().map(|i| i.id).collect(),
user_id: event.user_id,
package_id: package.id,
pushed_at: OffsetDateTime::now_utc().unix_timestamp() as u64,
}.send(&self.mq).await?;
// 5. Update order status to delivered
self.shop_db.process(UpdateOrderDelivered {
order_id: event.order_id
}).await?;
Ok(())
}
}
Testing Checklist
When implementing or verifying the delivery mechanism:
-
Happy Path:
- User purchases production → order created
- User pays order → OrderPaidEvent emitted
- Delivery hook triggered → package queue items created
- First package auto-activated → user can access nodes
- Order status updated to Delivered
-
Edge Cases:
- Package series has no master package → error handling
- Production deleted after order created but before payment
- Duplicate payment callbacks (idempotency)
- User already has active package → new packages queue correctly
-
Error Recovery:
- Delivery fails → order remains Paid (manual intervention needed)
- Partial delivery → some items created, transaction rollback
- Event replay → idempotent delivery (check by_order reference)
Best Practices
-
Production Naming: Use clear, descriptive titles that indicate:
- Service tier (Premium, Standard, Budget)
- Duration/quantity (Monthly, Quarterly, Annual)
- Special features (High-speed, Unlimited, etc.)
-
Price Management:
- Keep production prices stable
- Use coupons for temporary discounts
- Create new productions for permanent price changes
-
Access Control:
- Use
visible_tofor tier-based catalog (free users, paid users, VIP) - Use
is_private+limit_to_extra_groupfor special promotions or corporate accounts - Set
on_sale = falseto temporarily hide productions without deletion
- Use
-
Package Amount Strategy:
package_amount = 1: Pay-per-period (most common)package_amount = 3: Quarterly bundles (slight discount)package_amount = 12: Annual subscriptions (significant discount)- Higher amounts = fewer transactions, better user retention
-
Version Control Integration:
- Review current master package before creating production
- Coordinate with telecom team when updating master packages
- Document package version changes that affect existing productions
- Consider creating new production for major service upgrades
-
Audit Trail:
- All production operations are logged via admin audit system
- Track which admin created/deleted productions
- Monitor order counts per production for popularity metrics
- Review production-to-package-version history for customer support
See Also:
- Version Control of Packages - Package versioning system
- Package Queue - Package lifecycle management
- Node Client - Access control and node configuration
- Order System - Complete order lifecycle
- Ordering Flow - Purchase workflow diagrams
Order System
The Order System manages the complete e-commerce transaction lifecycle in Helium, from order creation through payment processing to product delivery.
Overview
An order represents a user’s purchase of a production. Key capabilities:
- Create orders with optional coupon discounts
- Process payments via ePay gateways or account balance
- Track order status through lifecycle states
- Automatically cancel unpaid orders after timeout
- Publish events to notify other modules
- Admin order management and intervention
Order Data Model
Each order tracks:
- User and production reference
- Final amount after discounts
- Applied coupon (if any)
- Current status and timestamps
- Payment method and provider
- Soft deletion flag
Order Status Lifecycle
Orders transition through the following states:
┌─────────┐ payment ┌──────┐ delivery ┌───────────┐
│ Unpaid │──────────────────>│ Paid │─────────────────>│ Delivered │
└─────────┘ └──────┘ └───────────┘
│ │
│ timeout/ │ admin/
│ manual │ user
│ │
v v
┌───────────┐ ┌───────────┐
│ Cancelled │ │ Refunding │
└───────────┘ └───────────┘
│
│ processed
v
┌──────────┐
│ Refunded │
└──────────┘
Status Definitions
Unpaid: Order created, payment pending. Auto-cancelled after timeout (default: 30 minutes)Paid: Payment confirmed. Awaiting product deliveryDelivered: Packages delivered to user’s accountCancelled: Order cancelled before paymentRefunding: Refund in progress (not fully implemented)Refunded: Refund completed (not fully implemented)
Payment Methods
- Alipay: AliPay via ePay gateway
- WeChatPay: WeChat Pay via ePay gateway
- Usdt: USDT cryptocurrency via ePay gateway
- AccountBalance: Direct payment from user’s account balance
- AdminChange: Admin manually marked as paid
Order Creation
User Purchase Flow
- Browse available productions
- Select production and apply coupon (optional)
- Create order → receives order ID
- Choose payment method (ePay or balance)
- Complete payment → order marked as paid
- System delivers packages automatically
Validation Rules
When creating an order, the system validates:
- Unpaid Order Limit: Users cannot have more than
max_unpaid_ordersunpaid orders (default: 5) - Production Existence: Production must exist and be available for purchase
- Coupon Validation (if coupon applied):
- Coupon code must be valid
- Current time within coupon’s validity window
- Global usage limit not exceeded
- Per-user usage limit not exceeded
- Production price meets coupon’s minimum amount requirement
Possible Results
- Created: Order created successfully, returns order ID
- ProductionNotFound: Selected production doesn’t exist
- CouponInvalid: Coupon is invalid or not applicable
- TooManyUnpaid: User has reached the unpaid order limit
Payment Processing
Payment Methods
1. ePay Gateway Payment
ePay (易支付) is a third-party payment aggregator supporting:
- AliPay
- WeChat Pay
- USDT cryptocurrency
Payment Flow:
User → Request Payment URL → Get Signed URL → Redirect to ePay
↓
User ← Return to App ← OrderPaidEvent ← Callback ← Payment Complete
Process:
- User requests payment URL with order ID, provider, and channel
- System generates signed payment URL
- User redirected to ePay gateway
- User completes payment on external site
- ePay sends callback to server with payment result
- System verifies signature and updates order status
OrderPaidEventpublished to trigger delivery- User redirected back to app
Security: All callbacks are signature-verified using provider’s secret key to prevent fraud.
2. Account Balance Payment
Users can pay directly from their account balance.
Process:
- User requests to pay with balance
- System checks sufficient balance available
- Balance deducted atomically with order update
- Balance change logged for audit trail
OrderPaidEventpublished to trigger delivery
Transaction Safety: Balance deduction and order update occur in a single atomic transaction with pessimistic locking to prevent race conditions.
Order Cancellation
Cancellation Methods
1. User Cancellation
Users can cancel their own unpaid orders at any time. Once cancelled, the order cannot be restored.
2. Automatic Cancellation
A cron job automatically cancels unpaid orders after a timeout period (default: 30 minutes). This prevents abandoned orders from cluttering the system.
Configuration: auto_cancel_after in shop config (default: 30 minutes)
3. Admin Cancellation
Administrators can manually cancel orders through the management interface.
Order Events
The order system publishes events via RabbitMQ for inter-module communication.
Event Types
| Event | When Published | Consumers |
|---|---|---|
OrderCreatedEvent | Order created | Internal tracking |
OrderPaidEvent | Payment confirmed | Market module (affiliate rewards), Delivery system |
OrderCancelledEvent | Order cancelled | Internal tracking |
OrderDeliveredEvent | Products delivered | ⚠️ Not yet implemented |
OrderPaidEvent → Product Delivery
When OrderPaidEvent is published, the system should:
- Retrieve order and production details
- Find current master package of the package series
- Create package queue items for the user
- Trigger package activation
- Update order status to
Delivered
Status: ⚠️ The delivery hook that consumes OrderPaidEvent needs to be implemented or located in the codebase.
Product Delivery
When an order is paid, the system delivers products by creating package queue items.
Delivery Flow
OrderPaidEvent → Delivery Hook → Create Package Queue Items → Activate First Package
What Happens
- Delivery hook receives
OrderPaidEvent - Looks up order and production details
- Finds current master package of the package series
- Creates
package_amountqueue items (e.g., 3 items for quarterly plan) - Links items to order for refund tracking
- First package automatically activates for user
- Order status updates to
Delivered
Important: The delivery hook snapshots the master package version at payment time, ensuring users receive the package version that was advertised when they purchased.
⚠️ Implementation Status: The delivery hook needs to be implemented or located in the codebase.
Admin Management
Administrators can manage orders through the management interface.
Admin Operations
| Operation | Permissions | Description |
|---|---|---|
| List Orders | Moderator, SuperAdmin, CustomerSupport | View all orders with filters (user, production, status) |
| Show Order Detail | Moderator, SuperAdmin, CustomerSupport | View complete order information |
| Mark as Paid | Moderator, SuperAdmin, CustomerSupport | Manually mark order as paid (triggers delivery) |
| Change Amount | Moderator, SuperAdmin, CustomerSupport | Adjust order amount for corrections or refunds |
Common Use Cases
- Manual Payment Processing: Mark orders as paid after offline payments
- Customer Support: View order details and history for troubleshooting
- Price Adjustments: Correct pricing errors or apply custom discounts
- Partial Refunds: Adjust order amount for partial refunds
Audit Trail
All admin operations are automatically logged with:
- Admin user ID and role
- Operation performed
- Target order ID
- Parameters and changes
- Timestamp and result
API Overview
The order system exposes a gRPC service: ShopOrderService (see proto/shop/order.proto)
Main Operations
| Operation | Purpose |
|---|---|
VerifyCoupon | Check if a coupon code is valid |
CreateOrder | Create new order with optional coupon |
ListOrders | Get user’s order history |
GetOrderDetail | View specific order details |
GetEpayUrl | Generate payment gateway URL |
PayOrderWithBalance | Pay with account balance |
CancelOrder | Cancel unpaid order |
DeleteOrder | Hide order from user’s list |
ListEpayProviders | Get available payment providers |
Configuration
Stored in Redis under key shop:
| Setting | Default | Description |
|---|---|---|
max_unpaid_orders | 5 | Maximum unpaid orders per user |
auto_cancel_after | 30 minutes | Timeout before auto-cancellation |
epay_notify_url | - | Server callback URL for payment notifications |
epay_return_url | - | User redirect URL after payment |
Usage Examples
User Purchase Flow
// 1. Verify coupon (optional)
const couponResponse = await orderService.verifyCoupon({ code: "DISCOUNT10" });
const couponId = couponResponse.isValid ? couponResponse.coupon.id : null;
// 2. Create order
const createResponse = await orderService.createOrder({
productionId: selectedProductionId,
couponId: couponId,
});
if (createResponse.result === "TOO_MANY_UNPAID") {
showError("Please pay or cancel existing orders first");
return;
}
const orderId = createResponse.orderId;
// 3. Choose payment method
if (useAccountBalance) {
// Pay with balance
const payResponse = await orderService.payOrderWithBalance({ orderId });
if (payResponse.result === "NOT_ENOUGH_BALANCE") {
showError("Insufficient balance");
return;
}
showSuccess("Payment successful!");
} else {
// Pay with ePay
const providers = await orderService.listEpayProviders({});
const urlResponse = await orderService.getEpayUrl({
orderId: orderId,
providerId: providers.providers[0].id,
channel: "EPAY_CHANNEL_ALI_PAY",
});
// Redirect to payment gateway
window.location.href = urlResponse.url;
}
// 4. Check order status
const detailResponse = await orderService.getOrderDetail({ orderId });
if (detailResponse.detail) {
console.log("Order status:", detailResponse.detail.order.orderStatus);
console.log("Production:", detailResponse.detail.production.title);
} else {
console.error("Order not found or not accessible");
}
Implementation Status
Completed Features
- ✅ Order creation with coupon validation
- ✅ ePay payment gateway integration
- ✅ Account balance payment
- ✅ Order cancellation (user, automatic, admin)
- ✅ Order tracking and status management
- ✅ Event publishing for inter-module communication
- ✅ Admin management operations
- ✅ Complete gRPC API
Pending Implementation
- ⚠️ Product delivery hook: Needs to consume
OrderPaidEventand create package queue items - ⚠️ OrderDeliveredEvent: Not currently published (implement or remove)
- ⚠️ Refund workflow: Status states exist but full refund process not implemented
Important Notes
For Backend Developers
- Transaction Safety: Balance payments use atomic transactions with pessimistic locking
- Idempotency: Payment callbacks can be replayed safely without duplicate charges
- Event-Driven Architecture: Use RabbitMQ events for cross-module communication
- Signature Verification: Always verify ePay callback signatures to prevent fraud
- Processor Pattern: All APIs exposed via
Processortrait, not object-oriented methods [[memory:6079830]]
For Frontend Developers
- Order Status Polling: After payment, poll order status until
Delivered - ePay Redirect: Handle user redirect to external payment gateway
- Error Handling: Handle all result enums (TooManyUnpaid, CouponInvalid, etc.)
- Coupon Verification: Always verify coupon before order creation to show discount preview
- Balance Check: Check user balance before offering balance payment option
See Also:
- Production - Product catalog and version control
- Shop Module Introduction - Shop module overview
- Package Queue - Package delivery mechanism
- Balance System - User balance management (if documented)
Ordering Flow
This document explains the complete purchase flow in the Helium shop system - how a user goes from browsing products to receiving service access. It covers the conceptual flow, cross-module interactions, and important implementation notes that developers should remember.
Flow Overview
User Journey:
Browse Products → Apply Coupon → Create Order → Pay → Receive Packages → Access Nodes
System Flow:
┌─────────────┐ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐
│ Production │ ───> │ Order │ ───> │ Payment │ ───> │ Package │
│ Service │ │ Service │ │ Process │ │ Queue │
└─────────────┘ └──────────────┘ └─────────────┘ └──────────────┘
│ │ │ │
▼ ▼ ▼ ▼
Catalog with Order Creation Payment Methods Service
Access Control + Pricing Logic (ePay / Balance) Activation
Key Concepts
1. Production Visibility
Productions (products) are filtered by access control before users can see them:
- User Group: Basic tier matching (
visible_tomust equal user’suser_group) - Private Productions: If
is_private = true, user must havelimit_to_extra_groupin theirextra_groups - On-Sale Status: Only
on_sale = trueproductions are visible
This means the same codebase can show different catalogs to different user tiers.
2. Coupon Validation
Coupons are validated TWICE in the flow:
- Pre-order verification:
VerifyCouponRPC for UI preview - Order creation validation: Re-validated when creating order (security)
Why twice? The coupon state can change between preview and order creation (e.g., usage limit reached). Always validate at order creation to prevent abuse.
Discount Types:
- Rate Discount: Percentage off (e.g., 10% → 0.1 rate)
- Amount Discount: Fixed amount off with minimum threshold
Validation Rules:
- Time window (start_time ≤ now ≤ end_time)
- Global usage limit (total uses across all users)
- Per-account usage limit (uses per individual user)
- Minimum amount requirement (for amount-based discounts)
3. Order Creation
When creating an order, the system:
- Checks unpaid order limit (default: 5 per user)
- Validates production exists and is available
- Applies coupon discount if provided
- Creates order with final calculated price
- Publishes
OrderCreatedEvent(for tracking)
Important: The final price is locked at order creation time. Even if production price changes later, the order amount remains unchanged.
4. Payment Methods
Two payment paths with different characteristics:
ePay Gateway (AliPay, WeChat, USDT)
- User Flow: Redirect to external gateway → Complete payment → Redirect back
- Callback Flow: Gateway sends async callback to server → Signature verification → Update order
- Security: All callbacks MUST verify signature using provider’s key
- Idempotency: Callbacks can be replayed; check order status before processing
Architecture:
User Payment on Gateway
↓
Gateway POST /api/shop/epay/callback
↓
Publish EpayCallback event to RabbitMQ (immediate return)
↓
EpayHook consumer processes callback
↓
Verify signature → Update order → Publish OrderPaidEvent
Why async? The HTTP callback must return immediately to the gateway (within 2-3 seconds). Processing happens async via message queue.
Account Balance
- Transaction Safety: Atomic operation with pessimistic locking (
FOR UPDATE) - Balance Types: Only
available_balancecan be used (notfrozen_balance) - Audit Trail: Every balance change logged to
user_balance_change_log
Why pessimistic lock? Prevents race conditions if user attempts multiple simultaneous payments.
5. Package Delivery
Trigger: OrderPaidEvent published after payment confirmation
Expected Flow (delivery hook needs implementation):
OrderPaidEvent → DeliveryHook → Create Package Queue Items → Update Order Status
What happens:
- Look up order and production details
- Find master package of the package series (current version at purchase time)
- Create N package queue items (N = production’s
package_amount) - Link items to order via
by_orderfield (for refund tracking) - Publish
PackageQueuePushEvent - Update order status to
Delivered
Critical: The package version is snapshot at purchase time. If the master package changes later, existing orders deliver the old version (version isolation).
Implementation Status: ⚠️ The delivery hook that consumes OrderPaidEvent is not yet visible in the codebase. This is the missing link between payment and package delivery.
6. Service Activation
Trigger: PackageQueuePushEvent published after package queue creation
Activation Logic:
- Only ONE package per user can be
Activeat a time - If no active package: Activate oldest queued package (FIFO)
- If active package exists: New packages remain in queue
- When active package expires: Next queued package auto-activates
Package States:
InQueue: Waiting for activationActive: Currently providing serviceConsumed: Expired (time or traffic limit)Cancelled: Refunded or cancelled by admin
7. Node Access
With an active package, users gain access to nodes filtered by available_group:
Active Package (available_group = 1)
↓
Query: WHERE node.available_groups && ARRAY[1]
↓
Returns: All nodes that include group 1 in their available_groups array
Access Control Chain:
Purchase Production → Receive Package → Package Activates → Defines available_group → Filters Nodes
No active package = no node access (empty list).
Cross-Module Interactions
Shop → Telecom (Package Delivery)
Event: OrderPaidEvent
- Exchange:
shop - Routing Key:
order_paid - Consumer: Delivery hook (needs implementation in shop or telecom module)
- Purpose: Trigger package queue creation after payment
Telecom → Telecom (Package Activation)
Event: PackageQueuePushEvent
- Exchange:
telecom - Routing Key:
package_queue_push - Consumer:
TelecomPackageQueueHook - Purpose: Auto-activate first package if user has none active
Shop → Market (Affiliate Rewards)
Event: OrderPaidEvent
- Consumer: Market module
- Purpose: Calculate and distribute affiliate commissions
Important Implementation Notes
For Backend Developers
-
Transaction Boundaries
- Balance payment: Single transaction includes balance deduction + order update
- Use
FOR UPDATEto lock balance row during payment - Order delivery: May need transaction spanning shop and telecom databases
-
Event-Driven Architecture
- Always publish events AFTER database commit (not before)
- Events must be idempotent (can be replayed safely)
- Use separate consumers for cross-module communication
-
Processor Pattern [[memory:6079830]]
- All service APIs exposed via
Processor<Input, Result<Output, Error>> - No object-oriented methods for business logic
- See existing services for reference
- All service APIs exposed via
-
Security Considerations
- ePay callbacks: MUST verify signature before processing
- Balance operations: Use pessimistic locks to prevent race conditions
- Coupon validation: Re-validate at order creation (not just preview)
-
Missing Implementation
- Order delivery hook (consumes
OrderPaidEvent) is not yet implemented - This is why orders remain stuck in
Paidstatus instead of moving toDelivered - Implementation location: Either
modules/shop/src/hooks/delivery.rsormodules/telecom/src/hooks/order.rs
- Order delivery hook (consumes
For Frontend Developers
-
Order Status Polling
- After payment, poll
GetOrderDetailuntil status becomesDelivered - Recommended interval: 2-3 seconds
- Timeout: ~60 seconds (suggest manual refresh after)
- After payment, poll
-
ePay Redirect Handling
- Save order ID before redirecting to gateway
- User redirects back via
epay_return_urlconfigured in shop config - On return: Check order status (payment may take a few seconds to process)
-
Error Handling
- All operations return result enums (not exceptions)
- Check result type before accessing response data
- Common errors:
TooManyUnpaid,CouponInvalid,NotEnoughBalance,OrderNotFound
-
Balance vs ePay Decision
- Check user balance before showing payment options
- ePay: Redirect flow, user leaves app temporarily
- Balance: Instant payment, better UX if sufficient balance
-
Coupon UI/UX
- Verify coupon before order creation (show discount preview)
- Display applicable conditions (time window, usage limit, min amount)
- Show final price after discount in order summary
Configuration
Shop module config stored in Redis under key shop:
| Field | Default | Purpose |
|---|---|---|
max_unpaid_orders | 5 | Maximum unpaid orders per user |
auto_cancel_after | 30m | Timeout for automatic order cancellation |
epay_notify_url | - | Server callback URL for payment notifications |
epay_return_url | - | User redirect URL after payment |
Common Issues and Solutions
Order Creation Fails with TooManyUnpaid
Cause: User has >= max_unpaid_orders unpaid orders
Solution: Cancel old unpaid orders or complete payment
Coupon Shows as Invalid
Causes:
- Outside time window (check
start_timeandend_time) - Usage limit exceeded (global or per-account)
- Production price below minimum amount (for amount-based discounts)
Solution: Check coupon conditions and inform user why it’s invalid
Balance Payment Fails
Causes:
- Insufficient
available_balance(frozen balance cannot be used) - Order already paid or cancelled
- Concurrent payment attempt (transaction conflict)
Solution: Refresh balance, check order status, retry if conflict
ePay Callback Not Received
Causes:
epay_notify_urlnot publicly accessible- Firewall blocking gateway IPs
- Signature verification failed
Solution: Check server logs, verify network config, confirm provider credentials
Order Stuck in Paid Status
Cause: Delivery hook not running or not implemented
Solution: Check RabbitMQ consumer status, verify OrderPaidEvent is being consumed
Packages Not Activating
Cause: PackageQueuePushEvent not triggering or hook not running
Solution: Check telecom module hooks, verify event publishing
Workflow Diagram
┌────────────────────────────────────────────────────────────────┐
│ User Purchase Flow │
└────────────────────────────────────────────────────────────────┘
1. Browse Productions (filtered by user group + extra groups)
└─> ProductionService.ListUserProduction
2. [Optional] Verify Coupon
└─> CouponService.VerifyCoupon
3. Create Order (with optional coupon)
└─> OrderService.CreateOrder
└─> Validates: unpaid limit, production exists, coupon valid
└─> Calculates final price with discount
└─> Publishes: OrderCreatedEvent
4a. Pay with ePay
└─> OrderService.GetEpayUrl (generate payment URL)
└─> User redirects to gateway
└─> Gateway calls back: POST /api/shop/epay/callback
└─> EpayHook consumes EpayCallback event
└─> OrderService.PayOrderWithEpay
└─> Verifies signature, updates order
└─> Publishes: OrderPaidEvent
4b. Pay with Balance
└─> OrderService.PayOrderWithBalance
└─> Atomic transaction: lock balance + deduct + update order
└─> Publishes: OrderPaidEvent
5. Deliver Packages [⚠️ Needs Implementation]
└─> DeliveryHook consumes OrderPaidEvent
└─> Finds master package of production's package series
└─> Creates N package queue items (N = package_amount)
└─> Updates order status to Delivered
└─> Publishes: PackageQueuePushEvent
6. Activate Service
└─> TelecomPackageQueueHook consumes PackageQueuePushEvent
└─> If no active package: activates oldest queued package
└─> Publishes: PackageActivateEvent
7. User Accesses Nodes
└─> Active package defines available_group
└─> Nodes filtered by: available_groups && ARRAY[user_group]
└─> User can generate subscription links and connect
Testing Considerations
Happy Path Testing
- User with valid permissions can see productions
- Coupon applies correct discount
- Order creation succeeds with valid inputs
- Payment updates order status to
Paid - Packages are delivered automatically
- First package activates immediately
- User can access nodes matching package group
Edge Cases to Test
- User at unpaid order limit (should reject new orders)
- Coupon usage limit reached between verification and order creation
- Production deleted after order created but before payment
- Duplicate payment callbacks (idempotency check)
- Concurrent balance payments (transaction locking)
- User already has active package (new packages should queue)
- Package series has no master package (should fail gracefully)
Error Recovery
- Payment succeeds but delivery fails (manual intervention needed)
- Partial delivery (transaction rollback)
- Event replay (idempotent processing)
- Network timeout during ePay redirect (order remains unpaid, can retry)
See Also
- Production - Product catalog, version control, and package series
- Order System - Order entity, status lifecycle, and admin operations
- Shop Module Introduction - Module overview and architecture
- Package Queue - Package delivery and activation mechanics
- Coupon System - Coupon types, validation rules, and management (if documented)
- Account Balance - Balance operations and gift cards (if documented)
Account Balance
Account Balance is the user’s internal wallet system in Helium. It holds monetary value that users can use to pay for orders without external payment gateways. The system tracks two types of balance (available and frozen) and maintains a complete audit trail of all balance changes.
Core Concept
Each user has a single balance record with two components:
pub struct UserBalance {
pub id: i64,
pub user_id: Uuid,
pub available_balance: Decimal, // Spendable balance
pub frozen_balance: Decimal, // Temporarily locked
}
Available Balance: The amount user can spend on orders or withdraw. This is what users see as their “wallet balance”.
Frozen Balance: Temporarily locked funds that cannot be spent. Used for scenarios where balance needs to be reserved but not immediately consumed (e.g., pending transactions, dispute holds).
Balance Change Types
All balance modifications are categorized into four types:
- Deposit: Adds to available balance (gift card redemption, admin top-up, refunds)
- Consume: Deducts from available balance (order payment, admin deduction)
- Freeze: Moves available balance to frozen balance (hold funds)
- Unfreeze: Moves frozen balance back to available balance (release hold)
pub enum UserBalanceChangeType {
Deposit, // available_balance + amount
Consume, // available_balance - amount
Freeze, // available_balance - amount, frozen_balance + amount
Unfreeze, // frozen_balance - amount, available_balance + amount
}
Every balance change is automatically logged in user_balance_change_log with timestamp, amount, reason, and change type.
User Operations
Get Balance
Users query their current balance status:
Service: UserBalanceService::GetMyBalance
Returns the user’s UserBalance with both available and frozen amounts, or None if balance has not been initialized (should never happen post-registration).
List Balance Changes
Users can view their transaction history with pagination:
Service: UserBalanceService::ListMyBalanceChanges
Parameters:
limit,offset: Pagination controlsasc: Sort order (ascending/descending by created_at)
Returns a list of UserBalanceChangeLog entries showing:
- Change amount (positive for deposits/unfreezes, negative for consumes/freezes)
- Reason string (human-readable explanation)
- Change type
- Timestamp
Redeem Gift Card
Users can redeem gift cards to add balance:
Service: GiftCardService::RedeemGiftCardRequest
Flow:
- Validate gift card exists and is not used/expired
- Verify user exists
- Add card amount to user’s available balance (transaction)
- Log balance change with reason “Redeem Gift Card”
- Mark gift card as redeemed with user ID and timestamp
Result Types:
Success: Balance credited, card redeemedCardNotFound: Invalid secret codeAlreadyUsed: Card already redeemed by someoneExpired: Card past valid_until dateUserNotFound: User account doesn’t exist
Important: Gift card redemption is transactional. If any step fails, the entire operation rolls back.
Payment with Balance
Users can pay for orders using their available balance:
Service: OrderService::PayOrderWithBalance
Flow:
- Verify order exists, belongs to user, and is unpaid
- Check user has sufficient available balance
- Transaction begins:
- Deduct order amount from available balance
- Log balance change with order reference
- Update order status to Paid
- Record paid_at timestamp
- Emit
OrderPaidEventfor downstream processing
Result Types:
Success: Order paid, balance deductedOrderNotFound: Invalid order or already paidNotEnoughBalance: Insufficient funds
Transaction Safety: The entire payment operation (balance deduction, log creation, order update) happens in a single database transaction. If any step fails, no changes are persisted.
Balance Initialization
User balances are automatically initialized when a new user registers:
Hook: RegisterHook consumes UserRegisterEvent from the auth module
Process:
- Creates balance record with
available_balance = 0andfrozen_balance = 0 - Uses
UpdateUserBalancewith zero diffs (upsert behavior) - No change log entry created (zero change, reason is empty string)
Note: The UpdateUserBalance entity operation has built-in upsert logic:
INSERT INTO user_balance (user_id) VALUES ($1)
ON CONFLICT (user_id) DO NOTHING
RETURNING *
This means calling UpdateUserBalance on a non-existent user will initialize their balance first, then apply the change.
Admin Operations
Administrators can manage user balances through ManageService. All operations require appropriate AdminRole permissions and are logged in the audit system.
Change User Balance
Operation: AdminChangeUserBalance
Permissions: Moderator, SuperAdmin, CustomerSupport
Parameters:
user_id: Target useramount: Change amount (always positive, type determines operation)reason: Human-readable explanation (required for audit)change_type: One of Deposit/Consume/Freeze/Unfreeze
Examples:
- Manual top-up:
{ amount: 100, change_type: Deposit, reason: "Promotional credit" } - Correction:
{ amount: 50, change_type: Consume, reason: "Duplicate refund correction" } - Hold funds:
{ amount: 200, change_type: Freeze, reason: "Dispute investigation" } - Release hold:
{ amount: 200, change_type: Unfreeze, reason: "Dispute resolved" }
Important: The amount parameter is always a positive number. The operation type determines whether it’s added or subtracted:
- Deposit:
available += amount - Consume:
available -= amount - Freeze:
available -= amount, frozen += amount - Unfreeze:
frozen -= amount, available += amount
List Balance Change Logs
Operation: AdminListUserBalanceLogs
Permissions: All admin roles
Returns paginated balance change history for a specific user, useful for customer support investigations.
Integration Points
Gift Cards
Gift cards are a primary source of balance deposits. When redeemed:
- Card’s
amountfield is added to available balance - Card marked as used with
used_by = user_id,redeem_at = NOW() - Balance change log created with change_type = Deposit
See Gift Card System for card management and generation.
Orders
Balance is consumed when users pay for orders via PayOrderWithBalance:
- Order’s
total_amountis deducted from available balance - Order transitions: Unpaid → Paid
OrderPaidEventemitted for package delivery- Balance change log references the order
See Order System for complete payment flows.
Refunds
When orders are refunded (status: Refunding → Refunded), the original payment amount should be returned to the user’s balance. This is handled by admin operations manually or through automated refund processing.
Current Status: Manual refund workflow requires admin to use AdminChangeUserBalance with change_type: Deposit.
Database Schema
Key tables (see migrations/20250815133800_create_shop_entites.sql):
user_balance:
user_id UUID PRIMARY KEY
available_balance DECIMAL NOT NULL DEFAULT 0
frozen_balance DECIMAL NOT NULL DEFAULT 0
user_balance_change_log:
id BIGSERIAL PRIMARY KEY
user_id UUID NOT NULL
amount DECIMAL NOT NULL
reason TEXT NOT NULL
change_type user_balance_change_type NOT NULL
created_at TIMESTAMP NOT NULL DEFAULT NOW()
Indexes:
user_balance_change_log(user_id, created_at DESC): Efficient pagination of user transaction historyuser_balance(user_id): Fast balance lookups (primary key)
Architecture Notes
Transaction Safety
All balance-modifying operations use database transactions:
- Payment with balance: Locks order and balance rows with
SELECT ... FOR UPDATE - Gift card redemption: Transaction ensures card can’t be double-redeemed
- Admin changes: Atomic balance update + log insertion
Change Log Automation
The UpdateUserBalance entity operation automatically:
- Creates or updates the balance record
- Determines change type from diff signs
- Inserts change log entry with correct amount/type
- All in a single transaction
Developer Note: You should never manually insert into user_balance_change_log. Always use UpdateUserBalance to modify balance, which handles logging automatically.
Decimal Precision
All monetary values use rust_decimal::Decimal for precise arithmetic. This avoids floating-point errors in financial calculations. Decimal serializes as string in protobuf/JSON to preserve precision.
Frozen Balance Use Cases
Currently, frozen balance is supported in the data model but not actively used in the order flow. Potential future use cases:
- Escrow for dispute resolution
- Pre-authorization holds
- Subscription renewals
- Withdrawal processing delays
Best Practices
-
Always Provide Reason: When modifying balance via admin operations, provide clear, descriptive reasons. These appear in user transaction history and audit logs.
-
Check Balance Before Deduction: Always verify sufficient available balance before attempting payment operations to avoid transaction rollbacks.
-
Use Transactions: Any operation involving balance changes and other state updates (orders, gift cards) must be wrapped in a database transaction.
-
Don’t Bypass Change Logs: Never directly update
user_balancetable. Always useUpdateUserBalanceto ensure change logs are created. -
Validate Amounts: All balance operations should validate that amounts are positive and reasonable (not excessively large).
Frontend Integration
Balance Display:
- Show
available_balanceas the user’s wallet balance - Optionally show
frozen_balanceif non-zero (with explanation) - Format decimals appropriately for currency display
Transaction History:
- Display
ListMyBalanceChangeswith infinite scroll or pagination - Color-code change types: green for Deposit/Unfreeze, red for Consume/Freeze
- Show reason string as transaction description
- Format timestamps in user’s local timezone
Payment Method Selection:
- When available_balance ≥ order total, enable “Pay with Balance” option
- Show remaining balance after payment preview
- Handle
NotEnoughBalanceerror gracefully with top-up prompt
Gift Card Redemption:
- Provide input field for gift card secret
- Handle all
RedeemGiftCardResultvariants with appropriate messages - Refresh balance display after successful redemption
See Also:
- Gift Card System - Card generation and management
- Order System - Payment flows and order lifecycle
- Shop Module Introduction - Overall shop architecture
Coupon
Coupon is the discount system that allows users to reduce their order total when purchasing productions. The system supports flexible discount strategies with comprehensive validation rules and usage limits.
Core Concept
A Coupon defines:
- Discount strategy: How the discount is calculated (rate or amount)
- Validity period: When the coupon can be used
- Usage limits: How many times the coupon can be used globally and per user
- Activation status: Whether the coupon is currently active
Data Model
pub struct Coupon {
pub id: i32,
pub code: String, // Unique code users enter
pub is_active: bool, // Administrative on/off switch
pub discount: Json<Discount>, // Discount strategy
pub start_time: Option<PrimitiveDateTime>, // When coupon becomes valid
pub end_time: Option<PrimitiveDateTime>, // When coupon expires
pub time_limit_per_account: Option<i32>, // Max uses per user
pub time_limit_global: Option<i32>, // Max total uses
pub used_count: i32, // Current usage count
}
Discount Types
The system supports two discount strategies through the Discount enum:
1. Rate Discount
Applies a percentage discount to the order total, regardless of the order amount.
Discount::Rate(RateDiscount {
rate: Decimal // e.g., 0.20 for 20% off
})
- Calculation:
final_price = original_price × (1 - rate) - Use case: General promotions (e.g., “20% off any purchase”)
- No minimum order requirement
2. Amount Discount
Subtracts a fixed amount from the order total, with a minimum order requirement.
Discount::Amount(AmountDiscount {
min_amount: Decimal, // Minimum order required
discount: Decimal // Amount to subtract
})
- Calculation:
final_price = original_price - discount(iforiginal_price >= min_amount) - Use case: Threshold promotions (e.g., “$10 off orders over $50”)
- Validation: Coupon is invalid if order doesn’t meet
min_amount
The discount data is stored as JSON in the database, allowing flexible extension of discount strategies in the future.
Validation Rules
Coupon validation occurs in two places:
1. Pre-Purchase Verification (VerifyCoupon)
Allows users to check if a coupon is valid before creating an order. This provides immediate feedback in the UI.
Validation checks (in order):
- Coupon exists (by code lookup)
- Current time is after
start_time(if set) - Current time is before
end_time(if set) - Global usage hasn’t exceeded
time_limit_global(if set) - User’s usage count hasn’t exceeded
time_limit_per_account(if set)
Returns Option<Coupon> - None if any validation fails.
2. Order Creation Validation
When a user creates an order with a coupon, the system performs additional validation through coupon_applicable():
Additional checks:
- All time-based validations (as above)
- For
Amountdiscounts: Production price must meetmin_amount - Per-account usage limit is re-checked at the database level (race condition protection)
If validation fails during order creation, returns CreateOrderResult::CouponInvalid.
Integration with Orders
Order Creation Flow
When a user creates an order with a coupon:
- Coupon lookup: Fetch coupon by ID
- Applicability check: Validate using
coupon_applicable() - Per-account limit check: Query database for user’s usage count
- Price calculation: Apply discount to production price
- Order creation: Store order with
coupon_usedfield - Usage tracking: The order record links to the coupon for usage counting
Price Calculation
let mut amount = prod.price;
if let Some(coupon) = coupon {
amount = match *coupon.discount {
Discount::Rate(r) => amount * (Decimal::ONE - r.rate),
Discount::Amount(a) => {
if amount >= a.min_amount {
amount - a.discount
} else {
amount // Discount not applied if below minimum
}
}
};
}
Usage Counting
The system tracks coupon usage through the orders table:
- Orders store the
coupon_usedfield (coupon ID) used_counton the coupon is derived by counting orders that reference it- Per-user usage is counted via
CountCouponUsageByUserquery
Important: Usage counting is based on order creation, not order payment status. An unpaid order still counts toward usage limits.
Active Status and Code Uniqueness
Active Status (is_active)
The is_active flag allows administrators to enable/disable coupons without deletion:
- Active: Coupon can be found and used
- Inactive: Coupon is invisible to users but preserved in database
This is useful for:
- Temporarily pausing a promotion
- Testing coupons before public release
- Historical record keeping
Code Uniqueness
The database enforces unique active codes through a partial index:
CREATE UNIQUE INDEX "idx_coupon_code"
ON "shop"."coupon" ("code")
WHERE is_active = TRUE;
Implications:
- Multiple inactive coupons can share the same code
- Only one active coupon can have a specific code at any time
- This allows code reuse across different promotion periods
Management Operations
The system provides admin APIs for coupon lifecycle management:
CRUD Operations
- Create: Generate new coupons with all configuration options
- Update: Modify existing coupons (code, discount, limits, times)
- List: Retrieve all coupons (no pagination - suitable for admin dashboard)
- Get: Fetch individual coupon by ID or code
- Delete: Permanently remove coupon from database
Note: Deleting a coupon does not cascade to orders. Orders retain the coupon_used ID even if the coupon is deleted.
Time Management
All timestamps use Unix epoch format in the API but are stored as TIMESTAMP WITHOUT TIME ZONE in the database:
- API layer converts between Unix timestamps and
PrimitiveDateTime - All time comparisons use UTC
start_timeandend_timeare optional - omitting them means no time restriction
Design Decisions
Why JSON for Discount?
Storing discount as JSON enables:
- Easy addition of new discount strategies without schema changes
- Type-safe handling through Rust’s
serdedeserialization - Database-level storage of complex discount rules
Why Count Orders, Not Payments?
Usage limits count order creation, not successful payments, because:
- Prevents abuse through repeated unpaid orders
- Simplifies usage tracking (no need to track order status changes)
- Protects limited-use coupons from reservation attacks
Why Separate Verification API?
The VerifyCoupon endpoint exists separately from order creation to:
- Provide immediate UI feedback without creating an order
- Allow frontend to show applicable discounts before purchase
- Reduce unnecessary order creation for invalid coupons
Frontend Integration Points
When implementing the coupon UI:
- Code Entry: Call
VerifyCouponas user types/submits coupon code - Visual Feedback: Display discount type and amount from returned coupon
- Price Preview: Calculate and show discounted price before order creation
- Order Creation: Pass
coupon_id(not code) inCreateOrderRequest - Error Handling: Handle
COUPON_INVALIDresult with user-friendly message
Key point: The verification step returns a Coupon object with an id field. Use this ID when creating the order, not the code string.
EPay Support
EPay (易支付) is the payment gateway integration that enables third-party payment processing through aggregator services. The system supports multiple payment providers with different channels, handles async payment callbacks, and ensures payment security through signature verification.
Core Concept
EPay acts as an abstraction layer over payment aggregators that support:
- AliPay (
alipay) - WeChat Pay (
wxpay) - USDT cryptocurrency (
usdt)
The system is designed to support multiple providers simultaneously, each with their own credentials and enabled payment channels. This allows failover capability and regional/method-specific provider selection.
Data Model
pub struct EpayProviderCredential {
pub id: i32,
pub display_name: String, // User-facing provider name
pub enabled_channels: Vec<EpaySupportedChannels>,
pub enabled: bool, // Admin on/off switch
pub key: String, // Merchant secret key
pub pid: i32, // Merchant ID
pub merchant_url: String, // Gateway endpoint
}
Provider Library
The libs/epay crate provides:
- Signature generation: MD5-based signing for payment requests
- Signature verification: Validate callbacks to prevent fraud
- Request/Response types: Type-safe payment gateway communication
- Channel enumeration: Standardized payment method identifiers
Payment Flow Architecture
The EPay payment flow involves multiple stages with async processing:
1. Payment URL Generation
User Journey: User creates order → selects provider and channel → receives payment URL
Process:
- User calls
GetPaymentUrlRPC with order ID, provider ID, and channel - System loads provider credentials from database
- System generates signed payment request using provider’s key
- Returns redirect URL:
{merchant_url}?{signed_parameters}
The signed parameters include order details, callback URLs, and an MD5 signature. The signature ensures the gateway can verify the request came from an authorized merchant.
2. External Payment
User is redirected to the EPay gateway (external site) where they complete payment through their chosen method (AliPay, WeChat, USDT). This happens entirely outside the system.
3. Async Callback Processing
Critical Architecture: The callback must return immediately to the gateway (within 2-3 seconds), so processing happens asynchronously through RabbitMQ.
Flow:
EPay Gateway → POST /api/shop/epay/callback → Publish to RabbitMQ → Return 200 OK
↓
EpayHook Consumer
↓
Verify Signature
↓
Update Order Status
↓
Publish OrderPaidEvent
Components:
api/epay.rs: HTTP endpoint that receives gateway callbackevents/epay.rs:EpayCallbackevent definitionhooks/epay.rs:EpayHookconsumer that processes the callback
Why Async?: Payment gateways expect immediate HTTP responses. If the server takes too long, the gateway may retry the callback multiple times, potentially causing duplicate processing.
4. Callback Verification
Security Model: All callbacks MUST verify the signature before processing.
Verification Process:
- Extract callback parameters (order ID, amount, status, etc.)
- Load provider credentials from database using
pidfrom callback - Reconstruct signature using provider’s secret key
- Compare computed signature with received signature
- Reject if signatures don’t match
Protection Against:
- Forged callbacks from malicious actors
- Man-in-the-middle attacks
- Replay attacks with modified amounts
5. Idempotency Handling
Callbacks can be received multiple times due to network retries. The system handles this by:
- Checking order status before processing
- Only updating unpaid orders
- Returning success for already-paid orders
Multi-Provider System
Provider Discovery
Frontend clients discover available providers via ListEpayProviders RPC:
pub struct EpayProviderSummary {
pub id: i32,
pub display_name: String,
pub enabled_channels: Vec<EpaySupportedChannels>,
}
Query Filters:
- Only returns providers where
enabled = TRUE - Excludes providers with empty
enabled_channels - Filters out channels not in the enabled list
This allows dynamic provider selection in the UI based on current availability.
Provider Selection
When requesting a payment URL, the user specifies:
- Provider ID: Which payment aggregator to use
- Channel: Which payment method (alipay, wxpay, usdt)
The system validates:
- Provider exists and is enabled
- Requested channel is in provider’s
enabled_channels - Order is unpaid and belongs to the requesting user
Provider Management
The enabled flag (added via migration 20250929232831) allows administrators to:
- Temporarily disable problematic providers without deletion
- Switch between providers during incidents
- A/B test different payment gateways
- Phase in new providers gradually
Database Operations:
- Providers are managed via admin interface or direct database access
- No gRPC APIs exist for provider CRUD (admin-only operation)
- Credentials are redacted in logs for security
Configuration
EPay requires configuration in the shop module config:
{
"shop": {
"epay_notify_url": "https://your-domain.com/api/shop/epay/callback",
"epay_return_url": "https://your-domain.com/payment/success",
...
}
}
Configuration Fields
epay_notify_url: Server-to-server callback endpoint (async notification)epay_return_url: User redirect URL after payment (browser redirect)
Important Distinctions:
notify_url: Backend webhook for payment processing (reliable)return_url: Frontend redirect for user experience (unreliable)
Never rely on return_url for order processing. Users may close the browser before redirecting. Always use the notify_url callback for payment confirmation.
Provider Credentials
Providers are stored in the shop.epay_provider_credential table:
INSERT INTO shop.epay_provider_credential (
display_name,
enabled_channels,
enabled,
key,
pid,
merchant_url
) VALUES (
'My Payment Provider',
ARRAY['alipay', 'wxpay']::text[],
true,
'your-merchant-secret-key',
1234,
'https://pay.provider.com/submit.php'
);
Obtaining Credentials: Register with an EPay-compatible payment aggregator to receive merchant credentials (PID, Key, Gateway URL).
Integration with Order System
Order Fields
Orders track EPay payment through:
paid_with_epay_provider: Stores provider ID when payment URL is generatedpayment_method: Set to channel (AliPay, WeChat, USDT) after paymentorder_status: Updated fromUnpaidtoPaidon successful callback
Payment Method Mapping
The system maps EPay channels to internal payment methods:
EpaySupportedChannels::AliPay => PaymentMethod::AliPay
EpaySupportedChannels::WeChatPay => PaymentMethod::WeChat
EpaySupportedChannels::Usdt => PaymentMethod::Usdt
Event Publishing
When a callback successfully processes:
- Order status updated to
Paid OrderPaidEventpublished to RabbitMQ (shop.order_paid)- Downstream consumers (e.g., market module) react to the event
Error Handling
Callback Validation Failures
If signature verification fails:
- Log warning (may indicate exposed webhook or malicious request)
- Return error to gateway (gateway may retry with correct signature)
- Do NOT update order status
Order State Errors
If order is not found or already paid:
- Return success to gateway (prevent infinite retries)
- Log the incident for monitoring
Provider Not Found
If the callback references an unknown provider:
- Cannot verify signature (no key available)
- Log error and return failure
Frontend Integration Points
When implementing EPay payment UI:
- List Providers: Call
ListEpayProvidersto get available providers and channels - Display Options: Show provider names and channel icons (AliPay, WeChat, USDT)
- Request Payment: Call
GetPaymentUrlwith selected provider ID and channel - Redirect User: Open payment URL in browser or webview
- Handle Return: When user returns via
return_url, poll order status to confirm payment - Status Polling: Use
GetOrderByIdto check if payment completed
Key Points:
- Payment confirmation happens via backend callback, not frontend redirect
- Frontend should poll order status after user returns
- Don’t assume payment succeeded just because user returned to app
- Handle timeout scenarios (user abandons payment gateway)
Design Decisions
Why Multi-Provider Support?
Supporting multiple providers enables:
- Failover: Switch to backup provider if primary has issues
- Regional Optimization: Use different providers for different regions
- Rate Shopping: Select providers with better fees for specific channels
- Risk Distribution: Avoid single point of failure
Why Async Callback Processing?
Payment gateways expect fast responses (< 3 seconds). Database queries, signature verification, and event publishing can exceed this threshold. Async processing via RabbitMQ ensures:
- Immediate HTTP response to gateway
- Reliable processing with automatic retries
- Decoupled webhook handling from business logic
Why Store Provider ID on Order?
When generating a payment URL, the system stores the provider ID in paid_with_epay_provider. This enables:
- Signature verification (need provider’s key)
- Callback validation (ensure callback matches expected provider)
- Analytics and reporting (which provider processed the payment)
Why MD5 Signatures?
MD5 is cryptographically weak but widely used by Chinese payment aggregators. The EPay library uses MD5 for compatibility with existing gateway implementations. The signature prevents tampering but should not be considered cryptographically secure.
Gift Card
Gift Card is a prepaid credit system that allows users to redeem balance into their account. Gift cards have a secret code, a fixed monetary value, an expiration date, and can only be used once. Administrators can generate gift cards in bulk or create special cards with custom secrets.
Core Concept
A Gift Card represents:
- Secret code: Unique 64-character alphanumeric string for redemption
- Amount: Fixed value credited to user’s balance upon redemption
- Expiration: Time limit after which the card becomes invalid
- Single-use: Once redeemed, the card is permanently marked as used
Data Model
pub struct GiftCard {
pub id: i32,
pub secret: String, // 64-char alphanumeric code
pub amount: Decimal, // Value to credit
pub used_by: Option<Uuid>, // User who redeemed (if any)
pub created_at: PrimitiveDateTime, // Creation timestamp
pub redeem_at: Option<PrimitiveDateTime>, // When it was redeemed
pub valid_until: PrimitiveDateTime, // Expiration date
}
Gift Card Lifecycle
1. Generation
Gift cards are created by administrators through bulk generation or special creation:
Bulk Generation (AdminGenerateGiftCard):
- Generates N cards with identical amount and expiration
- Secrets are randomly generated (64-character alphanumeric)
- Returns list of secret codes for distribution
- Uses hash set to ensure uniqueness before database insertion
Special Creation (AdminCreateSpecialGiftCard):
- Creates a single card with custom secret (e.g., promotional codes like “WELCOME2025”)
- Useful for marketing campaigns or personalized gifts
- Secret must be unique (database constraint)
Permissions: Moderator, SuperAdmin, CustomerSupport
2. Distribution
Gift cards exist as secret codes. Administrators must distribute these codes to users through external channels (email, physical cards, promotional materials). The system does not handle distribution automatically.
3. Redemption
Users redeem gift cards through GiftCardService::RedeemGiftCard:
Flow:
- User submits secret code
- System looks up valid gift card (not used, not expired)
- Validates user exists
- Transaction begins:
- Add card amount to user’s available balance
- Log balance change (type: Deposit, reason: “Redeem Gift Card”)
- Mark card as used with
used_byandredeem_at
- Transaction commits
Validation Order:
- Card exists by secret →
CardNotFound - Card not already used →
AlreadyUsed - Card not expired (
valid_until > NOW()) →Expired - User exists →
UserNotFound - Transaction succeeds →
Success
Important: The validation provides specific error reasons. If a card is found but invalid, the system distinguishes between “already used” and “expired” to provide clear feedback.
4. Expiration
Expired cards remain in the database but cannot be redeemed:
- Query filter:
valid_until > NOW() AND used_by IS NULL - Redemption attempt returns
Expiredresult - No automatic cleanup of expired cards (historical record)
User Operations
Redeem Gift Card
Service: GiftCardService::RedeemGiftCard
Users provide a secret code to add balance to their account.
Result Types:
Success: Balance credited successfullyCardNotFound: Secret doesn’t exist in databaseAlreadyUsed: Card was previously redeemed by any userExpired: Card’svalid_untildate has passedUserNotFound: Requesting user account doesn’t exist (edge case)
Transaction Safety: The redemption process is fully transactional. If the balance update fails, the card remains unused. This prevents double-redemption and ensures balance consistency.
Admin Operations
All gift card admin operations are exposed through ManageService and require appropriate admin roles.
Generate Gift Cards
Operation: AdminGenerateGiftCard
Bulk-creates gift cards with identical configuration.
Parameters:
number: How many cards to generate (batch size)amount: Value of each cardvalid_until: Expiration timestamp for all cards
Returns: List of secret codes (strings)
Use Cases:
- Promotional campaigns (e.g., 1000 cards worth $10 each)
- Customer rewards programs
- Event giveaways
Note: The operation returns the secret codes in the response. These should be securely stored or distributed immediately, as they cannot be retrieved later (secrets are logged as [REDACTED] in debug output for security).
Create Special Gift Card
Operation: AdminCreateSpecialGiftCard
Creates a single card with a custom secret.
Parameters:
secret: Custom code (e.g., “NEWYEAR2025”)amount: Card valuevalid_until: Expiration timestamp
Use Cases:
- Marketing promotions with memorable codes
- Influencer partnerships with branded codes
- VIP customer gifts
Database Constraint: The secret must be unique across all gift cards (used or unused). Attempting to create a duplicate secret will fail.
List Gift Cards
Operation: AdminListGiftCards
Retrieves gift cards with optional filtering and pagination.
Filters:
filter_id: Specific card IDfilter_secret: Partial or exact secret matchfilter_is_used: Show only used or unused cardsfilter_used_by: Cards redeemed by specific userlimit,page: Pagination controls
Use Cases:
- Finding cards redeemed by a user (customer support)
- Checking if a secret exists before creation
- Monitoring unused expired cards
- Audit trail of card usage
Delete Gift Cards
Operation: AdminDeleteGiftCards
Permanently removes gift cards from the database.
Parameters: List of card IDs to delete
Permissions: Moderator, SuperAdmin (more restricted than other operations)
Use Cases:
- Removing expired promotional campaigns
- Cleaning up unused cards after campaign ends
Warning: Deleting a redeemed card does not affect user balance. The balance change log remains intact with the reason “Redeem Gift Card”, but the reference to the card is lost.
Integration with Balance System
Gift card redemption is the primary way users add funds to their account (other than admin top-ups).
Balance Credit Flow
When a gift card is redeemed:
- User’s
available_balanceincreases by cardamount - A
UserBalanceChangeLogentry is created:change_type:Depositamount: Card value (positive)reason: “Redeem Gift Card”
- User can immediately use the balance to pay for orders
See Account Balance for complete balance system documentation.
Transaction Integrity
The redemption uses database transactions with row-level locking:
SELECT ... FOR UPDATEon gift card prevents concurrent redemption- Balance update and card marking happen atomically
- If balance update fails (user not found), card remains unused
Security Considerations
Secret Generation
Secrets are 64-character random alphanumeric strings (A-Z, a-z, 0-9):
- Entropy: ~380 bits (62^64 combinations)
- Collision probability: Negligible even for millions of cards
- Not cryptographically signed or verifiable offline
Uniqueness: The system generates secrets in memory using a HashSet before database insertion, ensuring no duplicates in a batch. Database constraint provides final uniqueness guarantee.
Debug Output Redaction
The GiftCard struct implements custom Debug to redact secrets:
.field("secret", &"[REDACTED]")
This prevents accidental secret leakage in logs, error messages, or debug traces.
No Secret Recovery
Once generated, secrets cannot be retrieved by ID. Admins must save the secret codes from the generation response. This is intentional—secrets are meant to be distributed, not stored centrally.
Database Schema
Key schema details (from migrations/20250922063531_gift_card_system.sql):
Table: shop.gift_card
id SERIAL PRIMARY KEY
secret VARCHAR(255) NOT NULL UNIQUE
amount NUMERIC NOT NULL
used_by UUID REFERENCES auth.user_profile(id)
created_at TIMESTAMP NOT NULL DEFAULT NOW()
redeem_at TIMESTAMP
valid_until TIMESTAMP NOT NULL
Indexes:
(secret): Fast lookup by secret (primary redemption path)(secret) WHERE used_by IS NULL: Fast lookup for valid cards(valid_until) WHERE used_by IS NULL: Expiration queries(used_by): Find cards redeemed by a user
Foreign Key: used_by references user profile with ON DELETE RESTRICT, preventing user deletion if they’ve redeemed cards.
Frontend Integration Points
User Redemption Flow
-
Input Field: Provide text input for secret code (64 characters)
-
Validation: Optional client-side format check (alphanumeric, length)
-
Submission: Call
RedeemGiftCardwith the secret -
Result Handling:
Success: Show success message, update balance displayCardNotFound: “Invalid gift card code”AlreadyUsed: “This gift card has already been redeemed”Expired: “This gift card has expired”UserNotFound: Generic error (should never happen)
-
Balance Refresh: Fetch updated balance after successful redemption
Admin Management UI
Generate Cards:
- Form with number, amount, expiration date
- Display generated secrets in a list (with copy buttons)
- Warn user to save secrets before leaving page
List Cards:
- Table with columns: ID, Secret (masked/copyable), Amount, Status (Used/Unused), Used By, Created, Redeemed, Expires
- Filters for used/unused, user, date ranges
- Search by secret or ID
Create Special Card:
- Form with custom secret input, amount, expiration
- Validate secret format before submission
- Handle duplicate secret error clearly
Design Decisions
Why Single-Use Only?
Gift cards are one-time redeemable by design:
- Simplifies balance tracking (single deposit event)
- Prevents confusion about remaining card balance
- Matches traditional physical gift card behavior
- Users can check their balance history for redemptions
For recurring credits or subscriptions, use coupon system or scheduled balance deposits instead.
Why No Secret Retrieval?
Secrets are treated as bearer tokens:
- Admin generates and distributes them
- System validates but doesn’t need to recall them
- Reduces risk of centralized secret exposure
- Aligns with physical gift card model (code is on the card)
Admins can search by secret if a user provides it (customer support), but cannot list all secrets.
Why Soft Delete Not Supported?
Unlike coupons (which have is_active), gift cards are either used or deleted:
- Once distributed, cards shouldn’t be “disabled” (already in user’s hands)
- Unused cards can be deleted if campaign is cancelled
- Used cards should be kept for audit trail (don’t delete redeemed cards)
Why Expiration is Required?
All gift cards must have a valid_until date:
- Prevents indefinite liability on the system
- Aligns with legal/financial regulations for prepaid instruments
- Encourages timely redemption
- Allows cleanup of old campaigns
Set far-future dates (e.g., 10 years) for effectively non-expiring cards if needed.
See Also:
- Account Balance - Balance system and transaction history
- Order System - Using balance to pay for orders
- Shop Module Introduction - Overall shop architecture
Notification Module
The Notification module provides system-wide announcements and user notification preference management. It enables administrators to broadcast messages to users with different priorities and targeting options, while allowing users to control what types of notifications they wish to receive.
Core Features
Announcements
Announcements are system-wide messages that can be displayed to users. Key characteristics:
- Priority Levels: Four priority levels (Journal, Info, Warning, Urgent) to indicate message importance
- User Targeting: Announcements can be targeted to specific user groups and extra groups
- Pinning: Important announcements can be pinned to stay at the top
- Persistence: All announcements are stored in the database for historical reference
Notification Settings
Each user has personalized notification preferences that control:
- Login Notifications: Whether to receive email notifications on login
- Marketing Communications: Opt-in/out for promotional emails
- Service Alerts: Notifications for package expiration and other service events
These settings are stored per-user and can be modified through the user-facing API.
Architecture
Services
The module exposes two gRPC services:
- NotificationService: User-facing API for viewing announcements and managing personal notification settings
- NotificationManageService: Admin-facing API for CRUD operations on announcements
Events & Hooks
The module uses an internal event system for announcement lifecycle management:
- AnnouncementCreatedEvent: Published when a new announcement is created, consumed internally for potential side effects (e.g., cache invalidation, push notifications)
- Event routing uses RabbitMQ with the
notificationexchange
Database Schema
The module uses its own notification PostgreSQL schema with two main tables:
announcement: Stores announcement data with GIN indexes on user group arrays for efficient targetingsettings: Stores per-user notification preferences, linked to user profiles via foreign key
Integration Points
With Auth Module
- Notification settings are tied to user profiles through foreign key relationships
- User group information is used for announcement targeting
With Mailer Module
While not directly coupled, the notification settings (especially send_login_email and receive_marketing_email) are designed to be consumed by the mailer module when sending emails.
Development Notes
- All APIs follow the Processor pattern (not OOP)
- The module uses RabbitMQ for internal event handling
- Announcement targeting uses PostgreSQL array types with GIN indexes for performance
- User groups are stored as integer arrays for flexible targeting
Announcement
Announcement is a system-wide messaging feature that allows administrators to broadcast important information to targeted user groups. Announcements support priority levels, user targeting, and pinning capabilities to ensure critical messages reach the right users.
Core Concept
An Announcement represents a broadcast message with:
- Title and Content: Message headline and body text
- Priority Level: Visual importance indicator (Journal, Info, Warning, Urgent)
- User Targeting: Specify which user groups should see the announcement
- Pinning: Pin important announcements to the top of the list
- Persistence: All announcements are stored permanently for historical reference
Data Model
pub struct Announcement {
pub id: i64,
pub title: String,
pub content: String,
pub user_group: Vec<i32>, // Target user groups
pub user_extra_groups: Vec<i32>, // Target extra user groups
pub is_pinned: bool, // Whether pinned to top
pub priority: AnnouncementPriority, // Visual importance
pub created_at: PrimitiveDateTime,
pub updated_at: PrimitiveDateTime,
}
Priority Levels
Announcements have four priority levels that indicate message importance:
- Journal: Routine informational messages (default priority)
- Info: General information that users should be aware of
- Warning: Important messages requiring user attention
- Urgent: Critical messages that need immediate attention
Priority levels are purely visual indicators—they don’t affect targeting or delivery. Frontend implementations should style these differently to draw appropriate attention.
User Targeting
Announcements use a flexible targeting system based on user groups:
Targeting Logic
An announcement is visible to a user if either condition is true:
- User’s primary
user_groupmatches any group in announcement’suser_grouparray - User’s
user_extra_groupshas any overlap with announcement’suser_extra_groupsarray
This OR-based logic allows administrators to target announcements broadly or narrowly:
- Broadcast to all: Include all possible user groups
- Target specific roles: Specify only relevant user groups
- Mixed targeting: Combine primary and extra groups for fine-grained control
Empty Targeting
If both user_group and user_extra_groups are empty arrays, the announcement won’t be visible to any users. This can be useful for drafting announcements before activating them.
Announcement Lifecycle
1. Creation
Administrators create announcements through AdminCreateAnnouncement:
Process:
- Admin specifies title, content, targeting, priority, and pin status
- Announcement is inserted into database
AnnouncementCreatedEventis published to RabbitMQ- Event hook can trigger side effects (cache invalidation, push notifications)
Permissions: SuperAdmin, Moderator
Event System: Creation triggers an internal event for extensibility. Currently, the event hook is a placeholder for future features like real-time push notifications or cache updates.
2. Display
Users retrieve announcements through NotificationService::ListAnnouncements:
Behavior:
- Returns announcements targeted to the requesting user
- Sorted by pinned status first (pinned on top), then by creation date (newest first)
- Limited to 20 announcements per request
- No pagination—shows the 20 most relevant announcements
Targeting Query: Uses PostgreSQL array operators (= ANY() for primary group, && for array overlap) with GIN indexes for performance.
Single Announcement: Users can also fetch a specific announcement by ID via GetAnnouncement, useful for detail pages or deep links.
3. Updates
Administrators can edit existing announcements through AdminEditAnnouncement:
Editable Fields: All fields (title, content, targeting, priority, pinning) can be modified
Effect: Changes are immediate—users will see updated content on next fetch
No History: The system doesn’t track edit history. If audit trail is needed, it’s handled through the admin operation logging system.
4. Deletion
Administrators can permanently remove announcements through AdminDeleteAnnouncement:
Permissions: SuperAdmin, Moderator
Effect: Hard delete from database—no soft delete or archiving
Use Cases:
- Removing outdated announcements
- Cleaning up test announcements
- Deleting announcements posted in error
User Operations
List Announcements
Service: NotificationService::ListAnnouncements
Retrieves all announcements targeted to the current user, ordered by relevance (pinned first, then newest).
Response: Returns up to 20 announcements. No pagination—users see the most important/recent messages.
Targeting: Automatically filtered based on user’s group membership—no need to specify groups in request.
Get Single Announcement
Service: NotificationService::GetAnnouncement
Fetches a specific announcement by ID.
Use Case: Detail pages, direct links, or refreshing a single announcement without fetching the entire list.
Access Control: No additional access control beyond targeting—if an announcement exists and targets the user, they can retrieve it by ID.
Admin Operations
All admin operations require SuperAdmin or Moderator roles and are exposed through NotificationManageService.
Create Announcement
Operation: AdminCreateAnnouncement
Broadcasts a new message to users.
Parameters:
title: Message headlinecontent: Full message body (supports long text, formatting handled by frontend)user_group: Array of primary user group IDs to targetuser_extra_groups: Array of extra user group IDs to targetis_pinned: Whether to pin to top of listpriority: Visual importance indicator
Returns: New announcement ID
Event: Triggers AnnouncementCreatedEvent for extensibility (future push notifications, cache invalidation)
List Announcements
Operation: AdminListAnnouncements
Retrieves all announcements (not filtered by user targeting) for management purposes.
Pagination: Uses limit and offset for pagination
Sorting: Returns announcements in creation order (newest first)
Use Case: Admin dashboard showing all system announcements regardless of targeting
Edit Announcement
Operation: AdminEditAnnouncement
Updates an existing announcement.
Parameters: All fields can be modified (same as creation)
Returns: Updated announcement object
Effect: Changes are immediate—users see updated content on next fetch
Delete Announcement
Operation: AdminDeleteAnnouncement
Permanently removes an announcement from the database.
Parameters: Announcement ID
Returns: Empty response on success
Warning: Hard delete with no recovery mechanism. Ensure announcement should be permanently removed.
Pinning Behavior
Pinned announcements always appear at the top of the list, regardless of creation date:
Sorting Order:
- Pinned announcements (ordered by creation date, newest first)
- Unpinned announcements (ordered by creation date, newest first)
Use Cases:
- Pin urgent system maintenance notices
- Keep important policy changes visible
- Highlight time-sensitive information
Frontend Consideration: Pinned announcements should be visually distinct (e.g., pin icon, different background) to indicate their importance.
Database Schema
Key schema details (from 20250814171450_create_entities_for_notification_module.sql):
Table: notification.announcement
id BIGSERIAL PRIMARY KEY
title TEXT NOT NULL
content TEXT NOT NULL
user_group INTEGER[] NOT NULL
user_extra_groups INTEGER[] NOT NULL
is_pinned BOOLEAN NOT NULL DEFAULT FALSE
priority announcement_priority NOT NULL DEFAULT 'journal'
created_at TIMESTAMP NOT NULL DEFAULT NOW()
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
Indexes:
- GIN index on
user_group: Fast targeting queries - GIN index on
user_extra_groups: Fast targeting queries - B-tree index on
is_pinned: Efficient pinned-first sorting - B-tree index on
priority: Potential future priority-based queries
No Foreign Key: User groups are stored as integer arrays with no foreign key constraint, providing flexibility for group management.
Event System
The announcement system uses internal events for lifecycle hooks:
Event: AnnouncementCreatedEvent
- Published by:
ManageService::AdminCreateAnnouncement - Consumed by: Internal
AnnouncementEventHook(extensibility placeholder) - Route:
notification.announcement_createdonnotificationexchange - Payload:
announcement_id,created_attimestamp
Current Implementation: The event hook is a placeholder. Future implementations might:
- Invalidate frontend caches
- Send push notifications to targeted users
- Trigger webhooks for external integrations
- Log analytics events
Frontend Integration Points
User Announcement Display
List View:
- Call
ListAnnouncementson page load - Display announcements with priority-based styling:
- Urgent: Red/high-contrast styling
- Warning: Yellow/amber styling
- Info: Blue/neutral styling
- Journal: Default/subtle styling
- Show pinned announcements with visual indicator (pin icon)
- Limit display to 20 announcements (no pagination needed)
Detail View:
- Provide “Read More” links using announcement ID
- Call
GetAnnouncementto fetch full content - Display full message with timestamp and priority indicator
Real-time Updates: Consider polling ListAnnouncements periodically (e.g., every 5 minutes) or implementing WebSocket for real-time announcement delivery.
Admin Management UI
Create Form:
- Title input (required)
- Content textarea (supports long text)
- User group multi-select (checkboxes or dropdown)
- Extra user groups multi-select
- Priority dropdown (Journal, Info, Warning, Urgent)
- Pin checkbox
List View:
- Table showing all announcements
- Columns: ID, Title, Priority, Pinned, Target Groups, Created, Updated
- Edit and delete actions
- Pagination controls (limit/offset)
Edit Form:
- Pre-populate with existing values
- Allow modification of all fields
- Show last updated timestamp
Design Decisions
Why No Read Tracking?
Announcements don’t track which users have read them:
- Simplicity: Avoids per-user state management
- Broadcast Nature: Announcements are informational, not actionable
- Scalability: No need to store read state for thousands of users
- Use Case: For messages requiring acknowledgment, use notification systems with explicit confirmation flows
Why Hard Delete Instead of Soft Delete?
Announcements are permanently deleted rather than archived:
- Clean Data: Old announcements clutter management UI
- No Recovery Need: If an announcement needs to persist, don’t delete it
- Admin Control: Admins have full control over lifecycle
- Audit Trail: Admin operation logging provides deletion records if needed
Why Limit to 20 Announcements?
User-facing list is capped at 20 without pagination:
- Relevance: Users don’t need to see dozens of old announcements
- Performance: Keeps queries fast with simple sorting
- UX: Encourages admins to keep announcements current
- Workaround: For historical access, provide search/filter in admin panel
Why OR-Based Targeting?
Users see announcements if they match either primary group or extra groups:
- Flexibility: Allows broad or narrow targeting
- Ease of Use: Simpler than complex boolean logic
- Common Use Case: “Show to all premium users OR all beta testers”
- Empty Arrays: Provide draft mode (not visible to anyone)
See Also:
- Notification Module Introduction - Module overview and architecture
- Auth Module - User group management (user_group and user_extra_groups)
Notification Settings
Notification Settings provide per-user preferences that control which types of email notifications and alerts a user receives. This enables users to opt-in or opt-out of different notification categories according to their preferences.
Core Concept
Each user has a personalized notification settings record that controls three types of notifications:
- Login Notifications (
send_login_email): Email alerts when the user logs in - Marketing Communications (
receive_marketing_email): Promotional and marketing emails - Service Alerts (
notify_package_expired): Notifications about package/subscription expiration
These settings are stored per-user and are designed to be consumed by other modules (primarily the Mailer module) when deciding whether to send notifications.
Data Model
pub struct NotificationSettings {
pub id: Uuid, // User ID (foreign key to auth.user_profile)
pub send_login_email: bool,
pub receive_marketing_email: bool,
pub notify_package_expired: bool,
pub created_at: PrimitiveDateTime,
pub updated_at: PrimitiveDateTime,
}
Default Behavior
Notification settings use a lazy creation approach:
- Settings are not automatically created when a user signs up
- When fetching settings for a user without a record, the API returns default values
- Settings are only created in the database when the user first modifies them
Default Values:
send_login_email:false(login notifications disabled by default)receive_marketing_email:false(marketing emails disabled by default)notify_package_expired:true(service alerts enabled by default)
This design reduces database writes for users who never change their preferences while still providing predictable defaults.
Notification Types
Login Email Notifications
Purpose: Alert users when someone logs into their account
Use Case: Security monitoring—users can detect unauthorized access attempts
Default: Disabled (false)
Consumer: Mailer module checks this setting when the Auth module publishes login events
Frontend Consideration: Present as “Email me when I log in” checkbox in user settings
Marketing Email Notifications
Purpose: Promotional emails, feature announcements, and marketing campaigns
Use Case: Allow users to opt-out of non-essential marketing communications while still receiving critical service updates
Default: Disabled (false)
Consumer: Mailer module or external marketing systems check this before sending promotional content
Compliance: Important for GDPR and email marketing regulations—users must explicitly opt-in
Package Expiration Notifications
Purpose: Alert users when their service package/subscription is about to expire or has expired
Use Case: Ensure users are aware of expiring services and can take action (renew, upgrade, etc.)
Default: Enabled (true)
Consumer: Telecom module or background jobs check this before sending expiration alerts
Why Enabled by Default: Service continuity—users generally want to know when their service is expiring
User Operations
All operations are exposed through the NotificationService gRPC service and automatically use the authenticated user’s ID from the request middleware.
Get Notification Settings
Operation: GetMyNotificationSettings
Retrieves the current user’s notification preferences.
Authentication: User ID extracted from request middleware
Behavior:
- If settings exist: Returns stored preferences
- If settings don’t exist: Returns default values (without creating a database record)
Use Case: Display current preferences in user settings page
Set Notification Settings
Operation: SetNotificationSettings
Updates the user’s notification preferences.
Authentication: User ID extracted from request middleware
Parameters: All three boolean flags (must provide all values, not partial updates)
Behavior: Uses UPSERT operation—creates record if none exists, updates if it does
Effect: Changes are immediate—other modules will see updated preferences on their next check
Design Note: Requires all three flags to prevent partial updates. Frontend should fetch current settings first, then send all values with modifications.
Database Schema
Table: notification.settings
id UUID PRIMARY KEY
REFERENCES auth.user_profile(id) ON DELETE CASCADE
send_login_email BOOLEAN NOT NULL DEFAULT FALSE
receive_marketing_email BOOLEAN NOT NULL DEFAULT FALSE
notify_package_expired BOOLEAN NOT NULL DEFAULT TRUE
created_at TIMESTAMP NOT NULL DEFAULT NOW()
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
Key Characteristics:
- Primary key is user ID (foreign key to
auth.user_profile) - Cascade delete: Settings are removed when user account is deleted
- No additional indexes needed (lookups by primary key are already fast)
- Automatic timestamps via database triggers
Integration Points
With Auth Module
- Settings table references
auth.user_profilevia foreign key - User ID is the primary key—one settings record per user
- Cascade delete ensures data consistency when users are removed
With Mailer Module
While not directly coupled (no foreign keys or direct queries), the notification settings are designed to be consumed by the mailer module:
Integration Pattern:
- Mailer receives event (e.g., “user logged in”, “package expiring”)
- Mailer queries notification settings for target user(s)
- Mailer respects user preferences before sending email
Batch Operations: The FindNotificationSettingsByIds processor supports fetching settings for multiple users efficiently, useful for bulk email campaigns.
With Telecom Module
Service expiration notifications are checked against notify_package_expired before sending alerts about package/subscription expirations.
Frontend Integration Points
User Settings Page
Display Current Settings:
- Call
GetMyNotificationSettingson page load - Pre-populate checkboxes with returned values
- Handle both existing settings and default values transparently
Save Changes:
- Gather all three checkbox states (even unchanged ones)
- Call
SetNotificationSettingswith all three values - Show confirmation message on success
- No need to re-fetch—changes are immediate
UI Recommendations:
- Group notifications by category (Security, Marketing, Service)
- Provide clear descriptions of what each setting controls
- Consider warning users before disabling critical notifications (package expiration)
- Show last updated timestamp if available
Example Layout:
Notification Preferences
━━━━━━━━━━━━━━━━━━━━━━━
Security Notifications
☐ Email me when I log in
Marketing & Promotions
☐ Receive promotional emails and feature announcements
Service Alerts
☑ Notify me when my package is about to expire
Account Registration Flow
No Action Needed: Settings don’t need to be created during registration. Users will see default values until they modify preferences.
Design Decisions
Why Lazy Creation?
Settings are only created when users first modify them:
Pros:
- Reduces database writes for users who never change settings
- Simpler registration flow (one less insert operation)
- No storage cost for default preferences
Cons:
- API must handle missing records gracefully
- Queries can’t use JOIN without LEFT JOIN
Decision: The storage savings and reduced write load outweigh the minor complexity of handling missing records.
Why Return Defaults Instead of 404?
When settings don’t exist, the API returns default values rather than an error:
Reasoning:
- Better UX—frontend doesn’t need special handling for new users
- Predictable behavior—defaults are consistent
- Simpler frontend code—no need to distinguish “not set” from “default”
Why Require All Three Flags on Update?
The SetNotificationSettings API requires all three boolean values (no partial updates):
Reasoning:
- Prevents accidental resets—frontend must be intentional about all values
- Simpler implementation—no need to handle partial updates
- Clear semantics—“set these preferences” not “update some preferences”
- Atomic updates—all settings change together
Frontend Impact: Must fetch current settings first, then send all values. This is the standard pattern for settings pages anyway.
Why No Granular Notification Types?
Currently only three notification types exist:
Trade-off:
- Fewer types = simpler UX, less overwhelming for users
- Limited flexibility—can’t opt-in to some marketing emails but not others
Extensibility: If more granular control is needed in the future, additional boolean fields can be added to the schema without breaking existing APIs (protobuf supports field addition).
Why No Notification Channels?
Settings don’t distinguish between email, SMS, push notifications:
Current Design: All settings assume email notifications
Future Expansion: If other channels (SMS, push, in-app) are added:
- Could add separate fields (e.g.,
send_login_email_sms,send_login_email_push) - Or restructure to nested channel preferences
- Current design doesn’t block future expansion
See Also:
- Notification Module Introduction - Module overview and architecture
- Announcement - System-wide messaging feature
- Mailer Module - Email sending and template management
Support Module
The Support module provides a ticket-based customer support system enabling two-way communication between users and administrators. It handles ticket creation, message threading, status management, and access control for both user-facing and admin-facing operations.
Core Features
Ticket Management
Tickets are conversation threads with lifecycle management:
- Priority Levels: Three priority levels (Low, Medium, High) to help admins triage support requests
- Status Transitions: Automatic status updates based on who sends messages (Open → Pending → Resolved → Closed)
- Ownership Validation: Each ticket is tied to a specific user; access control ensures users can only see their own tickets
- Title-based Organization: Each ticket has a title for quick identification in lists
Messaging System
Real-time threaded messaging within tickets:
- Dual-author Model: Messages can be authored by either users or admins (never both on same message)
- Message Editing: Both users and admins can edit their own messages (with edit timestamp tracking)
- Admin-only Deletion: Admins can delete their own messages only (not for general moderation)
- Status-aware Sending: Messages cannot be sent to Resolved or Closed tickets
Message Lists
Messages are returned with a simple limit-based approach:
- User-facing API: Returns messages with enriched author metadata (name, avatar, email) via
FrontendTicketMessagetype - Admin-facing API: Returns plain
TicketMessageobjects without enriched metadata - Default Limit: Both APIs use a 100-message limit per request
Architecture
Services
The module exposes two gRPC services:
- SupportService: User-facing API for creating tickets, viewing own tickets, sending messages, and closing own tickets
- SupportAdminService: Admin-facing API with capabilities including viewing all tickets, sending responses, editing/deleting own messages, and manually resolving/closing any ticket
All service methods follow the Processor pattern for consistency with the rest of the codebase.
RBAC Integration
Admin operations integrate with the manage module’s role-based access control system:
- Allowed Roles: SuperAdmin, Moderator, CustomerSupport, and SupportBot can access all admin operations
- Audit Logging: All admin operations are automatically logged via the
RecordedAdminOperationwrapper - Operation Validation: Each operation validates the admin’s role before execution
Status Transition Logic
Ticket statuses automatically transition based on message activity:
- User sends message: Pending/Resolved tickets become Open (indicating new user activity requiring attention)
- Admin replies: Open tickets become Pending (indicating admin has responded)
- Admin closes ticket: Any non-Closed ticket can be closed manually
- Closed/Resolved tickets: Cannot receive new messages (enforced by validation)
This state machine ensures tickets naturally reflect their support workflow status without manual updates.
Input Validation
The module includes comprehensive input validation:
- Content validation: Message length limits, empty content checks (enforced at gRPC layer)
- Ownership checks: Users can only access their own tickets and messages
- Status checks: Operations respect ticket status (e.g., no messages on closed tickets)
Integration Points
With Auth and Manage Modules
- Ticket ownership is tied to user UUIDs from the auth system
- User authentication context is passed through
auth::rpc::middleware::UserIdto determine ticket access - Admin authentication is passed through
manage::rpc::middleware::AdminIdfor admin operations - Admin authorization is verified through the
managemodule’s RBAC system
Database Schema
The module uses its own support PostgreSQL schema with:
tickettable: Core ticket data with status enum and priority enumticket_messagetable: Threaded messages with dual-author columns (user_author_id OR admin_author_id)- Foreign key relationships to auth module’s user tables
Development Notes
- All APIs follow the Processor pattern (not OOP) [[memory:6079830]]
- The module does NOT use static lifetimes for database connections [[memory:7107428]]
- Ticket IDs are UUIDs, but message IDs are auto-incrementing integers for simpler database management
- Message ownership validation happens at the service layer, not database layer
- The
FrontendTicketMessagetype includes enriched author metadata (name, avatar, email) joined from auth module tables - Admin operations use the
impl_recorded_admin_processor!macro to automatically wrap operations with audit logging
Ticket System
Ticket System provides structured two-way communication between users and administrators for customer support. Each ticket is a conversation thread with lifecycle management, priority levels, and automatic status transitions based on message activity.
Core Concept
A Ticket represents:
- Conversation thread: Series of messages between a user and admin team
- Priority: Low, Medium, or High (set by user at creation, helps admins triage)
- Status: Open, Pending, Resolved, or Closed (automatically transitions based on activity)
- Ownership: Tied to a specific user; only that user and admins can access it
- Title: Brief description for quick identification in ticket lists
Ticket Lifecycle
Status States
- Open: User has sent a message, awaiting admin response
- Pending: Admin has replied, awaiting user feedback
- Resolved: Admin has marked the issue as resolved
- Closed: Ticket is archived, no further messages allowed
Automatic Transitions
Status changes automatically based on who sends messages:
- User sends message → Open (even if previously Pending or Resolved)
- Admin replies to Open ticket → Pending
- Admin manually resolves → Resolved
- Admin manually closes → Closed
This state machine ensures tickets naturally reflect their support workflow without manual status management for most interactions.
Message Restrictions
- Resolved/Closed tickets: Cannot receive new messages
- Open/Pending tickets: Both users and admins can send messages
- Users can only edit their own messages; admins can only edit their own messages
- Admins can only delete their own messages (not for general moderation)
Architecture
Dual Service Design
The module exposes two separate gRPC services:
SupportService (User-facing):
- Create tickets
- List own tickets
- View ticket details (ownership validated)
- Send and edit own messages
- Close own tickets
SupportAdminService (Admin-facing):
- List all tickets (with pagination and optional unreplied filter)
- View any ticket detail
- Send admin responses
- Edit own messages
- Delete own messages
- Manually close or resolve tickets
This separation ensures clear permission boundaries and prevents accidental exposure of admin capabilities.
RBAC and Audit Logging
Admin operations integrate with the manage module’s RBAC system:
- Allowed Roles: SuperAdmin, Moderator, CustomerSupport, and SupportBot
- Audit Trail: All admin operations are logged with operation name, target, and input parameters
- Authorization: Role verification happens before operation execution via
AdminOperationtrait
Message Threading
Messages within a ticket follow these rules:
- Dual-author model: Each message has either
user_author_idORadmin_author_id(never both) - Edit tracking:
edited_attimestamp records when a message was last modified - Author metadata:
- User-facing API returns
FrontendTicketMessagewith enriched author data (name, avatar, email) joined from auth tables - Admin-facing API returns plain
TicketMessageobjects without enriched metadata
- User-facing API returns
- Sequential IDs: Messages use auto-incrementing integers for efficient database operations
Message Retrieval
Messages are returned with a simple limit-based approach:
- Default Limit: 100 messages per request
- Ordering: Messages are ordered by
sent_attimestamp in descending order (newest first) - No Pagination: Currently, there is no pagination support; all messages up to the limit are returned at once
Access Control
User Permissions
- Users can only view tickets they created (validated by
user_idmatch) - Users can only edit messages they authored
- Ticket ownership is checked on every operation (list, view, send message)
- Failed ownership checks return empty results or permission denied errors
Admin Permissions
- Admins can access all tickets through
SupportAdminService - Admin permissions are verified through the
managemodule’s RBAC system - Admins can perform operations like viewing all tickets, sending responses, and manually changing ticket status
- Admins can only edit or delete their own messages (ownership is checked)
Validation Layer
Input validation includes:
- Title validation: Non-empty, length limits (enforced at gRPC layer)
- Content validation: Non-empty, length limits for messages (enforced at gRPC layer)
- Ownership checks: Service-level validation verifies user owns the ticket/message before operations
Integration Points
Auth Module
- Ticket
user_idreferences users from auth module - User authentication context flows through
auth::rpc::middleware::UserIdfor user operations - Admin authentication flows through
manage::rpc::middleware::AdminIdfor admin operations
Database Schema
Lives in support PostgreSQL schema:
tickettable: Core ticket data with custom enums for priority and statusticket_messagetable: Messages with dual-author columns- Foreign keys to auth module’s user tables for referential integrity
Development Notes
- Ticket IDs are UUIDs for global uniqueness
- Message IDs are integers for simpler database management
- All service operations use the Processor pattern
- Status transition logic is implemented in
TicketStatusenum methods (on_user_message(),on_admin_reply()) - Status transitions happen automatically within the
CreateTicketMessagedatabase transaction - User-facing API gets enriched message data (
FrontendTicketMessage) with author metadata joined from user/admin tables - Admin operations use the
impl_recorded_admin_processor!macro for automatic audit logging - Admin message edit/delete operations enforce ownership validation at the service layer
Email Sender
The Email Sender (mailer) module provides email delivery capabilities for the entire system. It operates as an asynchronous event-driven service that consumes email send requests from other modules via RabbitMQ and delivers them through a configured SMTP server.
Core Features
SMTP Integration
The module uses standard SMTP protocol for email delivery:
- STARTTLS Support: Configurable TLS encryption for secure email transmission
- Authentication: Username/password-based SMTP authentication
- Plain Connection Option: Support for unencrypted connections (for testing or trusted networks)
- Dynamic Configuration: SMTP settings can be reloaded without service restart
Event-Driven Architecture
All email sending is triggered through RabbitMQ events:
- Asynchronous Processing: Email sending doesn’t block the calling service
- Reliable Delivery: RabbitMQ ensures events are not lost if the mailer is temporarily unavailable
- Decoupled Design: Other modules don’t need direct dependencies on mailer code
Template System
Pre-defined email templates with embedded assets:
- HTML Templates: Rich HTML email templates with inline styling
- Embedded Assets: Logo and other assets are compiled into the binary (no external files needed)
- Template Types: Register verification, OTP codes, password reset, package expiration notifications
Architecture
Configuration Storage
SMTP configuration is stored in Redis using the Manage module’s config provider system:
- Config Key:
mailer - Hot Reload: The module periodically checks Redis for config changes and rebuilds the SMTP connection if needed
- Sensitive Data Protection: Passwords are redacted in debug logs
Configuration fields:
host: SMTP server hostnameport: SMTP server port (typically 587 for STARTTLS, 25 for plain)username: SMTP authentication usernamepassword: SMTP authentication passwordsender: Default sender email address (used whenfromis not specified)starttls: Whether to use STARTTLS encryption
Event Consumption
The module consumes two types of events:
-
Generic Email Events (
MailSendCall): Direct email sending with full control over content- Queue:
helium_mailer_send - Exchange:
mailer(Direct) - Routing Key:
send
- Queue:
-
Template-based Email Events: Events from other modules (primarily Auth and Telecom) that trigger template rendering
- Register verification emails
- OTP email codes
- Password reset emails
- Package expiration notifications
Hooks (Event Processors)
Each hook implements the Processor pattern to handle specific event types:
- MailerHook: Core processor for
MailSendCallevents, handles actual SMTP transmission - RegisterEmailHook: Processes user registration verification emails
- OtpEmailHook: Processes one-time password emails for 2FA
- ForgotPasswordEmailHook: Processes password reset emails
- AllPackageExpiredHook: Processes telecom package expiration notifications
All hooks follow the standard pattern: receive event → render template (if needed) → publish MailSendCall → SMTP delivery.
Integration Points
With Auth Module
The mailer consumes three types of email events from Auth:
RegisterEmailSendCall: When a new user registersOtpEmailSendCall: When OTP authentication is requiredForgotPasswordEmailSendCall: When a user requests password reset
With Telecom Module
AllPackageExpiredEvent: Sent when all of a user’s telecom packages expire
With Manage Module
- Uses Manage’s
config_providerto fetch SMTP configuration from Redis - SMTP settings can be updated via Manage’s admin configuration APIs
Email Sending Flow
When another module needs to send an email:
- The calling module publishes an event (either a specific template event or a generic
MailSendCall) - If it’s a template event, the corresponding hook receives it and:
- Fetches necessary data from the database (e.g., user email)
- Renders the HTML template
- Publishes a
MailSendCallevent
MailerHookreceives theMailSendCallevent- SMTP connection is used to transmit the email
- Errors are logged but don’t crash the worker (events can be retried by RabbitMQ)
Development Notes
- All email content is sent as HTML with
Content-Type: text/html - The SMTP sender can be overridden per-email via the
fromfield inMailSendCall - Email templates use the
askamatemplating engine - The module doesn’t expose any gRPC services (it’s worker-only)
- All processors follow the Processor pattern (not OOP)
- Email addresses are validated before sending; invalid addresses result in
Error::InvalidInput - The module doesn’t track email delivery status or bounces (delegated to SMTP server)
Worker Setup
The mailer runs as a dedicated worker process that needs:
- PostgreSQL connection (for template data like user emails)
- Redis connection (for SMTP configuration)
- RabbitMQ connection (for event consumption)
Refer to the helium-server deployment documentation for worker configuration details.
Built-in WAF
Shield implements three main security mechanisms:
- Hashcash-based CAPTCHA - Proof-of-work challenges to prevent automated attacks
- Rate Limiting - Token bucket algorithm to control API request rates
- Anti-XSS Utilities - Input sanitization functions to prevent cross-site scripting
Components
Hashcash CAPTCHA
A proof-of-work based CAPTCHA system that requires clients to solve computational puzzles before performing sensitive operations. The frontend requests a challenge with specified difficulty and TTL, solves it locally, and submits the solution for verification.
Use cases:
- Protecting registration/login endpoints
- Preventing automated form submissions
- Rate limiting expensive operations
The challenge-response cycle is stateful and stored in Redis with configurable expiration.
Rate Limiting Middleware
Per-user rate limiting using the token bucket algorithm. The middleware can be applied to any gRPC endpoint to automatically enforce rate limits based on user identity.
Configuration parameters:
api_name: Identifier for the rate limit bucketcapacity: Maximum tokens availablerefill: Token refill rate (tokens per second)cost: Tokens consumed per requestttl: Bucket expiration time
Rate limit state is maintained in Redis using atomic Lua scripts. The middleware extracts user identity from request extensions (requires authentication middleware to run first).
Anti-XSS Utilities
Three sanitization functions for different content types:
anti_xss_text(): Basic HTML entity encoding for plain textanti_xss_markdown(): Sanitizes markdown while preserving safe formatting, with allowlist-based filtering for images and linksanti_xss_enhanced(): Advanced protection that detects and removes script injections, event handlers, and dangerous schemes
When to use:
- Sanitize user-generated content before storage or display
- Clean markdown in announcements, tickets, or comments
- Validate URLs and embedded content
Architecture Notes
Redis Dependency
Both hashcash and rate limiting rely on Redis for state management. The hashcash service stores active challenges, while rate limiting maintains token bucket state with atomic updates.
Middleware Integration
RateLimitLayer is a Tower middleware that integrates with the gRPC server stack. It must be placed after authentication middleware to access user identity from request extensions.
Performance Considerations
- Hashcash difficulty should be tuned based on client capabilities (mobile vs desktop)
- Rate limit refill rates should balance user experience with system protection
- Anti-XSS functions are synchronous and relatively lightweight, safe to use in request paths
Rate Limiter
The Rate Limiter provides per-user request throttling using the token bucket algorithm. It’s designed to protect API endpoints from abuse while maintaining good user experience through gradual token refill.
Purpose
Rate limiting prevents users from overwhelming the system with too many requests. Unlike simple fixed-window counters, the token bucket algorithm allows for burst traffic while maintaining average rate limits over time.
How It Works
Token Bucket Algorithm
Each user gets a virtual “bucket” of tokens stored in Redis:
- Capacity: Maximum tokens the bucket can hold
- Refill Rate: Tokens added per second
- Cost: Tokens consumed per request
- TTL: Bucket expiration time (for cleanup of inactive users)
When a request arrives, the system checks if enough tokens are available. If yes, tokens are consumed and the request proceeds. If no, the request is rejected with RESOURCE_EXHAUSTED status.
Tokens gradually refill over time based on the refill rate, allowing users to make requests again after waiting.
Per-User Enforcement
Rate limits are enforced per-user, identified by their user ID from the authentication middleware. Each API endpoint can have its own rate limit configuration with a unique identifier.
Bucket keys in Redis follow the pattern: rate_limit:API[{api_name}]-{user_id}
Integration
As Middleware
RateLimitLayer is a Tower middleware that wraps individual gRPC service methods. Apply it to specific endpoints that need rate limiting:
use shield::rpc::middleware::{RateLimitLayer, RateLimitConfig};
use std::time::Duration;
let rate_limit = RateLimitLayer::new(
redis_connection,
RateLimitConfig {
api_name: "CreateOrder",
capacity: 10.0,
refill: 1.0, // 1 token per second
cost: 1.0,
ttl: Duration::from_secs(3600),
}
);
let service = OrderServiceServer::new(order_service)
.layer(rate_limit);
Configuration Parameters
- api_name: Unique identifier for the rate limit bucket (used in Redis key)
- capacity: Maximum burst size (how many requests can be made immediately)
- refill: Token recovery rate in tokens per second
- cost: Tokens consumed per request (usually 1.0, but can be higher for expensive operations)
- ttl: How long to keep inactive buckets in Redis
Tuning Guidelines
High-frequency, lightweight endpoints:
- Capacity: 30-100
- Refill: 10-30 tokens/second
- Cost: 1.0
Moderate-frequency endpoints:
- Capacity: 10-20
- Refill: 1-5 tokens/second
- Cost: 1.0
Expensive operations (e.g., report generation):
- Capacity: 3-5
- Refill: 0.1-0.5 tokens/second
- Cost: 1.0
Architecture
Redis-Backed State
Rate limit state is stored in Redis hash structures with two fields:
tokens: Current token count (float)last_refill: Last refill timestamp (unix timestamp)
A Lua script executes atomically on Redis to:
- Calculate elapsed time since last refill
- Add refilled tokens (capped at capacity)
- Check if enough tokens available
- Consume tokens if allowed
- Update state and TTL
This ensures thread-safe, distributed rate limiting across multiple server instances.
Authentication Dependency
The middleware requires UserId to be present in the request extensions, which is set by the authentication middleware. Therefore:
- Rate limiting middleware must be applied after authentication middleware in the layer stack
- Unauthenticated requests will be rejected before rate limiting is evaluated
- Anonymous/public endpoints cannot use this middleware (use hashcash CAPTCHA instead)
Error Handling
When rate limit is exceeded, the middleware returns:
- Status:
RESOURCE_EXHAUSTED - Message:
"Rate limit exceeded"
Frontend should handle this gracefully by:
- Displaying user-friendly messages
- Implementing exponential backoff for retries
- Showing remaining quota if applicable (requires separate API)
When to Use Rate Limiting
Good use cases:
- Order creation and payment endpoints
- Data export and report generation
- Account modification operations
- Support ticket creation
When NOT to use:
- Read-only queries (unless very expensive)
- Authentication endpoints (use hashcash instead)
- Static content delivery
- Public announcement viewing
Implementation Notes
- Uses the Processor pattern for the core rate limiting logic
- Middleware integrates with Tower/tonic service stack
- All Redis operations are atomic via Lua scripts
- Float precision is used for tokens to support fractional costs and refill rates
- TTL prevents memory leaks from abandoned user sessions
Hashcash Captcha
The Hashcash Captcha provides proof-of-work based challenge-response authentication to prevent automated attacks. Unlike traditional image-based CAPTCHAs, it requires clients to perform computational work, making it accessible and bot-resistant.
Purpose
Hashcash CAPTCHA protects sensitive operations from automated abuse by requiring clients to solve a cryptographic puzzle. The difficulty can be tuned to balance security and user experience - higher difficulty means more computation time required.
How It Works
Proof-of-Work Challenge
The system uses a challenge-response protocol:
- Client requests a challenge - Specifies desired difficulty (19-35) and time-to-live
- Server generates a random 32-byte challenge - Stores it in Redis with the specified TTL
- Client solves the puzzle - Finds a nonce value that when hashed with the challenge produces a hash with the required number of leading zero bits
- Client submits the solution - Sends back the challenge ID and nonce
- Server verifies - Checks the solution and deletes the challenge if valid (one-time use)
The difficulty parameter controls how many leading zero bits are required in the SHA256 hash output. Each increment doubles the expected computation time.
Challenge Lifecycle
Challenges are stateful and stored in Redis:
- Each challenge has a unique 16-byte ID
- TTL ensures challenges expire and don’t accumulate
- Successfully verified challenges are deleted immediately (prevents replay attacks)
- Expired or non-existent challenges return
NotFoundresult
Frontend Integration
Basic Workflow
1. Before submitting sensitive operation (login, registration, etc.)
2. Call RequestCaptcha with difficulty and TTL
3. Receive challenge_id, challenge bytes, and difficulty
4. Solve: Find nonce where SHA256(challenge + nonce) has required leading zero bits
5. Call VerifyCaptcha with challenge_id and nonce
6. Check result: Pass/Fail/NotFound
7. If Pass, proceed with the protected operation
Solving the Challenge
Frontend must implement the proof-of-work algorithm:
- Hash function: SHA256
- Input:
challenge_bytes + nonce_bytes(nonce as 8-byte big-endian u64) - Goal: Find nonce where hash has
difficultyleading zero bits - Method: Brute force increment nonce from 0 until valid hash found
JavaScript/TypeScript developers should use the Web Crypto API or a library like crypto-js for SHA256 hashing. For performance-critical cases, consider using WebAssembly.
Difficulty Guidelines
Difficulty 19-22 (Very Easy):
- Solves in <100ms on modern devices
- Use for non-critical operations or mobile clients
Difficulty 23-26 (Easy to Medium):
- Solves in 100ms - 1 second
- Good balance for most use cases (login, registration)
Difficulty 27-30 (Medium to Hard):
- Solves in 1-10 seconds
- Use for high-value operations (password reset, account deletion)
Difficulty 31-35 (Very Hard):
- Solves in 10+ seconds
- Use sparingly, mainly for administrative or highly sensitive operations
Note: Difficulty is exponential - each +1 doubles the average solve time.
TTL Recommendations
Set TTL based on expected solve time plus user think time:
- Easy challenges: 30-60 seconds
- Medium challenges: 60-120 seconds
- Hard challenges: 120-300 seconds
Too short: Users may timeout while solving Too long: Increases Redis memory usage and replay attack window
Verification Results
Three possible outcomes from VerifyCaptcha:
- Pass: Solution is correct, challenge consumed and deleted
- Fail: Solution is incorrect, challenge remains valid (can retry)
- NotFound: Challenge expired, already used, or never existed
Frontend should handle NotFound by requesting a new challenge, and Fail by continuing to solve or requesting a new one.
Architecture
Redis-Backed Storage
Challenges are stored in Redis with keys: hashcash:{hex(challenge_id)}
Each challenge contains:
- 16-byte unique ID (random)
- 32-byte random challenge data
- Difficulty value (19-35)
- TTL expiration
Redis automatically cleans up expired challenges, and successful verifications delete them immediately to prevent replay attacks.
Stateless and Distributed
The system is stateless from the application perspective - all state lives in Redis. This allows multiple server instances to handle challenge requests and verifications without coordination.
When to Use Hashcash
Good use cases:
- Registration and login endpoints
- Password reset requests
- Anonymous or pre-authentication operations
- Expensive public endpoints (email sending, report generation)
- Rate limiting supplement for unauthenticated users
When NOT to use:
- Authenticated user operations (use rate limiting instead)
- High-frequency read operations
- Mobile-first applications (difficulty must be tuned lower)
- Accessibility-critical features (consider alternatives)
Implementation Notes
- Uses the Processor pattern for challenge creation and verification
- Challenge IDs must be exactly 16 bytes
- Difficulty validation: must be in range [19, 35] inclusive
- TTL validation: must be at least 1 second
- One challenge = one verification (single-use)
- Verification is atomic: check and delete happen together
- All cryptographic operations use SHA256
Anti-XSS
The Anti-XSS utilities provide input sanitization functions to protect against cross-site scripting (XSS) attacks. These are synchronous, lightweight functions designed to sanitize user-generated content before storage or display.
Purpose
XSS attacks occur when malicious scripts are injected into web pages through user input. The anti-XSS utilities help prevent these attacks by removing or encoding potentially dangerous content while preserving legitimate formatting and functionality.
Sanitization Functions
anti_xss_text
Basic HTML entity encoding for plain text content. Converts special characters to their HTML entity equivalents to prevent HTML/script injection.
Encoded characters:
<,>,&,",',/
When to use:
- Plain text user inputs (usernames, comments, descriptions)
- Content that should contain no HTML or markdown
- Simple text fields where formatting is not needed
anti_xss_markdown
Sanitizes markdown while preserving safe formatting. Uses allowlist-based filtering to keep legitimate markdown features while blocking dangerous content.
Features:
- Strips all HTML tags completely
- Validates image URLs against domain allowlist (configurable in code)
- Validates link schemes (allows
http,https,mailtoonly) - Preserves safe markdown formatting (headers, lists, emphasis, etc.)
- Removes images from disallowed domains
- Strips URLs from links with dangerous schemes (but keeps link text)
When to use:
- Markdown content in announcements
- Support ticket descriptions
- User comments that support formatting
- Any user-generated markdown that will be rendered
Domain Allowlist:
The function checks image URLs against ALLOWED_IMAGE_DOMAINS constant. Modify this list in the source code to add trusted image hosting services.
anti_xss_enhanced
Advanced protection that actively detects and removes script injections before encoding. Combines pattern matching with entity encoding for multi-layered defense.
Protected against:
<script>tags (including multiline)- Dangerous URL schemes (
javascript:,vbscript:,data:) - Event handler attributes (
onclick,onload, etc.) - Embedded objects (
<iframe>,<object>,<embed>)
Process:
- Detects and removes script patterns using regex
- Applies HTML entity encoding to remaining content
When to use:
- Rich text content from untrusted sources
- Content that might contain complex HTML
- Extra protection layer for high-risk inputs
- Any content where XSS risk is elevated
Integration
Backend Usage
These are utility functions exported from the shield::utils::anti_xss module. Call them directly in your service logic before storing or processing user input:
use shield::utils::anti_xss::{anti_xss_text, anti_xss_markdown, anti_xss_enhanced};
// Sanitize before storage
let clean_username = anti_xss_text(&user_input);
let clean_announcement = anti_xss_markdown(&announcement_text);
let clean_content = anti_xss_enhanced(&rich_text);
Frontend Considerations
Backend sanitization is the primary defense, but frontend should also:
- Validate input before submission
- Use appropriate HTML escaping when rendering
- Leverage framework-level XSS protection (React’s JSX escaping, Vue’s template binding, etc.)
- Never use
innerHTMLor equivalent with user content - Use markdown renderers with built-in sanitization
When to Apply Sanitization
Always sanitize:
- User-submitted text, markdown, or HTML
- Content from external APIs or third-party sources
- Any data that will be displayed to other users
- Content stored in databases that renders on frontend
Timing:
- Before storage (recommended): Sanitize once when data enters the system
- Before display: Sanitize when rendering if storage is raw
Defense in depth: For high-risk scenarios, apply multiple layers:
- Frontend validation
- Backend sanitization (these functions)
- Frontend rendering protection (framework escaping)
Architecture Notes
Synchronous and Lightweight
All three functions are synchronous and suitable for use in request handlers. They don’t perform I/O or heavy computation, making them safe to call inline during request processing.
Regex-Based Detection
The sanitization uses regex patterns for detection. The patterns are compiled once using lazy_static for performance. Complex patterns (multiline, case-insensitive) ensure robust detection of script injection attempts.
No External Dependencies
The utilities rely only on standard Rust regex and URL parsing libraries. No external sanitization services or heavyweight parsers are required.
Limitations
These utilities provide solid protection for common XSS vectors, but they are not a complete security solution:
- Not a WAF: These functions handle input sanitization, not request-level filtering
- Not HTML parsing: They use regex-based pattern matching, not full HTML/markdown parsers
- Allowlist maintenance: Image domain allowlist must be manually maintained in code
- Markdown edge cases: Complex or malformed markdown might not be handled perfectly
Additional security measures:
- Use Content Security Policy (CSP) headers in frontend
- Apply proper output encoding in templates
- Enable framework-level XSS protection
- Regular security audits of user-facing features
- Keep dependencies updated
Implementation Notes
- Functions are pure and stateless (no side effects)
- All functions return new strings (input is not modified)
- Empty strings are handled safely
- Regex compilation errors have fallback patterns
- Domain checking for images uses full URL parsing
- Link text is preserved even when URL is removed
Basic Design
Admin Dashboard
User Web Application
helium-server Crate
The Helium server is designed as a multi-mode worker system that can run different components independently or together, enabling flexible deployment strategies. Each worker mode serves a specific purpose in the overall system architecture.
Architecture
Worker Modes
The server supports six distinct worker modes:
| Worker Mode | Port | Description | Use Case |
|---|---|---|---|
grpc | 50051 | gRPC API server | Main API for client applications and admin panels |
subscribe_api | 8080 | RESTful subscription API | Public subscription endpoints |
webhook_api | 8081 | RESTful webhook handler | Payment provider callbacks, third-party integrations |
consumer | - | Background message consumer | Processing async tasks from message queue |
mailer | - | Email service worker | Sending emails and notifications |
cron_executor | - | Scheduled task executor | Running periodic maintenance tasks |
Dependencies
The server requires three core infrastructure components:
- PostgreSQL: Primary database for persistent data
- Redis: Caching, session storage, and temporary data
- RabbitMQ (AMQP): Message queuing for async processing
Module Integration
The server integrates all business logic modules:
- auth: Authentication and authorization
- shop: E-commerce and billing
- telecom: VPN node management and traffic handling
- market: Affiliate and marketing systems
- notification: Announcements and messaging
- support: Customer support tickets
- manage: Administrative functions
- shield: Security and anti-abuse measures
Deployment Guide
Prerequisites
- PostgreSQL, Redis, and RabbitMQ servers accessible
- SQLx CLI:
cargo install sqlx-cli --no-default-features --features postgres - Environment variables configured (see below)
Environment Configuration
The server is configured entirely through environment variables:
Required Variables
# Worker mode selection
WORK_MODE="grpc" # or subscribe_api, webhook_api, consumer, mailer, cron_executor
# Database connection
DATABASE_URL="postgres://user:password@localhost/helium_db"
# Message queue connection
MQ_URL="amqp://user:password@localhost:5672/"
# Redis connection
REDIS_URL="redis://localhost:6379"
Optional Variables
# Server listen address (for API workers)
LISTEN_ADDR="0.0.0.0:50051" # grpc mode default
LISTEN_ADDR="0.0.0.0:8080" # subscribe_api mode default
LISTEN_ADDR="0.0.0.0:8081" # webhook_api mode default
# Cron executor scan interval (seconds)
SCAN_INTERVAL="60" # cron_executor mode only
# OpenTelemetry Collector endpoint (optional, for observability)
OTEL_COLLECTOR="http://otel-collector:4317" # See Observability guide
Note: For comprehensive observability with distributed tracing and metrics, see the Observability with OpenTelemetry guide.
Database Migration
⚠️ CRITICAL: Database migrations must be run before starting the application.
# Install SQLx CLI
cargo install sqlx-cli --no-default-features --features postgres
# Apply all pending migrations
sqlx migrate run --database-url $DATABASE_URL
# Verify migration status
sqlx migrate info --database-url $DATABASE_URL
Basic Deployment
Running the Server
# Apply database migrations first
sqlx migrate run --database-url $DATABASE_URL
# Start the server with desired worker mode
WORK_MODE=grpc ./helium-server
Multiple Worker Deployment
For production, run different worker modes as separate processes:
# Terminal 1: Main gRPC API
WORK_MODE=grpc ./helium-server
# Terminal 2: Background consumer
WORK_MODE=consumer ./helium-server
# Terminal 3: Email worker
WORK_MODE=mailer ./helium-server
# Terminal 4: Cron jobs
WORK_MODE=cron_executor ./helium-server
Logging
The server uses structured logging:
# Enable debug logging
RUST_LOG=debug ./helium-server
# Production logging (default)
RUST_LOG=info ./helium-server
Developer Guide
Project Structure
server/
├── Cargo.toml # Dependencies and metadata
├── src/
│ ├── main.rs # Entry point and startup logic
│ ├── worker/ # Worker mode implementations
│ │ ├── mod.rs # Worker configuration and dispatch
│ │ ├── grpc.rs # gRPC server implementation
│ │ ├── consumer.rs # Background message consumer
│ │ ├── mailer.rs # Email service worker
│ │ ├── cron_executor.rs # Scheduled task executor
│ │ ├── subscribe_api.rs # Subscription REST API
│ │ └── webhook_api.rs # Webhook REST API
│ └── hooks/ # Extension points (currently unused)
│ └── mod.rs
Building from Source
# Development build
cd server
cargo build
# Release build (optimized)
cargo build --release
# Run with specific worker mode
WORK_MODE=grpc cargo run
Adding New Worker Modes
- Create worker implementation:
// src/worker/new_worker.rs
pub struct NewWorker {
// worker fields
}
impl NewWorker {
pub async fn initialize(args: YourArgs) -> anyhow::Result<Self> {
// initialization logic
}
pub async fn run(&self) -> anyhow::Result<()> {
// worker main loop
}
}
- Add to worker configuration:
// src/worker/mod.rs
pub enum WorkerArgs {
// existing variants...
NewWorker(YourArgs),
}
impl WorkerArgs {
pub fn load_from_env() -> anyhow::Result<Self> {
match work_mode.as_str() {
// existing modes...
"new_worker" => {
// parse environment variables
Ok(WorkerArgs::NewWorker(args))
}
}
}
pub async fn execute_worker(self) -> anyhow::Result<()> {
match self {
// existing modes...
WorkerArgs::NewWorker(args) => {
let worker = NewWorker::initialize(args).await?;
worker.run().await
}
}
}
}
gRPC Service Development
The gRPC worker automatically integrates all modules. To add new services:
-
Implement your service in the appropriate module (e.g.,
modules/your_module/) -
Add to gRPC worker:
// src/worker/grpc.rs
impl GrpcWorker {
pub async fn initialize(args: GrpcWorkModeArgs) -> Result<Self, anyhow::Error> {
// ... existing initialization ...
let your_service = YourService::new(database_processor.clone());
Ok(Self {
// ... existing fields ...
your_service,
})
}
pub fn server_ready(self) -> Router<...> {
tonic::transport::server::Server::builder()
// ... existing services ...
.add_service(YourServiceServer::new(self.your_service))
}
}
Database Migrations
Database schema is managed through SQLx migrations in the migrations/ directory. When adding new features:
- Create migration files:
# Create new migration
sqlx migrate add your_feature_name
# This creates:
# migrations/TIMESTAMP_your_feature_name.up.sql
# migrations/TIMESTAMP_your_feature_name.down.sql
- Run migrations:
# Apply migrations
sqlx migrate run --database-url $DATABASE_URL
# Revert last migration
sqlx migrate revert --database-url $DATABASE_URL
Testing
# Run all tests
cargo test
# Run specific module tests
cargo test --package helium-server
# Integration tests with database
DATABASE_URL=postgres://test_db cargo test
Performance Considerations
- Memory Usage: Each worker typically uses 40-200MB RAM
- CPU Efficiency: Single-core performance optimized, can handle 1000+ RPS
- Connection Pooling: Database connections are shared across services
- Async Processing: All I/O operations are non-blocking
Troubleshooting
Common Issues
Service won’t start:
# Check environment variables
env | grep -E "(DATABASE_URL|MQ_URL|REDIS_URL|WORK_MODE)"
# Verify database migrations are applied
sqlx migrate info --database-url $DATABASE_URL
Database migration issues:
# Check migration status
sqlx migrate info --database-url $DATABASE_URL
# Force apply migrations (if stuck)
sqlx migrate run --database-url $DATABASE_URL
# Revert last migration if needed
sqlx migrate revert --database-url $DATABASE_URL
# Reset database (CAUTION: destroys all data)
sqlx database reset --database-url $DATABASE_URL
Performance issues:
# Enable request tracing
RUST_LOG=helium_server=trace ./helium-server
# Profile with flamegraph
cargo flamegraph --bin helium-server
Logs and Debugging
# Debug logging
RUST_LOG=debug ./helium-server
# Trace specific modules
RUST_LOG=helium_server::worker::grpc=trace,info ./helium-server
Configuration Validation
Ensure all required environment variables are properly set:
# Validate configuration script
#!/bin/bash
set -e
echo "Validating Helium server configuration..."
# Check required variables
: "${WORK_MODE:?WORK_MODE not set}"
: "${DATABASE_URL:?DATABASE_URL not set}"
: "${MQ_URL:?MQ_URL not set}"
: "${REDIS_URL:?REDIS_URL not set}"
# Validate work mode
case "$WORK_MODE" in
grpc|subscribe_api|webhook_api|consumer|mailer|cron_executor)
echo "✓ Valid WORK_MODE: $WORK_MODE"
;;
*)
echo "✗ Invalid WORK_MODE: $WORK_MODE"
exit 1
;;
esac
# Check if migrations are applied
if command -v sqlx >/dev/null 2>&1; then
if sqlx migrate info --database-url "$DATABASE_URL" | grep -q "pending"; then
echo "⚠ Warning: Pending database migrations found"
echo "Run: sqlx migrate run --database-url $DATABASE_URL"
else
echo "✓ Database migrations are up to date"
fi
else
echo "⚠ Warning: sqlx CLI not found - cannot verify migrations"
echo "Install with: cargo install sqlx-cli --no-default-features --features postgres"
fi
echo "Configuration validation complete!"
External Dependencies
The Helium system requires several external services to function properly. The Helium application itself runs in Docker containers, but the core infrastructure dependencies (PostgreSQL, Redis, RabbitMQ) should be provisioned as external managed services for production deployments.
While some dependencies are core infrastructure requirements, others are module-specific and may be optional depending on your deployment configuration.
Core Infrastructure Dependencies
These dependencies are required for all Helium deployments:
1. PostgreSQL Database
Purpose: Primary data store for all application data Version: PostgreSQL 12+ recommended Configuration:
- Environment variable:
DATABASE_URL - Format:
postgres://user:password@host:port/database - Example:
postgres://helium:password@localhost:5432/helium_db
Database Schema:
- ⚠️ CRITICAL: SQLx migrations must be run before starting the application
- All database schema changes are managed through SQLx migrations in the
/migrationsdirectory - Use
sqlx migrate run --database-url $DATABASE_URLto apply migrations
External Service Requirements:
- NOT containerized - PostgreSQL should run as an external managed service
- Recommended: Use cloud-managed PostgreSQL (AWS RDS, Google Cloud SQL, Azure Database, etc.)
- Alternative: Dedicated PostgreSQL server with proper backup and high availability setup
2. Redis
Purpose: Caching, session storage, and configuration store Version: Redis 6+ recommended Configuration:
- Environment variable:
REDIS_URL - Format:
redis://host:portorredis://user:password@host:port - Example:
redis://localhost:6379
Usage:
- Session management and authentication tokens
- Configuration caching across modules
- Temporary data storage (OAuth challenges, etc.)
External Service Requirements:
- NOT containerized - Redis should run as an external managed service
- Recommended: Use cloud-managed Redis (AWS ElastiCache, Google Memorystore, Azure Cache, etc.)
- Alternative: Dedicated Redis server with persistence and clustering for production
3. RabbitMQ
Purpose: Message queue for asynchronous processing between modules Version: RabbitMQ 3.8+ recommended Configuration:
- Environment variable:
MQ_URL - Format:
amqp://user:password@host:port/ - Example:
amqp://helium:password@localhost:5672/
Usage:
- Inter-module communication
- Background job processing
- Event-driven architecture support
External Service Requirements:
- NOT containerized - RabbitMQ should run as an external managed service
- Recommended: Use cloud-managed message queues (AWS MQ, Google Cloud Pub/Sub, Azure Service Bus)
- Alternative: Dedicated RabbitMQ cluster with proper clustering and high availability
Module-Specific Dependencies
These dependencies are required only when using specific modules:
Auth Module - OAuth Providers (Optional)
Purpose: Social authentication (Google, Microsoft, GitHub, Discord) Required: Only if OAuth authentication is enabled Configuration: Stored in database/Redis configuration
Supported Providers:
- Google OAuth 2.0
- Microsoft Azure AD
- GitHub OAuth
- Discord OAuth
Setup Requirements:
- Create OAuth applications with each provider
- Configure redirect URIs to your Helium deployment
- Store client ID and secret in the system configuration
- Configure OAuth provider settings via the management interface
Configuration Structure:
{
"auth": {
"oauth_providers": {
"providers": [
{
"name": "google",
"client_id": "your-client-id",
"client_secret": "your-client-secret",
"redirect_uri": "https://your-domain.com/auth/oauth/callback"
}
],
"challenge_expiration": "5m"
}
}
}
Mailer Module - SMTP Server (Required for Email)
Purpose: Email delivery for user notifications, verification, etc. Required: When email functionality is needed Configuration: Stored in database/Redis configuration
SMTP Configuration:
{
"mailer": {
"host": "smtp.gmail.com",
"port": 587,
"username": "your-email@gmail.com",
"password": "your-app-password",
"sender": "noreply@your-domain.com",
"starttls": true
}
}
Supported SMTP Features:
- STARTTLS encryption
- Plain authentication
- Custom sender addresses
- HTML email templates
Common SMTP Providers:
- Gmail:
smtp.gmail.com:587(requires app passwords) - Outlook/Hotmail:
smtp-mail.outlook.com:587 - SendGrid:
smtp.sendgrid.net:587 - Mailgun:
smtp.mailgun.org:587 - Amazon SES:
email-smtp.region.amazonaws.com:587
Shop Module - Epay Payment Provider (Required for Payments)
Purpose: Payment processing for e-commerce functionality Required: When payment processing is needed Configuration: Stored in database as epay provider credentials
Epay Provider Setup:
- Register with an Epay-compatible payment provider
- Obtain merchant credentials (PID, Key, Merchant URL)
- Configure webhook endpoints for payment notifications
- Add provider credentials via the management interface
Supported Payment Methods:
- Alipay (
alipay) - WeChat Pay (
wxpay) - USDT cryptocurrency (
usdt)
Configuration Requirements:
{
"shop": {
"epay_notify_url": "https://your-domain.com/api/shop/epay/callback",
"epay_return_url": "https://your-domain.com/payment/success",
"max_unpaid_orders": 5,
"auto_cancel_after": "30m"
}
}
Epay Provider Database Entry:
INSERT INTO epay_provider_credentials (
display_name,
enabled_channels,
key,
pid,
merchant_url
) VALUES (
'My Payment Provider',
['alipay', 'wxpay'],
'your-merchant-key',
1234,
'https://pay.provider.com/submit.php'
);
Development Dependencies
These are required for building and developing the project:
Protocol Buffers Compiler
Purpose: Compiling .proto files for gRPC services
Installation:
- Ubuntu/Debian:
apt-get install protobuf-compiler - macOS:
brew install protobuf - Already included in Docker build process
SQLx CLI
Purpose: Database migration management
Installation: cargo install sqlx-cli --no-default-features --features postgres
Usage:
- Apply migrations:
sqlx migrate run - Create new migration:
sqlx migrate add <name>
Docker/Kubernetes Deployment Considerations
What Should Be Containerized
✅ Containerize:
- Helium server application (
helium-server) - Application-specific components and workers
❌ Do NOT Containerize:
- PostgreSQL - Use external managed database services
- Redis - Use external managed cache services
- RabbitMQ - Use external managed message queue services
Infrastructure Handled by Platform
When deploying with Docker and Kubernetes, these infrastructure concerns are handled by the orchestration platform:
- Load Balancers: Kubernetes ingress controllers handle load balancing
- TLS Certificates: cert-manager or similar tools handle SSL/TLS
- Service Discovery: Kubernetes DNS handles service discovery
- Health Checks: Kubernetes probes handle application health monitoring
- Logging: Container runtime and logging drivers handle log aggregation
Recommended Managed Services by Cloud Provider
AWS:
- PostgreSQL: Amazon RDS for PostgreSQL
- Redis: Amazon ElastiCache for Redis
- RabbitMQ: Amazon MQ for RabbitMQ
Google Cloud:
- PostgreSQL: Cloud SQL for PostgreSQL
- Redis: Memorystore for Redis
- RabbitMQ: Cloud Pub/Sub (alternative) or third-party RabbitMQ
Azure:
- PostgreSQL: Azure Database for PostgreSQL
- Redis: Azure Cache for Redis
- RabbitMQ: Azure Service Bus (alternative) or third-party RabbitMQ
Environment Variables for Containers
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-server
spec:
template:
spec:
containers:
- name: helium-server
image: helium-server:latest
env:
- name: WORK_MODE
value: "grpc"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
Security Considerations
Credentials Management
- Never store credentials in plain text
- Use Kubernetes secrets or similar secure storage
- Rotate credentials regularly
- Use dedicated service accounts with minimal permissions
Network Security
- Database: Restrict access to application subnets only
- Redis: Enable authentication and restrict network access
- RabbitMQ: Use strong passwords and enable TLS
- SMTP: Use app passwords or OAuth tokens when available
OAuth Security
- Use HTTPS for all OAuth redirect URIs
- Validate redirect URI domains strictly
- Use state parameter for CSRF protection (handled automatically)
Troubleshooting
Database Connection Issues
# Test database connectivity
psql $DATABASE_URL -c "SELECT version();"
# Check migration status
sqlx migrate info --database-url $DATABASE_URL
Redis Connection Issues
# Test Redis connectivity
redis-cli -u $REDIS_URL ping
# Check Redis memory usage
redis-cli -u $REDIS_URL info memory
RabbitMQ Connection Issues
# Check queue status
rabbitmqctl list_queues
# Check connection status
rabbitmqctl list_connections
SMTP Testing
The mailer module provides test endpoints and logging to help diagnose SMTP issues. Check application logs for detailed SMTP connection and authentication errors.
Epay Integration Issues
- Verify webhook URLs are accessible from the internet
- Check payment provider’s callback logs
- Ensure merchant credentials are correctly configured
- Validate signature verification in callback processing
Optional Observability Stack
OpenTelemetry & Grafana Stack (Optional)
Purpose: Comprehensive observability with distributed tracing, metrics, and log aggregation
Required: No - completely optional enhancement
Configuration: OTEL_COLLECTOR environment variable
Components:
- OpenTelemetry Collector: Telemetry data collection and routing
- Grafana Tempo: Distributed tracing backend
- Prometheus: Metrics storage and querying
- Grafana Loki: Log aggregation
- Grafana: Unified visualization dashboard
When to Use:
- Production deployments requiring detailed performance analysis
- Multi-instance deployments needing distributed tracing
- Teams requiring centralized observability dashboards
- Troubleshooting complex performance issues
Deployment:
- NOT containerized with application - Deploy as separate Kubernetes workloads or use Grafana Cloud
- Recommended: Deploy Grafana stack in dedicated observability namespace
- Alternative: Use managed services (Grafana Cloud, Datadog, New Relic)
Note: Helium automatically falls back to basic structured logging if OpenTelemetry is not configured. See the comprehensive Observability with OpenTelemetry guide for full setup instructions.
Summary
| Dependency | Required | Purpose | Configuration | Deployment |
|---|---|---|---|---|
| PostgreSQL | Yes | Primary database | DATABASE_URL | External managed service |
| Redis | Yes | Caching/sessions | REDIS_URL | External managed service |
| RabbitMQ | Yes | Message queuing | MQ_URL | External managed service |
| SMTP Server | Conditional | Email delivery | Database config | External service |
| OAuth Providers | Optional | Social auth | Database config | External providers |
| Epay Provider | Conditional | Payment processing | Database config | External service |
| Observability | Optional | Tracing & metrics | OTEL_COLLECTOR | External stack/cloud |
Next Steps: After setting up these dependencies, proceed to the Helium Server Deployment Guide for detailed deployment instructions.
Observability with OpenTelemetry
Helium server includes optional OpenTelemetry (OTel) integration for comprehensive observability. This integration is completely optional — the server will work perfectly fine without it using basic structured logging.
What is OpenTelemetry?
OpenTelemetry provides distributed tracing, metrics collection, and contextual logging for production systems. Use it when:
- Running multiple worker instances requiring distributed tracing
- Need detailed performance analysis and troubleshooting
- Want centralized observability dashboards
Skip it for simple deployments, development environments, or when basic logging is sufficient.
Configuration
Enable OpenTelemetry by setting the OTEL_COLLECTOR environment variable:
export OTEL_COLLECTOR="http://otel-collector:4317"
./helium-server
If not set or initialization fails, the server automatically falls back to basic logging.
Service Names
Each worker mode reports with a distinct service name:
| Worker Mode | Service Name |
|---|---|
grpc | Helium.grpc |
subscribe_api | Helium.subscribe-api |
webhook_api | Helium.webhook-api |
consumer | Helium.consumer |
mailer | Helium.mailer |
cron_executor | Helium.cron-executor |
Recommended Stack: Grafana Observability
For production deployments, we recommend the Grafana observability stack — an open-source, Kubernetes-native solution with unified dashboards for traces, metrics, and logs.
Components
- OpenTelemetry Collector: Receives and routes telemetry
- Grafana Tempo: Distributed tracing storage
- Prometheus: Metrics collection
- Grafana Loki: Log aggregation
- Grafana: Unified visualization
Deployment
Deploy the Grafana stack alongside your Kubernetes cluster:
1. Add Helm Repositories
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
2. Create Namespace
kubectl create namespace observability
3. Deploy OpenTelemetry Collector
Create otel-collector-values.yaml:
mode: deployment
config:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 10s
send_batch_size: 1024
exporters:
# Traces to Tempo
otlp/tempo:
endpoint: tempo.observability.svc.cluster.local:4317
tls:
insecure: true
# Metrics to Prometheus
prometheus:
endpoint: 0.0.0.0:8889
namespace: helium
# Logs to Loki
loki:
endpoint: http://loki.observability.svc.cluster.local:3100/loki/api/v1/push
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/tempo]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]
logs:
receivers: [otlp]
processors: [batch]
exporters: [loki]
ports:
otlp-grpc:
enabled: true
containerPort: 4317
servicePort: 4317
protocol: TCP
otlp-http:
enabled: true
containerPort: 4318
servicePort: 4318
protocol: TCP
metrics:
enabled: true
containerPort: 8889
servicePort: 8889
protocol: TCP
helm install otel-collector grafana/opentelemetry-collector \
--namespace observability \
--values otel-collector-values.yaml
4. Deploy Tempo, Loki, and Prometheus
# Tempo for traces
helm install tempo grafana/tempo \
--namespace observability \
--set tempo.receivers.otlp.protocols.grpc.endpoint=0.0.0.0:4317
# Loki for logs
helm install loki grafana/loki-stack \
--namespace observability \
--set loki.enabled=true \
--set promtail.enabled=false
5. Deploy Prometheus
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace observability \
--set grafana.enabled=false
6. Deploy Grafana
helm install grafana grafana/grafana \
--namespace observability \
--set adminPassword=changeme
Configure data sources in Grafana to connect Tempo, Prometheus, and Loki.
Troubleshooting
Server logs show “Failed to initialize OpenTelemetry”
Check that the OTel Collector is reachable at the configured endpoint. The server will automatically fall back to basic logging.
Missing traces in Grafana
Verify the data pipeline: Helium → OTel Collector → Tempo. Check logs at each stage.
Performance impact
OpenTelemetry adds minimal overhead: < 2% CPU, ~10-20MB memory, < 1ms latency per request.
Disabling OpenTelemetry
Simply unset the OTEL_COLLECTOR variable — the server automatically falls back to basic logging.
Summary
OpenTelemetry in Helium is completely optional:
- Set
OTEL_COLLECTORto enable, leave unset to use basic logging - Automatic fallback if initialization fails
- Recommended for production with multiple instances
- Grafana stack provides open-source, Kubernetes-native observability
For detailed Helm deployment configurations, refer to the official Grafana Helm charts documentation.
Health Checks for Kubernetes
Helium server provides HTTP health check endpoints designed for Kubernetes liveness and readiness probes. These endpoints run on a separate internal port (default: 9090) and are enabled for all worker modes.
Overview
Health checks help Kubernetes determine:
- Liveness: Is the container alive and should it be restarted if it becomes unresponsive?
- Readiness: Is the container ready to handle requests?
Helium implements both probe types on a dedicated HTTP server that runs alongside each worker mode.
Endpoints
Liveness Probe: /healthz
Returns 200 OK with a JSON response if the server is running:
{
"status": "ok"
}
This endpoint always returns success if the health check server is responding. Kubernetes uses this to determine if the container should be restarted.
Readiness Probe: /readyz
Checks connectivity to all dependencies before returning status:
Success Response (200 OK):
{
"status": "ok",
"database": "ok",
"redis": "ok",
"rabbitmq": "ok"
}
Failure Response (503 Service Unavailable):
{
"status": "error",
"database": "ok",
"redis": "error",
"rabbitmq": "ok",
"error": "Redis error: Connection refused"
}
The readiness probe checks:
- PostgreSQL: Executes a simple query (
SELECT 1) - Redis: Sends a
PINGcommand - RabbitMQ: Validates connection pool status
All worker modes check the same three dependencies.
Configuration
Health Check Port
Set the HEALTH_CHECK_PORT environment variable to customize the port (default: 9090):
export HEALTH_CHECK_PORT=9090
This port should be:
- Internal only: Not exposed to external traffic
- Accessible by Kubernetes: For probe requests
- Different from main service ports: To avoid conflicts
Worker Modes
Health checks are available in all worker modes:
| Worker Mode | Main Port | Health Check Port | Dependencies Checked |
|---|---|---|---|
grpc | 50051 | 9090 | Database, Redis, RabbitMQ |
subscribe_api | 8080 | 9090 | Database, Redis, RabbitMQ |
webhook_api | 8081 | 9090 | Database, Redis, RabbitMQ |
consumer | N/A | 9090 | Database, Redis, RabbitMQ |
mailer | N/A | 9090 | Database, Redis, RabbitMQ |
cron_executor | N/A | 9090 | Database, Redis, RabbitMQ |
Kubernetes Deployment
Example Pod Configuration
Here’s how to configure health checks in your Kubernetes deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-grpc
spec:
replicas: 3
selector:
matchLabels:
app: helium-grpc
template:
metadata:
labels:
app: helium-grpc
spec:
containers:
- name: helium-grpc
image: helium-server:latest
env:
- name: WORK_MODE
value: "grpc"
- name: LISTEN_ADDR
value: "0.0.0.0:50051"
- name: HEALTH_CHECK_PORT
value: "9090"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: mq-url
ports:
- name: grpc
containerPort: 50051
protocol: TCP
- name: health
containerPort: 9090
protocol: TCP
livenessProbe:
httpGet:
path: /healthz
port: health
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /readyz
port: health
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 5
failureThreshold: 3
Probe Configuration Guidelines
Liveness Probe:
initialDelaySeconds: 10-30 seconds (allow time for startup)periodSeconds: 10-30 seconds (check periodically)timeoutSeconds: 5 secondsfailureThreshold: 3 (restart after 3 consecutive failures)
Readiness Probe:
initialDelaySeconds: 5-10 seconds (faster than liveness)periodSeconds: 5-10 seconds (check more frequently)timeoutSeconds: 5 secondsfailureThreshold: 3 (mark unready after 3 failures)
Service Configuration
For API worker modes (grpc, subscribe_api, webhook_api), configure a Service:
apiVersion: v1
kind: Service
metadata:
name: helium-grpc
spec:
type: ClusterIP
ports:
- name: grpc
port: 50051
targetPort: grpc
protocol: TCP
selector:
app: helium-grpc
Note: The health check port (9090) is not exposed in the Service. It’s only for Kubernetes probes.
Worker Mode Behavior
API Modes (grpc, subscribe_api, webhook_api)
For API modes, the health check server runs alongside the main API server:
- When the main server exits, the health check server is immediately terminated
- Process exits when either server fails
- Ensures no “zombie” containers serving health checks without handling requests
Background Worker Modes (consumer, mailer, cron_executor)
For background workers, the health check server runs continuously:
- Liveness probe confirms the worker process is alive
- Readiness probe ensures dependencies are accessible
- Worker loops indefinitely alongside health check server
Troubleshooting
Health Check Server Not Starting
Symptom: Probes fail immediately with connection errors
Solutions:
- Check logs for health check server errors
- Verify
HEALTH_CHECK_PORTis not already in use - Ensure the port is accessible within the pod
Readiness Probe Failing
Symptom: Pod remains in “Not Ready” state
Solutions:
- Check which dependency is failing in the
/readyzresponse - Verify connection strings (DATABASE_URL, REDIS_URL, MQ_URL)
- Ensure network policies allow pod access to dependencies
- Check if dependencies are healthy
Example debugging:
# Forward health check port to local machine
kubectl port-forward pod/helium-grpc-xyz 9090:9090
# Check readiness endpoint
curl http://localhost:9090/readyz
Liveness Probe Causing Restart Loop
Symptom: Pod repeatedly restarts with liveness probe failures
Solutions:
- Increase
initialDelaySeconds(worker may need more startup time) - Increase
failureThreshold(allow more failures before restart) - Check if worker is deadlocked or stuck (examine logs before restart)
Worker Exits But Pod Stays Running
Symptom: Container appears healthy but doesn’t process requests
This should not happen with the current implementation:
- API workers: Health check is aborted when main server exits
- Background workers: Return from
execute_worker()causes process exit
If this occurs, file a bug report.
Security Considerations
Port Exposure
The health check port (9090) should never be exposed externally:
- Don’t create Ingress rules for health check endpoints
- Don’t expose the health check port in the Service definition
- Use network policies to restrict access to Kubernetes control plane only
Sensitive Information
Health check responses contain minimal information:
- No version numbers
- No internal IPs or hostnames
- No authentication tokens
- Only dependency status (ok/error)
Error messages may contain connection details. Ensure logs are secured appropriately.
Best Practices
- Use separate ports: Never combine health checks with main service endpoints
- Set appropriate timeouts: Balance between quick detection and false positives
- Monitor probe metrics: Track probe success rates in your observability stack
- Test locally: Use port-forwarding to verify health checks before deployment
- Align with dependencies: If using a sidecar proxy (Istio, Linkerd), configure startup probes
Summary
Helium’s health check endpoints provide robust Kubernetes integration:
- Liveness probe (
/healthz): Detects unresponsive containers - Readiness probe (
/readyz): Ensures dependencies are healthy - Separate port (default 9090): Isolated from main services
- All worker modes: Consistent behavior across deployment types
- Process lifecycle: Ensures clean exits, no zombie containers
Configure these probes in your Kubernetes deployments to enable automatic recovery and load balancing.
Docker-based Deployment
The Helium system is designed with a multi-worker architecture that can be deployed using containers. Each worker type serves a specific purpose and has different scaling requirements. This deployment approach provides:
- Scalability: Independent scaling of different worker types based on load
- Reliability: Fault isolation between different services
- Flexibility: Easy deployment across different environments
- Maintainability: Simplified updates and rollbacks
Prerequisites
Before proceeding with this guide, ensure you have:
- External dependencies configured (see External Dependencies)
- Docker or container runtime installed
- Kubernetes cluster (for Kubernetes deployment)
- Basic understanding of containerization concepts
Container Architecture
Worker Types and Scaling Patterns
The Helium server supports six distinct worker modes, each with specific scaling characteristics:
| Worker Mode | Port | Scaling | Description |
|---|---|---|---|
grpc | 50051 | ✅ Horizontal | Main gRPC API server - can be load balanced |
subscribe_api | 8080 | ✅ Horizontal | RESTful subscription API - can be load balanced |
webhook_api | 8081 | ✅ Horizontal | Webhook handler for payments - can be load balanced |
consumer | - | ✅ Horizontal | Background message consumer - multiple instances supported |
mailer | - | ⚠️ Single preferred | Email service - not recommended >1 instance |
cron_executor | - | 🚫 Single only | Scheduled tasks - MUST be exactly 1 instance |
Scaling Constraints
⚠️ Critical Scaling Limitations
mailer Worker:
- Recommendation: Deploy as single instance only
- Reason: Relies on SMTP server connections and may cause email delivery issues with multiple instances
- Impact: Multiple mailer instances can lead to duplicate emails or SMTP rate limiting
cron_executor Worker:
- Requirement: MUST have exactly one instance
- Reason: Scans the database to check for scheduled tasks in the queue
- Impact: Multiple instances will cause duplicate task execution and potential data corruption
✅ Scalable Workers
API Workers (grpc, subscribe_api, webhook_api):
- Can be horizontally scaled based on traffic demands
- Support standard load balancing techniques
- Share state through external Redis and PostgreSQL
consumer Worker:
- Can run multiple instances for processing message queues
- Automatically distributes work through RabbitMQ
Docker Image
Building the Docker Image
The project includes a multi-stage Dockerfile optimized for production:
# Build the Docker image
docker build -t helium-server:latest .
# Tag for registry
docker tag helium-server:latest your-registry/helium-server:v1.0.0
# Push to registry
docker push your-registry/helium-server:v1.0.0
Image Characteristics
- Base Image:
gcr.io/distroless/ccfor minimal attack surface - Size: ~50MB final image
- Architecture: Multi-arch support (amd64, arm64)
- Security: Non-root user, minimal dependencies
Environment Variables
Configure containers using these environment variables:
# Required - Worker mode selection
WORK_MODE=grpc # grpc, subscribe_api, webhook_api, consumer, mailer, cron_executor
# Required - Database connections
DATABASE_URL=postgres://user:password@postgres-host:5432/helium_db
REDIS_URL=redis://redis-host:6379
MQ_URL=amqp://user:password@rabbitmq-host:5672/
# Optional - Server configuration
LISTEN_ADDR=0.0.0.0:50051 # For API workers
SCAN_INTERVAL=60 # For cron_executor only
RUST_LOG=info # Logging level
Docker Compose Deployment
For development or simple production setups:
version: "3.8"
services:
# Main gRPC API (scalable)
helium-grpc:
image: helium-server:latest
ports:
- "50051:50051"
environment:
WORK_MODE: grpc
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
LISTEN_ADDR: 0.0.0.0:50051
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 2 # Can be scaled horizontally
# Subscription API (scalable)
helium-subscribe-api:
image: helium-server:latest
ports:
- "8080:8080"
environment:
WORK_MODE: subscribe_api
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
LISTEN_ADDR: 0.0.0.0:8080
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 2 # Can be scaled horizontally
# Webhook API (scalable)
helium-webhook-api:
image: helium-server:latest
ports:
- "8081:8081"
environment:
WORK_MODE: webhook_api
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
LISTEN_ADDR: 0.0.0.0:8081
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 2 # Can be scaled horizontally
# Background consumer (scalable)
helium-consumer:
image: helium-server:latest
environment:
WORK_MODE: consumer
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 3 # Can run multiple instances
# Mailer service (single instance recommended)
helium-mailer:
image: helium-server:latest
environment:
WORK_MODE: mailer
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 1 # SINGLE INSTANCE ONLY
# Cron executor (must be single instance)
helium-cron:
image: helium-server:latest
environment:
WORK_MODE: cron_executor
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
SCAN_INTERVAL: 60
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 1 # MUST BE EXACTLY 1
# External dependencies (for development only)
postgres:
image: postgres:15
environment:
POSTGRES_USER: helium
POSTGRES_PASSWORD: password
POSTGRES_DB: helium_db
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
redis:
image: redis:7
ports:
- "6379:6379"
volumes:
- redis_data:/data
rabbitmq:
image: rabbitmq:3-management
environment:
RABBITMQ_DEFAULT_USER: helium
RABBITMQ_DEFAULT_PASS: password
ports:
- "5672:5672"
- "15672:15672"
volumes:
- rabbitmq_data:/var/lib/rabbitmq
volumes:
postgres_data:
redis_data:
rabbitmq_data:
Kubernetes Deployment
For production Kubernetes deployments:
Namespace and ConfigMap
apiVersion: v1
kind: Namespace
metadata:
name: helium-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: helium-config
namespace: helium-system
data:
RUST_LOG: "info"
SCAN_INTERVAL: "60"
Secrets
apiVersion: v1
kind: Secret
metadata:
name: helium-secrets
namespace: helium-system
type: Opaque
stringData:
database-url: "postgres://helium:password@postgres-service:5432/helium_db"
redis-url: "redis://redis-service:6379"
rabbitmq-url: "amqp://helium:password@rabbitmq-service:5672/"
gRPC API Deployment (Scalable)
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-grpc
namespace: helium-system
spec:
replicas: 3 # Can be scaled horizontally
selector:
matchLabels:
app: helium-grpc
template:
metadata:
labels:
app: helium-grpc
spec:
containers:
- name: helium-server
image: your-registry/helium-server:v1.0.0
ports:
- containerPort: 50051
env:
- name: WORK_MODE
value: "grpc"
- name: LISTEN_ADDR
value: "0.0.0.0:50051"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
envFrom:
- configMapRef:
name: helium-config
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
tcpSocket:
port: 50051
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
tcpSocket:
port: 50051
initialDelaySeconds: 5
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: helium-grpc-service
namespace: helium-system
spec:
selector:
app: helium-grpc
ports:
- port: 50051
targetPort: 50051
type: ClusterIP
Consumer Deployment (Scalable)
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-consumer
namespace: helium-system
spec:
replicas: 3 # Can run multiple instances
selector:
matchLabels:
app: helium-consumer
template:
metadata:
labels:
app: helium-consumer
spec:
containers:
- name: helium-server
image: your-registry/helium-server:v1.0.0
env:
- name: WORK_MODE
value: "consumer"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
envFrom:
- configMapRef:
name: helium-config
resources:
requests:
memory: "256Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
exec:
command:
- /bin/sh
- -c
- "ps aux | grep helium-server | grep -v grep"
initialDelaySeconds: 30
periodSeconds: 30
Mailer Deployment (Single Instance)
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-mailer
namespace: helium-system
spec:
replicas: 1 # SINGLE INSTANCE ONLY
strategy:
type: Recreate # Prevent multiple instances during updates
selector:
matchLabels:
app: helium-mailer
template:
metadata:
labels:
app: helium-mailer
spec:
containers:
- name: helium-server
image: your-registry/helium-server:v1.0.0
env:
- name: WORK_MODE
value: "mailer"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
envFrom:
- configMapRef:
name: helium-config
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
Cron Executor Deployment (Singleton)
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-cron
namespace: helium-system
spec:
replicas: 1 # MUST BE EXACTLY 1
strategy:
type: Recreate # Ensure no overlap during updates
selector:
matchLabels:
app: helium-cron
template:
metadata:
labels:
app: helium-cron
spec:
containers:
- name: helium-server
image: your-registry/helium-server:v1.0.0
env:
- name: WORK_MODE
value: "cron_executor"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
- name: SCAN_INTERVAL
value: "60"
envFrom:
- configMapRef:
name: helium-config
resources:
requests:
memory: "128Mi"
cpu: "50m"
limits:
memory: "256Mi"
cpu: "200m"
livenessProbe:
exec:
command:
- /bin/sh
- -c
- "ps aux | grep helium-server | grep -v grep"
initialDelaySeconds: 60
periodSeconds: 30
Horizontal Pod Autoscaler (HPA)
For scalable workers, configure automatic scaling:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: helium-grpc-hpa
namespace: helium-system
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: helium-grpc
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Load Balancer Configuration
Ingress for API Services
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: helium-ingress
namespace: helium-system
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- api.your-domain.com
secretName: helium-tls
rules:
- host: api.your-domain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: helium-grpc-service
port:
number: 50051
Service Mesh Configuration
For advanced deployments with service mesh (Istio):
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: helium-grpc-vs
namespace: helium-system
spec:
hosts:
- api.your-domain.com
gateways:
- helium-gateway
http:
- match:
- uri:
prefix: /
route:
- destination:
host: helium-grpc-service
port:
number: 50051
weight: 100
fault:
delay:
percentage:
value: 0.1
fixedDelay: 5s
Database Migration
Database migrations must be run before starting any workers:
Migration Job
apiVersion: batch/v1
kind: Job
metadata:
name: helium-migration
namespace: helium-system
spec:
template:
spec:
containers:
- name: migration
image: your-registry/helium-server:v1.0.0
command: ["sqlx", "migrate", "run"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
restartPolicy: Never
backoffLimit: 3
Init Container for Workers
Add to all worker deployments:
spec:
template:
spec:
initContainers:
- name: wait-for-migration
image: postgres:15
command:
[
"sh",
"-c",
"until pg_isready -h postgres-service -p 5432; do echo waiting for database; sleep 2; done;",
]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
Monitoring and Observability
Health Checks
Configure appropriate health checks for each worker type:
# For API workers (gRPC, REST)
livenessProbe:
tcpSocket:
port: 50051
initialDelaySeconds: 30
periodSeconds: 10
# For background workers (consumer, mailer, cron)
livenessProbe:
exec:
command:
- /bin/sh
- -c
- "ps aux | grep helium-server | grep -v grep"
initialDelaySeconds: 30
periodSeconds: 30
Logging Configuration
env:
- name: RUST_LOG
value: "info,helium_server=debug" # Adjust as needed
Metrics Collection
Use Prometheus for metrics collection:
apiVersion: v1
kind: Service
metadata:
name: helium-metrics
namespace: helium-system
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
selector:
app: helium-grpc
ports:
- port: 8080
name: metrics
Troubleshooting
Common Issues
Pod Crash Loop:
# Check logs
kubectl logs -n helium-system deployment/helium-grpc
# Check events
kubectl get events -n helium-system --sort-by='.metadata.creationTimestamp'
# Verify environment variables
kubectl exec -n helium-system deployment/helium-grpc -- env | grep -E "(DATABASE_URL|REDIS_URL|MQ_URL)"
Multiple Cron Executors:
# Check for multiple cron instances (should show only 1)
kubectl get pods -n helium-system -l app=helium-cron
# Check cron logs for conflicts
kubectl logs -n helium-system -l app=helium-cron --tail=100
Database Connection Issues:
# Test database connectivity
kubectl run -i --tty --rm debug --image=postgres:15 --restart=Never -- \
psql postgresql://user:password@postgres-service:5432/helium_db -c "SELECT version();"
# Check migration status
kubectl exec -n helium-system deployment/helium-grpc -- \
sqlx migrate info --database-url $DATABASE_URL
Performance Tuning
Resource Limits:
- API workers: 200-500m CPU, 256Mi-1Gi RAM per pod
- Consumer workers: 500m-1 CPU, 512Mi-2Gi RAM per pod
- Mailer/Cron: 100-200m CPU, 128-512Mi RAM per pod
Scaling Guidelines:
- Start with 2-3 replicas for API workers
- Scale consumers based on message queue depth
- Monitor CPU/memory usage and adjust limits accordingly
External Dependencies
Refer to the External Dependencies Guide for detailed information about:
- PostgreSQL setup and configuration
- Redis configuration and clustering
- RabbitMQ setup and management
- SMTP server configuration
- OAuth provider setup
- Payment provider integration
Configuration Management
Refer to the Configuration Guide for:
- Environment variable reference
- Configuration file formats
- Runtime configuration updates
- Security best practices
Next Steps
After successful deployment:
- Configure monitoring and alerting
- Set up backup procedures for stateful data
- Implement CI/CD pipelines for automated deployments
- Configure log aggregation and analysis
- Plan disaster recovery procedures
For specific configuration details, see the Helium Server Configuration guide.
Configuration Guide
This document provides comprehensive configuration information for operators deploying the Helium project. The system uses a combination of environment variables for server configuration and JSON configurations stored in the database for module-specific settings.
Table of Contents
Environment Variables
The Helium server is configured entirely through environment variables. These control the server behavior and connectivity to external services.
Required Environment Variables
All worker modes require these variables:
# Worker mode selection (REQUIRED)
WORK_MODE="grpc" # Options: grpc, subscribe_api, webhook_api, consumer, mailer, cron_executor
# Database connection (REQUIRED)
DATABASE_URL="postgres://user:password@localhost:5432/helium_db"
# Redis connection (REQUIRED)
REDIS_URL="redis://localhost:6379"
# RabbitMQ connection (REQUIRED)
MQ_URL="amqp://user:password@localhost:5672/"
Worker Mode Options
| Worker Mode | Port | Description | Use Case |
|---|---|---|---|
grpc | 50051 | gRPC API server | Main API for client applications and admin panels |
subscribe_api | 8080 | RESTful subscription API | Public subscription endpoints |
webhook_api | 8081 | RESTful webhook handler | Payment provider callbacks, third-party integrations |
consumer | - | Background message consumer | Processing async tasks from message queue |
mailer | - | Email service worker | Sending emails and notifications |
cron_executor | - | Scheduled task executor | Running periodic maintenance tasks |
Optional Environment Variables
# Server listen addresses (for API workers)
LISTEN_ADDR="0.0.0.0:50051" # Default for grpc mode
LISTEN_ADDR="0.0.0.0:8080" # Default for subscribe_api mode
LISTEN_ADDR="0.0.0.0:8081" # Default for webhook_api mode
# Cron executor configuration
SCAN_INTERVAL="60" # Scan interval in seconds (cron_executor mode only)
# Logging configuration
RUST_LOG="info" # Options: error, warn, info, debug, trace
Module Configurations
Module configurations are stored as JSON in the PostgreSQL database in the application__config table. Each module has its own configuration key and JSON structure.
Note: All duration values are represented as strings containing the number of seconds (e.g., "300" for 5 minutes, "1800" for 30 minutes).
Auth Module (auth)
Key: "auth"
The authentication module handles user registration, login, JWT tokens, and OAuth providers.
{
"email_provider": {
"register_domain": {
"enable_white_list": false,
"white_list": [],
"enable_black_list": false,
"black_list": []
},
"otp_expire_after": "300",
"delete_otp_before": "7200",
"magic_link_expire_after": "1800",
"magic_link_delete_before": "14400",
"resend_interval": "30"
},
"jwt": {
"secret": "your-jwt-secret-key-32-characters-long",
"refresh_token_expiration": "2592000",
"access_token_expiration": "900",
"issuer": "https://your-domain.com",
"access_audience": "helium_cloud",
"refresh_audience": "helium_cloud_auth"
},
"oauth_providers": {
"providers": [
{
"name": "Google",
"client_id": "your-google-client-id",
"client_secret": "your-google-client-secret",
"redirect_uri": "https://your-domain.com/auth/oauth/google/callback"
},
{
"name": "GitHub",
"client_id": "your-github-client-id",
"client_secret": "your-github-client-secret",
"redirect_uri": "https://your-domain.com/auth/oauth/github/callback"
}
],
"challenge_expiration": "300"
},
"default_user_group": 1
}
Configuration Details:
email_provider.register_domain: Controls which email domains are allowed for registrationemail_provider.otp_expire_after: How long OTP codes remain valid (in seconds, default: “300” = 5 minutes)email_provider.resend_interval: Minimum time between resend attempts (in seconds, default: “30” = 30 seconds)jwt.secret: CRITICAL: Must be a secure random string for productionjwt.*_expiration: Token lifetime settings (in seconds, default: “2592000” = 30 days for refresh, “900” = 15 minutes for access)oauth_providers.providers: List of OAuth providers with their credentialsdefault_user_group: Default group ID assigned to new users
Telecom Module (telecom)
Key: "telecom"
The telecom module manages VPN nodes, subscription links, and proxy synchronization.
{
"node_health_check": {
"offline_timeout": "600"
},
"subscribe_link": {
"endpoints": [
{
"url_template": "https://subscribe.your-domain.com/subscribe/{SUBSCRIBE_TOKEN}",
"endpoint_name": "primary"
},
{
"url_template": "https://backup.your-domain.com/subscribe/{SUBSCRIBE_TOKEN}",
"endpoint_name": "backup"
}
]
},
"uni_proxy_sync": {
"push_interval": "30",
"pull_interval": "60"
},
"vpn_server_token": "secure-random-token-for-vpn-servers"
}
Configuration Details:
node_health_check.offline_timeout: Time before marking nodes as offline (in seconds, default: “600” = 10 minutes)subscribe_link.endpoints: List of subscription endpoints for client configurationuni_proxy_sync.push_interval: How often to push traffic data (in seconds, default: “30” = 30 seconds)uni_proxy_sync.pull_interval: How often to pull user info (in seconds, default: “60” = 1 minute)vpn_server_token: CRITICAL: Secure token for VPN server authentication
Shop Module (shop)
Key: "shop"
The shop module handles e-commerce functionality, orders, and payment processing.
{
"max_unpaid_orders": 5,
"auto_cancel_after": "1800",
"epay_notify_url": "https://your-domain.com/api/webhook/epay/notify",
"epay_return_url": "https://your-domain.com/payment/success"
}
Configuration Details:
max_unpaid_orders: Maximum unpaid orders per user (default: 5)auto_cancel_after: Time before auto-canceling unpaid orders (in seconds, default: “1800” = 30 minutes)epay_notify_url: REQUIRED: Server-to-server notification endpoint for payment providersepay_return_url: REQUIRED: User return URL after payment completion
Mailer Module (mailer)
Key: "mailer"
The mailer module handles email delivery through SMTP.
{
"host": "smtp.gmail.com",
"port": 587,
"username": "your-smtp-username",
"password": "your-smtp-password",
"sender": "noreply@your-domain.com",
"starttls": true
}
Configuration Details:
host: SMTP server hostnameport: SMTP server port (typically 587 for STARTTLS, 465 for SSL)username/password: SMTP authentication credentialssender: Email address used as senderstarttls: Enable STARTTLS encryption (recommended:true)
Admin Management Module (admin-jwt)
Key: "admin-jwt"
Controls JWT tokens for administrative access.
{
"secret": "admin-jwt-secret-key-32-characters-long",
"token_expiration": "864000",
"issuer": "https://admin.your-domain.com",
"audience": "HeliumAdmin"
}
Configuration Details:
secret: CRITICAL: Secure secret for admin JWT signingtoken_expiration: Admin token lifetime (in seconds, default: “864000” = 10 days)issuer: JWT issuer for admin tokensaudience: JWT audience for admin tokens
Market Module (affiliate)
Key: "affiliate"
Controls the affiliate marketing system.
{
"max_invite_code_per_user": 10,
"default_reward_rate": "0.1",
"default_trigger_time_per_user": 3
}
Configuration Details:
max_invite_code_per_user: Maximum invite codes per user (default: 10)default_reward_rate: Default affiliate commission rate (default: 10%)default_trigger_time_per_user: Required referrals before earning (default: 3)
Infrastructure Dependencies
PostgreSQL Database
Required Version: PostgreSQL 12+
Configuration:
- Environment variable:
DATABASE_URL - Format:
postgres://user:password@host:port/database
Important Notes:
- ⚠️ CRITICAL: Run migrations before starting:
sqlx migrate run --database-url $DATABASE_URL - Use external managed PostgreSQL service for production (AWS RDS, Google Cloud SQL, etc.)
- Ensure proper backup and high availability configuration
Redis
Required Version: Redis 6+
Configuration:
- Environment variable:
REDIS_URL - Format:
redis://host:portorredis://user:password@host:port
Usage:
- Session storage and authentication tokens
- Module configuration caching
- Temporary data (OAuth challenges, OTP codes)
RabbitMQ (AMQP)
Configuration:
- Environment variable:
MQ_URL - Format:
amqp://user:password@host:port/
Usage:
- Asynchronous task processing
- Email sending queue
- Inter-module communication
Configuration Templates
Development Environment
# .env file for development
WORK_MODE=grpc
DATABASE_URL=postgres://helium:password@localhost:5432/helium_dev
REDIS_URL=redis://localhost:6379
MQ_URL=amqp://guest:guest@localhost:5672/
LISTEN_ADDR=0.0.0.0:50051
RUST_LOG=debug
Production Environment
# Production environment variables
WORK_MODE=grpc
DATABASE_URL=postgres://helium_user:secure_password@db.example.com:5432/helium_prod
REDIS_URL=redis://redis.example.com:6379
MQ_URL=amqp://helium_user:secure_password@mq.example.com:5672/
LISTEN_ADDR=0.0.0.0:50051
RUST_LOG=info
Multi-Worker Deployment
For production, run multiple worker processes:
# API Server (can be scaled horizontally)
WORK_MODE=grpc ./helium-server &
# Background Tasks (can be scaled)
WORK_MODE=consumer ./helium-server &
# Email Processing (single instance recommended)
WORK_MODE=mailer ./helium-server &
# Scheduled Tasks (MUST be single instance)
WORK_MODE=cron_executor ./helium-server &
Configuration Management
To update module configurations:
- Via Database: Insert/update records in the
application__configtable - Via Admin API: Use the management gRPC API to update configurations
- Configuration Sync: The system automatically syncs configurations from PostgreSQL to Redis cache
Example SQL for updating auth configuration:
INSERT INTO application__config (key, content)
VALUES ('auth', '{"jwt": {"secret": "new-secret"}, ...}')
ON CONFLICT (key) DO UPDATE SET
content = EXCLUDED.content,
updated_at = NOW();
Security Considerations
⚠️ Critical Configuration Security:
- JWT Secrets: Use cryptographically secure random strings (32+ characters)
- VPN Server Token: Generate secure random tokens for server authentication
- Database Credentials: Use strong passwords and restrict database access
- SMTP Credentials: Use application-specific passwords, not primary account passwords
- OAuth Secrets: Keep OAuth client secrets secure and rotate them regularly
Troubleshooting
Common Configuration Issues
- Database Connection: Verify PostgreSQL accessibility and credentials
- Redis Connection: Check Redis server status and network connectivity
- RabbitMQ Connection: Ensure RabbitMQ server is running and accessible
- Email Delivery: Test SMTP configuration with your email provider
- OAuth Issues: Verify client IDs, secrets, and redirect URIs match provider settings
Validation Commands
# Test database connection
sqlx migrate info --database-url $DATABASE_URL
# Test Redis connection
redis-cli -u $REDIS_URL ping
# Test RabbitMQ connection
rabbitmqctl status # on RabbitMQ server
Helium CLI
The Helium CLI (helium-cli) is a comprehensive administrative tool that allows operators to:
- Initialize system configurations with default values
- Manage admin accounts (create, list, view, delete)
- Validate configuration files before deployment
- Interact with both PostgreSQL database and Redis cache
Installation
The CLI is built as part of the main Helium project. After building the project:
cargo build --release --bin helium-cli
The binary will be available at target/release/helium-cli.
Global Configuration
The CLI requires database and Redis connections to function. These can be configured via:
Environment Variables
export DATABASE_URL="postgresql://user:password@localhost/helium"
export REDIS_URL="redis://localhost:6379"
Command Line Arguments
helium-cli --database-url "postgresql://user:password@localhost/helium" \
--redis-url "redis://localhost:6379" \
<command>
Verbose Logging
Enable detailed logging for troubleshooting:
helium-cli --verbose <command>
Commands
Configuration Management
Initialize All Configurations
helium-cli init-config
This command initializes all system configurations with their default values. It:
- Creates default configurations for all modules in the database
- Updates Redis cache with the configurations
- Handles the following configuration types:
- Auth: Authentication and authorization settings
- Admin JWT: JWT configuration for admin authentication
- Telecom: Telecom service configurations
- Shop: E-commerce and shop settings
- Market: Affiliate and marketing configurations
- Mailer: SMTP and email service settings
Example Output:
Initializing 6 configuration types...
Initializing Auth config... ✓ Success
Initializing Admin JWT config... ✓ Success
Initializing Telecom config... ✓ Success
Initializing Shop config... ✓ Success
Initializing Market/Affiliate config... ✓ Success
Initializing Mailer config... ✓ Success
Configuration initialization completed:
✓ Successful: 6
Use Cases:
- Initial deployment setup
- Resetting configurations to defaults
- Disaster recovery scenarios
Validate Configuration Files
helium-cli validate-config --config-type <TYPE> <config-file.json>
Validates a JSON configuration file against the specified configuration schema.
Supported Configuration Types:
auth- Authentication configurationadmin-jwt/admin_jwt- Admin JWT configurationtelecom- Telecom service configurationshop- Shop/e-commerce configurationmarket/affiliate- Marketing/affiliate configurationmailer- Email service configuration
Examples:
# Validate auth configuration
helium-cli validate-config --config-type auth auth-config.json
# Validate mailer configuration
helium-cli validate-config --config-type mailer smtp-config.json
Example Output:
✓ Configuration file is valid!
File: auth-config.json
Type: Auth
Key: auth
Admin Account Management
List Admin Accounts
helium-cli admin list [--limit <N>] [--offset <N>]
Lists all admin accounts with pagination support.
Options:
--limit <N>- Number of results to return (default: 50)--offset <N>- Number of results to skip (default: 0)
Example:
# List first 10 admin accounts
helium-cli admin list --limit 10
# List admin accounts with pagination
helium-cli admin list --limit 25 --offset 50
Example Output:
Found 3 admin account(s):
ID Role Name Email Created At
------------------------------------ -------------------- ------------------------------ ------------------------------ --------------------
123e4567-e89b-12d3-a456-426614174000 Super Admin System Administrator admin@example.com 2024-01-15T10:30:00Z
234e5678-e89b-12d3-a456-426614174001 Customer Support Support Team Lead support@example.com 2024-01-16T14:20:00Z
345e6789-e89b-12d3-a456-426614174002 Moderator Content Moderator moderator@example.com 2024-01-17T09:45:00Z
Show Admin Account Details
helium-cli admin show <ADMIN_ID>
Displays detailed information about a specific admin account.
Example:
helium-cli admin show 123e4567-e89b-12d3-a456-426614174000
Example Output:
Admin Account Details:
ID: 123e4567-e89b-12d3-a456-426614174000
Name: System Administrator
Role: Super Admin
Email: admin@example.com
Avatar: https://example.com/avatar.jpg
Created At: 2024-01-15T10:30:00Z
Create Admin Account
helium-cli admin create --name <NAME> --role <ROLE> [--email <EMAIL>] [--avatar <AVATAR_URL>]
Creates a new admin account with the specified details.
Required Options:
--name <NAME>- Display name for the admin--role <ROLE>- Admin role (see roles below)
Optional Options:
--email <EMAIL>- Admin email address--avatar <AVATAR_URL>- URL to admin avatar image
Available Roles:
super_admin/superadmin/super-admin- Full system accessmoderator- Content moderation privilegescustomer_support/customersupport/customer-support- Customer service accesssupport_bot/supportbot/support-bot- Automated support system access
Examples:
# Create super admin
helium-cli admin create \
--name "System Administrator" \
--role super_admin \
--email "admin@example.com"
# Create customer support account
helium-cli admin create \
--name "Support Agent" \
--role customer_support \
--email "support@example.com" \
--avatar "https://example.com/avatars/support.jpg"
# Create moderator (minimal info)
helium-cli admin create \
--name "Content Moderator" \
--role moderator
Example Output:
Successfully created admin account:
ID: 456e7890-e89b-12d3-a456-426614174003
Name: System Administrator
Role: Super Admin
Email: admin@example.com
Avatar: N/A
Delete Admin Account
helium-cli admin delete <ADMIN_ID> [--yes]
Deletes an admin account after confirmation.
Options:
--yes- Skip confirmation prompt (use with caution)
Examples:
# Delete with confirmation prompt
helium-cli admin delete 123e4567-e89b-12d3-a456-426614174000
# Delete without confirmation (automated scripts)
helium-cli admin delete 123e4567-e89b-12d3-a456-426614174000 --yes
Example Interactive Flow:
Admin account to delete:
ID: 123e4567-e89b-12d3-a456-426614174000
Name: Old Administrator
Role: Super Admin
Email: old-admin@example.com
Are you sure you want to delete this admin account? [y/N]: y
Successfully deleted admin account: 123e4567-e89b-12d3-a456-426614174000
Common Use Cases
Initial Deployment
-
Set up environment variables:
export DATABASE_URL="postgresql://helium:password@localhost/helium" export REDIS_URL="redis://localhost:6379" -
Initialize system configurations:
helium-cli init-config -
Create initial super admin:
helium-cli admin create \ --name "System Administrator" \ --role super_admin \ --email "admin@yourcompany.com"
Configuration Management Workflow
-
Prepare configuration file: Create a JSON file with your custom configuration.
-
Validate before deployment:
helium-cli validate-config --config-type auth ./configs/auth-config.json -
Deploy configuration: Use the web interface or API to upload the validated configuration.
Admin Account Maintenance
-
Regular audit of admin accounts:
helium-cli admin list --limit 100 -
Create specialized support accounts:
# Customer support team helium-cli admin create --name "Support Team A" --role customer_support # Content moderation team helium-cli admin create --name "Moderator Team B" --role moderator -
Remove inactive accounts:
helium-cli admin delete <inactive-admin-id>
Error Handling
The CLI provides comprehensive error messages and logging:
- Database Connection Issues: Check
DATABASE_URLand database availability - Redis Connection Issues: Verify
REDIS_URLand Redis service status - Configuration Validation Errors: Review JSON syntax and required fields
- Admin Role Errors: Ensure role names match supported values exactly
Security Considerations
- Environment Variables: Use secure methods to set database credentials
- Admin Creation: Be selective with
super_adminrole assignments - Account Deletion: Always verify admin identity before deletion
- Logging: Be aware that verbose mode may log sensitive information
Troubleshooting
Common Issues
“DATABASE_URL must be provided”
- Set the
DATABASE_URLenvironment variable or use--database-urlflag
“Failed to connect to database”
- Verify PostgreSQL is running and accessible
- Check connection string format and credentials
- Ensure the database exists
“Invalid admin role”
- Use exact role names:
super_admin,moderator,customer_support,support_bot - Role names are case-insensitive but must match supported variants
“Configuration validation failed”
- Check JSON syntax with a JSON validator
- Ensure all required fields are present
- Verify field types match expected schema
Getting Help
Use the built-in help system:
# General help
helium-cli --help
# Command-specific help
helium-cli admin --help
helium-cli admin create --help
Integration with Deployment Scripts
The CLI is designed to work well in automated deployment scenarios:
#!/bin/bash
set -e
# Set environment
export DATABASE_URL="$HELIUM_DB_URL"
export REDIS_URL="$HELIUM_REDIS_URL"
# Initialize configurations
echo "Initializing Helium configurations..."
helium-cli init-config
# Create admin account if it doesn't exist
echo "Creating admin account..."
helium-cli admin create \
--name "$ADMIN_NAME" \
--role super_admin \
--email "$ADMIN_EMAIL" || true
echo "Deployment initialization complete!"
This CLI tool is essential for proper Helium deployment and ongoing operational management. Use it as part of your deployment automation and regular maintenance procedures.
Migrate From SS-Panel UIM
This guide walks Helium operators through migrating an existing SS-Panel UIM deployment. The migration intentionally happens in two isolated passes so you can export data from the legacy MariaDB instance without touching the new Helium PostgreSQL database until you are ready.
At a high level:
mariadb-passreads all data from the SS-Panel MariaDB schema and saves it to a localrkyvarchive.postgre-passconsumes thatrkyvarchive and writes normalized data into Helium’s PostgreSQL schema.
Because Helium normally targets PostgreSQL, the first pass uses a dedicated crate that bundles the MySQL client driver and builds separately from the rest of the project.
What Gets Migrated
The migration transfers the following SS-Panel data into Helium’s schema:
User Accounts
- Email and password hashes (preserved as-is for seamless login)
- User names and registration timestamps
- Last active timestamps
- Account balances (available balance for purchasing)
- Referral relationships (affiliate ref_by links)
- Traffic usage (upload/download totals)
- VMess UUIDs (for node authentication)
- Subscribe tokens (subscription links)
- Invite codes (user-specific invite codes)
Helium creates corresponding entries in:
auth.user_account(login credentials)auth.user_profile(profile metadata)shop.user_balance(financial data)market.affiliate_user_policy(referral relationships)telecom.user_nodes_token(node authentication tokens)
Products → Packages
SS-Panel products are converted to Helium packages with:
- Package name
- Price
- Duration (time allowance in days)
- Bandwidth quota
These populate the telecom.package table.
Orders → Package Queues
Historical purchase orders are replayed into Helium’s package queue system:
- Order status (activated vs. pending)
- Creation and update timestamps
- Associated product/package
Orders are inserted into telecom.package_queue to preserve user entitlements and purchase history.
Nodes → Node Servers & Clients
SS-Panel nodes are split into two Helium entities:
- Node servers (
telecom.node_server): server address, rate, class - Node clients (
telecom.node_client): protocol configurations (VMess, WebSocket, gRPC)
Each node’s custom configuration (ports, security, network transport) is normalized to Helium’s node client schema.
Data Not Migrated
The following SS-Panel data is not migrated:
- Invoices (read but not written to Helium)
- Payback records (read but not written)
- Admin accounts (must be created manually via
helium-cli) - System configurations (initialize via
helium-cli init-config) - Announcements and tickets (start fresh in Helium)
Prerequisites
- SS-Panel UIM running on MariaDB (or MySQL-compatible) that you can access in read-only mode during export.
- A ready Helium PostgreSQL database with migrations applied and no production users yet. Run
sqlx migrate runbefore importing. - Adequate disk space wherever you write the
rkyvarchive. Expect several hundred megabytes for large installs. - Rust toolchain (same as Helium) and network access to both databases from the machine performing the migration.
- Optional: a safe location (e.g., object storage) to back up the generated
rkyvfile.
Pass 1 – Export From SS-Panel (MariaDB)
The exporter lives in ssp-migrator/mariadb-pass and is compiled with SQLx’s MySQL feature set. Build and run it separately from the main server binaries.
Build the exporter
mariadb-pass uses SQLx’s compile-time query checking. The workspace ships with .sqlx caches for PostgreSQL only, so generic commands such as cargo build --release -p mariadb-pass will fail. You must compile from the crate directory with access to a live SS-Panel database (or export SQLx metadata for MariaDB manually).
cd ssp-migrator/mariadb-pass
SQLX_OFFLINE=false DATABASE_URL="$SSP_DATABASE_URL" cargo build --release
The
DATABASE_URLenvironment variable is required during compilation so SQLx can introspect the MariaDB schema. If you cannot open a direct connection from the build host, generate SQLx data offline withsqlx prepareagainst MariaDB and commit it alongside the crate before building.
Prepare connection settings
You can pass the database URL directly on the command line or export it as an environment variable. A typical MariaDB connection string looks like:
export SSP_DATABASE_URL="mysql://user:password@legacy-host:3306/sspanel"
Run the exporter
cd ssp-migrator/mariadb-pass
SQLX_OFFLINE=false DATABASE_URL="$SSP_DATABASE_URL" cargo run --release -- \
--database-url "$SSP_DATABASE_URL" \
--output-file /tmp/helium-migration.rkyv
The command performs several steps internally:
- Streams each SS-Panel entity (users, products, orders, nodes, etc.) in batches.
- Normalizes relationships to Helium’s intermediate structs.
- Serializes the result to an
rkyvarchive (default namemigration_data.rkyv).
Monitor the logs for warnings about rows that cannot be converted. The exporter skips invalid records but continues processing.
When the run finishes you should have an archive file similar to /tmp/helium-migration.rkyv. Back it up before moving on.
Pass 2 – Import Into Helium (PostgreSQL)
The importer lives in ssp-migrator/postgre-pass and understands Helium’s canonical schema. Ensure the target PostgreSQL database is empty or freshly provisioned to avoid collisions.
Build the importer
cargo build --release -p postgre-pass
This binary only links the PostgreSQL driver, so it compiles with the same workspace settings as other Helium components.
Prepare connection settings
export HELIUM_DATABASE_URL="postgres://helium:password@new-host:5432/helium_db"
Run the importer
cargo run --release -p postgre-pass -- \
--rkyv-file /tmp/helium-migration.rkyv \
--database-url "$HELIUM_DATABASE_URL"
The importer performs conversions aligned with Helium’s modules:
- Inserts node servers and clients in the correct dependency order.
- Creates packages, affiliate policies, balances, and user accounts.
- Replays historical purchases into the package queue so users retain entitlements.
If anything fails, no partial state is left behind—each insert group is committed in dependency order. Fix the reported data issue, rebuild the rkyv archive if necessary, and rerun the importer.
Post-migration Checklist
- Confirm the importer logs
Migration completed successfully. - Inspect a handful of migrated users in Helium’s admin tools (profiles, balances, active packages).
- Verify node configurations in
telecommatch the expected SS-Panel node inventory. - Rotate user credentials if required by your migration policy (password hashes are imported as-is).
- Schedule DNS cutover and client config updates after validating the new deployment.
Troubleshooting
- MariaDB TLS or authentication errors: confirm the MariaDB driver accepts your certificates or append parameters (e.g.,
?ssl-mode=REQUIRED). - Missing subscribe links or invite codes: the exporter requires these tables to be populated for each user. Reconcile data in SS-Panel before exporting.
- Importer stops on unique constraint violations: verify the PostgreSQL database is clean. Drop and recreate the schema, then rerun the importer.
- Large datasets: run the exporter on a machine close to the database to reduce latency. You can copy the resulting
rkyvfile to the environment where the importer runs.
With both passes complete, Helium now has a faithful copy of the SS-Panel data and you can proceed with normal deployment and cutover activities.