Helium
Helium is a modern commercial VPN SaaS system built with Rust, focusing on scalability, security, and user-friendliness.
Features
- Kubernetes/docker native: stateless, horizontally scalable, and easy to deploy.
- High security: no shell execution, no deserialize vulnerabilities, and no SQL injection.
- Pluggable frontend: full functioned grpc API, and easy to build your own frontend.
- Lightweight: Minimal to 40MB memory usage per service. Can handle 1000+ requests per second on a single 1-core CPU server.
- Advanced Selling System: easy to handle complex business strategy, suitable for billions of users.
Tech Stack
- Rust: Memory-safe systems programming with C-level performance
- gRPC + Tonic: High-performance API with type-safe contracts
- PostgreSQL + SQLx: Reliable database with compile-time query validation
- Redis: Fast in-memory caching and session storage
- AMQP: Reliable message queuing for microservices
- Tokio: Async runtime for handling thousands of concurrent connections
Key Advantages:
- Microservices architecture with independent scaling
- Container-native design for Kubernetes deployment
- Memory safety eliminates entire classes of security vulnerabilities
- Exceptional performance: 1000+ RPS on single-core CPU with 40MB memory usage
Enterprise User Guide
This guide is for enterprises that run customer-facing VPN services on Helium. It is written for technical operators and support teams who understand coding, networking, and business operations, but do not need to read Helium source code.
Who should read this
- Product and operations owners
- Customer support and finance teams
- Platform administrators
- Technical onboarding staff
What this guide covers
The guide is organized by business flow, not by backend module:
- Onboard users and secure accounts
- Sell plans and process payments
- Deliver VPN access and manage package lifecycle
- Handle wallet, gift card, and referral operations
- Run customer support and announcements
- Operate admin workflows and infrastructure
How to use this guide
- Start with user flows if you are launching customer operations
- Use admin flows for internal team enablement
- Share specific pages with the teams that own each workflow
What you can do with Helium
- Run a full VPN SaaS business with user accounts, plans, payments, and support
- Offer multiple payment methods and internal wallet usage
- Control package access, node visibility, and policy by customer segment
- Give different permissions to operations, support, and admin teams
What you cannot do with this guide
- It does not replace security policy or compliance review for your organization
- It does not document source-level implementation details
- It does not include custom integration code for your internal systems
User Onboarding and Account Security
This flow explains how end users create accounts, sign in, recover access, and keep accounts secure.
What customers can do
- Register with email and password
- Register or sign in with supported OAuth providers
- Enable multi-factor authentication (MFA)
- Reset password by email link
- Keep sessions active across devices, then log out when needed
Recommended onboarding flow
flowchart TD
start[UserStartsRegistration] --> sendEmail[RequestVerificationEmail]
sendEmail --> emailLink[UserClicksMagicLink]
emailLink --> setPassword[SetPasswordAndCreateAccount]
setPassword --> autoLogin{AutoLoginEnabled}
autoLogin -->|Yes| activeSession[SignedInSession]
autoLogin -->|No| loginPage[GoToLogin]
activeSession --> mfaSetup[OptionalMfaSetup]
loginPage --> manualLogin[EmailOrOAuthLogin]
Registration
- User submits email (and optional referral code).
- System sends a time-limited link to that email.
- User opens the link, sets password, and completes account creation.
- If auto-login is enabled, user enters with a live session immediately.
Sign-in options
- Email + password
- OAuth provider login
- MFA challenge when enabled
For enterprise UX, provide clear fallbacks: “Try another login method” and “Reset password.”
Password reset
- User requests reset email.
- User opens time-limited link.
- User sets new password.
- Existing sessions are invalidated for security.
This is expected behavior and should be explained on the reset success screen.
Session behavior
- Access remains active through access and refresh token lifecycle.
- Users can refresh sessions and continue without full re-login until refresh lifetime ends.
- Security-sensitive changes can require re-verification.
What customers cannot do
- Use expired or already-used email links
- Keep old sessions alive after a successful password reset
- Remove their last remaining login method (at least one method must remain)
- Bypass MFA once it is required for their account actions
Purchasing a Plan
This flow describes how customers browse plans, create orders, pay, and receive service.
What customers can do
- View plans available to their account tier
- Apply valid coupons before checkout
- Pay by supported gateway methods or account wallet
- Cancel unpaid orders
- Track order status until service delivery
End-to-end purchase flow
flowchart TD
browse[BrowsePlans] --> coupon[OptionalApplyCoupon]
coupon --> create[CreateOrder]
create --> payMethod{ChoosePaymentMethod}
payMethod -->|Gateway| gatewayPay[ExternalGatewayPayment]
payMethod -->|Wallet| walletPay[WalletPayment]
gatewayPay --> paid[OrderMarkedPaid]
walletPay --> paid
paid --> delivery[PackageDelivery]
delivery --> active[ServiceBecomesActive]
Plan visibility
Customers only see plans that match their current eligibility (group access and sale status).
This allows the same platform to run multiple customer segments.
Coupon usage
- Coupon is checked when user previews it.
- Coupon is checked again at order creation.
- Final payable amount is locked at order creation.
If a coupon becomes invalid before order creation completes, checkout should fail with a clear message.
Payment paths
Gateway payment
- Customer is redirected to a payment page.
- System confirms callback from provider.
- Order becomes paid after verification.
Wallet payment
- Uses available wallet balance only.
- Payment and order update happen atomically.
- Best UX for repeat customers with balance.
Order lifecycle
Unpaid: waiting for paymentPaid: payment confirmed, waiting for service deliveryDelivered: service package assignedCancelled: unpaid order cancelled by user, admin, or timeout
Operational notes
- Unpaid orders are auto-cancelled after configured timeout.
- There is a limit on how many unpaid orders one user can hold.
- After successful payment, frontend should poll order status until delivered.
What customers cannot do
- Use coupons outside validity rules (time window, limits, minimum amount)
- Pay with wallet if available balance is insufficient
- Recover a cancelled order (must create a new order)
- Force immediate delivery if backend delivery queue is delayed
Using the VPN Service
This flow explains how customers receive and use VPN access after purchase.
What customers can do
- Get personal subscription links
- Import configs into supported proxy clients
- Use nodes allowed by their active package
- Queue future packages for automatic activation
- Monitor usage and package consumption status
Access flow
flowchart TD
paidOrder[OrderPaid] --> queue[PackageAddedToQueue]
queue --> activate{HasActivePackage}
activate -->|No| firstActive[ActivateFirstPackage]
activate -->|Yes| stayQueued[RemainInQueue]
firstActive --> nodeAccess[NodeAccessEnabled]
stayQueued --> laterActivate[AutoActivateWhenCurrentConsumed]
nodeAccess --> subscribe[GenerateSubscribeConfig]
Subscription links
Each user has a unique subscription token used to fetch client configuration from a subscribe URL.
- URL can be opened directly in compatible clients
- Client type can be auto-detected or user-selected
- Generated config is filtered by user package permissions
Supported client families
- Clash ecosystem
- V2Ray-compatible clients
- Sing-box ecosystem
- iOS-focused clients such as QuantumultX, Loon, Surfboard, and Surge
Package lifecycle
InQueue: purchased, waitingActive: currently providing serviceConsumed: completed by time or traffic usageCancelled: removed due to cancellation/refund action
Only one package is active per user at a time. Additional packages wait and activate automatically in order.
Node access rules
Active package policy decides which nodes are visible to the customer.
If no package is active, node list and effective service access are empty.
What customers cannot do
- Use nodes outside their active package group permissions
- Activate multiple packages at the same time on one account
- Use subscription token from another account
- Keep access after package is fully consumed without another queued package
Wallet and Payments
This flow explains how customers use internal wallet balance, gift cards, and payment history.
What customers can do
- View available and frozen balances
- Redeem valid gift cards into wallet
- Pay eligible orders from available balance
- View balance change history for audit and support
Balance model
- Available balance: spendable for order payment
- Frozen balance: temporarily locked and not spendable
Both are shown as part of one wallet account per user.
Gift card redemption
- User enters gift card code.
- System validates card status (exists, unused, unexpired).
- Value is deposited into available balance.
- Card is marked redeemed and cannot be reused.
Paying orders with wallet
- User selects wallet payment.
- System checks available balance.
- Amount is deducted and order moves to paid status in one transaction.
- Payment record appears in balance history.
Balance history
History is used for:
- customer transparency
- support troubleshooting
- finance reconciliation
Each record contains amount change, change type, reason, and timestamp.
What customers cannot do
- Spend frozen balance
- Redeem an expired or already-used gift card
- Partially bypass payment rules with insufficient available balance
- Directly edit or delete wallet history records
Referral Program
This flow explains how customers invite others and earn referral rewards.
What customers can do
- Create invite codes (within configured limits)
- Share invite codes with prospective users
- Earn rewards when referred users complete qualifying purchases
- Track referral performance and withdraw available rewards to wallet
Referral flow
flowchart TD
code[CreateInviteCode] --> share[ShareCode]
share --> signup[NewUserRegistersWithCode]
signup --> purchase[ReferredUserCompletesQualifyingPayment]
purchase --> reward[CommissionCalculated]
reward --> stats[AffiliateStatsUpdated]
stats --> withdraw[WithdrawToWallet]
Reward principles
- Reward is based on a configured commission rate.
- Each referred user can trigger rewards only up to a configured number of times.
- Rewards are tracked as available amount until withdrawn.
Withdrawal
When user withdraws referral rewards:
- System checks withdrawable amount.
- Referral stats are updated.
- Wallet available balance is credited.
This should appear in both referral records and wallet history.
What customers cannot do
- Create unlimited invite codes
- Earn commission from non-qualifying payments (for example, restricted methods per policy)
- Withdraw more than currently available reward
- Change historical referral relation records
Support and Notifications
This flow explains how customers communicate with support teams and receive platform announcements.
What customers can do
- Open support tickets with priority and description
- Continue two-way conversations with support staff
- Track ticket status changes
- Read targeted announcements
- Configure notification preferences (such as email categories)
Ticket lifecycle
flowchart TD
open[UserCreatesTicket] --> waitingAdmin[StatusOpen]
waitingAdmin --> adminReply[AdminReplies]
adminReply --> waitingUser[StatusPending]
waitingUser --> userReply[UserReplies]
userReply --> waitingAdmin
waitingAdmin --> resolved[AdminMarksResolved]
resolved --> closed[AdminOrUserClosesTicket]
Ticket handling rules
- Ticket belongs to the user who opened it.
- Both sides can communicate while ticket is open or pending.
- Closed tickets are archived for workflow completion.
Announcements
Announcements are broadcast messages targeted by user segment and priority.
- Pinned announcements are shown first.
- Priority helps users identify urgency.
- Users only see announcements for groups they belong to.
Notification preferences
Users can control notification settings for available categories (for example: security, marketing, service reminders) based on your enterprise policy.
What customers cannot do
- Access tickets that belong to another user
- Continue sending messages on closed tickets
- See announcements outside their targeting scope
- Edit admin-authored support messages
Admin Getting Started
This flow explains how enterprises set up administrative access and role boundaries.
What admins can do
- Onboard administrators through invitation flow
- Authenticate with admin credentials and access tokens
- Assign role-based permissions
- Rotate and manage access credentials
- Audit administrative actions
Admin onboarding flow
flowchart TD
invite[CreateAdminInvitation] --> accept[AdminAcceptsInvitation]
accept --> register[AdminCompletesRegistration]
register --> issueKey[CredentialIssued]
issueKey --> login[AdminLogin]
login --> access[UseRoleBasedAdminFunctions]
Role model (recommended baseline)
| Role | Typical responsibility | Write capability |
|---|---|---|
| SuperAdmin | Platform ownership and high-risk changes | Full |
| Moderator | Daily operations and catalog management | Broad |
| CustomerSupport | Customer issue handling | Limited |
| SupportBot | Automated read-oriented workflows | Minimal/None |
Enterprises should map these roles to internal SOPs before production launch.
Good operational practices
- Use separate admin accounts per human operator.
- Rotate credentials on a fixed schedule.
- Keep support and platform ownership roles separate.
- Review audit records regularly.
What admins cannot do
- Use invitation links after they are consumed or expired
- Exceed role permissions assigned by policy
- Safely share one admin credential across multiple operators
- Skip audit and governance requirements for sensitive actions
Admin Product Management
This flow explains how enterprise teams manage plans, coupons, and gift-card operations.
What admins can do
- Build and maintain saleable plan catalog
- Update package offerings through controlled versioning
- Manage coupon campaigns and constraints
- Generate and distribute gift cards
- Keep existing customer entitlements stable during catalog updates
Package and product principles
- Products are what customers buy.
- Packages define service entitlements (traffic, duration, access scope).
- Package series allow new versions while preserving old purchased terms.
This protects existing subscribers from unexpected entitlement changes.
Product operation flow
flowchart TD
design[DefineOffer] --> packageVersion[CreateOrUpdatePackageVersion]
packageVersion --> bindProduct[BindProductToPackageSeries]
bindProduct --> publish[PublishForSale]
publish --> monitor[MonitorSalesAndUsage]
monitor --> iterate[IterateNextVersion]
Coupon campaign management
Admins can configure:
- discount type (percentage or fixed amount)
- valid time window
- per-user and global usage limits
- active/inactive lifecycle
Use preview messaging in frontend so customers understand why a coupon fails.
Gift card operations
- Bulk generation for campaigns
- Special-code generation for branded promotions
- Expiration control for liability management
- Support lookup for troubleshooting redemption issues
What admins cannot do
- Change already-delivered customer package terms retroactively
- Reuse a currently active coupon code conflict without deactivation strategy
- Re-redeem single-use gift cards
- Treat product updates as immediate entitlement changes for historical orders
Admin Customer Operations
This flow explains daily customer-facing operations for support and operations teams.
What admins can do
- Search and inspect customer account state
- Ban or unban users according to policy
- Help recover account access (including MFA recovery operations)
- Review order and payment state for troubleshooting
- Apply controlled balance adjustments with documented reasons
- Perform manual order interventions where policy allows
Typical support workflow
flowchart TD
issue[CustomerIssueReported] --> identify[FindUserAndValidateIdentity]
identify --> diagnose[CheckAccountOrderWalletState]
diagnose --> action{NeedAdminAction}
action -->|No| guidance[ProvideCustomerGuidance]
action -->|Yes| execute[ExecuteAuthorizedAdminOperation]
execute --> audit[RecordAuditAndSupportNote]
guidance --> close[CloseSupportCase]
audit --> close
Core operations
Account controls
- Ban/unban user based on abuse, compliance, or security policy
- Remove blocked authentication factors during verified recovery
Wallet controls
- Deposit: credit balance
- Consume: deduct balance
- Freeze: move spendable funds into hold
- Unfreeze: release held funds
Every balance adjustment should include clear human-readable reason text.
Order controls
- Check unpaid/paid/delivered status
- Assist with payment confirmation disputes
- Use manual paid marking only under approved internal process
- Handle partial compensation using approved balance adjustment process
What admins cannot do
- Perform actions outside assigned role permissions
- Adjust funds without reason and audit trace
- Access or modify customer credentials directly
- Use manual interventions as a substitute for payment controls and reconciliation policy
Admin VPN Infrastructure Operations
This flow explains how infrastructure teams run node capacity, routing quality, and delivery readiness.
What admins can do
- Register and maintain node servers
- Configure node clients and protocol offerings
- Control node availability and maintenance state
- Monitor node health, usage, and quality trends
- Observe package queue delivery and activation behavior
Infrastructure operation flow
flowchart TD
onboard[OnboardNodeServer] --> config[ConfigureNodeClientProfiles]
config --> publish[ExposeEligibleNodesByGroup]
publish --> monitor[MonitorHealthAndTraffic]
monitor --> maintain[MaintenanceOrCapacityAdjustment]
maintain --> monitor
Node operations
- Keep node metadata accurate (location, route class, capacity intent).
- Validate protocol configuration before publishing.
- Use maintenance states to protect customer experience during changes.
Access and delivery relationship
- Customer package policy controls which nodes users can access.
- Package queue health affects when customers become active on service.
- Infrastructure and commercial operations must coordinate release windows.
Observability checklist
- node online/offline trends
- abnormal traffic spikes
- package activation lag
- failed delivery or queue backlog
- regional quality degradation
What admins cannot do
- Expose nodes to customers without matching package access scope
- Ignore maintenance signaling during disruptive changes
- Treat node-client configuration and package policy as independent concerns
- Assume delivery is healthy without queue and activation monitoring
Microservices Architecture
Helium is built as a collection of focused microservices that cooperate through a shared set of contracts, messaging patterns, and observability tooling. This section introduces the high-level layout of the system, explains how the services interact, and highlights the infrastructure choices that enable the platform to scale for large commercial VPN deployments.
Architectural Goals
- Independent scaling – Each service can be deployed and scaled based on its workload characteristics (API traffic, background jobs, email throughput, etc.).
- Clear boundaries – Services expose well-defined APIs (gRPC, REST, AMQP events) and depend on shared libraries for cross-cutting concerns, ensuring that business logic remains isolated inside its module.
- Operational resiliency – Stateless services, database connection pooling, and message queues allow resilient deployments with graceful failure handling.
- Security by design – Rust, strict processor patterns, and zero shared mutable state within processes prevent memory safety issues and accidental privilege escalations.
Service Topology
The helium-server crate can run in multiple worker modes. Each mode is
packaged into its own container image or deployment unit, providing a natural
microservice boundary while reusing the same codebase and shared libraries.
| Worker | Entry Point | Responsibilities |
|---|---|---|
grpc | GrpcWorker | Exposes gRPC APIs for all business domains (Auth, Manage, Telecom, Market, Shop, Support, etc.). Performs request validation, invokes the corresponding module services, and emits events. |
subscribe_api | SubscribeApiWorker | Provides REST endpoints optimized for subscription clients. Primarily a read-heavy facade backed by Redis caching and the service layer. |
webhook_api | WebHookApiWorker | Receives payment gateway callbacks and external partner webhooks, normalizes payloads, and dispatches workflow events. |
consumer | ConsumerWorker | Listens on AMQP queues for asynchronous jobs (billing, node updates, provisioning) emitted by other services. Orchestrates long-running tasks that should not block API responses. |
mailer | MailerWorker | Specialized consumer responsible for templated email delivery, retry management, and transactional messaging. |
cron_executor | CronWorker | Periodically scans for scheduled work (subscription renewals, quota resets, health checks) and dispatches jobs via the same service layer used by the API workers. |
These workers are deployed independently and scaled according to throughput requirements. For example, a busy billing period can scale the consumer and cron workers without affecting the gRPC API footprint.
Domain-Oriented Modules
Each domain (Auth, Manage, Telecom, Shop, Market, Notification, Support, Shield,
Mailer) is implemented as an independent module under modules/. Modules follow
a common layout (entities, services, rpc, hooks, events) as described in the
Project Structure Guide. Within the microservices
architecture:
- Modules provide service layer processors that encapsulate business logic.
- RPC layers expose the processors through gRPC servers. The
GrpcWorkeraggregates these services and mounts them behind a single TLS termination point, while keeping module ownership intact. - Hooks and events enable cross-module interactions without tight coupling, allowing, for instance, the Telecom module to emit usage events consumed by the billing logic in the Manage module.
Communication Patterns
Helium combines synchronous APIs with asynchronous messaging to balance latency and resiliency.
gRPC Contract
- Tonic-generated servers provide strongly typed interfaces for customer-facing and operator APIs.
- A uniform Processor trait ensures every RPC delegate is testable in isolation and can be reused by background workers.
- Service discovery is handled at the infrastructure layer (Kubernetes or Docker Compose) because workers are stateless; clients load-balance using standard mechanisms (Envoy, NGINX, etc.).
REST Facades
- Subscription and webhook workers expose lightweight REST routes via Axum.
- REST APIs reuse the same service processors, ensuring identical business behavior across protocols and simplifying versioning.
Asynchronous Messaging
- RabbitMQ (AMQP) is used to propagate domain events and dispatch background jobs.
- Producers append metadata (correlation IDs, tenant identifiers) to support observability and reliable retries.
- Consumers acknowledge messages only after successful processing, preventing data loss during failures.
Data Management
- PostgreSQL is the system of record. SQLx is used through the
DatabaseProcessorabstraction to keep SQL isolated insideentities/modules and to support compile-time query checking. - Redis provides ephemeral caches, session storage, and rate limiting. The
RedisConnectionwrapper fromhelium-frameworkmanages pooled connections shared by API and worker processes. - Consistent migrations live in the top-level
migrations/directory and are applied during deployment. Services run with zero shared mutable state; all coordination happens through the database or message queues.
Observability & Operations
- Tracing is initialized in every worker with structured logs and span annotations. This enables distributed tracing across API and background workloads when combined with log collectors.
- Metrics exporters (e.g., Prometheus integration) can be attached at the deployment layer because each worker exposes a predictable Axum/Tonic server endpoint.
- Health probes: gRPC and REST workers perform dependency checks on startup (database, Redis, AMQP). Container orchestrators can use readiness/liveness probes to restart unhealthy instances.
Deployment Model
- Workers are packaged as lightweight containers (<50MB RSS) and designed to be horizontally scalable. Scaling policies are set per worker depending on CPU or queue length metrics.
- Configuration is provided through environment variables (
DATABASE_URL,MQ_URL,REDIS_URL,WORK_MODE, etc.), making the platform 12-factor compliant. - Infrastructure typically consists of:
- Kubernetes/Docker orchestrating the worker deployments
- Managed PostgreSQL and Redis services
- RabbitMQ cluster for messaging
- Optional CDN or reverse proxy terminating TLS before forwarding requests to gRPC/REST workers.
Extensibility
Adding a new capability follows a repeatable pattern:
- Create or extend a module under
modules/with the Processor-based service implementation. - Expose the functionality via RPC/REST by wiring the service into the relevant worker.
- Emit domain events or enqueue background jobs when work must be processed asynchronously.
- Deploy the updated worker image; other workers continue functioning without redeployment because contracts are versioned explicitly.
This approach keeps Helium maintainable while providing the flexibility to grow with complex VPN SaaS requirements.
Helium Project Structure Guide
This document describes the modular architecture and organization of the Helium VPN SaaS system.
Project Overview
Helium is a modern VPN SaaS system built with Rust, organized as a workspace with multiple modules. The system follows a modular architecture where each module represents a specific business domain.
Module Architecture
Each module follows a consistent internal structure with standardized components:
1. Entity Layer (entities/)
Purpose: Data models and database access patterns
Structure:
entities/
├── mod.rs # Module exports
├── db/ # Database entity processors
│ ├── mod.rs
│ ├── user_account.rs # User account queries/commands
│ └── ...
└── redis/ # Redis entity processors
├── mod.rs
├── session.rs # Session cache operations
└── ...
Key Patterns:
- Implements
Processor<Input, Result<Output, sqlx::Error>>forDatabaseProcessor - Contains all SQL queries and database operations
- Separated by storage backend (db/ for PostgreSQL, redis/ for Redis)
- No business logic - pure data access
2. Service Layer (services/)
Purpose: Business logic orchestration and workflows
Structure:
services/
├── mod.rs # Service exports
├── manage.rs # Management operations
├── user_account.rs # User profile management
└── ...
Key Patterns:
- Implements
Processor<Input, Result<Output, Error>>for service operations - Orchestrates multiple entity operations
- Handles validation, transformation, and business rules
- No direct SQL - delegates to entity processors
- Uses
DatabaseProcessorfor data access
Example:
#[derive(Clone)]
pub struct UserManageService {
pub db: sqlx::PgPool,
}
impl Processor<ListUsersRequest, Result<ListUsersResponse, Error>> for UserManageService {
async fn process(&self, input: ListUsersRequest) -> Result<ListUsersResponse, Error> {
let db = DatabaseProcessor::from_pool(self.db.clone());
let users = db.process(ListUsers { ... }).await?;
Ok(ListUsersResponse { users })
}
}
3. gRPC Layer (rpc/)
Purpose: gRPC service implementations and external API
Structure:
rpc/
├── mod.rs # RPC exports
├── auth_service.rs # Authentication gRPC service
├── manage_service.rs # Management gRPC service
├── middleware.rs # gRPC middleware
└── ...
Key Patterns:
- Implements generated gRPC trait definitions
- Converts protobuf messages to service DTOs
- Delegates to service layer via
Processor::process - Handles authentication and authorization
4. Hook System (hooks/)
Purpose: Event-driven side effects and integrations
Structure:
hooks/
├── mod.rs # Hook exports
├── billing.rs # Billing event hooks
├── register.rs # Registration hooks
└── ...
Key Patterns:
- Responds to domain events
- Handles cross-module integrations
- Implements side effects (notifications, external API calls)
- Decoupled from main business flows
5. Event System (events/)
Purpose: Domain event definitions and publishing
Structure:
events/
├── mod.rs # Event exports
├── user.rs # User-related events
├── order.rs # Order events
└── ...
Key Patterns:
- Defines domain events using message queue integration
- Publishes events for cross-module communication
- Enables audit trails and analytics
- Supports eventual consistency patterns
6. API Layer (api/)
Purpose: REST API endpoints and HTTP handlers
Structure:
api/
├── mod.rs # API exports
├── subscribe.rs # Subscription endpoints
└── xrayr/ # XrayR integration APIs
├── mod.rs
└── ...
Key Patterns:
- Implements REST endpoints using Axum
- Handles HTTP-specific concerns (parsing, serialization)
- Delegates to service layer
- Provides alternative to gRPC for specific use cases
7. Cron Jobs (cron.rs)
Purpose: Scheduled tasks and background jobs
Key Patterns:
- Implements periodic maintenance tasks
- Handles cleanup operations
- Manages recurring billing cycles
- Executes system health checks
8. Testing (tests/)
Purpose: Integration and unit tests
Structure:
tests/
├── common/ # Test utilities
│ └── mod.rs # Common test setup
├── service_name_test.rs # Service integration tests
└── ...
Key Patterns:
- Integration tests for complete workflows
- Uses testcontainers for database testing
- Isolated test environments
- Comprehensive service testing
Module Configuration
Dependencies (Cargo.toml)
Each module declares:
- Workspace dependencies (shared versions)
- Inter-module dependencies
- Module-specific dependencies
- Dev dependencies for testing
- Build dependencies (typically
tonic-prost-buildfor gRPC)
Build Configuration (build.rs)
Most modules include build scripts for:
- gRPC code generation from proto files
- Custom compilation steps
- Environment-specific builds
Module Entry Point (lib.rs)
Standard module structure:
#![forbid(clippy::unwrap_used)]
#![forbid(unsafe_code)]
#![deny(clippy::expect_used)]
#![deny(clippy::panic)]
pub mod config;
pub mod cron;
pub mod entities;
pub mod events;
pub mod hooks; // Optional
pub mod api; // Optional
pub mod rpc;
pub mod services;
Protocol Buffers (proto/)
Organization: Organized by module with consistent naming:
proto/
├── auth/
│ ├── auth.proto # Core auth services
│ ├── account.proto # Account management
│ └── manage.proto # Admin operations
├── telecom/
│ ├── telecom.proto # VPN services
│ └── manage.proto # Telecom management
└── ...
Patterns:
- Service definitions mirror module structure
- Consistent message naming conventions
- Shared types in common proto files
Key Architectural Principles
1. Processor Pattern
All APIs use the kanau::processor::Processor trait for consistent interfaces and composability.
2. Separation of Concerns
- Entities: Data access only
- Services: Business logic only
- RPC/API: Protocol handling only
- Events/Hooks: Side effects only
3. Database Abstraction
Services never contain raw SQL - all database access goes through entity processors.
4. Event-Driven Architecture
Modules communicate via events to maintain loose coupling.
5. RBAC and Audit
Administrative operations implement consistent role-based access control and audit logging.
Development Guidelines
Adding a New Module
- Create module directory under
modules/ - Add basic
Cargo.tomlwith workspace dependencies - Create
src/lib.rswith standard module structure - Add module to workspace
Cargo.toml - Create proto definitions if gRPC services needed
- Implement entities → services → rpc layers in order
Testing Strategy
- Unit tests for complex business logic in services
- Integration tests in
tests/directory - Use
testcontainer-helium-modulesfor database tests - Mock external dependencies
- Test error handling paths
Documentation Standards
- Document all public APIs
- Include examples for complex workflows
- Maintain this guide as modules evolve
- Document breaking changes in module changelogs
This modular architecture enables independent development, testing, and deployment of features while maintaining system coherence through standardized patterns and interfaces.
helium-server Crate
The Helium server is designed as a multi-mode worker system that can run different components independently or together, enabling flexible deployment strategies. Each worker mode serves a specific purpose in the overall system architecture.
Architecture
Worker Modes
The server supports six distinct worker modes:
| Worker Mode | Port | Description | Use Case |
|---|---|---|---|
grpc | 50051 | gRPC API server | Main API for client applications and admin panels |
subscribe_api | 8080 | RESTful subscription API | Public subscription endpoints |
webhook_api | 8081 | RESTful webhook handler | Payment provider callbacks, third-party integrations |
consumer | - | Background message consumer | Processing async tasks from message queue |
mailer | - | Email service worker | Sending emails and notifications |
cron_executor | - | Scheduled task executor | Running periodic maintenance tasks |
Dependencies
The server requires three core infrastructure components:
- PostgreSQL: Primary database for persistent data
- Redis: Caching, session storage, and temporary data
- RabbitMQ (AMQP): Message queuing for async processing
Module Integration
The server integrates all business logic modules:
- auth: Authentication and authorization
- shop: E-commerce and billing
- telecom: VPN node management and traffic handling
- market: Affiliate and marketing systems
- notification: Announcements and messaging
- support: Customer support tickets
- manage: Administrative functions
- shield: Security and anti-abuse measures
Deployment Guide
Prerequisites
- PostgreSQL, Redis, and RabbitMQ servers accessible
- SQLx CLI:
cargo install sqlx-cli --no-default-features --features postgres - Environment variables configured (see below)
Environment Configuration
The server is configured entirely through environment variables:
Required Variables
# Worker mode selection
WORK_MODE="grpc" # or subscribe_api, webhook_api, consumer, mailer, cron_executor
# Database connection
DATABASE_URL="postgres://user:password@localhost/helium_db"
# Message queue connection
MQ_URL="amqp://user:password@localhost:5672/"
# Redis connection
REDIS_URL="redis://localhost:6379"
Optional Variables
# Server listen address (for API workers)
LISTEN_ADDR="0.0.0.0:50051" # grpc mode default
LISTEN_ADDR="0.0.0.0:8080" # subscribe_api mode default
LISTEN_ADDR="0.0.0.0:8081" # webhook_api mode default
# Cron executor scan interval (seconds)
SCAN_INTERVAL="60" # cron_executor mode only
# OpenTelemetry Collector endpoint (optional, for observability)
OTEL_COLLECTOR="http://otel-collector:4317" # See Observability guide
Note: For comprehensive observability with distributed tracing and metrics, see the Observability with OpenTelemetry guide.
Database Migration
⚠️ CRITICAL: Database migrations must be run before starting the application.
# Install SQLx CLI
cargo install sqlx-cli --no-default-features --features postgres
# Apply all pending migrations
sqlx migrate run --database-url $DATABASE_URL
# Verify migration status
sqlx migrate info --database-url $DATABASE_URL
Basic Deployment
Running the Server
# Apply database migrations first
sqlx migrate run --database-url $DATABASE_URL
# Start the server with desired worker mode
WORK_MODE=grpc ./helium-server
Multiple Worker Deployment
For production, run different worker modes as separate processes:
# Terminal 1: Main gRPC API
WORK_MODE=grpc ./helium-server
# Terminal 2: Background consumer
WORK_MODE=consumer ./helium-server
# Terminal 3: Email worker
WORK_MODE=mailer ./helium-server
# Terminal 4: Cron jobs
WORK_MODE=cron_executor ./helium-server
Logging
The server uses structured logging:
# Enable debug logging
RUST_LOG=debug ./helium-server
# Production logging (default)
RUST_LOG=info ./helium-server
Developer Guide
Project Structure
server/
├── Cargo.toml # Dependencies and metadata
├── src/
│ ├── main.rs # Entry point and startup logic
│ ├── worker/ # Worker mode implementations
│ │ ├── mod.rs # Worker configuration and dispatch
│ │ ├── grpc.rs # gRPC server implementation
│ │ ├── consumer.rs # Background message consumer
│ │ ├── mailer.rs # Email service worker
│ │ ├── cron_executor.rs # Scheduled task executor
│ │ ├── subscribe_api.rs # Subscription REST API
│ │ └── webhook_api.rs # Webhook REST API
│ └── hooks/ # Extension points (currently unused)
│ └── mod.rs
Building from Source
# Development build
cd server
cargo build
# Release build (optimized)
cargo build --release
# Run with specific worker mode
WORK_MODE=grpc cargo run
Adding New Worker Modes
- Create worker implementation:
// src/worker/new_worker.rs
pub struct NewWorker {
// worker fields
}
impl NewWorker {
pub async fn initialize(args: YourArgs) -> anyhow::Result<Self> {
// initialization logic
}
pub async fn run(&self) -> anyhow::Result<()> {
// worker main loop
}
}
- Add to worker configuration:
// src/worker/mod.rs
pub enum WorkerArgs {
// existing variants...
NewWorker(YourArgs),
}
impl WorkerArgs {
pub fn load_from_env() -> anyhow::Result<Self> {
match work_mode.as_str() {
// existing modes...
"new_worker" => {
// parse environment variables
Ok(WorkerArgs::NewWorker(args))
}
}
}
pub async fn execute_worker(self) -> anyhow::Result<()> {
match self {
// existing modes...
WorkerArgs::NewWorker(args) => {
let worker = NewWorker::initialize(args).await?;
worker.run().await
}
}
}
}
gRPC Service Development
The gRPC worker automatically integrates all modules. To add new services:
-
Implement your service in the appropriate module (e.g.,
modules/your_module/) -
Add to gRPC worker:
// src/worker/grpc.rs
impl GrpcWorker {
pub async fn initialize(args: GrpcWorkModeArgs) -> Result<Self, anyhow::Error> {
// ... existing initialization ...
let your_service = YourService::new(database_processor.clone());
Ok(Self {
// ... existing fields ...
your_service,
})
}
pub fn server_ready(self) -> Router<...> {
tonic::transport::server::Server::builder()
// ... existing services ...
.add_service(YourServiceServer::new(self.your_service))
}
}
Database Migrations
Database schema is managed through SQLx migrations in the migrations/ directory. When adding new features:
- Create migration files:
# Create new migration
sqlx migrate add your_feature_name
# This creates:
# migrations/TIMESTAMP_your_feature_name.up.sql
# migrations/TIMESTAMP_your_feature_name.down.sql
- Run migrations:
# Apply migrations
sqlx migrate run --database-url $DATABASE_URL
# Revert last migration
sqlx migrate revert --database-url $DATABASE_URL
Testing
# Run all tests
cargo test
# Run specific module tests
cargo test --package helium-server
# Integration tests with database
DATABASE_URL=postgres://test_db cargo test
Performance Considerations
- Memory Usage: Each worker typically uses 40-200MB RAM
- CPU Efficiency: Single-core performance optimized, can handle 1000+ RPS
- Connection Pooling: Database connections are shared across services
- Async Processing: All I/O operations are non-blocking
Troubleshooting
Common Issues
Service won’t start:
# Check environment variables
env | grep -E "(DATABASE_URL|MQ_URL|REDIS_URL|WORK_MODE)"
# Verify database migrations are applied
sqlx migrate info --database-url $DATABASE_URL
Database migration issues:
# Check migration status
sqlx migrate info --database-url $DATABASE_URL
# Force apply migrations (if stuck)
sqlx migrate run --database-url $DATABASE_URL
# Revert last migration if needed
sqlx migrate revert --database-url $DATABASE_URL
# Reset database (CAUTION: destroys all data)
sqlx database reset --database-url $DATABASE_URL
Performance issues:
# Enable request tracing
RUST_LOG=helium_server=trace ./helium-server
# Profile with flamegraph
cargo flamegraph --bin helium-server
Logs and Debugging
# Debug logging
RUST_LOG=debug ./helium-server
# Trace specific modules
RUST_LOG=helium_server::worker::grpc=trace,info ./helium-server
Configuration Validation
Ensure all required environment variables are properly set:
# Validate configuration script
#!/bin/bash
set -e
echo "Validating Helium server configuration..."
# Check required variables
: "${WORK_MODE:?WORK_MODE not set}"
: "${DATABASE_URL:?DATABASE_URL not set}"
: "${MQ_URL:?MQ_URL not set}"
: "${REDIS_URL:?REDIS_URL not set}"
# Validate work mode
case "$WORK_MODE" in
grpc|subscribe_api|webhook_api|consumer|mailer|cron_executor)
echo "✓ Valid WORK_MODE: $WORK_MODE"
;;
*)
echo "✗ Invalid WORK_MODE: $WORK_MODE"
exit 1
;;
esac
# Check if migrations are applied
if command -v sqlx >/dev/null 2>&1; then
if sqlx migrate info --database-url "$DATABASE_URL" | grep -q "pending"; then
echo "⚠ Warning: Pending database migrations found"
echo "Run: sqlx migrate run --database-url $DATABASE_URL"
else
echo "✓ Database migrations are up to date"
fi
else
echo "⚠ Warning: sqlx CLI not found - cannot verify migrations"
echo "Install with: cargo install sqlx-cli --no-default-features --features postgres"
fi
echo "Configuration validation complete!"
External Dependencies
The Helium system requires several external services to function properly. The Helium application itself runs in Docker containers, but the core infrastructure dependencies (PostgreSQL, Redis, RabbitMQ) should be provisioned as external managed services for production deployments.
While some dependencies are core infrastructure requirements, others are module-specific and may be optional depending on your deployment configuration.
Core Infrastructure Dependencies
These dependencies are required for all Helium deployments:
1. PostgreSQL Database
Purpose: Primary data store for all application data Version: PostgreSQL 12+ recommended Configuration:
- Environment variable:
DATABASE_URL - Format:
postgres://user:password@host:port/database - Example:
postgres://helium:password@localhost:5432/helium_db
Database Schema:
- ⚠️ CRITICAL: SQLx migrations must be run before starting the application
- All database schema changes are managed through SQLx migrations in the
/migrationsdirectory - Use
sqlx migrate run --database-url $DATABASE_URLto apply migrations
External Service Requirements:
- NOT containerized - PostgreSQL should run as an external managed service
- Recommended: Use cloud-managed PostgreSQL (AWS RDS, Google Cloud SQL, Azure Database, etc.)
- Alternative: Dedicated PostgreSQL server with proper backup and high availability setup
2. Redis
Purpose: Caching, session storage, and configuration store Version: Redis 6+ recommended Configuration:
- Environment variable:
REDIS_URL - Format:
redis://host:portorredis://user:password@host:port - Example:
redis://localhost:6379
Usage:
- Session management and authentication tokens
- Configuration caching across modules
- Temporary data storage (OAuth challenges, etc.)
External Service Requirements:
- NOT containerized - Redis should run as an external managed service
- Recommended: Use cloud-managed Redis (AWS ElastiCache, Google Memorystore, Azure Cache, etc.)
- Alternative: Dedicated Redis server with persistence and clustering for production
3. RabbitMQ
Purpose: Message queue for asynchronous processing between modules Version: RabbitMQ 3.8+ recommended Configuration:
- Environment variable:
MQ_URL - Format:
amqp://user:password@host:port/ - Example:
amqp://helium:password@localhost:5672/
Usage:
- Inter-module communication
- Background job processing
- Event-driven architecture support
External Service Requirements:
- NOT containerized - RabbitMQ should run as an external managed service
- Recommended: Use cloud-managed message queues (AWS MQ, Google Cloud Pub/Sub, Azure Service Bus)
- Alternative: Dedicated RabbitMQ cluster with proper clustering and high availability
Module-Specific Dependencies
These dependencies are required only when using specific modules:
Auth Module - OAuth Providers (Optional)
Purpose: Social authentication (Google, Microsoft, GitHub, Discord) Required: Only if OAuth authentication is enabled Configuration: Stored in database/Redis configuration
Supported Providers:
- Google OAuth 2.0
- Microsoft Azure AD
- GitHub OAuth
- Discord OAuth
Setup Requirements:
- Create OAuth applications with each provider
- Configure redirect URIs to your Helium deployment
- Store client ID and secret in the system configuration
- Configure OAuth provider settings via the management interface
Configuration Structure:
{
"auth": {
"oauth_providers": {
"providers": [
{
"name": "google",
"client_id": "your-client-id",
"client_secret": "your-client-secret",
"redirect_uri": "https://your-domain.com/auth/oauth/callback"
}
],
"challenge_expiration": "5m"
}
}
}
Mailer Module - SMTP Server (Required for Email)
Purpose: Email delivery for user notifications, verification, etc. Required: When email functionality is needed Configuration: Stored in database/Redis configuration
SMTP Configuration:
{
"mailer": {
"host": "smtp.gmail.com",
"port": 587,
"username": "your-email@gmail.com",
"password": "your-app-password",
"sender": "noreply@your-domain.com",
"starttls": true
}
}
Supported SMTP Features:
- STARTTLS encryption
- Plain authentication
- Custom sender addresses
- HTML email templates
Common SMTP Providers:
- Gmail:
smtp.gmail.com:587(requires app passwords) - Outlook/Hotmail:
smtp-mail.outlook.com:587 - SendGrid:
smtp.sendgrid.net:587 - Mailgun:
smtp.mailgun.org:587 - Amazon SES:
email-smtp.region.amazonaws.com:587
Shop Module - Epay Payment Provider (Required for Payments)
Purpose: Payment processing for e-commerce functionality Required: When payment processing is needed Configuration: Stored in database as epay provider credentials
Epay Provider Setup:
- Register with an Epay-compatible payment provider
- Obtain merchant credentials (PID, Key, Merchant URL)
- Configure webhook endpoints for payment notifications
- Add provider credentials via the management interface
Supported Payment Methods:
- Alipay (
alipay) - WeChat Pay (
wxpay) - USDT cryptocurrency (
usdt)
Configuration Requirements:
{
"shop": {
"epay_notify_url": "https://your-domain.com/api/shop/epay/callback",
"epay_return_url": "https://your-domain.com/payment/success",
"max_unpaid_orders": 5,
"auto_cancel_after": "30m"
}
}
Epay Provider Database Entry:
INSERT INTO epay_provider_credentials (
display_name,
enabled_channels,
key,
pid,
merchant_url
) VALUES (
'My Payment Provider',
['alipay', 'wxpay'],
'your-merchant-key',
1234,
'https://pay.provider.com/submit.php'
);
Development Dependencies
These are required for building and developing the project:
Protocol Buffers Compiler
Purpose: Compiling .proto files for gRPC services
Installation:
- Ubuntu/Debian:
apt-get install protobuf-compiler - macOS:
brew install protobuf - Already included in Docker build process
SQLx CLI
Purpose: Database migration management
Installation: cargo install sqlx-cli --no-default-features --features postgres
Usage:
- Apply migrations:
sqlx migrate run - Create new migration:
sqlx migrate add <name>
Docker/Kubernetes Deployment Considerations
What Should Be Containerized
✅ Containerize:
- Helium server application (
helium-server) - Application-specific components and workers
❌ Do NOT Containerize:
- PostgreSQL - Use external managed database services
- Redis - Use external managed cache services
- RabbitMQ - Use external managed message queue services
Infrastructure Handled by Platform
When deploying with Docker and Kubernetes, these infrastructure concerns are handled by the orchestration platform:
- Load Balancers: Kubernetes ingress controllers handle load balancing
- TLS Certificates: cert-manager or similar tools handle SSL/TLS
- Service Discovery: Kubernetes DNS handles service discovery
- Health Checks: Kubernetes probes handle application health monitoring
- Logging: Container runtime and logging drivers handle log aggregation
Recommended Managed Services by Cloud Provider
AWS:
- PostgreSQL: Amazon RDS for PostgreSQL
- Redis: Amazon ElastiCache for Redis
- RabbitMQ: Amazon MQ for RabbitMQ
Google Cloud:
- PostgreSQL: Cloud SQL for PostgreSQL
- Redis: Memorystore for Redis
- RabbitMQ: Cloud Pub/Sub (alternative) or third-party RabbitMQ
Azure:
- PostgreSQL: Azure Database for PostgreSQL
- Redis: Azure Cache for Redis
- RabbitMQ: Azure Service Bus (alternative) or third-party RabbitMQ
Environment Variables for Containers
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-server
spec:
template:
spec:
containers:
- name: helium-server
image: helium-server:latest
env:
- name: WORK_MODE
value: "grpc"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
Security Considerations
Credentials Management
- Never store credentials in plain text
- Use Kubernetes secrets or similar secure storage
- Rotate credentials regularly
- Use dedicated service accounts with minimal permissions
Network Security
- Database: Restrict access to application subnets only
- Redis: Enable authentication and restrict network access
- RabbitMQ: Use strong passwords and enable TLS
- SMTP: Use app passwords or OAuth tokens when available
OAuth Security
- Use HTTPS for all OAuth redirect URIs
- Validate redirect URI domains strictly
- Use state parameter for CSRF protection (handled automatically)
Troubleshooting
Database Connection Issues
# Test database connectivity
psql $DATABASE_URL -c "SELECT version();"
# Check migration status
sqlx migrate info --database-url $DATABASE_URL
Redis Connection Issues
# Test Redis connectivity
redis-cli -u $REDIS_URL ping
# Check Redis memory usage
redis-cli -u $REDIS_URL info memory
RabbitMQ Connection Issues
# Check queue status
rabbitmqctl list_queues
# Check connection status
rabbitmqctl list_connections
SMTP Testing
The mailer module provides test endpoints and logging to help diagnose SMTP issues. Check application logs for detailed SMTP connection and authentication errors.
Epay Integration Issues
- Verify webhook URLs are accessible from the internet
- Check payment provider’s callback logs
- Ensure merchant credentials are correctly configured
- Validate signature verification in callback processing
Optional Observability Stack
OpenTelemetry & Grafana Stack (Optional)
Purpose: Comprehensive observability with distributed tracing, metrics, and log aggregation
Required: No - completely optional enhancement
Configuration: OTEL_COLLECTOR environment variable
Components:
- OpenTelemetry Collector: Telemetry data collection and routing
- Grafana Tempo: Distributed tracing backend
- Prometheus: Metrics storage and querying
- Grafana Loki: Log aggregation
- Grafana: Unified visualization dashboard
When to Use:
- Production deployments requiring detailed performance analysis
- Multi-instance deployments needing distributed tracing
- Teams requiring centralized observability dashboards
- Troubleshooting complex performance issues
Deployment:
- NOT containerized with application - Deploy as separate Kubernetes workloads or use Grafana Cloud
- Recommended: Deploy Grafana stack in dedicated observability namespace
- Alternative: Use managed services (Grafana Cloud, Datadog, New Relic)
Note: Helium automatically falls back to basic structured logging if OpenTelemetry is not configured. See the comprehensive Observability with OpenTelemetry guide for full setup instructions.
Summary
| Dependency | Required | Purpose | Configuration | Deployment |
|---|---|---|---|---|
| PostgreSQL | Yes | Primary database | DATABASE_URL | External managed service |
| Redis | Yes | Caching/sessions | REDIS_URL | External managed service |
| RabbitMQ | Yes | Message queuing | MQ_URL | External managed service |
| SMTP Server | Conditional | Email delivery | Database config | External service |
| OAuth Providers | Optional | Social auth | Database config | External providers |
| Epay Provider | Conditional | Payment processing | Database config | External service |
| Observability | Optional | Tracing & metrics | OTEL_COLLECTOR | External stack/cloud |
Next Steps: After setting up these dependencies, proceed to the Helium Server Deployment Guide for detailed deployment instructions.
Observability with OpenTelemetry
Helium server includes optional OpenTelemetry (OTel) integration for comprehensive observability. This integration is completely optional — the server will work perfectly fine without it using basic structured logging.
What is OpenTelemetry?
OpenTelemetry provides distributed tracing, metrics collection, and contextual logging for production systems. Use it when:
- Running multiple worker instances requiring distributed tracing
- Need detailed performance analysis and troubleshooting
- Want centralized observability dashboards
Skip it for simple deployments, development environments, or when basic logging is sufficient.
Configuration
Enable OpenTelemetry by setting the OTEL_COLLECTOR environment variable:
export OTEL_COLLECTOR="http://otel-collector:4317"
./helium-server
If not set or initialization fails, the server automatically falls back to basic logging.
Service Names
Each worker mode reports with a distinct service name:
| Worker Mode | Service Name |
|---|---|
grpc | Helium.grpc |
subscribe_api | Helium.subscribe-api |
webhook_api | Helium.webhook-api |
consumer | Helium.consumer |
mailer | Helium.mailer |
cron_executor | Helium.cron-executor |
Recommended Stack: Grafana Observability
For production deployments, we recommend the Grafana observability stack — an open-source, Kubernetes-native solution with unified dashboards for traces, metrics, and logs.
Components
- OpenTelemetry Collector: Receives and routes telemetry
- Grafana Tempo: Distributed tracing storage
- Prometheus: Metrics collection
- Grafana Loki: Log aggregation
- Grafana: Unified visualization
Deployment
Deploy the Grafana stack alongside your Kubernetes cluster:
1. Add Helm Repositories
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
2. Create Namespace
kubectl create namespace observability
3. Deploy OpenTelemetry Collector
Create otel-collector-values.yaml:
mode: deployment
config:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 10s
send_batch_size: 1024
exporters:
# Traces to Tempo
otlp/tempo:
endpoint: tempo.observability.svc.cluster.local:4317
tls:
insecure: true
# Metrics to Prometheus
prometheus:
endpoint: 0.0.0.0:8889
namespace: helium
# Logs to Loki
loki:
endpoint: http://loki.observability.svc.cluster.local:3100/loki/api/v1/push
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/tempo]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]
logs:
receivers: [otlp]
processors: [batch]
exporters: [loki]
ports:
otlp-grpc:
enabled: true
containerPort: 4317
servicePort: 4317
protocol: TCP
otlp-http:
enabled: true
containerPort: 4318
servicePort: 4318
protocol: TCP
metrics:
enabled: true
containerPort: 8889
servicePort: 8889
protocol: TCP
helm install otel-collector grafana/opentelemetry-collector \
--namespace observability \
--values otel-collector-values.yaml
4. Deploy Tempo, Loki, and Prometheus
# Tempo for traces
helm install tempo grafana/tempo \
--namespace observability \
--set tempo.receivers.otlp.protocols.grpc.endpoint=0.0.0.0:4317
# Loki for logs
helm install loki grafana/loki-stack \
--namespace observability \
--set loki.enabled=true \
--set promtail.enabled=false
5. Deploy Prometheus
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace observability \
--set grafana.enabled=false
6. Deploy Grafana
helm install grafana grafana/grafana \
--namespace observability \
--set adminPassword=changeme
Configure data sources in Grafana to connect Tempo, Prometheus, and Loki.
Troubleshooting
Server logs show “Failed to initialize OpenTelemetry”
Check that the OTel Collector is reachable at the configured endpoint. The server will automatically fall back to basic logging.
Missing traces in Grafana
Verify the data pipeline: Helium → OTel Collector → Tempo. Check logs at each stage.
Performance impact
OpenTelemetry adds minimal overhead: < 2% CPU, ~10-20MB memory, < 1ms latency per request.
Disabling OpenTelemetry
Simply unset the OTEL_COLLECTOR variable — the server automatically falls back to basic logging.
Summary
OpenTelemetry in Helium is completely optional:
- Set
OTEL_COLLECTORto enable, leave unset to use basic logging - Automatic fallback if initialization fails
- Recommended for production with multiple instances
- Grafana stack provides open-source, Kubernetes-native observability
For detailed Helm deployment configurations, refer to the official Grafana Helm charts documentation.
Health Checks for Kubernetes
Helium server provides HTTP health check endpoints designed for Kubernetes liveness and readiness probes. These endpoints run on a separate internal port (default: 9090) and are enabled for all worker modes.
Overview
Health checks help Kubernetes determine:
- Liveness: Is the container alive and should it be restarted if it becomes unresponsive?
- Readiness: Is the container ready to handle requests?
Helium implements both probe types on a dedicated HTTP server that runs alongside each worker mode.
Endpoints
Liveness Probe: /healthz
Returns 200 OK with a JSON response if the server is running:
{
"status": "ok"
}
This endpoint always returns success if the health check server is responding. Kubernetes uses this to determine if the container should be restarted.
Readiness Probe: /readyz
Checks connectivity to all dependencies before returning status:
Success Response (200 OK):
{
"status": "ok",
"database": "ok",
"redis": "ok",
"rabbitmq": "ok"
}
Failure Response (503 Service Unavailable):
{
"status": "error",
"database": "ok",
"redis": "error",
"rabbitmq": "ok",
"error": "Redis error: Connection refused"
}
The readiness probe checks:
- PostgreSQL: Executes a simple query (
SELECT 1) - Redis: Sends a
PINGcommand - RabbitMQ: Validates connection pool status
All worker modes check the same three dependencies.
Configuration
Health Check Port
Set the HEALTH_CHECK_PORT environment variable to customize the port (default: 9090):
export HEALTH_CHECK_PORT=9090
This port should be:
- Internal only: Not exposed to external traffic
- Accessible by Kubernetes: For probe requests
- Different from main service ports: To avoid conflicts
Worker Modes
Health checks are available in all worker modes:
| Worker Mode | Main Port | Health Check Port | Dependencies Checked |
|---|---|---|---|
grpc | 50051 | 9090 | Database, Redis, RabbitMQ |
subscribe_api | 8080 | 9090 | Database, Redis, RabbitMQ |
webhook_api | 8081 | 9090 | Database, Redis, RabbitMQ |
consumer | N/A | 9090 | Database, Redis, RabbitMQ |
mailer | N/A | 9090 | Database, Redis, RabbitMQ |
cron_executor | N/A | 9090 | Database, Redis, RabbitMQ |
Kubernetes Deployment
Example Pod Configuration
Here’s how to configure health checks in your Kubernetes deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-grpc
spec:
replicas: 3
selector:
matchLabels:
app: helium-grpc
template:
metadata:
labels:
app: helium-grpc
spec:
containers:
- name: helium-grpc
image: helium-server:latest
env:
- name: WORK_MODE
value: "grpc"
- name: LISTEN_ADDR
value: "0.0.0.0:50051"
- name: HEALTH_CHECK_PORT
value: "9090"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: mq-url
ports:
- name: grpc
containerPort: 50051
protocol: TCP
- name: health
containerPort: 9090
protocol: TCP
livenessProbe:
httpGet:
path: /healthz
port: health
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /readyz
port: health
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 5
failureThreshold: 3
Probe Configuration Guidelines
Liveness Probe:
initialDelaySeconds: 10-30 seconds (allow time for startup)periodSeconds: 10-30 seconds (check periodically)timeoutSeconds: 5 secondsfailureThreshold: 3 (restart after 3 consecutive failures)
Readiness Probe:
initialDelaySeconds: 5-10 seconds (faster than liveness)periodSeconds: 5-10 seconds (check more frequently)timeoutSeconds: 5 secondsfailureThreshold: 3 (mark unready after 3 failures)
Service Configuration
For API worker modes (grpc, subscribe_api, webhook_api), configure a Service:
apiVersion: v1
kind: Service
metadata:
name: helium-grpc
spec:
type: ClusterIP
ports:
- name: grpc
port: 50051
targetPort: grpc
protocol: TCP
selector:
app: helium-grpc
Note: The health check port (9090) is not exposed in the Service. It’s only for Kubernetes probes.
Worker Mode Behavior
API Modes (grpc, subscribe_api, webhook_api)
For API modes, the health check server runs alongside the main API server:
- When the main server exits, the health check server is immediately terminated
- Process exits when either server fails
- Ensures no “zombie” containers serving health checks without handling requests
Background Worker Modes (consumer, mailer, cron_executor)
For background workers, the health check server runs continuously:
- Liveness probe confirms the worker process is alive
- Readiness probe ensures dependencies are accessible
- Worker loops indefinitely alongside health check server
Troubleshooting
Health Check Server Not Starting
Symptom: Probes fail immediately with connection errors
Solutions:
- Check logs for health check server errors
- Verify
HEALTH_CHECK_PORTis not already in use - Ensure the port is accessible within the pod
Readiness Probe Failing
Symptom: Pod remains in “Not Ready” state
Solutions:
- Check which dependency is failing in the
/readyzresponse - Verify connection strings (DATABASE_URL, REDIS_URL, MQ_URL)
- Ensure network policies allow pod access to dependencies
- Check if dependencies are healthy
Example debugging:
# Forward health check port to local machine
kubectl port-forward pod/helium-grpc-xyz 9090:9090
# Check readiness endpoint
curl http://localhost:9090/readyz
Liveness Probe Causing Restart Loop
Symptom: Pod repeatedly restarts with liveness probe failures
Solutions:
- Increase
initialDelaySeconds(worker may need more startup time) - Increase
failureThreshold(allow more failures before restart) - Check if worker is deadlocked or stuck (examine logs before restart)
Worker Exits But Pod Stays Running
Symptom: Container appears healthy but doesn’t process requests
This should not happen with the current implementation:
- API workers: Health check is aborted when main server exits
- Background workers: Return from
execute_worker()causes process exit
If this occurs, file a bug report.
Security Considerations
Port Exposure
The health check port (9090) should never be exposed externally:
- Don’t create Ingress rules for health check endpoints
- Don’t expose the health check port in the Service definition
- Use network policies to restrict access to Kubernetes control plane only
Sensitive Information
Health check responses contain minimal information:
- No version numbers
- No internal IPs or hostnames
- No authentication tokens
- Only dependency status (ok/error)
Error messages may contain connection details. Ensure logs are secured appropriately.
Best Practices
- Use separate ports: Never combine health checks with main service endpoints
- Set appropriate timeouts: Balance between quick detection and false positives
- Monitor probe metrics: Track probe success rates in your observability stack
- Test locally: Use port-forwarding to verify health checks before deployment
- Align with dependencies: If using a sidecar proxy (Istio, Linkerd), configure startup probes
Summary
Helium’s health check endpoints provide robust Kubernetes integration:
- Liveness probe (
/healthz): Detects unresponsive containers - Readiness probe (
/readyz): Ensures dependencies are healthy - Separate port (default 9090): Isolated from main services
- All worker modes: Consistent behavior across deployment types
- Process lifecycle: Ensures clean exits, no zombie containers
Configure these probes in your Kubernetes deployments to enable automatic recovery and load balancing.
Docker-based Deployment
The Helium system is designed with a multi-worker architecture that can be deployed using containers. Each worker type serves a specific purpose and has different scaling requirements. This deployment approach provides:
- Scalability: Independent scaling of different worker types based on load
- Reliability: Fault isolation between different services
- Flexibility: Easy deployment across different environments
- Maintainability: Simplified updates and rollbacks
Prerequisites
Before proceeding with this guide, ensure you have:
- External dependencies configured (see External Dependencies)
- Docker or container runtime installed
- Kubernetes cluster (for Kubernetes deployment)
- Basic understanding of containerization concepts
Container Architecture
Worker Types and Scaling Patterns
The Helium server supports six distinct worker modes, each with specific scaling characteristics:
| Worker Mode | Port | Scaling | Description |
|---|---|---|---|
grpc | 50051 | ✅ Horizontal | Main gRPC API server - can be load balanced |
subscribe_api | 8080 | ✅ Horizontal | RESTful subscription API - can be load balanced |
webhook_api | 8081 | ✅ Horizontal | Webhook handler for payments - can be load balanced |
consumer | - | ✅ Horizontal | Background message consumer - multiple instances supported |
mailer | - | ⚠️ Single preferred | Email service - not recommended >1 instance |
cron_executor | - | 🚫 Single only | Scheduled tasks - MUST be exactly 1 instance |
Scaling Constraints
⚠️ Critical Scaling Limitations
mailer Worker:
- Recommendation: Deploy as single instance only
- Reason: Relies on SMTP server connections and may cause email delivery issues with multiple instances
- Impact: Multiple mailer instances can lead to duplicate emails or SMTP rate limiting
cron_executor Worker:
- Requirement: MUST have exactly one instance
- Reason: Scans the database to check for scheduled tasks in the queue
- Impact: Multiple instances will cause duplicate task execution and potential data corruption
✅ Scalable Workers
API Workers (grpc, subscribe_api, webhook_api):
- Can be horizontally scaled based on traffic demands
- Support standard load balancing techniques
- Share state through external Redis and PostgreSQL
consumer Worker:
- Can run multiple instances for processing message queues
- Automatically distributes work through RabbitMQ
Docker Image
Building the Docker Image
The project includes a multi-stage Dockerfile optimized for production:
# Build the Docker image
docker build -t helium-server:latest .
# Tag for registry
docker tag helium-server:latest your-registry/helium-server:v1.0.0
# Push to registry
docker push your-registry/helium-server:v1.0.0
Image Characteristics
- Base Image:
gcr.io/distroless/ccfor minimal attack surface - Size: ~50MB final image
- Architecture: Multi-arch support (amd64, arm64)
- Security: Non-root user, minimal dependencies
Environment Variables
Configure containers using these environment variables:
# Required - Worker mode selection
WORK_MODE=grpc # grpc, subscribe_api, webhook_api, consumer, mailer, cron_executor
# Required - Database connections
DATABASE_URL=postgres://user:password@postgres-host:5432/helium_db
REDIS_URL=redis://redis-host:6379
MQ_URL=amqp://user:password@rabbitmq-host:5672/
# Optional - Server configuration
LISTEN_ADDR=0.0.0.0:50051 # For API workers
SCAN_INTERVAL=60 # For cron_executor only
RUST_LOG=info # Logging level
Docker Compose Deployment
For development or simple production setups:
version: "3.8"
services:
# Main gRPC API (scalable)
helium-grpc:
image: helium-server:latest
ports:
- "50051:50051"
environment:
WORK_MODE: grpc
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
LISTEN_ADDR: 0.0.0.0:50051
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 2 # Can be scaled horizontally
# Subscription API (scalable)
helium-subscribe-api:
image: helium-server:latest
ports:
- "8080:8080"
environment:
WORK_MODE: subscribe_api
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
LISTEN_ADDR: 0.0.0.0:8080
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 2 # Can be scaled horizontally
# Webhook API (scalable)
helium-webhook-api:
image: helium-server:latest
ports:
- "8081:8081"
environment:
WORK_MODE: webhook_api
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
LISTEN_ADDR: 0.0.0.0:8081
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 2 # Can be scaled horizontally
# Background consumer (scalable)
helium-consumer:
image: helium-server:latest
environment:
WORK_MODE: consumer
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 3 # Can run multiple instances
# Mailer service (single instance recommended)
helium-mailer:
image: helium-server:latest
environment:
WORK_MODE: mailer
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 1 # SINGLE INSTANCE ONLY
# Cron executor (must be single instance)
helium-cron:
image: helium-server:latest
environment:
WORK_MODE: cron_executor
DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
REDIS_URL: redis://redis:6379
MQ_URL: amqp://helium:password@rabbitmq:5672/
SCAN_INTERVAL: 60
depends_on:
- postgres
- redis
- rabbitmq
restart: unless-stopped
deploy:
replicas: 1 # MUST BE EXACTLY 1
# External dependencies (for development only)
postgres:
image: postgres:15
environment:
POSTGRES_USER: helium
POSTGRES_PASSWORD: password
POSTGRES_DB: helium_db
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
redis:
image: redis:7
ports:
- "6379:6379"
volumes:
- redis_data:/data
rabbitmq:
image: rabbitmq:3-management
environment:
RABBITMQ_DEFAULT_USER: helium
RABBITMQ_DEFAULT_PASS: password
ports:
- "5672:5672"
- "15672:15672"
volumes:
- rabbitmq_data:/var/lib/rabbitmq
volumes:
postgres_data:
redis_data:
rabbitmq_data:
Kubernetes Deployment
For production Kubernetes deployments:
Namespace and ConfigMap
apiVersion: v1
kind: Namespace
metadata:
name: helium-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: helium-config
namespace: helium-system
data:
RUST_LOG: "info"
SCAN_INTERVAL: "60"
Secrets
apiVersion: v1
kind: Secret
metadata:
name: helium-secrets
namespace: helium-system
type: Opaque
stringData:
database-url: "postgres://helium:password@postgres-service:5432/helium_db"
redis-url: "redis://redis-service:6379"
rabbitmq-url: "amqp://helium:password@rabbitmq-service:5672/"
gRPC API Deployment (Scalable)
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-grpc
namespace: helium-system
spec:
replicas: 3 # Can be scaled horizontally
selector:
matchLabels:
app: helium-grpc
template:
metadata:
labels:
app: helium-grpc
spec:
containers:
- name: helium-server
image: your-registry/helium-server:v1.0.0
ports:
- containerPort: 50051
env:
- name: WORK_MODE
value: "grpc"
- name: LISTEN_ADDR
value: "0.0.0.0:50051"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
envFrom:
- configMapRef:
name: helium-config
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
tcpSocket:
port: 50051
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
tcpSocket:
port: 50051
initialDelaySeconds: 5
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: helium-grpc-service
namespace: helium-system
spec:
selector:
app: helium-grpc
ports:
- port: 50051
targetPort: 50051
type: ClusterIP
Consumer Deployment (Scalable)
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-consumer
namespace: helium-system
spec:
replicas: 3 # Can run multiple instances
selector:
matchLabels:
app: helium-consumer
template:
metadata:
labels:
app: helium-consumer
spec:
containers:
- name: helium-server
image: your-registry/helium-server:v1.0.0
env:
- name: WORK_MODE
value: "consumer"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
envFrom:
- configMapRef:
name: helium-config
resources:
requests:
memory: "256Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
exec:
command:
- /bin/sh
- -c
- "ps aux | grep helium-server | grep -v grep"
initialDelaySeconds: 30
periodSeconds: 30
Mailer Deployment (Single Instance)
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-mailer
namespace: helium-system
spec:
replicas: 1 # SINGLE INSTANCE ONLY
strategy:
type: Recreate # Prevent multiple instances during updates
selector:
matchLabels:
app: helium-mailer
template:
metadata:
labels:
app: helium-mailer
spec:
containers:
- name: helium-server
image: your-registry/helium-server:v1.0.0
env:
- name: WORK_MODE
value: "mailer"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
envFrom:
- configMapRef:
name: helium-config
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
Cron Executor Deployment (Singleton)
apiVersion: apps/v1
kind: Deployment
metadata:
name: helium-cron
namespace: helium-system
spec:
replicas: 1 # MUST BE EXACTLY 1
strategy:
type: Recreate # Ensure no overlap during updates
selector:
matchLabels:
app: helium-cron
template:
metadata:
labels:
app: helium-cron
spec:
containers:
- name: helium-server
image: your-registry/helium-server:v1.0.0
env:
- name: WORK_MODE
value: "cron_executor"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: redis-url
- name: MQ_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: rabbitmq-url
- name: SCAN_INTERVAL
value: "60"
envFrom:
- configMapRef:
name: helium-config
resources:
requests:
memory: "128Mi"
cpu: "50m"
limits:
memory: "256Mi"
cpu: "200m"
livenessProbe:
exec:
command:
- /bin/sh
- -c
- "ps aux | grep helium-server | grep -v grep"
initialDelaySeconds: 60
periodSeconds: 30
Horizontal Pod Autoscaler (HPA)
For scalable workers, configure automatic scaling:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: helium-grpc-hpa
namespace: helium-system
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: helium-grpc
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Load Balancer Configuration
Ingress for API Services
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: helium-ingress
namespace: helium-system
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- api.your-domain.com
secretName: helium-tls
rules:
- host: api.your-domain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: helium-grpc-service
port:
number: 50051
Service Mesh Configuration
For advanced deployments with service mesh (Istio):
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: helium-grpc-vs
namespace: helium-system
spec:
hosts:
- api.your-domain.com
gateways:
- helium-gateway
http:
- match:
- uri:
prefix: /
route:
- destination:
host: helium-grpc-service
port:
number: 50051
weight: 100
fault:
delay:
percentage:
value: 0.1
fixedDelay: 5s
Database Migration
Database migrations must be run before starting any workers:
Migration Job
apiVersion: batch/v1
kind: Job
metadata:
name: helium-migration
namespace: helium-system
spec:
template:
spec:
containers:
- name: migration
image: your-registry/helium-server:v1.0.0
command: ["sqlx", "migrate", "run"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
restartPolicy: Never
backoffLimit: 3
Init Container for Workers
Add to all worker deployments:
spec:
template:
spec:
initContainers:
- name: wait-for-migration
image: postgres:15
command:
[
"sh",
"-c",
"until pg_isready -h postgres-service -p 5432; do echo waiting for database; sleep 2; done;",
]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: helium-secrets
key: database-url
Monitoring and Observability
Health Checks
Configure appropriate health checks for each worker type:
# For API workers (gRPC, REST)
livenessProbe:
tcpSocket:
port: 50051
initialDelaySeconds: 30
periodSeconds: 10
# For background workers (consumer, mailer, cron)
livenessProbe:
exec:
command:
- /bin/sh
- -c
- "ps aux | grep helium-server | grep -v grep"
initialDelaySeconds: 30
periodSeconds: 30
Logging Configuration
env:
- name: RUST_LOG
value: "info,helium_server=debug" # Adjust as needed
Metrics Collection
Use Prometheus for metrics collection:
apiVersion: v1
kind: Service
metadata:
name: helium-metrics
namespace: helium-system
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
selector:
app: helium-grpc
ports:
- port: 8080
name: metrics
Troubleshooting
Common Issues
Pod Crash Loop:
# Check logs
kubectl logs -n helium-system deployment/helium-grpc
# Check events
kubectl get events -n helium-system --sort-by='.metadata.creationTimestamp'
# Verify environment variables
kubectl exec -n helium-system deployment/helium-grpc -- env | grep -E "(DATABASE_URL|REDIS_URL|MQ_URL)"
Multiple Cron Executors:
# Check for multiple cron instances (should show only 1)
kubectl get pods -n helium-system -l app=helium-cron
# Check cron logs for conflicts
kubectl logs -n helium-system -l app=helium-cron --tail=100
Database Connection Issues:
# Test database connectivity
kubectl run -i --tty --rm debug --image=postgres:15 --restart=Never -- \
psql postgresql://user:password@postgres-service:5432/helium_db -c "SELECT version();"
# Check migration status
kubectl exec -n helium-system deployment/helium-grpc -- \
sqlx migrate info --database-url $DATABASE_URL
Performance Tuning
Resource Limits:
- API workers: 200-500m CPU, 256Mi-1Gi RAM per pod
- Consumer workers: 500m-1 CPU, 512Mi-2Gi RAM per pod
- Mailer/Cron: 100-200m CPU, 128-512Mi RAM per pod
Scaling Guidelines:
- Start with 2-3 replicas for API workers
- Scale consumers based on message queue depth
- Monitor CPU/memory usage and adjust limits accordingly
External Dependencies
Refer to the External Dependencies Guide for detailed information about:
- PostgreSQL setup and configuration
- Redis configuration and clustering
- RabbitMQ setup and management
- SMTP server configuration
- OAuth provider setup
- Payment provider integration
Configuration Management
Refer to the Configuration Guide for:
- Environment variable reference
- Configuration file formats
- Runtime configuration updates
- Security best practices
Next Steps
After successful deployment:
- Configure monitoring and alerting
- Set up backup procedures for stateful data
- Implement CI/CD pipelines for automated deployments
- Configure log aggregation and analysis
- Plan disaster recovery procedures
For specific configuration details, see the Helium Server Configuration guide.
Configuration Guide
This document provides comprehensive configuration information for operators deploying the Helium project. The system uses a combination of environment variables for server configuration and JSON configurations stored in the database for module-specific settings.
Table of Contents
Environment Variables
The Helium server is configured entirely through environment variables. These control the server behavior and connectivity to external services.
Required Environment Variables
All worker modes require these variables:
# Worker mode selection (REQUIRED)
WORK_MODE="grpc" # Options: grpc, subscribe_api, webhook_api, consumer, mailer, cron_executor
# Database connection (REQUIRED)
DATABASE_URL="postgres://user:password@localhost:5432/helium_db"
# Redis connection (REQUIRED)
REDIS_URL="redis://localhost:6379"
# RabbitMQ connection (REQUIRED)
MQ_URL="amqp://user:password@localhost:5672/"
Worker Mode Options
| Worker Mode | Port | Description | Use Case |
|---|---|---|---|
grpc | 50051 | gRPC API server | Main API for client applications and admin panels |
subscribe_api | 8080 | RESTful subscription API | Public subscription endpoints |
webhook_api | 8081 | RESTful webhook handler | Payment provider callbacks, third-party integrations |
consumer | - | Background message consumer | Processing async tasks from message queue |
mailer | - | Email service worker | Sending emails and notifications |
cron_executor | - | Scheduled task executor | Running periodic maintenance tasks |
Optional Environment Variables
# Server listen addresses (for API workers)
LISTEN_ADDR="0.0.0.0:50051" # Default for grpc mode
LISTEN_ADDR="0.0.0.0:8080" # Default for subscribe_api mode
LISTEN_ADDR="0.0.0.0:8081" # Default for webhook_api mode
# Cron executor configuration
SCAN_INTERVAL="60" # Scan interval in seconds (cron_executor mode only)
# Logging configuration
RUST_LOG="info" # Options: error, warn, info, debug, trace
Module Configurations
Module configurations are stored as JSON in the PostgreSQL database in the application__config table. Each module has its own configuration key and JSON structure.
Note: All duration values are represented as strings containing the number of seconds (e.g., "300" for 5 minutes, "1800" for 30 minutes).
Auth Module (auth)
Key: "auth"
The authentication module handles user registration, login, JWT tokens, and OAuth providers.
{
"email_provider": {
"register_domain": {
"enable_white_list": false,
"white_list": [],
"enable_black_list": false,
"black_list": []
},
"otp_expire_after": "300",
"delete_otp_before": "7200",
"magic_link_expire_after": "1800",
"magic_link_delete_before": "14400",
"resend_interval": "30"
},
"jwt": {
"secret": "your-jwt-secret-key-32-characters-long",
"refresh_token_expiration": "2592000",
"access_token_expiration": "900",
"issuer": "https://your-domain.com",
"access_audience": "helium_cloud",
"refresh_audience": "helium_cloud_auth"
},
"oauth_providers": {
"providers": [
{
"name": "Google",
"client_id": "your-google-client-id",
"client_secret": "your-google-client-secret",
"redirect_uri": "https://your-domain.com/auth/oauth/google/callback"
},
{
"name": "GitHub",
"client_id": "your-github-client-id",
"client_secret": "your-github-client-secret",
"redirect_uri": "https://your-domain.com/auth/oauth/github/callback"
}
],
"challenge_expiration": "300"
},
}
Configuration Details:
email_provider.register_domain: Controls which email domains are allowed for registrationemail_provider.otp_expire_after: How long OTP codes remain valid (in seconds, default: “300” = 5 minutes)email_provider.resend_interval: Minimum time between resend attempts (in seconds, default: “30” = 30 seconds)jwt.secret: CRITICAL: Must be a secure random string for productionjwt.*_expiration: Token lifetime settings (in seconds, default: “2592000” = 30 days for refresh, “900” = 15 minutes for access)oauth_providers.providers: List of OAuth providers with their credentials
Telecom Module (telecom)
Key: "telecom"
The telecom module manages VPN nodes, subscription links, and proxy synchronization.
{
"node_health_check": {
"offline_timeout": "600"
},
"subscribe_link": {
"endpoints": [
{
"url_template": "https://subscribe.your-domain.com/subscribe/{SUBSCRIBE_TOKEN}",
"endpoint_name": "primary"
},
{
"url_template": "https://backup.your-domain.com/subscribe/{SUBSCRIBE_TOKEN}",
"endpoint_name": "backup"
}
]
},
"uni_proxy_sync": {
"push_interval": "30",
"pull_interval": "60"
},
"vpn_server_token": "secure-random-token-for-vpn-servers"
}
Configuration Details:
node_health_check.offline_timeout: Time before marking nodes as offline (in seconds, default: “600” = 10 minutes)subscribe_link.endpoints: List of subscription endpoints for client configurationuni_proxy_sync.push_interval: How often to push traffic data (in seconds, default: “30” = 30 seconds)uni_proxy_sync.pull_interval: How often to pull user info (in seconds, default: “60” = 1 minute)vpn_server_token: CRITICAL: Secure token for VPN server authentication
Shop Module (shop)
Key: "shop"
The shop module handles e-commerce functionality, orders, and payment processing.
{
"max_unpaid_orders": 5,
"auto_cancel_after": "1800",
"epay_notify_url": "https://your-domain.com/api/webhook/epay/notify",
"epay_return_url": "https://your-domain.com/payment/success"
}
Configuration Details:
max_unpaid_orders: Maximum unpaid orders per user (default: 5)auto_cancel_after: Time before auto-canceling unpaid orders (in seconds, default: “1800” = 30 minutes)epay_notify_url: REQUIRED: Server-to-server notification endpoint for payment providersepay_return_url: REQUIRED: User return URL after payment completion
Mailer Module (mailer)
Key: "mailer"
The mailer module handles email delivery through SMTP.
{
"host": "smtp.gmail.com",
"port": 587,
"username": "your-smtp-username",
"password": "your-smtp-password",
"sender": "noreply@your-domain.com",
"starttls": true
}
Configuration Details:
host: SMTP server hostnameport: SMTP server port (typically 587 for STARTTLS, 465 for SSL)username/password: SMTP authentication credentialssender: Email address used as senderstarttls: Enable STARTTLS encryption (recommended:true)
Admin Management Module (admin-jwt)
Key: "admin-jwt"
Controls JWT tokens for administrative access.
{
"secret": "admin-jwt-secret-key-32-characters-long",
"token_expiration": "864000",
"issuer": "https://admin.your-domain.com",
"audience": "HeliumAdmin"
}
Configuration Details:
secret: CRITICAL: Secure secret for admin JWT signingtoken_expiration: Admin token lifetime (in seconds, default: “864000” = 10 days)issuer: JWT issuer for admin tokensaudience: JWT audience for admin tokens
Market Module (affiliate)
Key: "affiliate"
Controls the affiliate marketing system.
{
"max_invite_code_per_user": 10,
"default_reward_rate": "0.1",
"default_trigger_time_per_user": 3
}
Configuration Details:
max_invite_code_per_user: Maximum invite codes per user (default: 10)default_reward_rate: Default affiliate commission rate (default: 10%)default_trigger_time_per_user: Required referrals before earning (default: 3)
Infrastructure Dependencies
PostgreSQL Database
Required Version: PostgreSQL 12+
Configuration:
- Environment variable:
DATABASE_URL - Format:
postgres://user:password@host:port/database
Important Notes:
- ⚠️ CRITICAL: Run migrations before starting:
sqlx migrate run --database-url $DATABASE_URL - Use external managed PostgreSQL service for production (AWS RDS, Google Cloud SQL, etc.)
- Ensure proper backup and high availability configuration
Redis
Required Version: Redis 6+
Configuration:
- Environment variable:
REDIS_URL - Format:
redis://host:portorredis://user:password@host:port
Usage:
- Session storage and authentication tokens
- Module configuration caching
- Temporary data (OAuth challenges, OTP codes)
RabbitMQ (AMQP)
Configuration:
- Environment variable:
MQ_URL - Format:
amqp://user:password@host:port/
Usage:
- Asynchronous task processing
- Email sending queue
- Inter-module communication
Configuration Templates
Development Environment
# .env file for development
WORK_MODE=grpc
DATABASE_URL=postgres://helium:password@localhost:5432/helium_dev
REDIS_URL=redis://localhost:6379
MQ_URL=amqp://guest:guest@localhost:5672/
LISTEN_ADDR=0.0.0.0:50051
RUST_LOG=debug
Production Environment
# Production environment variables
WORK_MODE=grpc
DATABASE_URL=postgres://helium_user:secure_password@db.example.com:5432/helium_prod
REDIS_URL=redis://redis.example.com:6379
MQ_URL=amqp://helium_user:secure_password@mq.example.com:5672/
LISTEN_ADDR=0.0.0.0:50051
RUST_LOG=info
Multi-Worker Deployment
For production, run multiple worker processes:
# API Server (can be scaled horizontally)
WORK_MODE=grpc ./helium-server &
# Background Tasks (can be scaled)
WORK_MODE=consumer ./helium-server &
# Email Processing (single instance recommended)
WORK_MODE=mailer ./helium-server &
# Scheduled Tasks (MUST be single instance)
WORK_MODE=cron_executor ./helium-server &
Configuration Management
To update module configurations:
- Via Database: Insert/update records in the
application__configtable - Via Admin API: Use the management gRPC API to update configurations
- Configuration Sync: The system automatically syncs configurations from PostgreSQL to Redis cache
Example SQL for updating auth configuration:
INSERT INTO application__config (key, content)
VALUES ('auth', '{"jwt": {"secret": "new-secret"}, ...}')
ON CONFLICT (key) DO UPDATE SET
content = EXCLUDED.content,
updated_at = NOW();
Security Considerations
⚠️ Critical Configuration Security:
- JWT Secrets: Use cryptographically secure random strings (32+ characters)
- VPN Server Token: Generate secure random tokens for server authentication
- Database Credentials: Use strong passwords and restrict database access
- SMTP Credentials: Use application-specific passwords, not primary account passwords
- OAuth Secrets: Keep OAuth client secrets secure and rotate them regularly
Troubleshooting
Common Configuration Issues
- Database Connection: Verify PostgreSQL accessibility and credentials
- Redis Connection: Check Redis server status and network connectivity
- RabbitMQ Connection: Ensure RabbitMQ server is running and accessible
- Email Delivery: Test SMTP configuration with your email provider
- OAuth Issues: Verify client IDs, secrets, and redirect URIs match provider settings
Validation Commands
# Test database connection
sqlx migrate info --database-url $DATABASE_URL
# Test Redis connection
redis-cli -u $REDIS_URL ping
# Test RabbitMQ connection
rabbitmqctl status # on RabbitMQ server
Helium CLI
The Helium CLI (helium-cli) is a comprehensive administrative tool that allows operators to:
- Initialize system configurations with default values
- Manage admin accounts (create, list, view, delete)
- Validate configuration files before deployment
- Interact with both PostgreSQL database and Redis cache
Installation
The CLI is built as part of the main Helium project. After building the project:
cargo build --release --bin helium-cli
The binary will be available at target/release/helium-cli.
Global Configuration
The CLI requires database and Redis connections to function. These can be configured via:
Environment Variables
export DATABASE_URL="postgresql://user:password@localhost/helium"
export REDIS_URL="redis://localhost:6379"
Command Line Arguments
helium-cli --database-url "postgresql://user:password@localhost/helium" \
--redis-url "redis://localhost:6379" \
<command>
Verbose Logging
Enable detailed logging for troubleshooting:
helium-cli --verbose <command>
Skip Database Migrations
Skip automatic database migrations when connecting:
helium-cli --skip-migration <command>
This is useful when:
- Migrations have already been run by another process
- Running in a container where migrations are handled separately
- Debugging database connection issues
Commands
Configuration Management
Initialize All Configurations
helium-cli init-config
This command initializes all system configurations with their default values. It:
- Creates default configurations for all modules in the database
- Updates Redis cache with the configurations
- Handles the following configuration types:
- Auth: Authentication and authorization settings
- Admin JWT: JWT configuration for admin authentication
- Telecom: Telecom service configurations
- Shop: E-commerce and shop settings
- Market: Affiliate and marketing configurations
- Mailer: SMTP and email service settings
Example Output:
Initializing 6 configuration types...
Initializing Auth config... ✓ Success
Initializing Admin JWT config... ✓ Success
Initializing Telecom config... ✓ Success
Initializing Shop config... ✓ Success
Initializing Market/Affiliate config... ✓ Success
Initializing Mailer config... ✓ Success
Configuration initialization completed:
✓ Successful: 6
Use Cases:
- Initial deployment setup
- Resetting configurations to defaults
- Disaster recovery scenarios
Validate Configuration Files
helium-cli validate-config --config-type <TYPE> <config-file.json>
Validates a JSON configuration file against the specified configuration schema.
Supported Configuration Types:
auth- Authentication configurationadmin-jwt/admin_jwt- Admin JWT configurationtelecom- Telecom service configurationshop- Shop/e-commerce configurationmarket/affiliate- Marketing/affiliate configurationmailer- Email service configuration
Examples:
# Validate auth configuration
helium-cli validate-config --config-type auth auth-config.json
# Validate mailer configuration
helium-cli validate-config --config-type mailer smtp-config.json
Example Output:
✓ Configuration file is valid!
File: auth-config.json
Type: Auth
Key: auth
Admin Account Management
List Admin Accounts
helium-cli admin list [--limit <N>] [--offset <N>]
Lists all admin accounts with pagination support.
Options:
--limit <N>- Number of results to return (default: 50)--offset <N>- Number of results to skip (default: 0)
Example:
# List first 10 admin accounts
helium-cli admin list --limit 10
# List admin accounts with pagination
helium-cli admin list --limit 25 --offset 50
Example Output:
Found 3 admin account(s):
ID Role Name Email Created At
------------------------------------ -------------------- ------------------------------ ------------------------------ --------------------
123e4567-e89b-12d3-a456-426614174000 Super Admin System Administrator admin@example.com 2024-01-15T10:30:00Z
234e5678-e89b-12d3-a456-426614174001 Customer Support Support Team Lead support@example.com 2024-01-16T14:20:00Z
345e6789-e89b-12d3-a456-426614174002 Moderator Content Moderator moderator@example.com 2024-01-17T09:45:00Z
Show Admin Account Details
helium-cli admin show <ADMIN_ID>
Displays detailed information about a specific admin account.
Example:
helium-cli admin show 123e4567-e89b-12d3-a456-426614174000
Example Output:
Admin Account Details:
ID: 123e4567-e89b-12d3-a456-426614174000
Name: System Administrator
Role: Super Admin
Email: admin@example.com
Avatar: https://example.com/avatar.jpg
Created At: 2024-01-15T10:30:00Z
Create Admin Account
helium-cli admin create --name <NAME> --role <ROLE> [--email <EMAIL>] [--avatar <AVATAR_URL>]
Creates a new admin account with the specified details.
Required Options:
--name <NAME>- Display name for the admin--role <ROLE>- Admin role (see roles below)
Optional Options:
--email <EMAIL>- Admin email address--avatar <AVATAR_URL>- URL to admin avatar image
Available Roles:
super_admin/superadmin/super-admin- Full system accessmoderator- Content moderation privilegescustomer_support/customersupport/customer-support- Customer service accesssupport_bot/supportbot/support-bot- Automated support system access
Examples:
# Create super admin
helium-cli admin create \
--name "System Administrator" \
--role super_admin \
--email "admin@example.com"
# Create customer support account
helium-cli admin create \
--name "Support Agent" \
--role customer_support \
--email "support@example.com" \
--avatar "https://example.com/avatars/support.jpg"
# Create moderator (minimal info)
helium-cli admin create \
--name "Content Moderator" \
--role moderator
Example Output:
Successfully created admin account:
ID: 456e7890-e89b-12d3-a456-426614174003
Name: System Administrator
Role: Super Admin
Email: admin@example.com
Avatar: N/A
Delete Admin Account
helium-cli admin delete <ADMIN_ID> [--yes]
Deletes an admin account after confirmation.
Options:
--yes- Skip confirmation prompt (use with caution)
Examples:
# Delete with confirmation prompt
helium-cli admin delete 123e4567-e89b-12d3-a456-426614174000
# Delete without confirmation (automated scripts)
helium-cli admin delete 123e4567-e89b-12d3-a456-426614174000 --yes
Example Interactive Flow:
Admin account to delete:
ID: 123e4567-e89b-12d3-a456-426614174000
Name: Old Administrator
Role: Super Admin
Email: old-admin@example.com
Are you sure you want to delete this admin account? [y/N]: y
Successfully deleted admin account: 123e4567-e89b-12d3-a456-426614174000
Common Use Cases
Initial Deployment
-
Set up environment variables:
export DATABASE_URL="postgresql://helium:password@localhost/helium" export REDIS_URL="redis://localhost:6379" -
Initialize system configurations:
helium-cli init-config -
Create initial super admin:
helium-cli admin create \ --name "System Administrator" \ --role super_admin \ --email "admin@yourcompany.com"
Configuration Management Workflow
-
Prepare configuration file: Create a JSON file with your custom configuration.
-
Validate before deployment:
helium-cli validate-config --config-type auth ./configs/auth-config.json -
Deploy configuration: Use the web interface or API to upload the validated configuration.
Admin Account Maintenance
-
Regular audit of admin accounts:
helium-cli admin list --limit 100 -
Create specialized support accounts:
# Customer support team helium-cli admin create --name "Support Team A" --role customer_support # Content moderation team helium-cli admin create --name "Moderator Team B" --role moderator -
Remove inactive accounts:
helium-cli admin delete <inactive-admin-id>
Error Handling
The CLI provides comprehensive error messages and logging:
- Database Connection Issues: Check
DATABASE_URLand database availability - Redis Connection Issues: Verify
REDIS_URLand Redis service status - Configuration Validation Errors: Review JSON syntax and required fields
- Admin Role Errors: Ensure role names match supported values exactly
Security Considerations
- Environment Variables: Use secure methods to set database credentials
- Admin Creation: Be selective with
super_adminrole assignments - Account Deletion: Always verify admin identity before deletion
- Logging: Be aware that verbose mode may log sensitive information
Troubleshooting
Common Issues
“DATABASE_URL must be provided”
- Set the
DATABASE_URLenvironment variable or use--database-urlflag
“Failed to connect to database”
- Verify PostgreSQL is running and accessible
- Check connection string format and credentials
- Ensure the database exists
“Invalid admin role”
- Use exact role names:
super_admin,moderator,customer_support,support_bot - Role names are case-insensitive but must match supported variants
“Configuration validation failed”
- Check JSON syntax with a JSON validator
- Ensure all required fields are present
- Verify field types match expected schema
Getting Help
Use the built-in help system:
# General help
helium-cli --help
# Command-specific help
helium-cli admin --help
helium-cli admin create --help
Integration with Deployment Scripts
The CLI is designed to work well in automated deployment scenarios:
#!/bin/bash
set -e
# Set environment
export DATABASE_URL="$HELIUM_DB_URL"
export REDIS_URL="$HELIUM_REDIS_URL"
# Initialize configurations
echo "Initializing Helium configurations..."
helium-cli init-config
# Create admin account if it doesn't exist
echo "Creating admin account..."
helium-cli admin create \
--name "$ADMIN_NAME" \
--role super_admin \
--email "$ADMIN_EMAIL" || true
echo "Deployment initialization complete!"
This CLI tool is essential for proper Helium deployment and ongoing operational management. Use it as part of your deployment automation and regular maintenance procedures.
Migrate From SS-Panel UIM
This guide walks Helium operators through migrating an existing SS-Panel UIM deployment. The migration intentionally happens in two isolated passes so you can export data from the legacy MariaDB instance without touching the new Helium PostgreSQL database until you are ready.
At a high level:
mariadb-passreads all data from the SS-Panel MariaDB schema and saves it to a localrkyvarchive.postgre-passconsumes thatrkyvarchive and writes normalized data into Helium’s PostgreSQL schema.
Because Helium normally targets PostgreSQL, the first pass uses a dedicated crate that bundles the MySQL client driver and builds separately from the rest of the project.
What Gets Migrated
The migration transfers the following SS-Panel data into Helium’s schema:
User Accounts
- Email and password hashes (preserved as-is for seamless login)
- User names and registration timestamps
- Last active timestamps
- Account balances (available balance for purchasing)
- Referral relationships (affiliate ref_by links)
- Traffic usage (upload/download totals)
- VMess UUIDs (for node authentication)
- Subscribe tokens (subscription links)
- Invite codes (user-specific invite codes)
Helium creates corresponding entries in:
auth.user_account(login credentials)auth.user_account(profile metadata)shop.user_balance(financial data)market.affiliate_user_policy(referral relationships)telecom.user_nodes_token(node authentication tokens)
Products → Packages
SS-Panel products are converted to Helium packages with:
- Package name
- Price
- Duration (time allowance in days)
- Bandwidth quota
These populate the telecom.package table.
Orders → Package Queues
Historical purchase orders are replayed into Helium’s package queue system:
- Order status (activated vs. pending)
- Creation and update timestamps
- Associated product/package
Orders are inserted into telecom.package_queue to preserve user entitlements and purchase history.
Nodes → Node Servers & Clients
SS-Panel nodes are split into two Helium entities:
- Node servers (
telecom.node_server): server address, rate, class - Node clients (
telecom.node_client): protocol configurations (VMess, WebSocket, gRPC)
Each node’s custom configuration (ports, security, network transport) is normalized to Helium’s node client schema.
Data Not Migrated
The following SS-Panel data is not migrated:
- Invoices (read but not written to Helium)
- Payback records (read but not written)
- Admin accounts (must be created manually via
helium-cli) - System configurations (initialize via
helium-cli init-config) - Announcements and tickets (start fresh in Helium)
Prerequisites
- SS-Panel UIM running on MariaDB (or MySQL-compatible) that you can access in read-only mode during export.
- A ready Helium PostgreSQL database with migrations applied and no production users yet. Run
sqlx migrate runbefore importing. - Adequate disk space wherever you write the
rkyvarchive. Expect several hundred megabytes for large installs. - Rust toolchain (same as Helium) and network access to both databases from the machine performing the migration.
- Optional: a safe location (e.g., object storage) to back up the generated
rkyvfile.
Pass 1 – Export From SS-Panel (MariaDB)
The exporter lives in ssp-migrator/mariadb-pass and is compiled with SQLx’s MySQL feature set. Build and run it separately from the main server binaries.
Build the exporter
mariadb-pass uses SQLx’s compile-time query checking. The workspace ships with .sqlx caches for PostgreSQL only, so generic commands such as cargo build --release -p mariadb-pass will fail. You must compile from the crate directory with access to a live SS-Panel database (or export SQLx metadata for MariaDB manually).
cd ssp-migrator/mariadb-pass
SQLX_OFFLINE=false DATABASE_URL="$SSP_DATABASE_URL" cargo build --release
The
DATABASE_URLenvironment variable is required during compilation so SQLx can introspect the MariaDB schema. If you cannot open a direct connection from the build host, generate SQLx data offline withsqlx prepareagainst MariaDB and commit it alongside the crate before building.
Prepare connection settings
You can pass the database URL directly on the command line or export it as an environment variable. A typical MariaDB connection string looks like:
export SSP_DATABASE_URL="mysql://user:password@legacy-host:3306/sspanel"
Run the exporter
cd ssp-migrator/mariadb-pass
SQLX_OFFLINE=false DATABASE_URL="$SSP_DATABASE_URL" cargo run --release -- \
--database-url "$SSP_DATABASE_URL" \
--output-file /tmp/helium-migration.rkyv
The command performs several steps internally:
- Streams each SS-Panel entity (users, products, orders, nodes, etc.) in batches.
- Normalizes relationships to Helium’s intermediate structs.
- Serializes the result to an
rkyvarchive (default namemigration_data.rkyv).
Monitor the logs for warnings about rows that cannot be converted. The exporter skips invalid records but continues processing.
When the run finishes you should have an archive file similar to /tmp/helium-migration.rkyv. Back it up before moving on.
Pass 2 – Import Into Helium (PostgreSQL)
The importer lives in ssp-migrator/postgre-pass and understands Helium’s canonical schema. Ensure the target PostgreSQL database is empty or freshly provisioned to avoid collisions.
Build the importer
cargo build --release -p postgre-pass
This binary only links the PostgreSQL driver, so it compiles with the same workspace settings as other Helium components.
Prepare connection settings
export HELIUM_DATABASE_URL="postgres://helium:password@new-host:5432/helium_db"
Run the importer
cargo run --release -p postgre-pass -- \
--rkyv-file /tmp/helium-migration.rkyv \
--database-url "$HELIUM_DATABASE_URL"
The importer performs conversions aligned with Helium’s modules:
- Inserts node servers and clients in the correct dependency order.
- Creates packages, affiliate policies, balances, and user accounts.
- Replays historical purchases into the package queue so users retain entitlements.
If anything fails, no partial state is left behind—each insert group is committed in dependency order. Fix the reported data issue, rebuild the rkyv archive if necessary, and rerun the importer.
Post-migration Checklist
- Confirm the importer logs
Migration completed successfully. - Inspect a handful of migrated users in Helium’s admin tools (profiles, balances, active packages).
- Verify node configurations in
telecommatch the expected SS-Panel node inventory. - Rotate user credentials if required by your migration policy (password hashes are imported as-is).
- Schedule DNS cutover and client config updates after validating the new deployment.
Troubleshooting
- MariaDB TLS or authentication errors: confirm the MariaDB driver accepts your certificates or append parameters (e.g.,
?ssl-mode=REQUIRED). - Missing subscribe links or invite codes: the exporter requires these tables to be populated for each user. Reconcile data in SS-Panel before exporting.
- Importer stops on unique constraint violations: verify the PostgreSQL database is clean. Drop and recreate the schema, then rerun the importer.
- Large datasets: run the exporter on a machine close to the database to reduce latency. You can copy the resulting
rkyvfile to the environment where the importer runs.
With both passes complete, Helium now has a faithful copy of the SS-Panel data and you can proceed with normal deployment and cutover activities.