Helium

Helium is a modern commercial VPN SaaS system built with Rust, focusing on scalability, security, and user-friendliness.

Features

Kubernetes/docker native: stateless, horizontally scalable, and easy to deploy.
High security: no shell execution, no deserialize vulnerabilities, and no SQL injection.
Pluggable frontend: full functioned grpc API, and easy to build your own frontend.
Lightweight: Minimal to 40MB memory usage per service. Can handle 1000+ requests per second on a single 1-core CPU server.
Advanced Selling System: easy to handle complex business strategy, suitable for billions of users.

Tech Stack

Rust: Memory-safe systems programming with C-level performance
gRPC + Tonic: High-performance API with type-safe contracts
PostgreSQL + SQLx: Reliable database with compile-time query validation
Redis: Fast in-memory caching and session storage
AMQP: Reliable message queuing for microservices
Tokio: Async runtime for handling thousands of concurrent connections

Key Advantages:

Microservices architecture with independent scaling
Container-native design for Kubernetes deployment
Memory safety eliminates entire classes of security vulnerabilities
Exceptional performance: 1000+ RPS on single-core CPU with 40MB memory usage

Enterprise User Guide

This guide is for enterprises that run customer-facing VPN services on Helium. It is written for technical operators and support teams who understand coding, networking, and business operations, but do not need to read Helium source code.

Who should read this

Product and operations owners
Customer support and finance teams
Platform administrators
Technical onboarding staff

What this guide covers

The guide is organized by business flow, not by backend module:

Onboard users and secure accounts
Sell plans and process payments
Deliver VPN access and manage package lifecycle
Handle wallet, gift card, and referral operations
Run customer support and announcements
Operate admin workflows and infrastructure

How to use this guide

Start with user flows if you are launching customer operations
Use admin flows for internal team enablement
Share specific pages with the teams that own each workflow

What you can do with Helium

Run a full VPN SaaS business with user accounts, plans, payments, and support
Offer multiple payment methods and internal wallet usage
Control package access, node visibility, and policy by customer segment
Give different permissions to operations, support, and admin teams

What you cannot do with this guide

It does not replace security policy or compliance review for your organization
It does not document source-level implementation details
It does not include custom integration code for your internal systems

User Onboarding and Account Security

This flow explains how end users create accounts, sign in, recover access, and keep accounts secure.

What customers can do

Register with email and password
Register or sign in with supported OAuth providers
Enable multi-factor authentication (MFA)
Reset password by email link
Keep sessions active across devices, then log out when needed

Recommended onboarding flow

flowchart TD
    start[UserStartsRegistration] --> sendEmail[RequestVerificationEmail]
    sendEmail --> emailLink[UserClicksMagicLink]
    emailLink --> setPassword[SetPasswordAndCreateAccount]
    setPassword --> autoLogin{AutoLoginEnabled}
    autoLogin -->|Yes| activeSession[SignedInSession]
    autoLogin -->|No| loginPage[GoToLogin]
    activeSession --> mfaSetup[OptionalMfaSetup]
    loginPage --> manualLogin[EmailOrOAuthLogin]

Registration

User submits email (and optional referral code).
System sends a time-limited link to that email.
User opens the link, sets password, and completes account creation.
If auto-login is enabled, user enters with a live session immediately.

Email + password
OAuth provider login
MFA challenge when enabled

For enterprise UX, provide clear fallbacks: “Try another login method” and “Reset password.”

Password reset

User requests reset email.
User opens time-limited link.
User sets new password.
Existing sessions are invalidated for security.

This is expected behavior and should be explained on the reset success screen.

Session behavior

Access remains active through access and refresh token lifecycle.
Users can refresh sessions and continue without full re-login until refresh lifetime ends.
Security-sensitive changes can require re-verification.

What customers cannot do

Use expired or already-used email links
Keep old sessions alive after a successful password reset
Remove their last remaining login method (at least one method must remain)
Bypass MFA once it is required for their account actions

Purchasing a Plan

This flow describes how customers browse plans, create orders, pay, and receive service.

What customers can do

View plans available to their account tier
Apply valid coupons before checkout
Pay by supported gateway methods or account wallet
Cancel unpaid orders
Track order status until service delivery

End-to-end purchase flow

flowchart TD
    browse[BrowsePlans] --> coupon[OptionalApplyCoupon]
    coupon --> create[CreateOrder]
    create --> payMethod{ChoosePaymentMethod}
    payMethod -->|Gateway| gatewayPay[ExternalGatewayPayment]
    payMethod -->|Wallet| walletPay[WalletPayment]
    gatewayPay --> paid[OrderMarkedPaid]
    walletPay --> paid
    paid --> delivery[PackageDelivery]
    delivery --> active[ServiceBecomesActive]

Plan visibility

Customers only see plans that match their current eligibility (group access and sale status).
This allows the same platform to run multiple customer segments.

Coupon usage

Coupon is checked when user previews it.
Coupon is checked again at order creation.
Final payable amount is locked at order creation.

If a coupon becomes invalid before order creation completes, checkout should fail with a clear message.

Payment paths

Gateway payment

Customer is redirected to a payment page.
System confirms callback from provider.
Order becomes paid after verification.

Wallet payment

Uses available wallet balance only.
Payment and order update happen atomically.
Best UX for repeat customers with balance.

Order lifecycle

Unpaid: waiting for payment
Paid: payment confirmed, waiting for service delivery
Delivered: service package assigned
Cancelled: unpaid order cancelled by user, admin, or timeout

Operational notes

Unpaid orders are auto-cancelled after configured timeout.
There is a limit on how many unpaid orders one user can hold.
After successful payment, frontend should poll order status until delivered.

What customers cannot do

Use coupons outside validity rules (time window, limits, minimum amount)
Pay with wallet if available balance is insufficient
Recover a cancelled order (must create a new order)
Force immediate delivery if backend delivery queue is delayed

Using the VPN Service

This flow explains how customers receive and use VPN access after purchase.

What customers can do

Get personal subscription links
Import configs into supported proxy clients
Use nodes allowed by their active package
Queue future packages for automatic activation
Monitor usage and package consumption status

Access flow

flowchart TD
    paidOrder[OrderPaid] --> queue[PackageAddedToQueue]
    queue --> activate{HasActivePackage}
    activate -->|No| firstActive[ActivateFirstPackage]
    activate -->|Yes| stayQueued[RemainInQueue]
    firstActive --> nodeAccess[NodeAccessEnabled]
    stayQueued --> laterActivate[AutoActivateWhenCurrentConsumed]
    nodeAccess --> subscribe[GenerateSubscribeConfig]

Subscription links

Each user has a unique subscription token used to fetch client configuration from a subscribe URL.

URL can be opened directly in compatible clients
Client type can be auto-detected or user-selected
Generated config is filtered by user package permissions

Supported client families

Clash ecosystem
V2Ray-compatible clients
Sing-box ecosystem
iOS-focused clients such as QuantumultX, Loon, Surfboard, and Surge

Package lifecycle

InQueue: purchased, waiting
Active: currently providing service
Consumed: completed by time or traffic usage
Cancelled: removed due to cancellation/refund action

Only one package is active per user at a time. Additional packages wait and activate automatically in order.

Node access rules

Active package policy decides which nodes are visible to the customer.
If no package is active, node list and effective service access are empty.

What customers cannot do

Use nodes outside their active package group permissions
Activate multiple packages at the same time on one account
Use subscription token from another account
Keep access after package is fully consumed without another queued package

Wallet and Payments

This flow explains how customers use internal wallet balance, gift cards, and payment history.

What customers can do

View available and frozen balances
Redeem valid gift cards into wallet
Pay eligible orders from available balance
View balance change history for audit and support

Balance model

Available balance: spendable for order payment
Frozen balance: temporarily locked and not spendable

Both are shown as part of one wallet account per user.

Gift card redemption

User enters gift card code.
System validates card status (exists, unused, unexpired).
Value is deposited into available balance.
Card is marked redeemed and cannot be reused.

Paying orders with wallet

User selects wallet payment.
System checks available balance.
Amount is deducted and order moves to paid status in one transaction.
Payment record appears in balance history.

Balance history

History is used for:

customer transparency
support troubleshooting
finance reconciliation

Each record contains amount change, change type, reason, and timestamp.

What customers cannot do

Spend frozen balance
Redeem an expired or already-used gift card
Partially bypass payment rules with insufficient available balance
Directly edit or delete wallet history records

Referral Program

This flow explains how customers invite others and earn referral rewards.

What customers can do

Create invite codes (within configured limits)
Share invite codes with prospective users
Earn rewards when referred users complete qualifying purchases
Track referral performance and withdraw available rewards to wallet

Referral flow

flowchart TD
    code[CreateInviteCode] --> share[ShareCode]
    share --> signup[NewUserRegistersWithCode]
    signup --> purchase[ReferredUserCompletesQualifyingPayment]
    purchase --> reward[CommissionCalculated]
    reward --> stats[AffiliateStatsUpdated]
    stats --> withdraw[WithdrawToWallet]

Reward principles

Reward is based on a configured commission rate.
Each referred user can trigger rewards only up to a configured number of times.
Rewards are tracked as available amount until withdrawn.

Withdrawal

When user withdraws referral rewards:

System checks withdrawable amount.
Referral stats are updated.
Wallet available balance is credited.

This should appear in both referral records and wallet history.

What customers cannot do

Create unlimited invite codes
Earn commission from non-qualifying payments (for example, restricted methods per policy)
Withdraw more than currently available reward
Change historical referral relation records

Support and Notifications

This flow explains how customers communicate with support teams and receive platform announcements.

What customers can do

Open support tickets with priority and description
Continue two-way conversations with support staff
Track ticket status changes
Read targeted announcements
Configure notification preferences (such as email categories)

Ticket lifecycle

flowchart TD
    open[UserCreatesTicket] --> waitingAdmin[StatusOpen]
    waitingAdmin --> adminReply[AdminReplies]
    adminReply --> waitingUser[StatusPending]
    waitingUser --> userReply[UserReplies]
    userReply --> waitingAdmin
    waitingAdmin --> resolved[AdminMarksResolved]
    resolved --> closed[AdminOrUserClosesTicket]

Ticket handling rules

Ticket belongs to the user who opened it.
Both sides can communicate while ticket is open or pending.
Closed tickets are archived for workflow completion.

Announcements

Announcements are broadcast messages targeted by user segment and priority.

Pinned announcements are shown first.
Priority helps users identify urgency.
Users only see announcements for groups they belong to.

Notification preferences

Users can control notification settings for available categories (for example: security, marketing, service reminders) based on your enterprise policy.

What customers cannot do

Access tickets that belong to another user
Continue sending messages on closed tickets
See announcements outside their targeting scope
Edit admin-authored support messages

Admin Getting Started

This flow explains how enterprises set up administrative access and role boundaries.

What admins can do

Onboard administrators through invitation flow
Authenticate with admin credentials and access tokens
Assign role-based permissions
Rotate and manage access credentials
Audit administrative actions

Admin onboarding flow

flowchart TD
    invite[CreateAdminInvitation] --> accept[AdminAcceptsInvitation]
    accept --> register[AdminCompletesRegistration]
    register --> issueKey[CredentialIssued]
    issueKey --> login[AdminLogin]
    login --> access[UseRoleBasedAdminFunctions]

Role model (recommended baseline)

Role	Typical responsibility	Write capability
SuperAdmin	Platform ownership and high-risk changes	Full
Moderator	Daily operations and catalog management	Broad
CustomerSupport	Customer issue handling	Limited
SupportBot	Automated read-oriented workflows	Minimal/None

Enterprises should map these roles to internal SOPs before production launch.

Good operational practices

Use separate admin accounts per human operator.
Rotate credentials on a fixed schedule.
Keep support and platform ownership roles separate.
Review audit records regularly.

What admins cannot do

Use invitation links after they are consumed or expired
Exceed role permissions assigned by policy
Safely share one admin credential across multiple operators
Skip audit and governance requirements for sensitive actions

Admin Product Management

This flow explains how enterprise teams manage plans, coupons, and gift-card operations.

What admins can do

Build and maintain saleable plan catalog
Update package offerings through controlled versioning
Manage coupon campaigns and constraints
Generate and distribute gift cards
Keep existing customer entitlements stable during catalog updates

Package and product principles

Products are what customers buy.
Packages define service entitlements (traffic, duration, access scope).
Package series allow new versions while preserving old purchased terms.

This protects existing subscribers from unexpected entitlement changes.

Product operation flow

flowchart TD
    design[DefineOffer] --> packageVersion[CreateOrUpdatePackageVersion]
    packageVersion --> bindProduct[BindProductToPackageSeries]
    bindProduct --> publish[PublishForSale]
    publish --> monitor[MonitorSalesAndUsage]
    monitor --> iterate[IterateNextVersion]

Coupon campaign management

Admins can configure:

discount type (percentage or fixed amount)
valid time window
per-user and global usage limits
active/inactive lifecycle

Use preview messaging in frontend so customers understand why a coupon fails.

Gift card operations

Bulk generation for campaigns
Special-code generation for branded promotions
Expiration control for liability management
Support lookup for troubleshooting redemption issues

What admins cannot do

Change already-delivered customer package terms retroactively
Reuse a currently active coupon code conflict without deactivation strategy
Re-redeem single-use gift cards
Treat product updates as immediate entitlement changes for historical orders

Admin Customer Operations

This flow explains daily customer-facing operations for support and operations teams.

What admins can do

Search and inspect customer account state
Ban or unban users according to policy
Help recover account access (including MFA recovery operations)
Review order and payment state for troubleshooting
Apply controlled balance adjustments with documented reasons
Perform manual order interventions where policy allows

Typical support workflow

flowchart TD
    issue[CustomerIssueReported] --> identify[FindUserAndValidateIdentity]
    identify --> diagnose[CheckAccountOrderWalletState]
    diagnose --> action{NeedAdminAction}
    action -->|No| guidance[ProvideCustomerGuidance]
    action -->|Yes| execute[ExecuteAuthorizedAdminOperation]
    execute --> audit[RecordAuditAndSupportNote]
    guidance --> close[CloseSupportCase]
    audit --> close

Core operations

Account controls

Ban/unban user based on abuse, compliance, or security policy
Remove blocked authentication factors during verified recovery

Wallet controls

Deposit: credit balance
Consume: deduct balance
Freeze: move spendable funds into hold
Unfreeze: release held funds

Every balance adjustment should include clear human-readable reason text.

Order controls

Check unpaid/paid/delivered status
Assist with payment confirmation disputes
Use manual paid marking only under approved internal process
Handle partial compensation using approved balance adjustment process

What admins cannot do

Perform actions outside assigned role permissions
Adjust funds without reason and audit trace
Access or modify customer credentials directly
Use manual interventions as a substitute for payment controls and reconciliation policy

Admin VPN Infrastructure Operations

This flow explains how infrastructure teams run node capacity, routing quality, and delivery readiness.

What admins can do

Register and maintain node servers
Configure node clients and protocol offerings
Control node availability and maintenance state
Monitor node health, usage, and quality trends
Observe package queue delivery and activation behavior

Infrastructure operation flow

flowchart TD
    onboard[OnboardNodeServer] --> config[ConfigureNodeClientProfiles]
    config --> publish[ExposeEligibleNodesByGroup]
    publish --> monitor[MonitorHealthAndTraffic]
    monitor --> maintain[MaintenanceOrCapacityAdjustment]
    maintain --> monitor

Node operations

Keep node metadata accurate (location, route class, capacity intent).
Validate protocol configuration before publishing.
Use maintenance states to protect customer experience during changes.

Access and delivery relationship

Customer package policy controls which nodes users can access.
Package queue health affects when customers become active on service.
Infrastructure and commercial operations must coordinate release windows.

Observability checklist

node online/offline trends
abnormal traffic spikes
package activation lag
failed delivery or queue backlog
regional quality degradation

What admins cannot do

Expose nodes to customers without matching package access scope
Ignore maintenance signaling during disruptive changes
Treat node-client configuration and package policy as independent concerns
Assume delivery is healthy without queue and activation monitoring

Microservices Architecture

Helium is built as a collection of focused microservices that cooperate through a shared set of contracts, messaging patterns, and observability tooling. This section introduces the high-level layout of the system, explains how the services interact, and highlights the infrastructure choices that enable the platform to scale for large commercial VPN deployments.

Architectural Goals

Independent scaling – Each service can be deployed and scaled based on its workload characteristics (API traffic, background jobs, email throughput, etc.).
Clear boundaries – Services expose well-defined APIs (gRPC, REST, AMQP events) and depend on shared libraries for cross-cutting concerns, ensuring that business logic remains isolated inside its module.
Operational resiliency – Stateless services, database connection pooling, and message queues allow resilient deployments with graceful failure handling.
Security by design – Rust, strict processor patterns, and zero shared mutable state within processes prevent memory safety issues and accidental privilege escalations.

Service Topology

The helium-server crate can run in multiple worker modes. Each mode is packaged into its own container image or deployment unit, providing a natural microservice boundary while reusing the same codebase and shared libraries.

Worker	Entry Point	Responsibilities
`grpc`	`GrpcWorker`	Exposes gRPC APIs for all business domains (Auth, Manage, Telecom, Market, Shop, Support, etc.). Performs request validation, invokes the corresponding module services, and emits events.
`subscribe_api`	`SubscribeApiWorker`	Provides REST endpoints optimized for subscription clients. Primarily a read-heavy facade backed by Redis caching and the service layer.
`webhook_api`	`WebHookApiWorker`	Receives payment gateway callbacks and external partner webhooks, normalizes payloads, and dispatches workflow events.
`consumer`	`ConsumerWorker`	Listens on AMQP queues for asynchronous jobs (billing, node updates, provisioning) emitted by other services. Orchestrates long-running tasks that should not block API responses.
`mailer`	`MailerWorker`	Specialized consumer responsible for templated email delivery, retry management, and transactional messaging.
`cron_executor`	`CronWorker`	Periodically scans for scheduled work (subscription renewals, quota resets, health checks) and dispatches jobs via the same service layer used by the API workers.

These workers are deployed independently and scaled according to throughput requirements. For example, a busy billing period can scale the consumer and cron workers without affecting the gRPC API footprint.

Domain-Oriented Modules

Each domain (Auth, Manage, Telecom, Shop, Market, Notification, Support, Shield, Mailer) is implemented as an independent module under modules/. Modules follow a common layout (entities, services, rpc, hooks, events) as described in the Project Structure Guide. Within the microservices architecture:

Modules provide service layer processors that encapsulate business logic.
RPC layers expose the processors through gRPC servers. The GrpcWorker aggregates these services and mounts them behind a single TLS termination point, while keeping module ownership intact.
Hooks and events enable cross-module interactions without tight coupling, allowing, for instance, the Telecom module to emit usage events consumed by the billing logic in the Manage module.

Communication Patterns

Helium combines synchronous APIs with asynchronous messaging to balance latency and resiliency.

gRPC Contract

Tonic-generated servers provide strongly typed interfaces for customer-facing and operator APIs.
A uniform Processor trait ensures every RPC delegate is testable in isolation and can be reused by background workers.
Service discovery is handled at the infrastructure layer (Kubernetes or Docker Compose) because workers are stateless; clients load-balance using standard mechanisms (Envoy, NGINX, etc.).

REST Facades

Subscription and webhook workers expose lightweight REST routes via Axum.
REST APIs reuse the same service processors, ensuring identical business behavior across protocols and simplifying versioning.

Asynchronous Messaging

RabbitMQ (AMQP) is used to propagate domain events and dispatch background jobs.
Producers append metadata (correlation IDs, tenant identifiers) to support observability and reliable retries.
Consumers acknowledge messages only after successful processing, preventing data loss during failures.

Data Management

PostgreSQL is the system of record. SQLx is used through the DatabaseProcessor abstraction to keep SQL isolated inside entities/ modules and to support compile-time query checking.
Redis provides ephemeral caches, session storage, and rate limiting. The RedisConnection wrapper from helium-framework manages pooled connections shared by API and worker processes.
Consistent migrations live in the top-level migrations/ directory and are applied during deployment. Services run with zero shared mutable state; all coordination happens through the database or message queues.

Observability & Operations

Tracing is initialized in every worker with structured logs and span annotations. This enables distributed tracing across API and background workloads when combined with log collectors.
Metrics exporters (e.g., Prometheus integration) can be attached at the deployment layer because each worker exposes a predictable Axum/Tonic server endpoint.
Health probes: gRPC and REST workers perform dependency checks on startup (database, Redis, AMQP). Container orchestrators can use readiness/liveness probes to restart unhealthy instances.

Deployment Model

Workers are packaged as lightweight containers (<50MB RSS) and designed to be horizontally scalable. Scaling policies are set per worker depending on CPU or queue length metrics.
Configuration is provided through environment variables (DATABASE_URL, MQ_URL, REDIS_URL, WORK_MODE, etc.), making the platform 12-factor compliant.
Infrastructure typically consists of:
- Kubernetes/Docker orchestrating the worker deployments
- Managed PostgreSQL and Redis services
- RabbitMQ cluster for messaging
- Optional CDN or reverse proxy terminating TLS before forwarding requests to gRPC/REST workers.

Extensibility

Adding a new capability follows a repeatable pattern:

Create or extend a module under modules/ with the Processor-based service implementation.
Expose the functionality via RPC/REST by wiring the service into the relevant worker.
Emit domain events or enqueue background jobs when work must be processed asynchronously.
Deploy the updated worker image; other workers continue functioning without redeployment because contracts are versioned explicitly.

This approach keeps Helium maintainable while providing the flexibility to grow with complex VPN SaaS requirements.

Helium Project Structure Guide

This document describes the modular architecture and organization of the Helium VPN SaaS system.

Project Overview

Helium is a modern VPN SaaS system built with Rust, organized as a workspace with multiple modules. The system follows a modular architecture where each module represents a specific business domain.

Module Architecture

Each module follows a consistent internal structure with standardized components:

1. Entity Layer (`entities/`)

Purpose: Data models and database access patterns

Structure:

entities/
├── mod.rs                          # Module exports
├── db/                             # Database entity processors
│   ├── mod.rs
│   ├── user_account.rs             # User account queries/commands
│   └── ...
└── redis/                          # Redis entity processors
    ├── mod.rs
    ├── session.rs                  # Session cache operations
    └── ...

Key Patterns:

Implements Processor<Input, Result<Output, sqlx::Error>> for DatabaseProcessor
Contains all SQL queries and database operations
Separated by storage backend (db/ for PostgreSQL, redis/ for Redis)
No business logic - pure data access

2. Service Layer (`services/`)

Purpose: Business logic orchestration and workflows

Structure:

services/
├── mod.rs                          # Service exports
├── manage.rs                       # Management operations
├── user_account.rs                 # User profile management
└── ...

Key Patterns:

Implements Processor<Input, Result<Output, Error>> for service operations
Orchestrates multiple entity operations
Handles validation, transformation, and business rules
No direct SQL - delegates to entity processors
Uses DatabaseProcessor for data access

Example:

#[derive(Clone)]
pub struct UserManageService {
    pub db: sqlx::PgPool,
}

impl Processor<ListUsersRequest, Result<ListUsersResponse, Error>> for UserManageService {
    async fn process(&self, input: ListUsersRequest) -> Result<ListUsersResponse, Error> {
        let db = DatabaseProcessor::from_pool(self.db.clone());
        let users = db.process(ListUsers { ... }).await?;
        Ok(ListUsersResponse { users })
    }
}

3. gRPC Layer (`rpc/`)

Purpose: gRPC service implementations and external API

Structure:

rpc/
├── mod.rs                          # RPC exports
├── auth_service.rs                 # Authentication gRPC service
├── manage_service.rs               # Management gRPC service
├── middleware.rs                   # gRPC middleware
└── ...

Key Patterns:

Implements generated gRPC trait definitions
Converts protobuf messages to service DTOs
Delegates to service layer via Processor::process
Handles authentication and authorization

4. Hook System (`hooks/`)

Purpose: Event-driven side effects and integrations

Structure:

hooks/
├── mod.rs                          # Hook exports
├── billing.rs                      # Billing event hooks
├── register.rs                     # Registration hooks
└── ...

Key Patterns:

Responds to domain events
Handles cross-module integrations
Implements side effects (notifications, external API calls)
Decoupled from main business flows

5. Event System (`events/`)

Purpose: Domain event definitions and publishing

Structure:

events/
├── mod.rs                          # Event exports
├── user.rs                         # User-related events
├── order.rs                        # Order events
└── ...

Key Patterns:

Defines domain events using message queue integration
Publishes events for cross-module communication
Enables audit trails and analytics
Supports eventual consistency patterns

6. API Layer (`api/`)

Purpose: REST API endpoints and HTTP handlers

Structure:

api/
├── mod.rs                          # API exports
├── subscribe.rs                    # Subscription endpoints
└── xrayr/                          # XrayR integration APIs
    ├── mod.rs
    └── ...

Key Patterns:

Implements REST endpoints using Axum
Handles HTTP-specific concerns (parsing, serialization)
Delegates to service layer
Provides alternative to gRPC for specific use cases

7. Cron Jobs (`cron.rs`)

Purpose: Scheduled tasks and background jobs

Key Patterns:

Implements periodic maintenance tasks
Handles cleanup operations
Manages recurring billing cycles
Executes system health checks

8. Testing (`tests/`)

Purpose: Integration and unit tests

Structure:

tests/
├── common/                         # Test utilities
│   └── mod.rs                      # Common test setup
├── service_name_test.rs            # Service integration tests
└── ...

Key Patterns:

Integration tests for complete workflows
Uses testcontainers for database testing
Isolated test environments
Comprehensive service testing

Module Configuration

Dependencies (`Cargo.toml`)

Each module declares:

Workspace dependencies (shared versions)
Inter-module dependencies
Module-specific dependencies
Dev dependencies for testing
Build dependencies (typically tonic-prost-build for gRPC)

Build Configuration (`build.rs`)

Most modules include build scripts for:

gRPC code generation from proto files
Custom compilation steps
Environment-specific builds

Module Entry Point (`lib.rs`)

Standard module structure:

#![forbid(clippy::unwrap_used)]
#![forbid(unsafe_code)]
#![deny(clippy::expect_used)]
#![deny(clippy::panic)]

pub mod config;
pub mod cron;
pub mod entities;
pub mod events;
pub mod hooks;      // Optional
pub mod api;        // Optional
pub mod rpc;
pub mod services;

Protocol Buffers (`proto/`)

Organization: Organized by module with consistent naming:

proto/
├── auth/
│   ├── auth.proto                  # Core auth services
│   ├── account.proto               # Account management
│   └── manage.proto                # Admin operations
├── telecom/
│   ├── telecom.proto               # VPN services
│   └── manage.proto                # Telecom management
└── ...

Patterns:

Service definitions mirror module structure
Consistent message naming conventions
Shared types in common proto files

Key Architectural Principles

1. Processor Pattern

All APIs use the kanau::processor::Processor trait for consistent interfaces and composability.

2. Separation of Concerns

Entities: Data access only
Services: Business logic only
RPC/API: Protocol handling only
Events/Hooks: Side effects only

3. Database Abstraction

Services never contain raw SQL - all database access goes through entity processors.

4. Event-Driven Architecture

Modules communicate via events to maintain loose coupling.

5. RBAC and Audit

Administrative operations implement consistent role-based access control and audit logging.

Development Guidelines

Adding a New Module

Create module directory under modules/
Add basic Cargo.toml with workspace dependencies
Create src/lib.rs with standard module structure
Add module to workspace Cargo.toml
Create proto definitions if gRPC services needed
Implement entities → services → rpc layers in order

Testing Strategy

Unit tests for complex business logic in services
Integration tests in tests/ directory
Use testcontainer-helium-modules for database tests
Mock external dependencies
Test error handling paths

Documentation Standards

Document all public APIs
Include examples for complex workflows
Maintain this guide as modules evolve
Document breaking changes in module changelogs

This modular architecture enables independent development, testing, and deployment of features while maintaining system coherence through standardized patterns and interfaces.

`helium-server` Crate

The Helium server is designed as a multi-mode worker system that can run different components independently or together, enabling flexible deployment strategies. Each worker mode serves a specific purpose in the overall system architecture.

Architecture

Worker Modes

The server supports six distinct worker modes:

Worker Mode	Port	Description	Use Case
`grpc`	50051	gRPC API server	Main API for client applications and admin panels
`subscribe_api`	8080	RESTful subscription API	Public subscription endpoints
`webhook_api`	8081	RESTful webhook handler	Payment provider callbacks, third-party integrations
`consumer`	-	Background message consumer	Processing async tasks from message queue
`mailer`	-	Email service worker	Sending emails and notifications
`cron_executor`	-	Scheduled task executor	Running periodic maintenance tasks

Dependencies

The server requires three core infrastructure components:

PostgreSQL: Primary database for persistent data
Redis: Caching, session storage, and temporary data
RabbitMQ (AMQP): Message queuing for async processing

Module Integration

The server integrates all business logic modules:

auth: Authentication and authorization
shop: E-commerce and billing
telecom: VPN node management and traffic handling
market: Affiliate and marketing systems
notification: Announcements and messaging
support: Customer support tickets
manage: Administrative functions
shield: Security and anti-abuse measures

Deployment Guide

Prerequisites

PostgreSQL, Redis, and RabbitMQ servers accessible
SQLx CLI: cargo install sqlx-cli --no-default-features --features postgres
Environment variables configured (see below)

Environment Configuration

The server is configured entirely through environment variables:

Required Variables

# Worker mode selection
WORK_MODE="grpc"  # or subscribe_api, webhook_api, consumer, mailer, cron_executor

# Database connection
DATABASE_URL="postgres://user:password@localhost/helium_db"

# Message queue connection
MQ_URL="amqp://user:password@localhost:5672/"

# Redis connection
REDIS_URL="redis://localhost:6379"

Optional Variables

# Server listen address (for API workers)
LISTEN_ADDR="0.0.0.0:50051"  # grpc mode default
LISTEN_ADDR="0.0.0.0:8080"   # subscribe_api mode default
LISTEN_ADDR="0.0.0.0:8081"   # webhook_api mode default

# Cron executor scan interval (seconds)
SCAN_INTERVAL="60"  # cron_executor mode only

# OpenTelemetry Collector endpoint (optional, for observability)
OTEL_COLLECTOR="http://otel-collector:4317"  # See Observability guide

Note: For comprehensive observability with distributed tracing and metrics, see the Observability with OpenTelemetry guide.

Database Migration

⚠️ CRITICAL: Database migrations must be run before starting the application.

# Install SQLx CLI
cargo install sqlx-cli --no-default-features --features postgres

# Apply all pending migrations
sqlx migrate run --database-url $DATABASE_URL

# Verify migration status
sqlx migrate info --database-url $DATABASE_URL

Basic Deployment

Running the Server

# Apply database migrations first
sqlx migrate run --database-url $DATABASE_URL

# Start the server with desired worker mode
WORK_MODE=grpc ./helium-server

Multiple Worker Deployment

For production, run different worker modes as separate processes:

# Terminal 1: Main gRPC API
WORK_MODE=grpc ./helium-server

# Terminal 2: Background consumer
WORK_MODE=consumer ./helium-server

# Terminal 3: Email worker
WORK_MODE=mailer ./helium-server

# Terminal 4: Cron jobs
WORK_MODE=cron_executor ./helium-server

Logging

The server uses structured logging:

# Enable debug logging
RUST_LOG=debug ./helium-server

# Production logging (default)
RUST_LOG=info ./helium-server

Developer Guide

Project Structure

server/
├── Cargo.toml              # Dependencies and metadata
├── src/
│   ├── main.rs             # Entry point and startup logic
│   ├── worker/             # Worker mode implementations
│   │   ├── mod.rs          # Worker configuration and dispatch
│   │   ├── grpc.rs         # gRPC server implementation
│   │   ├── consumer.rs     # Background message consumer
│   │   ├── mailer.rs       # Email service worker
│   │   ├── cron_executor.rs # Scheduled task executor
│   │   ├── subscribe_api.rs # Subscription REST API
│   │   └── webhook_api.rs  # Webhook REST API
│   └── hooks/              # Extension points (currently unused)
│       └── mod.rs

Building from Source

# Development build
cd server
cargo build

# Release build (optimized)
cargo build --release

# Run with specific worker mode
WORK_MODE=grpc cargo run

Adding New Worker Modes

Create worker implementation:

// src/worker/new_worker.rs
pub struct NewWorker {
    // worker fields
}

impl NewWorker {
    pub async fn initialize(args: YourArgs) -> anyhow::Result<Self> {
        // initialization logic
    }

    pub async fn run(&self) -> anyhow::Result<()> {
        // worker main loop
    }
}

Add to worker configuration:

// src/worker/mod.rs
pub enum WorkerArgs {
    // existing variants...
    NewWorker(YourArgs),
}

impl WorkerArgs {
    pub fn load_from_env() -> anyhow::Result<Self> {
        match work_mode.as_str() {
            // existing modes...
            "new_worker" => {
                // parse environment variables
                Ok(WorkerArgs::NewWorker(args))
            }
        }
    }

    pub async fn execute_worker(self) -> anyhow::Result<()> {
        match self {
            // existing modes...
            WorkerArgs::NewWorker(args) => {
                let worker = NewWorker::initialize(args).await?;
                worker.run().await
            }
        }
    }
}

gRPC Service Development

The gRPC worker automatically integrates all modules. To add new services:

Implement your service in the appropriate module (e.g., modules/your_module/)
Add to gRPC worker:

// src/worker/grpc.rs
impl GrpcWorker {
    pub async fn initialize(args: GrpcWorkModeArgs) -> Result<Self, anyhow::Error> {
        // ... existing initialization ...

        let your_service = YourService::new(database_processor.clone());

        Ok(Self {
            // ... existing fields ...
            your_service,
        })
    }

    pub fn server_ready(self) -> Router<...> {
        tonic::transport::server::Server::builder()
            // ... existing services ...
            .add_service(YourServiceServer::new(self.your_service))
    }
}

Database Migrations

Database schema is managed through SQLx migrations in the migrations/ directory. When adding new features:

Create migration files:

# Create new migration
sqlx migrate add your_feature_name

# This creates:
# migrations/TIMESTAMP_your_feature_name.up.sql
# migrations/TIMESTAMP_your_feature_name.down.sql

Run migrations:

# Apply migrations
sqlx migrate run --database-url $DATABASE_URL

# Revert last migration
sqlx migrate revert --database-url $DATABASE_URL

Testing

# Run all tests
cargo test

# Run specific module tests
cargo test --package helium-server

# Integration tests with database
DATABASE_URL=postgres://test_db cargo test

Performance Considerations

Memory Usage: Each worker typically uses 40-200MB RAM
CPU Efficiency: Single-core performance optimized, can handle 1000+ RPS
Connection Pooling: Database connections are shared across services
Async Processing: All I/O operations are non-blocking

Troubleshooting

Common Issues

Service won’t start:

# Check environment variables
env | grep -E "(DATABASE_URL|MQ_URL|REDIS_URL|WORK_MODE)"

# Verify database migrations are applied
sqlx migrate info --database-url $DATABASE_URL

Database migration issues:

# Check migration status
sqlx migrate info --database-url $DATABASE_URL

# Force apply migrations (if stuck)
sqlx migrate run --database-url $DATABASE_URL

# Revert last migration if needed
sqlx migrate revert --database-url $DATABASE_URL

# Reset database (CAUTION: destroys all data)
sqlx database reset --database-url $DATABASE_URL

Performance issues:

# Enable request tracing
RUST_LOG=helium_server=trace ./helium-server

# Profile with flamegraph
cargo flamegraph --bin helium-server

Logs and Debugging

# Debug logging
RUST_LOG=debug ./helium-server

# Trace specific modules
RUST_LOG=helium_server::worker::grpc=trace,info ./helium-server

Configuration Validation

Ensure all required environment variables are properly set:

# Validate configuration script
#!/bin/bash
set -e

echo "Validating Helium server configuration..."

# Check required variables
: "${WORK_MODE:?WORK_MODE not set}"
: "${DATABASE_URL:?DATABASE_URL not set}"
: "${MQ_URL:?MQ_URL not set}"
: "${REDIS_URL:?REDIS_URL not set}"

# Validate work mode
case "$WORK_MODE" in
  grpc|subscribe_api|webhook_api|consumer|mailer|cron_executor)
    echo "✓ Valid WORK_MODE: $WORK_MODE"
    ;;
  *)
    echo "✗ Invalid WORK_MODE: $WORK_MODE"
    exit 1
    ;;
esac

# Check if migrations are applied
if command -v sqlx >/dev/null 2>&1; then
  if sqlx migrate info --database-url "$DATABASE_URL" | grep -q "pending"; then
    echo "⚠ Warning: Pending database migrations found"
    echo "Run: sqlx migrate run --database-url $DATABASE_URL"
  else
    echo "✓ Database migrations are up to date"
  fi
else
  echo "⚠ Warning: sqlx CLI not found - cannot verify migrations"
  echo "Install with: cargo install sqlx-cli --no-default-features --features postgres"
fi

echo "Configuration validation complete!"

External Dependencies

The Helium system requires several external services to function properly. The Helium application itself runs in Docker containers, but the core infrastructure dependencies (PostgreSQL, Redis, RabbitMQ) should be provisioned as external managed services for production deployments.

While some dependencies are core infrastructure requirements, others are module-specific and may be optional depending on your deployment configuration.

Core Infrastructure Dependencies

These dependencies are required for all Helium deployments:

1. PostgreSQL Database

Purpose: Primary data store for all application data Version: PostgreSQL 12+ recommended Configuration:

Environment variable: DATABASE_URL
Format: postgres://user:password@host:port/database
Example: postgres://helium:password@localhost:5432/helium_db

Database Schema:

⚠️ CRITICAL: SQLx migrations must be run before starting the application
All database schema changes are managed through SQLx migrations in the /migrations directory
Use sqlx migrate run --database-url $DATABASE_URL to apply migrations

External Service Requirements:

NOT containerized - PostgreSQL should run as an external managed service
Recommended: Use cloud-managed PostgreSQL (AWS RDS, Google Cloud SQL, Azure Database, etc.)
Alternative: Dedicated PostgreSQL server with proper backup and high availability setup

2. Redis

Purpose: Caching, session storage, and configuration store Version: Redis 6+ recommended Configuration:

Environment variable: REDIS_URL
Format: redis://host:port or redis://user:password@host:port
Example: redis://localhost:6379

Usage:

Session management and authentication tokens
Configuration caching across modules
Temporary data storage (OAuth challenges, etc.)

External Service Requirements:

NOT containerized - Redis should run as an external managed service
Recommended: Use cloud-managed Redis (AWS ElastiCache, Google Memorystore, Azure Cache, etc.)
Alternative: Dedicated Redis server with persistence and clustering for production

3. RabbitMQ

Purpose: Message queue for asynchronous processing between modules Version: RabbitMQ 3.8+ recommended Configuration:

Environment variable: MQ_URL
Format: amqp://user:password@host:port/
Example: amqp://helium:password@localhost:5672/

Usage:

Inter-module communication
Background job processing
Event-driven architecture support

External Service Requirements:

NOT containerized - RabbitMQ should run as an external managed service
Recommended: Use cloud-managed message queues (AWS MQ, Google Cloud Pub/Sub, Azure Service Bus)
Alternative: Dedicated RabbitMQ cluster with proper clustering and high availability

Module-Specific Dependencies

These dependencies are required only when using specific modules:

Auth Module - OAuth Providers (Optional)

Purpose: Social authentication (Google, Microsoft, GitHub, Discord) Required: Only if OAuth authentication is enabled Configuration: Stored in database/Redis configuration

Supported Providers:

Google OAuth 2.0
Microsoft Azure AD
GitHub OAuth
Discord OAuth

Setup Requirements:

Create OAuth applications with each provider
Configure redirect URIs to your Helium deployment
Store client ID and secret in the system configuration
Configure OAuth provider settings via the management interface

Configuration Structure:

{
  "auth": {
    "oauth_providers": {
      "providers": [
        {
          "name": "google",
          "client_id": "your-client-id",
          "client_secret": "your-client-secret",
          "redirect_uri": "https://your-domain.com/auth/oauth/callback"
        }
      ],
      "challenge_expiration": "5m"
    }
  }
}

Mailer Module - SMTP Server (Required for Email)

Purpose: Email delivery for user notifications, verification, etc. Required: When email functionality is needed Configuration: Stored in database/Redis configuration

SMTP Configuration:

{
  "mailer": {
    "host": "smtp.gmail.com",
    "port": 587,
    "username": "your-email@gmail.com",
    "password": "your-app-password",
    "sender": "noreply@your-domain.com",
    "starttls": true
  }
}

Supported SMTP Features:

STARTTLS encryption
Plain authentication
Custom sender addresses
HTML email templates

Common SMTP Providers:

Gmail: smtp.gmail.com:587 (requires app passwords)
Outlook/Hotmail: smtp-mail.outlook.com:587
SendGrid: smtp.sendgrid.net:587
Mailgun: smtp.mailgun.org:587
Amazon SES: email-smtp.region.amazonaws.com:587

Shop Module - Epay Payment Provider (Required for Payments)

Purpose: Payment processing for e-commerce functionality Required: When payment processing is needed Configuration: Stored in database as epay provider credentials

Epay Provider Setup:

Register with an Epay-compatible payment provider
Obtain merchant credentials (PID, Key, Merchant URL)
Configure webhook endpoints for payment notifications
Add provider credentials via the management interface

Supported Payment Methods:

Alipay (alipay)
WeChat Pay (wxpay)
USDT cryptocurrency (usdt)

Configuration Requirements:

{
  "shop": {
    "epay_notify_url": "https://your-domain.com/api/shop/epay/callback",
    "epay_return_url": "https://your-domain.com/payment/success",
    "max_unpaid_orders": 5,
    "auto_cancel_after": "30m"
  }
}

Epay Provider Database Entry:

INSERT INTO epay_provider_credentials (
  display_name,
  enabled_channels,
  key,
  pid,
  merchant_url
) VALUES (
  'My Payment Provider',
  ['alipay', 'wxpay'],
  'your-merchant-key',
  1234,
  'https://pay.provider.com/submit.php'
);

Development Dependencies

These are required for building and developing the project:

Protocol Buffers Compiler

Purpose: Compiling .proto files for gRPC services Installation:

Ubuntu/Debian: apt-get install protobuf-compiler
macOS: brew install protobuf
Already included in Docker build process

SQLx CLI

Purpose: Database migration management Installation: cargo install sqlx-cli --no-default-features --features postgres Usage:

Apply migrations: sqlx migrate run
Create new migration: sqlx migrate add <name>

Docker/Kubernetes Deployment Considerations

What Should Be Containerized

✅ Containerize:

Helium server application (helium-server)
Application-specific components and workers

❌ Do NOT Containerize:

PostgreSQL - Use external managed database services
Redis - Use external managed cache services
RabbitMQ - Use external managed message queue services

Infrastructure Handled by Platform

When deploying with Docker and Kubernetes, these infrastructure concerns are handled by the orchestration platform:

Load Balancers: Kubernetes ingress controllers handle load balancing
TLS Certificates: cert-manager or similar tools handle SSL/TLS
Service Discovery: Kubernetes DNS handles service discovery
Health Checks: Kubernetes probes handle application health monitoring
Logging: Container runtime and logging drivers handle log aggregation

Recommended Managed Services by Cloud Provider

AWS:

PostgreSQL: Amazon RDS for PostgreSQL
Redis: Amazon ElastiCache for Redis
RabbitMQ: Amazon MQ for RabbitMQ

Google Cloud:

PostgreSQL: Cloud SQL for PostgreSQL
Redis: Memorystore for Redis
RabbitMQ: Cloud Pub/Sub (alternative) or third-party RabbitMQ

Azure:

PostgreSQL: Azure Database for PostgreSQL
Redis: Azure Cache for Redis
RabbitMQ: Azure Service Bus (alternative) or third-party RabbitMQ

Environment Variables for Containers

apiVersion: apps/v1
kind: Deployment
metadata:
  name: helium-server
spec:
  template:
    spec:
      containers:
        - name: helium-server
          image: helium-server:latest
          env:
            - name: WORK_MODE
              value: "grpc"
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: database-url
            - name: REDIS_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: redis-url
            - name: MQ_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: rabbitmq-url

Security Considerations

Credentials Management

Never store credentials in plain text
Use Kubernetes secrets or similar secure storage
Rotate credentials regularly
Use dedicated service accounts with minimal permissions

Network Security

Database: Restrict access to application subnets only
Redis: Enable authentication and restrict network access
RabbitMQ: Use strong passwords and enable TLS
SMTP: Use app passwords or OAuth tokens when available

OAuth Security

Use HTTPS for all OAuth redirect URIs
Validate redirect URI domains strictly
Use state parameter for CSRF protection (handled automatically)

Troubleshooting

Database Connection Issues

# Test database connectivity
psql $DATABASE_URL -c "SELECT version();"

# Check migration status
sqlx migrate info --database-url $DATABASE_URL

Redis Connection Issues

# Test Redis connectivity
redis-cli -u $REDIS_URL ping

# Check Redis memory usage
redis-cli -u $REDIS_URL info memory

RabbitMQ Connection Issues

# Check queue status
rabbitmqctl list_queues

# Check connection status
rabbitmqctl list_connections

SMTP Testing

The mailer module provides test endpoints and logging to help diagnose SMTP issues. Check application logs for detailed SMTP connection and authentication errors.

Epay Integration Issues

Verify webhook URLs are accessible from the internet
Check payment provider’s callback logs
Ensure merchant credentials are correctly configured
Validate signature verification in callback processing

Optional Observability Stack

OpenTelemetry & Grafana Stack (Optional)

Purpose: Comprehensive observability with distributed tracing, metrics, and log aggregation Required: No - completely optional enhancement Configuration: OTEL_COLLECTOR environment variable

Components:

OpenTelemetry Collector: Telemetry data collection and routing
Grafana Tempo: Distributed tracing backend
Prometheus: Metrics storage and querying
Grafana Loki: Log aggregation
Grafana: Unified visualization dashboard

When to Use:

Production deployments requiring detailed performance analysis
Multi-instance deployments needing distributed tracing
Teams requiring centralized observability dashboards
Troubleshooting complex performance issues

Deployment:

NOT containerized with application - Deploy as separate Kubernetes workloads or use Grafana Cloud
Recommended: Deploy Grafana stack in dedicated observability namespace
Alternative: Use managed services (Grafana Cloud, Datadog, New Relic)

Note: Helium automatically falls back to basic structured logging if OpenTelemetry is not configured. See the comprehensive Observability with OpenTelemetry guide for full setup instructions.

Summary

Dependency	Required	Purpose	Configuration	Deployment
PostgreSQL	Yes	Primary database	`DATABASE_URL`	External managed service
Redis	Yes	Caching/sessions	`REDIS_URL`	External managed service
RabbitMQ	Yes	Message queuing	`MQ_URL`	External managed service
SMTP Server	Conditional	Email delivery	Database config	External service
OAuth Providers	Optional	Social auth	Database config	External providers
Epay Provider	Conditional	Payment processing	Database config	External service
Observability	Optional	Tracing & metrics	`OTEL_COLLECTOR`	External stack/cloud

Next Steps: After setting up these dependencies, proceed to the Helium Server Deployment Guide for detailed deployment instructions.

Observability with OpenTelemetry

Helium server includes optional OpenTelemetry (OTel) integration for comprehensive observability. This integration is completely optional — the server will work perfectly fine without it using basic structured logging.

What is OpenTelemetry?

OpenTelemetry provides distributed tracing, metrics collection, and contextual logging for production systems. Use it when:

Running multiple worker instances requiring distributed tracing
Need detailed performance analysis and troubleshooting
Want centralized observability dashboards

Skip it for simple deployments, development environments, or when basic logging is sufficient.

Configuration

Enable OpenTelemetry by setting the OTEL_COLLECTOR environment variable:

export OTEL_COLLECTOR="http://otel-collector:4317"
./helium-server

If not set or initialization fails, the server automatically falls back to basic logging.

Service Names

Each worker mode reports with a distinct service name:

Worker Mode	Service Name
`grpc`	`Helium.grpc`
`subscribe_api`	`Helium.subscribe-api`
`webhook_api`	`Helium.webhook-api`
`consumer`	`Helium.consumer`
`mailer`	`Helium.mailer`
`cron_executor`	`Helium.cron-executor`

Recommended Stack: Grafana Observability

For production deployments, we recommend the Grafana observability stack — an open-source, Kubernetes-native solution with unified dashboards for traces, metrics, and logs.

Components

OpenTelemetry Collector: Receives and routes telemetry
Grafana Tempo: Distributed tracing storage
Prometheus: Metrics collection
Grafana Loki: Log aggregation
Grafana: Unified visualization

Deployment

Deploy the Grafana stack alongside your Kubernetes cluster:

1. Add Helm Repositories

helm repo add grafana https://grafana.github.io/helm-charts
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

2. Create Namespace

kubectl create namespace observability

3. Deploy OpenTelemetry Collector

Create otel-collector-values.yaml:

mode: deployment

config:
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: 0.0.0.0:4317
        http:
          endpoint: 0.0.0.0:4318

  processors:
    batch:
      timeout: 10s
      send_batch_size: 1024

  exporters:
    # Traces to Tempo
    otlp/tempo:
      endpoint: tempo.observability.svc.cluster.local:4317
      tls:
        insecure: true

    # Metrics to Prometheus
    prometheus:
      endpoint: 0.0.0.0:8889
      namespace: helium

    # Logs to Loki
    loki:
      endpoint: http://loki.observability.svc.cluster.local:3100/loki/api/v1/push

  service:
    pipelines:
      traces:
        receivers: [otlp]
        processors: [batch]
        exporters: [otlp/tempo]

      metrics:
        receivers: [otlp]
        processors: [batch]
        exporters: [prometheus]

      logs:
        receivers: [otlp]
        processors: [batch]
        exporters: [loki]

ports:
  otlp-grpc:
    enabled: true
    containerPort: 4317
    servicePort: 4317
    protocol: TCP
  otlp-http:
    enabled: true
    containerPort: 4318
    servicePort: 4318
    protocol: TCP
  metrics:
    enabled: true
    containerPort: 8889
    servicePort: 8889
    protocol: TCP

helm install otel-collector grafana/opentelemetry-collector \
  --namespace observability \
  --values otel-collector-values.yaml

4. Deploy Tempo, Loki, and Prometheus

# Tempo for traces
helm install tempo grafana/tempo \
  --namespace observability \
  --set tempo.receivers.otlp.protocols.grpc.endpoint=0.0.0.0:4317

# Loki for logs
helm install loki grafana/loki-stack \
  --namespace observability \
  --set loki.enabled=true \
  --set promtail.enabled=false

5. Deploy Prometheus

helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace observability \
  --set grafana.enabled=false

6. Deploy Grafana

helm install grafana grafana/grafana \
  --namespace observability \
  --set adminPassword=changeme

Configure data sources in Grafana to connect Tempo, Prometheus, and Loki.

Troubleshooting

Server logs show “Failed to initialize OpenTelemetry”

Check that the OTel Collector is reachable at the configured endpoint. The server will automatically fall back to basic logging.

Missing traces in Grafana

Verify the data pipeline: Helium → OTel Collector → Tempo. Check logs at each stage.

Performance impact

OpenTelemetry adds minimal overhead: < 2% CPU, ~10-20MB memory, < 1ms latency per request.

Disabling OpenTelemetry

Simply unset the OTEL_COLLECTOR variable — the server automatically falls back to basic logging.

Summary

OpenTelemetry in Helium is completely optional:

Set OTEL_COLLECTOR to enable, leave unset to use basic logging
Automatic fallback if initialization fails
Recommended for production with multiple instances
Grafana stack provides open-source, Kubernetes-native observability

For detailed Helm deployment configurations, refer to the official Grafana Helm charts documentation.

Health Checks for Kubernetes

Helium server provides HTTP health check endpoints designed for Kubernetes liveness and readiness probes. These endpoints run on a separate internal port (default: 9090) and are enabled for all worker modes.

Overview

Health checks help Kubernetes determine:

Liveness: Is the container alive and should it be restarted if it becomes unresponsive?
Readiness: Is the container ready to handle requests?

Helium implements both probe types on a dedicated HTTP server that runs alongside each worker mode.

Endpoints

Liveness Probe: `/healthz`

Returns 200 OK with a JSON response if the server is running:

{
  "status": "ok"
}

This endpoint always returns success if the health check server is responding. Kubernetes uses this to determine if the container should be restarted.

Readiness Probe: `/readyz`

Checks connectivity to all dependencies before returning status:

Success Response (200 OK):

{
  "status": "ok",
  "database": "ok",
  "redis": "ok",
  "rabbitmq": "ok"
}

Failure Response (503 Service Unavailable):

{
  "status": "error",
  "database": "ok",
  "redis": "error",
  "rabbitmq": "ok",
  "error": "Redis error: Connection refused"
}

The readiness probe checks:

PostgreSQL: Executes a simple query (SELECT 1)
Redis: Sends a PING command
RabbitMQ: Validates connection pool status

All worker modes check the same three dependencies.

Configuration

Health Check Port

Set the HEALTH_CHECK_PORT environment variable to customize the port (default: 9090):

export HEALTH_CHECK_PORT=9090

This port should be:

Internal only: Not exposed to external traffic
Accessible by Kubernetes: For probe requests
Different from main service ports: To avoid conflicts

Worker Modes

Health checks are available in all worker modes:

Worker Mode	Main Port	Health Check Port	Dependencies Checked
`grpc`	50051	9090	Database, Redis, RabbitMQ
`subscribe_api`	8080	9090	Database, Redis, RabbitMQ
`webhook_api`	8081	9090	Database, Redis, RabbitMQ
`consumer`	N/A	9090	Database, Redis, RabbitMQ
`mailer`	N/A	9090	Database, Redis, RabbitMQ
`cron_executor`	N/A	9090	Database, Redis, RabbitMQ

Kubernetes Deployment

Example Pod Configuration

Here’s how to configure health checks in your Kubernetes deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: helium-grpc
spec:
  replicas: 3
  selector:
    matchLabels:
      app: helium-grpc
  template:
    metadata:
      labels:
        app: helium-grpc
    spec:
      containers:
      - name: helium-grpc
        image: helium-server:latest
        env:
        - name: WORK_MODE
          value: "grpc"
        - name: LISTEN_ADDR
          value: "0.0.0.0:50051"
        - name: HEALTH_CHECK_PORT
          value: "9090"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: helium-secrets
              key: database-url
        - name: REDIS_URL
          valueFrom:
            secretKeyRef:
              name: helium-secrets
              key: redis-url
        - name: MQ_URL
          valueFrom:
            secretKeyRef:
              name: helium-secrets
              key: mq-url
        ports:
        - name: grpc
          containerPort: 50051
          protocol: TCP
        - name: health
          containerPort: 9090
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /healthz
            port: health
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /readyz
            port: health
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 5
          failureThreshold: 3

Probe Configuration Guidelines

Liveness Probe:

initialDelaySeconds: 10-30 seconds (allow time for startup)
periodSeconds: 10-30 seconds (check periodically)
timeoutSeconds: 5 seconds
failureThreshold: 3 (restart after 3 consecutive failures)

Readiness Probe:

initialDelaySeconds: 5-10 seconds (faster than liveness)
periodSeconds: 5-10 seconds (check more frequently)
timeoutSeconds: 5 seconds
failureThreshold: 3 (mark unready after 3 failures)

Service Configuration

For API worker modes (grpc, subscribe_api, webhook_api), configure a Service:

apiVersion: v1
kind: Service
metadata:
  name: helium-grpc
spec:
  type: ClusterIP
  ports:
  - name: grpc
    port: 50051
    targetPort: grpc
    protocol: TCP
  selector:
    app: helium-grpc

Note: The health check port (9090) is not exposed in the Service. It’s only for Kubernetes probes.

Worker Mode Behavior

API Modes (grpc, subscribe_api, webhook_api)

For API modes, the health check server runs alongside the main API server:

When the main server exits, the health check server is immediately terminated
Process exits when either server fails
Ensures no “zombie” containers serving health checks without handling requests

Background Worker Modes (consumer, mailer, cron_executor)

For background workers, the health check server runs continuously:

Liveness probe confirms the worker process is alive
Readiness probe ensures dependencies are accessible
Worker loops indefinitely alongside health check server

Troubleshooting

Health Check Server Not Starting

Symptom: Probes fail immediately with connection errors

Solutions:

Check logs for health check server errors
Verify HEALTH_CHECK_PORT is not already in use
Ensure the port is accessible within the pod

Readiness Probe Failing

Symptom: Pod remains in “Not Ready” state

Solutions:

Check which dependency is failing in the /readyz response
Verify connection strings (DATABASE_URL, REDIS_URL, MQ_URL)
Ensure network policies allow pod access to dependencies
Check if dependencies are healthy

Example debugging:

# Forward health check port to local machine
kubectl port-forward pod/helium-grpc-xyz 9090:9090

# Check readiness endpoint
curl http://localhost:9090/readyz

Liveness Probe Causing Restart Loop

Symptom: Pod repeatedly restarts with liveness probe failures

Solutions:

Increase initialDelaySeconds (worker may need more startup time)
Increase failureThreshold (allow more failures before restart)
Check if worker is deadlocked or stuck (examine logs before restart)

Worker Exits But Pod Stays Running

Symptom: Container appears healthy but doesn’t process requests

This should not happen with the current implementation:

API workers: Health check is aborted when main server exits
Background workers: Return from execute_worker() causes process exit

If this occurs, file a bug report.

Security Considerations

Port Exposure

The health check port (9090) should never be exposed externally:

Don’t create Ingress rules for health check endpoints
Don’t expose the health check port in the Service definition
Use network policies to restrict access to Kubernetes control plane only

Sensitive Information

Health check responses contain minimal information:

No version numbers
No internal IPs or hostnames
No authentication tokens
Only dependency status (ok/error)

Error messages may contain connection details. Ensure logs are secured appropriately.

Best Practices

Use separate ports: Never combine health checks with main service endpoints
Set appropriate timeouts: Balance between quick detection and false positives
Monitor probe metrics: Track probe success rates in your observability stack
Test locally: Use port-forwarding to verify health checks before deployment
Align with dependencies: If using a sidecar proxy (Istio, Linkerd), configure startup probes

Summary

Helium’s health check endpoints provide robust Kubernetes integration:

Liveness probe (/healthz): Detects unresponsive containers
Readiness probe (/readyz): Ensures dependencies are healthy
Separate port (default 9090): Isolated from main services
All worker modes: Consistent behavior across deployment types
Process lifecycle: Ensures clean exits, no zombie containers

Configure these probes in your Kubernetes deployments to enable automatic recovery and load balancing.

Docker-based Deployment

The Helium system is designed with a multi-worker architecture that can be deployed using containers. Each worker type serves a specific purpose and has different scaling requirements. This deployment approach provides:

Scalability: Independent scaling of different worker types based on load
Reliability: Fault isolation between different services
Flexibility: Easy deployment across different environments
Maintainability: Simplified updates and rollbacks

Prerequisites

Before proceeding with this guide, ensure you have:

External dependencies configured (see External Dependencies)
Docker or container runtime installed
Kubernetes cluster (for Kubernetes deployment)
Basic understanding of containerization concepts

Container Architecture

Worker Types and Scaling Patterns

The Helium server supports six distinct worker modes, each with specific scaling characteristics:

Worker Mode	Port	Scaling	Description
`grpc`	50051	✅ Horizontal	Main gRPC API server - can be load balanced
`subscribe_api`	8080	✅ Horizontal	RESTful subscription API - can be load balanced
`webhook_api`	8081	✅ Horizontal	Webhook handler for payments - can be load balanced
`consumer`	-	✅ Horizontal	Background message consumer - multiple instances supported
`mailer`	-	⚠️ Single preferred	Email service - not recommended >1 instance
`cron_executor`	-	🚫 Single only	Scheduled tasks - MUST be exactly 1 instance

Scaling Constraints

⚠️ Critical Scaling Limitations

mailer Worker:

Recommendation: Deploy as single instance only
Reason: Relies on SMTP server connections and may cause email delivery issues with multiple instances
Impact: Multiple mailer instances can lead to duplicate emails or SMTP rate limiting

cron_executor Worker:

Requirement: MUST have exactly one instance
Reason: Scans the database to check for scheduled tasks in the queue
Impact: Multiple instances will cause duplicate task execution and potential data corruption

✅ Scalable Workers

API Workers (grpc, subscribe_api, webhook_api):

Can be horizontally scaled based on traffic demands
Support standard load balancing techniques
Share state through external Redis and PostgreSQL

consumer Worker:

Can run multiple instances for processing message queues
Automatically distributes work through RabbitMQ

Docker Image

Building the Docker Image

The project includes a multi-stage Dockerfile optimized for production:

# Build the Docker image
docker build -t helium-server:latest .

# Tag for registry
docker tag helium-server:latest your-registry/helium-server:v1.0.0

# Push to registry
docker push your-registry/helium-server:v1.0.0

Image Characteristics

Base Image: gcr.io/distroless/cc for minimal attack surface
Size: ~50MB final image
Architecture: Multi-arch support (amd64, arm64)
Security: Non-root user, minimal dependencies

Environment Variables

Configure containers using these environment variables:

# Required - Worker mode selection
WORK_MODE=grpc  # grpc, subscribe_api, webhook_api, consumer, mailer, cron_executor

# Required - Database connections
DATABASE_URL=postgres://user:password@postgres-host:5432/helium_db
REDIS_URL=redis://redis-host:6379
MQ_URL=amqp://user:password@rabbitmq-host:5672/

# Optional - Server configuration
LISTEN_ADDR=0.0.0.0:50051  # For API workers
SCAN_INTERVAL=60           # For cron_executor only
RUST_LOG=info             # Logging level

Docker Compose Deployment

For development or simple production setups:

version: "3.8"

services:
  # Main gRPC API (scalable)
  helium-grpc:
    image: helium-server:latest
    ports:
      - "50051:50051"
    environment:
      WORK_MODE: grpc
      DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
      REDIS_URL: redis://redis:6379
      MQ_URL: amqp://helium:password@rabbitmq:5672/
      LISTEN_ADDR: 0.0.0.0:50051
    depends_on:
      - postgres
      - redis
      - rabbitmq
    restart: unless-stopped
    deploy:
      replicas: 2 # Can be scaled horizontally

  # Subscription API (scalable)
  helium-subscribe-api:
    image: helium-server:latest
    ports:
      - "8080:8080"
    environment:
      WORK_MODE: subscribe_api
      DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
      REDIS_URL: redis://redis:6379
      MQ_URL: amqp://helium:password@rabbitmq:5672/
      LISTEN_ADDR: 0.0.0.0:8080
    depends_on:
      - postgres
      - redis
      - rabbitmq
    restart: unless-stopped
    deploy:
      replicas: 2 # Can be scaled horizontally

  # Webhook API (scalable)
  helium-webhook-api:
    image: helium-server:latest
    ports:
      - "8081:8081"
    environment:
      WORK_MODE: webhook_api
      DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
      REDIS_URL: redis://redis:6379
      MQ_URL: amqp://helium:password@rabbitmq:5672/
      LISTEN_ADDR: 0.0.0.0:8081
    depends_on:
      - postgres
      - redis
      - rabbitmq
    restart: unless-stopped
    deploy:
      replicas: 2 # Can be scaled horizontally

  # Background consumer (scalable)
  helium-consumer:
    image: helium-server:latest
    environment:
      WORK_MODE: consumer
      DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
      REDIS_URL: redis://redis:6379
      MQ_URL: amqp://helium:password@rabbitmq:5672/
    depends_on:
      - postgres
      - redis
      - rabbitmq
    restart: unless-stopped
    deploy:
      replicas: 3 # Can run multiple instances

  # Mailer service (single instance recommended)
  helium-mailer:
    image: helium-server:latest
    environment:
      WORK_MODE: mailer
      DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
      REDIS_URL: redis://redis:6379
      MQ_URL: amqp://helium:password@rabbitmq:5672/
    depends_on:
      - postgres
      - redis
      - rabbitmq
    restart: unless-stopped
    deploy:
      replicas: 1 # SINGLE INSTANCE ONLY

  # Cron executor (must be single instance)
  helium-cron:
    image: helium-server:latest
    environment:
      WORK_MODE: cron_executor
      DATABASE_URL: postgres://helium:password@postgres:5432/helium_db
      REDIS_URL: redis://redis:6379
      MQ_URL: amqp://helium:password@rabbitmq:5672/
      SCAN_INTERVAL: 60
    depends_on:
      - postgres
      - redis
      - rabbitmq
    restart: unless-stopped
    deploy:
      replicas: 1 # MUST BE EXACTLY 1

  # External dependencies (for development only)
  postgres:
    image: postgres:15
    environment:
      POSTGRES_USER: helium
      POSTGRES_PASSWORD: password
      POSTGRES_DB: helium_db
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"

  redis:
    image: redis:7
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

  rabbitmq:
    image: rabbitmq:3-management
    environment:
      RABBITMQ_DEFAULT_USER: helium
      RABBITMQ_DEFAULT_PASS: password
    ports:
      - "5672:5672"
      - "15672:15672"
    volumes:
      - rabbitmq_data:/var/lib/rabbitmq

volumes:
  postgres_data:
  redis_data:
  rabbitmq_data:

Kubernetes Deployment

For production Kubernetes deployments:

Namespace and ConfigMap

apiVersion: v1
kind: Namespace
metadata:
  name: helium-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: helium-config
  namespace: helium-system
data:
  RUST_LOG: "info"
  SCAN_INTERVAL: "60"

Secrets

apiVersion: v1
kind: Secret
metadata:
  name: helium-secrets
  namespace: helium-system
type: Opaque
stringData:
  database-url: "postgres://helium:password@postgres-service:5432/helium_db"
  redis-url: "redis://redis-service:6379"
  rabbitmq-url: "amqp://helium:password@rabbitmq-service:5672/"

gRPC API Deployment (Scalable)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: helium-grpc
  namespace: helium-system
spec:
  replicas: 3 # Can be scaled horizontally
  selector:
    matchLabels:
      app: helium-grpc
  template:
    metadata:
      labels:
        app: helium-grpc
    spec:
      containers:
        - name: helium-server
          image: your-registry/helium-server:v1.0.0
          ports:
            - containerPort: 50051
          env:
            - name: WORK_MODE
              value: "grpc"
            - name: LISTEN_ADDR
              value: "0.0.0.0:50051"
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: database-url
            - name: REDIS_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: redis-url
            - name: MQ_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: rabbitmq-url
          envFrom:
            - configMapRef:
                name: helium-config
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "512Mi"
              cpu: "500m"
          livenessProbe:
            tcpSocket:
              port: 50051
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            tcpSocket:
              port: 50051
            initialDelaySeconds: 5
            periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: helium-grpc-service
  namespace: helium-system
spec:
  selector:
    app: helium-grpc
  ports:
    - port: 50051
      targetPort: 50051
  type: ClusterIP

Consumer Deployment (Scalable)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: helium-consumer
  namespace: helium-system
spec:
  replicas: 3 # Can run multiple instances
  selector:
    matchLabels:
      app: helium-consumer
  template:
    metadata:
      labels:
        app: helium-consumer
    spec:
      containers:
        - name: helium-server
          image: your-registry/helium-server:v1.0.0
          env:
            - name: WORK_MODE
              value: "consumer"
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: database-url
            - name: REDIS_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: redis-url
            - name: MQ_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: rabbitmq-url
          envFrom:
            - configMapRef:
                name: helium-config
          resources:
            requests:
              memory: "256Mi"
              cpu: "200m"
            limits:
              memory: "1Gi"
              cpu: "1000m"
          livenessProbe:
            exec:
              command:
                - /bin/sh
                - -c
                - "ps aux | grep helium-server | grep -v grep"
            initialDelaySeconds: 30
            periodSeconds: 30

Mailer Deployment (Single Instance)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: helium-mailer
  namespace: helium-system
spec:
  replicas: 1 # SINGLE INSTANCE ONLY
  strategy:
    type: Recreate # Prevent multiple instances during updates
  selector:
    matchLabels:
      app: helium-mailer
  template:
    metadata:
      labels:
        app: helium-mailer
    spec:
      containers:
        - name: helium-server
          image: your-registry/helium-server:v1.0.0
          env:
            - name: WORK_MODE
              value: "mailer"
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: database-url
            - name: REDIS_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: redis-url
            - name: MQ_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: rabbitmq-url
          envFrom:
            - configMapRef:
                name: helium-config
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "512Mi"
              cpu: "500m"

Cron Executor Deployment (Singleton)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: helium-cron
  namespace: helium-system
spec:
  replicas: 1 # MUST BE EXACTLY 1
  strategy:
    type: Recreate # Ensure no overlap during updates
  selector:
    matchLabels:
      app: helium-cron
  template:
    metadata:
      labels:
        app: helium-cron
    spec:
      containers:
        - name: helium-server
          image: your-registry/helium-server:v1.0.0
          env:
            - name: WORK_MODE
              value: "cron_executor"
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: database-url
            - name: REDIS_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: redis-url
            - name: MQ_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: rabbitmq-url
            - name: SCAN_INTERVAL
              value: "60"
          envFrom:
            - configMapRef:
                name: helium-config
          resources:
            requests:
              memory: "128Mi"
              cpu: "50m"
            limits:
              memory: "256Mi"
              cpu: "200m"
          livenessProbe:
            exec:
              command:
                - /bin/sh
                - -c
                - "ps aux | grep helium-server | grep -v grep"
            initialDelaySeconds: 60
            periodSeconds: 30

Horizontal Pod Autoscaler (HPA)

For scalable workers, configure automatic scaling:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: helium-grpc-hpa
  namespace: helium-system
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: helium-grpc
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

Load Balancer Configuration

Ingress for API Services

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: helium-ingress
  namespace: helium-system
  annotations:
    nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  tls:
    - hosts:
        - api.your-domain.com
      secretName: helium-tls
  rules:
    - host: api.your-domain.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: helium-grpc-service
                port:
                  number: 50051

Service Mesh Configuration

For advanced deployments with service mesh (Istio):

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: helium-grpc-vs
  namespace: helium-system
spec:
  hosts:
    - api.your-domain.com
  gateways:
    - helium-gateway
  http:
    - match:
        - uri:
            prefix: /
      route:
        - destination:
            host: helium-grpc-service
            port:
              number: 50051
          weight: 100
      fault:
        delay:
          percentage:
            value: 0.1
          fixedDelay: 5s

Database Migration

Database migrations must be run before starting any workers:

Migration Job

apiVersion: batch/v1
kind: Job
metadata:
  name: helium-migration
  namespace: helium-system
spec:
  template:
    spec:
      containers:
        - name: migration
          image: your-registry/helium-server:v1.0.0
          command: ["sqlx", "migrate", "run"]
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: database-url
      restartPolicy: Never
  backoffLimit: 3

Init Container for Workers

Add to all worker deployments:

spec:
  template:
    spec:
      initContainers:
        - name: wait-for-migration
          image: postgres:15
          command:
            [
              "sh",
              "-c",
              "until pg_isready -h postgres-service -p 5432; do echo waiting for database; sleep 2; done;",
            ]
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: helium-secrets
                  key: database-url

Monitoring and Observability

Health Checks

Configure appropriate health checks for each worker type:

# For API workers (gRPC, REST)
livenessProbe:
  tcpSocket:
    port: 50051
  initialDelaySeconds: 30
  periodSeconds: 10

# For background workers (consumer, mailer, cron)
livenessProbe:
  exec:
    command:
    - /bin/sh
    - -c
    - "ps aux | grep helium-server | grep -v grep"
  initialDelaySeconds: 30
  periodSeconds: 30

Logging Configuration

env:
  - name: RUST_LOG
    value: "info,helium_server=debug" # Adjust as needed

Metrics Collection

Use Prometheus for metrics collection:

apiVersion: v1
kind: Service
metadata:
  name: helium-metrics
  namespace: helium-system
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/metrics"
spec:
  selector:
    app: helium-grpc
  ports:
    - port: 8080
      name: metrics

Troubleshooting

Common Issues

Pod Crash Loop:

# Check logs
kubectl logs -n helium-system deployment/helium-grpc

# Check events
kubectl get events -n helium-system --sort-by='.metadata.creationTimestamp'

# Verify environment variables
kubectl exec -n helium-system deployment/helium-grpc -- env | grep -E "(DATABASE_URL|REDIS_URL|MQ_URL)"

Multiple Cron Executors:

# Check for multiple cron instances (should show only 1)
kubectl get pods -n helium-system -l app=helium-cron

# Check cron logs for conflicts
kubectl logs -n helium-system -l app=helium-cron --tail=100

Database Connection Issues:

# Test database connectivity
kubectl run -i --tty --rm debug --image=postgres:15 --restart=Never -- \
  psql postgresql://user:password@postgres-service:5432/helium_db -c "SELECT version();"

# Check migration status
kubectl exec -n helium-system deployment/helium-grpc -- \
  sqlx migrate info --database-url $DATABASE_URL

Performance Tuning

Resource Limits:

API workers: 200-500m CPU, 256Mi-1Gi RAM per pod
Consumer workers: 500m-1 CPU, 512Mi-2Gi RAM per pod
Mailer/Cron: 100-200m CPU, 128-512Mi RAM per pod

Scaling Guidelines:

Start with 2-3 replicas for API workers
Scale consumers based on message queue depth
Monitor CPU/memory usage and adjust limits accordingly

External Dependencies

Refer to the External Dependencies Guide for detailed information about:

PostgreSQL setup and configuration
Redis configuration and clustering
RabbitMQ setup and management
SMTP server configuration
OAuth provider setup
Payment provider integration

Configuration Management

Refer to the Configuration Guide for:

Environment variable reference
Configuration file formats
Runtime configuration updates
Security best practices

Next Steps

After successful deployment:

Configure monitoring and alerting
Set up backup procedures for stateful data
Implement CI/CD pipelines for automated deployments
Configure log aggregation and analysis
Plan disaster recovery procedures

For specific configuration details, see the Helium Server Configuration guide.

Configuration Guide

This document provides comprehensive configuration information for operators deploying the Helium project. The system uses a combination of environment variables for server configuration and JSON configurations stored in the database for module-specific settings.

Environment Variables
Module Configurations
Infrastructure Dependencies
Configuration Templates

Environment Variables

The Helium server is configured entirely through environment variables. These control the server behavior and connectivity to external services.

Required Environment Variables

All worker modes require these variables:

# Worker mode selection (REQUIRED)
WORK_MODE="grpc"  # Options: grpc, subscribe_api, webhook_api, consumer, mailer, cron_executor

# Database connection (REQUIRED)
DATABASE_URL="postgres://user:password@localhost:5432/helium_db"

# Redis connection (REQUIRED)
REDIS_URL="redis://localhost:6379"

# RabbitMQ connection (REQUIRED)
MQ_URL="amqp://user:password@localhost:5672/"

Worker Mode Options

Worker Mode	Port	Description	Use Case
`grpc`	50051	gRPC API server	Main API for client applications and admin panels
`subscribe_api`	8080	RESTful subscription API	Public subscription endpoints
`webhook_api`	8081	RESTful webhook handler	Payment provider callbacks, third-party integrations
`consumer`	-	Background message consumer	Processing async tasks from message queue
`mailer`	-	Email service worker	Sending emails and notifications
`cron_executor`	-	Scheduled task executor	Running periodic maintenance tasks

Optional Environment Variables

# Server listen addresses (for API workers)
LISTEN_ADDR="0.0.0.0:50051"  # Default for grpc mode
LISTEN_ADDR="0.0.0.0:8080"   # Default for subscribe_api mode
LISTEN_ADDR="0.0.0.0:8081"   # Default for webhook_api mode

# Cron executor configuration
SCAN_INTERVAL="60"  # Scan interval in seconds (cron_executor mode only)

# Logging configuration
RUST_LOG="info"  # Options: error, warn, info, debug, trace

Module Configurations

Module configurations are stored as JSON in the PostgreSQL database in the application__config table. Each module has its own configuration key and JSON structure.

Note: All duration values are represented as strings containing the number of seconds (e.g., "300" for 5 minutes, "1800" for 30 minutes).

Auth Module (`auth`)

Key: "auth"

The authentication module handles user registration, login, JWT tokens, and OAuth providers.

{
  "email_provider": {
    "register_domain": {
      "enable_white_list": false,
      "white_list": [],
      "enable_black_list": false,
      "black_list": []
    },
    "otp_expire_after": "300",
    "delete_otp_before": "7200",
    "magic_link_expire_after": "1800",
    "magic_link_delete_before": "14400",
    "resend_interval": "30"
  },
  "jwt": {
    "secret": "your-jwt-secret-key-32-characters-long",
    "refresh_token_expiration": "2592000",
    "access_token_expiration": "900",
    "issuer": "https://your-domain.com",
    "access_audience": "helium_cloud",
    "refresh_audience": "helium_cloud_auth"
  },
  "oauth_providers": {
    "providers": [
      {
        "name": "Google",
        "client_id": "your-google-client-id",
        "client_secret": "your-google-client-secret",
        "redirect_uri": "https://your-domain.com/auth/oauth/google/callback"
      },
      {
        "name": "GitHub",
        "client_id": "your-github-client-id",
        "client_secret": "your-github-client-secret",
        "redirect_uri": "https://your-domain.com/auth/oauth/github/callback"
      }
    ],
    "challenge_expiration": "300"
  },
}

Configuration Details:

email_provider.register_domain: Controls which email domains are allowed for registration
email_provider.otp_expire_after: How long OTP codes remain valid (in seconds, default: “300” = 5 minutes)
email_provider.resend_interval: Minimum time between resend attempts (in seconds, default: “30” = 30 seconds)
jwt.secret: CRITICAL: Must be a secure random string for production
jwt.*_expiration: Token lifetime settings (in seconds, default: “2592000” = 30 days for refresh, “900” = 15 minutes for access)
oauth_providers.providers: List of OAuth providers with their credentials

Telecom Module (`telecom`)

Key: "telecom"

The telecom module manages VPN nodes, subscription links, and proxy synchronization.

{
  "node_health_check": {
    "offline_timeout": "600"
  },
  "subscribe_link": {
    "endpoints": [
      {
        "url_template": "https://subscribe.your-domain.com/subscribe/{SUBSCRIBE_TOKEN}",
        "endpoint_name": "primary"
      },
      {
        "url_template": "https://backup.your-domain.com/subscribe/{SUBSCRIBE_TOKEN}",
        "endpoint_name": "backup"
      }
    ]
  },
  "uni_proxy_sync": {
    "push_interval": "30",
    "pull_interval": "60"
  },
  "vpn_server_token": "secure-random-token-for-vpn-servers"
}

Configuration Details:

node_health_check.offline_timeout: Time before marking nodes as offline (in seconds, default: “600” = 10 minutes)
subscribe_link.endpoints: List of subscription endpoints for client configuration
uni_proxy_sync.push_interval: How often to push traffic data (in seconds, default: “30” = 30 seconds)
uni_proxy_sync.pull_interval: How often to pull user info (in seconds, default: “60” = 1 minute)
vpn_server_token: CRITICAL: Secure token for VPN server authentication

Shop Module (`shop`)

Key: "shop"

The shop module handles e-commerce functionality, orders, and payment processing.

{
  "max_unpaid_orders": 5,
  "auto_cancel_after": "1800",
  "epay_notify_url": "https://your-domain.com/api/webhook/epay/notify",
  "epay_return_url": "https://your-domain.com/payment/success"
}

Configuration Details:

max_unpaid_orders: Maximum unpaid orders per user (default: 5)
auto_cancel_after: Time before auto-canceling unpaid orders (in seconds, default: “1800” = 30 minutes)
epay_notify_url: REQUIRED: Server-to-server notification endpoint for payment providers
epay_return_url: REQUIRED: User return URL after payment completion

Mailer Module (`mailer`)

Key: "mailer"

The mailer module handles email delivery through SMTP.

{
  "host": "smtp.gmail.com",
  "port": 587,
  "username": "your-smtp-username",
  "password": "your-smtp-password",
  "sender": "noreply@your-domain.com",
  "starttls": true
}

Configuration Details:

host: SMTP server hostname
port: SMTP server port (typically 587 for STARTTLS, 465 for SSL)
username/password: SMTP authentication credentials
sender: Email address used as sender
starttls: Enable STARTTLS encryption (recommended: true)

Admin Management Module (`admin-jwt`)

Key: "admin-jwt"

Controls JWT tokens for administrative access.

{
  "secret": "admin-jwt-secret-key-32-characters-long",
  "token_expiration": "864000",
  "issuer": "https://admin.your-domain.com",
  "audience": "HeliumAdmin"
}

Configuration Details:

secret: CRITICAL: Secure secret for admin JWT signing
token_expiration: Admin token lifetime (in seconds, default: “864000” = 10 days)
issuer: JWT issuer for admin tokens
audience: JWT audience for admin tokens

Market Module (`affiliate`)

Key: "affiliate"

Controls the affiliate marketing system.

{
  "max_invite_code_per_user": 10,
  "default_reward_rate": "0.1",
  "default_trigger_time_per_user": 3
}

Configuration Details:

max_invite_code_per_user: Maximum invite codes per user (default: 10)
default_reward_rate: Default affiliate commission rate (default: 10%)
default_trigger_time_per_user: Required referrals before earning (default: 3)

Infrastructure Dependencies

PostgreSQL Database

Required Version: PostgreSQL 12+

Configuration:

Environment variable: DATABASE_URL
Format: postgres://user:password@host:port/database

Important Notes:

⚠️ CRITICAL: Run migrations before starting: sqlx migrate run --database-url $DATABASE_URL
Use external managed PostgreSQL service for production (AWS RDS, Google Cloud SQL, etc.)
Ensure proper backup and high availability configuration

Redis

Required Version: Redis 6+

Configuration:

Environment variable: REDIS_URL
Format: redis://host:port or redis://user:password@host:port

Usage:

Session storage and authentication tokens
Module configuration caching
Temporary data (OAuth challenges, OTP codes)

RabbitMQ (AMQP)

Configuration:

Environment variable: MQ_URL
Format: amqp://user:password@host:port/

Usage:

Asynchronous task processing
Email sending queue
Inter-module communication

Configuration Templates

Development Environment

# .env file for development
WORK_MODE=grpc
DATABASE_URL=postgres://helium:password@localhost:5432/helium_dev
REDIS_URL=redis://localhost:6379
MQ_URL=amqp://guest:guest@localhost:5672/
LISTEN_ADDR=0.0.0.0:50051
RUST_LOG=debug

Production Environment

# Production environment variables
WORK_MODE=grpc
DATABASE_URL=postgres://helium_user:secure_password@db.example.com:5432/helium_prod
REDIS_URL=redis://redis.example.com:6379
MQ_URL=amqp://helium_user:secure_password@mq.example.com:5672/
LISTEN_ADDR=0.0.0.0:50051
RUST_LOG=info

Multi-Worker Deployment

For production, run multiple worker processes:

# API Server (can be scaled horizontally)
WORK_MODE=grpc ./helium-server &

# Background Tasks (can be scaled)
WORK_MODE=consumer ./helium-server &

# Email Processing (single instance recommended)
WORK_MODE=mailer ./helium-server &

# Scheduled Tasks (MUST be single instance)
WORK_MODE=cron_executor ./helium-server &

Configuration Management

To update module configurations:

Via Database: Insert/update records in the application__config table
Via Admin API: Use the management gRPC API to update configurations
Configuration Sync: The system automatically syncs configurations from PostgreSQL to Redis cache

Example SQL for updating auth configuration:

INSERT INTO application__config (key, content)
VALUES ('auth', '{"jwt": {"secret": "new-secret"}, ...}')
ON CONFLICT (key) DO UPDATE SET
  content = EXCLUDED.content,
  updated_at = NOW();

Security Considerations

⚠️ Critical Configuration Security:

JWT Secrets: Use cryptographically secure random strings (32+ characters)
VPN Server Token: Generate secure random tokens for server authentication
Database Credentials: Use strong passwords and restrict database access
SMTP Credentials: Use application-specific passwords, not primary account passwords
OAuth Secrets: Keep OAuth client secrets secure and rotate them regularly

Troubleshooting

Common Configuration Issues

Database Connection: Verify PostgreSQL accessibility and credentials
Redis Connection: Check Redis server status and network connectivity
RabbitMQ Connection: Ensure RabbitMQ server is running and accessible
Email Delivery: Test SMTP configuration with your email provider
OAuth Issues: Verify client IDs, secrets, and redirect URIs match provider settings

Validation Commands

# Test database connection
sqlx migrate info --database-url $DATABASE_URL

# Test Redis connection
redis-cli -u $REDIS_URL ping

# Test RabbitMQ connection
rabbitmqctl status  # on RabbitMQ server

Helium CLI

The Helium CLI (helium-cli) is a comprehensive administrative tool that allows operators to:

Initialize system configurations with default values
Manage admin accounts (create, list, view, delete)
Validate configuration files before deployment
Interact with both PostgreSQL database and Redis cache

Installation

The CLI is built as part of the main Helium project. After building the project:

cargo build --release --bin helium-cli

The binary will be available at target/release/helium-cli.

Global Configuration

The CLI requires database and Redis connections to function. These can be configured via:

Environment Variables

export DATABASE_URL="postgresql://user:password@localhost/helium"
export REDIS_URL="redis://localhost:6379"

Command Line Arguments

helium-cli --database-url "postgresql://user:password@localhost/helium" \
           --redis-url "redis://localhost:6379" \
           <command>

Verbose Logging

Enable detailed logging for troubleshooting:

helium-cli --verbose <command>

Skip Database Migrations

Skip automatic database migrations when connecting:

helium-cli --skip-migration <command>

This is useful when:

Migrations have already been run by another process
Running in a container where migrations are handled separately
Debugging database connection issues

Commands

Configuration Management

Initialize All Configurations

helium-cli init-config

This command initializes all system configurations with their default values. It:

Creates default configurations for all modules in the database
Updates Redis cache with the configurations
Handles the following configuration types:
- Auth: Authentication and authorization settings
- Admin JWT: JWT configuration for admin authentication
- Telecom: Telecom service configurations
- Shop: E-commerce and shop settings
- Market: Affiliate and marketing configurations
- Mailer: SMTP and email service settings

Example Output:

Initializing 6 configuration types...

Initializing Auth config... ✓ Success
Initializing Admin JWT config... ✓ Success
Initializing Telecom config... ✓ Success
Initializing Shop config... ✓ Success
Initializing Market/Affiliate config... ✓ Success
Initializing Mailer config... ✓ Success

Configuration initialization completed:
  ✓ Successful: 6

Use Cases:

Initial deployment setup
Resetting configurations to defaults
Disaster recovery scenarios

Validate Configuration Files

helium-cli validate-config --config-type <TYPE> <config-file.json>

Validates a JSON configuration file against the specified configuration schema.

Supported Configuration Types:

auth - Authentication configuration
admin-jwt / admin_jwt - Admin JWT configuration
telecom - Telecom service configuration
shop - Shop/e-commerce configuration
market / affiliate - Marketing/affiliate configuration
mailer - Email service configuration

Examples:

# Validate auth configuration
helium-cli validate-config --config-type auth auth-config.json

# Validate mailer configuration
helium-cli validate-config --config-type mailer smtp-config.json

Example Output:

✓ Configuration file is valid!
  File: auth-config.json
  Type: Auth
  Key: auth

Admin Account Management

List Admin Accounts

helium-cli admin list [--limit <N>] [--offset <N>]

Lists all admin accounts with pagination support.

Options:

--limit <N> - Number of results to return (default: 50)
--offset <N> - Number of results to skip (default: 0)

Example:

# List first 10 admin accounts
helium-cli admin list --limit 10

# List admin accounts with pagination
helium-cli admin list --limit 25 --offset 50

Example Output:

Found 3 admin account(s):

ID                                   Role                 Name                           Email                          Created At
------------------------------------ -------------------- ------------------------------ ------------------------------ --------------------
123e4567-e89b-12d3-a456-426614174000 Super Admin          System Administrator           admin@example.com              2024-01-15T10:30:00Z
234e5678-e89b-12d3-a456-426614174001 Customer Support     Support Team Lead              support@example.com            2024-01-16T14:20:00Z
345e6789-e89b-12d3-a456-426614174002 Moderator            Content Moderator              moderator@example.com          2024-01-17T09:45:00Z

Show Admin Account Details

helium-cli admin show <ADMIN_ID>

Displays detailed information about a specific admin account.

Example:

helium-cli admin show 123e4567-e89b-12d3-a456-426614174000

Example Output:

Admin Account Details:
  ID: 123e4567-e89b-12d3-a456-426614174000
  Name: System Administrator
  Role: Super Admin
  Email: admin@example.com
  Avatar: https://example.com/avatar.jpg
  Created At: 2024-01-15T10:30:00Z

Create Admin Account

helium-cli admin create --name <NAME> --role <ROLE> [--email <EMAIL>] [--avatar <AVATAR_URL>]

Creates a new admin account with the specified details.

Required Options:

--name <NAME> - Display name for the admin
--role <ROLE> - Admin role (see roles below)

Optional Options:

--email <EMAIL> - Admin email address
--avatar <AVATAR_URL> - URL to admin avatar image

Available Roles:

super_admin / superadmin / super-admin - Full system access
moderator - Content moderation privileges
customer_support / customersupport / customer-support - Customer service access
support_bot / supportbot / support-bot - Automated support system access

Examples:

# Create super admin
helium-cli admin create \
  --name "System Administrator" \
  --role super_admin \
  --email "admin@example.com"

# Create customer support account
helium-cli admin create \
  --name "Support Agent" \
  --role customer_support \
  --email "support@example.com" \
  --avatar "https://example.com/avatars/support.jpg"

# Create moderator (minimal info)
helium-cli admin create \
  --name "Content Moderator" \
  --role moderator

Example Output:

Successfully created admin account:
  ID: 456e7890-e89b-12d3-a456-426614174003
  Name: System Administrator
  Role: Super Admin
  Email: admin@example.com
  Avatar: N/A

Delete Admin Account

helium-cli admin delete <ADMIN_ID> [--yes]

Deletes an admin account after confirmation.

Options:

--yes - Skip confirmation prompt (use with caution)

Examples:

# Delete with confirmation prompt
helium-cli admin delete 123e4567-e89b-12d3-a456-426614174000

# Delete without confirmation (automated scripts)
helium-cli admin delete 123e4567-e89b-12d3-a456-426614174000 --yes

Example Interactive Flow:

Admin account to delete:
  ID: 123e4567-e89b-12d3-a456-426614174000
  Name: Old Administrator
  Role: Super Admin
  Email: old-admin@example.com

Are you sure you want to delete this admin account? [y/N]: y
Successfully deleted admin account: 123e4567-e89b-12d3-a456-426614174000

Common Use Cases

Initial Deployment

Set up environment variables:

export DATABASE_URL="postgresql://helium:password@localhost/helium"
export REDIS_URL="redis://localhost:6379"

Initialize system configurations:
```
helium-cli init-config
```

Create initial super admin:

helium-cli admin create \
  --name "System Administrator" \
  --role super_admin \
  --email "admin@yourcompany.com"

Configuration Management Workflow

Prepare configuration file: Create a JSON file with your custom configuration.

Validate before deployment:

helium-cli validate-config --config-type auth ./configs/auth-config.json

Deploy configuration: Use the web interface or API to upload the validated configuration.

Admin Account Maintenance

Regular audit of admin accounts:
```
helium-cli admin list --limit 100
```

Create specialized support accounts:

# Customer support team
helium-cli admin create --name "Support Team A" --role customer_support

# Content moderation team
helium-cli admin create --name "Moderator Team B" --role moderator

Remove inactive accounts:

helium-cli admin delete <inactive-admin-id>

Error Handling

The CLI provides comprehensive error messages and logging:

Database Connection Issues: Check DATABASE_URL and database availability
Redis Connection Issues: Verify REDIS_URL and Redis service status
Configuration Validation Errors: Review JSON syntax and required fields
Admin Role Errors: Ensure role names match supported values exactly

Security Considerations

Environment Variables: Use secure methods to set database credentials
Admin Creation: Be selective with super_admin role assignments
Account Deletion: Always verify admin identity before deletion
Logging: Be aware that verbose mode may log sensitive information

Troubleshooting

Common Issues

“DATABASE_URL must be provided”

Set the DATABASE_URL environment variable or use --database-url flag

“Failed to connect to database”

Verify PostgreSQL is running and accessible
Check connection string format and credentials
Ensure the database exists

“Invalid admin role”

Use exact role names: super_admin, moderator, customer_support, support_bot
Role names are case-insensitive but must match supported variants

“Configuration validation failed”

Check JSON syntax with a JSON validator
Ensure all required fields are present
Verify field types match expected schema

Getting Help

Use the built-in help system:

# General help
helium-cli --help

# Command-specific help
helium-cli admin --help
helium-cli admin create --help

Integration with Deployment Scripts

The CLI is designed to work well in automated deployment scenarios:

#!/bin/bash
set -e

# Set environment
export DATABASE_URL="$HELIUM_DB_URL"
export REDIS_URL="$HELIUM_REDIS_URL"

# Initialize configurations
echo "Initializing Helium configurations..."
helium-cli init-config

# Create admin account if it doesn't exist
echo "Creating admin account..."
helium-cli admin create \
  --name "$ADMIN_NAME" \
  --role super_admin \
  --email "$ADMIN_EMAIL" || true

echo "Deployment initialization complete!"

This CLI tool is essential for proper Helium deployment and ongoing operational management. Use it as part of your deployment automation and regular maintenance procedures.

Migrate From SS-Panel UIM

This guide walks Helium operators through migrating an existing SS-Panel UIM deployment. The migration intentionally happens in two isolated passes so you can export data from the legacy MariaDB instance without touching the new Helium PostgreSQL database until you are ready.

At a high level:

mariadb-pass reads all data from the SS-Panel MariaDB schema and saves it to a local rkyv archive.
postgre-pass consumes that rkyv archive and writes normalized data into Helium’s PostgreSQL schema.

Because Helium normally targets PostgreSQL, the first pass uses a dedicated crate that bundles the MySQL client driver and builds separately from the rest of the project.

What Gets Migrated

The migration transfers the following SS-Panel data into Helium’s schema:

User Accounts

Email and password hashes (preserved as-is for seamless login)
User names and registration timestamps
Last active timestamps
Account balances (available balance for purchasing)
Referral relationships (affiliate ref_by links)
Traffic usage (upload/download totals)
VMess UUIDs (for node authentication)
Subscribe tokens (subscription links)
Invite codes (user-specific invite codes)

Helium creates corresponding entries in:

auth.user_account (login credentials)
auth.user_account (profile metadata)
shop.user_balance (financial data)
market.affiliate_user_policy (referral relationships)
telecom.user_nodes_token (node authentication tokens)

Products → Packages

SS-Panel products are converted to Helium packages with:

Package name
Price
Duration (time allowance in days)
Bandwidth quota

These populate the telecom.package table.

Orders → Package Queues

Historical purchase orders are replayed into Helium’s package queue system:

Order status (activated vs. pending)
Creation and update timestamps
Associated product/package

Orders are inserted into telecom.package_queue to preserve user entitlements and purchase history.

Nodes → Node Servers & Clients

SS-Panel nodes are split into two Helium entities:

Node servers (telecom.node_server): server address, rate, class
Node clients (telecom.node_client): protocol configurations (VMess, WebSocket, gRPC)

Each node’s custom configuration (ports, security, network transport) is normalized to Helium’s node client schema.

Data Not Migrated

The following SS-Panel data is not migrated:

Invoices (read but not written to Helium)
Payback records (read but not written)
Admin accounts (must be created manually via helium-cli)
System configurations (initialize via helium-cli init-config)
Announcements and tickets (start fresh in Helium)

Prerequisites

SS-Panel UIM running on MariaDB (or MySQL-compatible) that you can access in read-only mode during export.
A ready Helium PostgreSQL database with migrations applied and no production users yet. Run sqlx migrate run before importing.
Adequate disk space wherever you write the rkyv archive. Expect several hundred megabytes for large installs.
Rust toolchain (same as Helium) and network access to both databases from the machine performing the migration.
Optional: a safe location (e.g., object storage) to back up the generated rkyv file.

Pass 1 – Export From SS-Panel (MariaDB)

The exporter lives in ssp-migrator/mariadb-pass and is compiled with SQLx’s MySQL feature set. Build and run it separately from the main server binaries.

Build the exporter

mariadb-pass uses SQLx’s compile-time query checking. The workspace ships with .sqlx caches for PostgreSQL only, so generic commands such as cargo build --release -p mariadb-pass will fail. You must compile from the crate directory with access to a live SS-Panel database (or export SQLx metadata for MariaDB manually).

cd ssp-migrator/mariadb-pass
SQLX_OFFLINE=false DATABASE_URL="$SSP_DATABASE_URL" cargo build --release

The DATABASE_URL environment variable is required during compilation so SQLx can introspect the MariaDB schema. If you cannot open a direct connection from the build host, generate SQLx data offline with sqlx prepare against MariaDB and commit it alongside the crate before building.

Prepare connection settings

You can pass the database URL directly on the command line or export it as an environment variable. A typical MariaDB connection string looks like:

export SSP_DATABASE_URL="mysql://user:password@legacy-host:3306/sspanel"

Run the exporter

cd ssp-migrator/mariadb-pass
SQLX_OFFLINE=false DATABASE_URL="$SSP_DATABASE_URL" cargo run --release -- \
  --database-url "$SSP_DATABASE_URL" \
  --output-file /tmp/helium-migration.rkyv

The command performs several steps internally:

Streams each SS-Panel entity (users, products, orders, nodes, etc.) in batches.
Normalizes relationships to Helium’s intermediate structs.
Serializes the result to an rkyv archive (default name migration_data.rkyv).

Monitor the logs for warnings about rows that cannot be converted. The exporter skips invalid records but continues processing.

When the run finishes you should have an archive file similar to /tmp/helium-migration.rkyv. Back it up before moving on.

Pass 2 – Import Into Helium (PostgreSQL)

The importer lives in ssp-migrator/postgre-pass and understands Helium’s canonical schema. Ensure the target PostgreSQL database is empty or freshly provisioned to avoid collisions.

Build the importer

cargo build --release -p postgre-pass

This binary only links the PostgreSQL driver, so it compiles with the same workspace settings as other Helium components.

Prepare connection settings

export HELIUM_DATABASE_URL="postgres://helium:password@new-host:5432/helium_db"

Run the importer

cargo run --release -p postgre-pass -- \
  --rkyv-file /tmp/helium-migration.rkyv \
  --database-url "$HELIUM_DATABASE_URL"

The importer performs conversions aligned with Helium’s modules:

Inserts node servers and clients in the correct dependency order.
Creates packages, affiliate policies, balances, and user accounts.
Replays historical purchases into the package queue so users retain entitlements.

If anything fails, no partial state is left behind—each insert group is committed in dependency order. Fix the reported data issue, rebuild the rkyv archive if necessary, and rerun the importer.

Post-migration Checklist

Confirm the importer logs Migration completed successfully.
Inspect a handful of migrated users in Helium’s admin tools (profiles, balances, active packages).
Verify node configurations in telecom match the expected SS-Panel node inventory.
Rotate user credentials if required by your migration policy (password hashes are imported as-is).
Schedule DNS cutover and client config updates after validating the new deployment.

Troubleshooting

MariaDB TLS or authentication errors: confirm the MariaDB driver accepts your certificates or append parameters (e.g., ?ssl-mode=REQUIRED).
Missing subscribe links or invite codes: the exporter requires these tables to be populated for each user. Reconcile data in SS-Panel before exporting.
Importer stops on unique constraint violations: verify the PostgreSQL database is clean. Drop and recreate the schema, then rerun the importer.
Large datasets: run the exporter on a machine close to the database to reduce latency. You can copy the resulting rkyv file to the environment where the importer runs.

With both passes complete, Helium now has a faithful copy of the SS-Panel data and you can proceed with normal deployment and cutover activities.

Keyboard shortcuts

Helium Documentation