Anti-XSS

The Anti-XSS utilities provide input sanitization functions to protect against cross-site scripting (XSS) attacks. These are synchronous, lightweight functions designed to sanitize user-generated content before storage or display.

Purpose

XSS attacks occur when malicious scripts are injected into web pages through user input. The anti-XSS utilities help prevent these attacks by removing or encoding potentially dangerous content while preserving legitimate formatting and functionality.

Sanitization Functions

anti_xss_text

Basic HTML entity encoding for plain text content. Converts special characters to their HTML entity equivalents to prevent HTML/script injection.

Encoded characters:

<, >, &, ", ', /

When to use:

Plain text user inputs (usernames, comments, descriptions)
Content that should contain no HTML or markdown
Simple text fields where formatting is not needed

anti_xss_markdown

Sanitizes markdown while preserving safe formatting. Uses allowlist-based filtering to keep legitimate markdown features while blocking dangerous content.

Features:

Strips all HTML tags completely
Validates image URLs against domain allowlist (configurable in code)
Validates link schemes (allows http, https, mailto only)
Preserves safe markdown formatting (headers, lists, emphasis, etc.)
Removes images from disallowed domains
Strips URLs from links with dangerous schemes (but keeps link text)

When to use:

Markdown content in announcements
Support ticket descriptions
User comments that support formatting
Any user-generated markdown that will be rendered

Domain Allowlist: The function checks image URLs against ALLOWED_IMAGE_DOMAINS constant. Modify this list in the source code to add trusted image hosting services.

anti_xss_enhanced

Advanced protection that actively detects and removes script injections before encoding. Combines pattern matching with entity encoding for multi-layered defense.

Protected against:

<script> tags (including multiline)
Dangerous URL schemes (javascript:, vbscript:, data:)
Event handler attributes (onclick, onload, etc.)
Embedded objects (<iframe>, <object>, <embed>)

Process:

Detects and removes script patterns using regex
Applies HTML entity encoding to remaining content

When to use:

Rich text content from untrusted sources
Content that might contain complex HTML
Extra protection layer for high-risk inputs
Any content where XSS risk is elevated

Integration

Backend Usage

These are utility functions exported from the shield::utils::anti_xss module. Call them directly in your service logic before storing or processing user input:

use shield::utils::anti_xss::{anti_xss_text, anti_xss_markdown, anti_xss_enhanced};

// Sanitize before storage
let clean_username = anti_xss_text(&user_input);
let clean_announcement = anti_xss_markdown(&announcement_text);
let clean_content = anti_xss_enhanced(&rich_text);

Frontend Considerations

Backend sanitization is the primary defense, but frontend should also:

Validate input before submission
Use appropriate HTML escaping when rendering
Leverage framework-level XSS protection (React’s JSX escaping, Vue’s template binding, etc.)
Never use innerHTML or equivalent with user content
Use markdown renderers with built-in sanitization

When to Apply Sanitization

Always sanitize:

User-submitted text, markdown, or HTML
Content from external APIs or third-party sources
Any data that will be displayed to other users
Content stored in databases that renders on frontend

Timing:

Before storage (recommended): Sanitize once when data enters the system
Before display: Sanitize when rendering if storage is raw

Defense in depth: For high-risk scenarios, apply multiple layers:

Frontend validation
Backend sanitization (these functions)
Frontend rendering protection (framework escaping)

Not a WAF: These functions handle input sanitization, not request-level filtering
Not HTML parsing: They use regex-based pattern matching, not full HTML/markdown parsers
Allowlist maintenance: Image domain allowlist must be manually maintained in code
Markdown edge cases: Complex or malformed markdown might not be handled perfectly

Additional security measures:

Use Content Security Policy (CSP) headers in frontend
Apply proper output encoding in templates
Enable framework-level XSS protection
Regular security audits of user-facing features
Keep dependencies updated

Implementation Notes

Functions are pure and stateless (no side effects)
All functions return new strings (input is not modified)
Empty strings are handled safely
Regex compilation errors have fallback patterns
Domain checking for images uses full URL parsing
Link text is preserved even when URL is removed

Helium Documentation

Anti-XSS

Purpose

Sanitization Functions

anti_xss_text

anti_xss_markdown

anti_xss_enhanced

Integration

Backend Usage

Frontend Considerations

When to Apply Sanitization

Architecture Notes

Synchronous and Lightweight

Regex-Based Detection

No External Dependencies

Limitations

Implementation Notes

Keyboard shortcuts

Helium Documentation