Anti-XSS
The Anti-XSS utilities provide input sanitization functions to protect against cross-site scripting (XSS) attacks. These are synchronous, lightweight functions designed to sanitize user-generated content before storage or display.
Purpose
XSS attacks occur when malicious scripts are injected into web pages through user input. The anti-XSS utilities help prevent these attacks by removing or encoding potentially dangerous content while preserving legitimate formatting and functionality.
Sanitization Functions
anti_xss_text
Basic HTML entity encoding for plain text content. Converts special characters to their HTML entity equivalents to prevent HTML/script injection.
Encoded characters:
<,>,&,",',/
When to use:
- Plain text user inputs (usernames, comments, descriptions)
- Content that should contain no HTML or markdown
- Simple text fields where formatting is not needed
anti_xss_markdown
Sanitizes markdown while preserving safe formatting. Uses allowlist-based filtering to keep legitimate markdown features while blocking dangerous content.
Features:
- Strips all HTML tags completely
- Validates image URLs against domain allowlist (configurable in code)
- Validates link schemes (allows
http,https,mailtoonly) - Preserves safe markdown formatting (headers, lists, emphasis, etc.)
- Removes images from disallowed domains
- Strips URLs from links with dangerous schemes (but keeps link text)
When to use:
- Markdown content in announcements
- Support ticket descriptions
- User comments that support formatting
- Any user-generated markdown that will be rendered
Domain Allowlist:
The function checks image URLs against ALLOWED_IMAGE_DOMAINS constant. Modify this list in the source code to add trusted image hosting services.
anti_xss_enhanced
Advanced protection that actively detects and removes script injections before encoding. Combines pattern matching with entity encoding for multi-layered defense.
Protected against:
<script>tags (including multiline)- Dangerous URL schemes (
javascript:,vbscript:,data:) - Event handler attributes (
onclick,onload, etc.) - Embedded objects (
<iframe>,<object>,<embed>)
Process:
- Detects and removes script patterns using regex
- Applies HTML entity encoding to remaining content
When to use:
- Rich text content from untrusted sources
- Content that might contain complex HTML
- Extra protection layer for high-risk inputs
- Any content where XSS risk is elevated
Integration
Backend Usage
These are utility functions exported from the shield::utils::anti_xss module. Call them directly in your service logic before storing or processing user input:
use shield::utils::anti_xss::{anti_xss_text, anti_xss_markdown, anti_xss_enhanced};
// Sanitize before storage
let clean_username = anti_xss_text(&user_input);
let clean_announcement = anti_xss_markdown(&announcement_text);
let clean_content = anti_xss_enhanced(&rich_text);
Frontend Considerations
Backend sanitization is the primary defense, but frontend should also:
- Validate input before submission
- Use appropriate HTML escaping when rendering
- Leverage framework-level XSS protection (React’s JSX escaping, Vue’s template binding, etc.)
- Never use
innerHTMLor equivalent with user content - Use markdown renderers with built-in sanitization
When to Apply Sanitization
Always sanitize:
- User-submitted text, markdown, or HTML
- Content from external APIs or third-party sources
- Any data that will be displayed to other users
- Content stored in databases that renders on frontend
Timing:
- Before storage (recommended): Sanitize once when data enters the system
- Before display: Sanitize when rendering if storage is raw
Defense in depth: For high-risk scenarios, apply multiple layers:
- Frontend validation
- Backend sanitization (these functions)
- Frontend rendering protection (framework escaping)
Architecture Notes
Synchronous and Lightweight
All three functions are synchronous and suitable for use in request handlers. They don’t perform I/O or heavy computation, making them safe to call inline during request processing.
Regex-Based Detection
The sanitization uses regex patterns for detection. The patterns are compiled once using lazy_static for performance. Complex patterns (multiline, case-insensitive) ensure robust detection of script injection attempts.
No External Dependencies
The utilities rely only on standard Rust regex and URL parsing libraries. No external sanitization services or heavyweight parsers are required.
Limitations
These utilities provide solid protection for common XSS vectors, but they are not a complete security solution:
- Not a WAF: These functions handle input sanitization, not request-level filtering
- Not HTML parsing: They use regex-based pattern matching, not full HTML/markdown parsers
- Allowlist maintenance: Image domain allowlist must be manually maintained in code
- Markdown edge cases: Complex or malformed markdown might not be handled perfectly
Additional security measures:
- Use Content Security Policy (CSP) headers in frontend
- Apply proper output encoding in templates
- Enable framework-level XSS protection
- Regular security audits of user-facing features
- Keep dependencies updated
Implementation Notes
- Functions are pure and stateless (no side effects)
- All functions return new strings (input is not modified)
- Empty strings are handled safely
- Regex compilation errors have fallback patterns
- Domain checking for images uses full URL parsing
- Link text is preserved even when URL is removed