Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Anti-XSS

The Anti-XSS utilities provide input sanitization functions to protect against cross-site scripting (XSS) attacks. These are synchronous, lightweight functions designed to sanitize user-generated content before storage or display.

Purpose

XSS attacks occur when malicious scripts are injected into web pages through user input. The anti-XSS utilities help prevent these attacks by removing or encoding potentially dangerous content while preserving legitimate formatting and functionality.

Sanitization Functions

anti_xss_text

Basic HTML entity encoding for plain text content. Converts special characters to their HTML entity equivalents to prevent HTML/script injection.

Encoded characters:

  • <, >, &, ", ', /

When to use:

  • Plain text user inputs (usernames, comments, descriptions)
  • Content that should contain no HTML or markdown
  • Simple text fields where formatting is not needed

anti_xss_markdown

Sanitizes markdown while preserving safe formatting. Uses allowlist-based filtering to keep legitimate markdown features while blocking dangerous content.

Features:

  • Strips all HTML tags completely
  • Validates image URLs against domain allowlist (configurable in code)
  • Validates link schemes (allows http, https, mailto only)
  • Preserves safe markdown formatting (headers, lists, emphasis, etc.)
  • Removes images from disallowed domains
  • Strips URLs from links with dangerous schemes (but keeps link text)

When to use:

  • Markdown content in announcements
  • Support ticket descriptions
  • User comments that support formatting
  • Any user-generated markdown that will be rendered

Domain Allowlist: The function checks image URLs against ALLOWED_IMAGE_DOMAINS constant. Modify this list in the source code to add trusted image hosting services.

anti_xss_enhanced

Advanced protection that actively detects and removes script injections before encoding. Combines pattern matching with entity encoding for multi-layered defense.

Protected against:

  • <script> tags (including multiline)
  • Dangerous URL schemes (javascript:, vbscript:, data:)
  • Event handler attributes (onclick, onload, etc.)
  • Embedded objects (<iframe>, <object>, <embed>)

Process:

  1. Detects and removes script patterns using regex
  2. Applies HTML entity encoding to remaining content

When to use:

  • Rich text content from untrusted sources
  • Content that might contain complex HTML
  • Extra protection layer for high-risk inputs
  • Any content where XSS risk is elevated

Integration

Backend Usage

These are utility functions exported from the shield::utils::anti_xss module. Call them directly in your service logic before storing or processing user input:

use shield::utils::anti_xss::{anti_xss_text, anti_xss_markdown, anti_xss_enhanced};

// Sanitize before storage
let clean_username = anti_xss_text(&user_input);
let clean_announcement = anti_xss_markdown(&announcement_text);
let clean_content = anti_xss_enhanced(&rich_text);

Frontend Considerations

Backend sanitization is the primary defense, but frontend should also:

  • Validate input before submission
  • Use appropriate HTML escaping when rendering
  • Leverage framework-level XSS protection (React’s JSX escaping, Vue’s template binding, etc.)
  • Never use innerHTML or equivalent with user content
  • Use markdown renderers with built-in sanitization

When to Apply Sanitization

Always sanitize:

  • User-submitted text, markdown, or HTML
  • Content from external APIs or third-party sources
  • Any data that will be displayed to other users
  • Content stored in databases that renders on frontend

Timing:

  • Before storage (recommended): Sanitize once when data enters the system
  • Before display: Sanitize when rendering if storage is raw

Defense in depth: For high-risk scenarios, apply multiple layers:

  1. Frontend validation
  2. Backend sanitization (these functions)
  3. Frontend rendering protection (framework escaping)

Architecture Notes

Synchronous and Lightweight

All three functions are synchronous and suitable for use in request handlers. They don’t perform I/O or heavy computation, making them safe to call inline during request processing.

Regex-Based Detection

The sanitization uses regex patterns for detection. The patterns are compiled once using lazy_static for performance. Complex patterns (multiline, case-insensitive) ensure robust detection of script injection attempts.

No External Dependencies

The utilities rely only on standard Rust regex and URL parsing libraries. No external sanitization services or heavyweight parsers are required.

Limitations

These utilities provide solid protection for common XSS vectors, but they are not a complete security solution:

  • Not a WAF: These functions handle input sanitization, not request-level filtering
  • Not HTML parsing: They use regex-based pattern matching, not full HTML/markdown parsers
  • Allowlist maintenance: Image domain allowlist must be manually maintained in code
  • Markdown edge cases: Complex or malformed markdown might not be handled perfectly

Additional security measures:

  • Use Content Security Policy (CSP) headers in frontend
  • Apply proper output encoding in templates
  • Enable framework-level XSS protection
  • Regular security audits of user-facing features
  • Keep dependencies updated

Implementation Notes

  • Functions are pure and stateless (no side effects)
  • All functions return new strings (input is not modified)
  • Empty strings are handled safely
  • Regex compilation errors have fallback patterns
  • Domain checking for images uses full URL parsing
  • Link text is preserved even when URL is removed