2553 words
13 minutes
Understanding Cross-Site Scripting (XSS) Vulnerabilities

Cross-Site Scripting (XSS) is a browser-side vulnerability that happens when untrusted input is inserted into a web page without proper escaping or validation. The browser then interprets that input as code, which lets an attacker run scripts in a victim’s session. The result can be account takeover, data theft, or subtle manipulation of the UI that tricks users into dangerous actions.

This guide is long on purpose. It moves from beginner concepts to “hero level” defenses and real-world architecture decisions. If you only need the basics, read the first few sections and jump to the checklist at the end. If you want to level up, read straight through.

Why XSS Still Matters#

XSS has been around for decades, yet it still appears in modern apps because:

  • User content is everywhere (profiles, comments, tickets, rich text editors).
  • Frontends are dynamic and regularly touch the DOM.
  • Teams mix multiple frameworks, templates, and legacy pages.
  • One unsafe component can expose the whole session.

Even a small XSS can have large consequences. A single vulnerable page can become a phishing overlay, steal tokens stored in the browser, or silently perform actions with the victim’s credentials.

The Browser’s Point of View (Mental Model)#

To understand XSS, think like the browser. The browser parses HTML into a DOM tree. It then executes JavaScript from script tags, event handlers, and some URLs. The browser does not know whether a string is “user input” or “trusted.” It only knows how to interpret the markup it receives.

So XSS happens when:

  1. Untrusted data flows into the page.
  2. The data lands in a place where the browser treats it as code.
  3. That code runs in the context of the site’s origin.

This is why XSS is so dangerous. It runs as if the attacker were the site itself.

Quick Vocabulary (You’ll Use This Later)#

  • Source: Where untrusted data comes from (request parameters, headers, storage, API responses).
  • Sink: A place where data is written into the DOM or HTML (innerHTML, document.write, eval, template rendering).
  • Context: The location where the browser interprets the data (HTML text, attribute, URL, JavaScript, CSS).
  • Encoding (escaping): Converting special characters so the browser treats them as text instead of code.
  • Sanitization: Removing or rewriting dangerous parts from input.
  • Trusted Types: A browser feature that limits unsafe DOM sinks to trusted data.

The Three Core Types of XSS#

1. Reflected XSS#

Reflected XSS happens when input is included directly in the response. It often uses query parameters or form fields and appears immediately on the page.

2. Stored XSS#

Stored XSS happens when input is saved (database, cache, file) and later served to other users. It spreads more easily because the payload lives in your data.

3. DOM-Based XSS#

DOM XSS happens entirely in the browser. The server might be safe, but client-side JavaScript takes untrusted data and writes it into the DOM unsafely.

Easy Tier: A Simple Reflected XSS Example#

Let’s start with a minimal example that shows the core idea. Imagine a page that greets a user by name, taken from a query parameter.

// Node/Express example
app.get("/welcome", (req, res) => {
const name = req.query.name || "guest";
res.send(`<h1>Welcome, ${name}</h1>`);
});

If name is untrusted, an attacker can inject HTML or JavaScript into the response. The browser treats it as part of the page.

Fix (output encoding):

function escapeHtml(value) {
return String(value)
.replace(/&/g, "&amp;")
.replace(/</g, "&lt;")
.replace(/>/g, "&gt;")
.replace(/\"/g, "&quot;")
.replace(/'/g, "&#39;");
}
app.get("/welcome", (req, res) => {
const name = req.query.name || "guest";
res.send(`<h1>Welcome, ${escapeHtml(name)}</h1>`);
});

This simple change ensures that characters like < and > are rendered as text instead of HTML. Most template engines handle this for you by default, but you must avoid bypassing that behavior.

Easy Tier: HTML vs Attribute vs JavaScript Context#

Encoding is context-dependent. The same string can be safe in one place and dangerous in another.

HTML Text Context#

Inside text nodes, HTML encoding is enough.

<div>Welcome, USER_INPUT</div>

Attribute Context#

Attributes need special care, because quotes and spaces can break out.

<a href="/profile?name=USER_INPUT">Profile</a>

Here, " and ' must be encoded. It’s also safer to avoid placing raw input inside attribute values when possible.

JavaScript Context#

Putting input inside JavaScript is the most dangerous and the hardest to escape correctly.

<script>
const name = "USER_INPUT";
</script>

If input ends up inside <script> or an event handler, escaping becomes complex. The best practice is: do not put untrusted input inside JavaScript strings. Pass it as data from the server in JSON and parse it safely.

Medium Tier: Stored XSS in a Comment System#

Stored XSS is more serious because it persists. Consider a blog that allows comments.

// Save the comment
app.post("/comment", (req, res) => {
const comment = req.body.comment; // untrusted
saveToDatabase(comment);
res.redirect("/post/123");
});
// Render comments
app.get("/post/123", async (req, res) => {
const comments = await loadComments();
res.render("post", { comments });
});

If the template renders comments using raw HTML output, stored XSS can appear for every future visitor.

Safer version:

  • Store raw user input as plain text.
  • Escape on output.
  • If you allow rich text, sanitize it.
sanitize-html logo
import sanitizeHtml from "sanitize-html";
app.post("/comment", (req, res) => {
const dirty = req.body.comment;
const clean = sanitizeHtml(dirty, {
allowedTags: ["b", "i", "em", "strong", "a"],
allowedAttributes: { a: ["href", "title", "rel"] },
});
saveToDatabase(clean);
res.redirect("/post/123");
});

Sanitization is context-specific. Be explicit about what HTML you allow, and strip everything else.

Medium Tier: DOM XSS in Client-Side JavaScript#

DOM XSS happens when front-end code inserts untrusted data into the DOM. It is common in single-page apps when developers use innerHTML or similar methods.

<div id="welcome"></div>
<script>
const params = new URLSearchParams(window.location.search);
const name = params.get("name") || "guest";
// Dangerous
document.getElementById("welcome").innerHTML = `Welcome, ${name}`;
</script>

Safer version:

<div id="welcome"></div>
<script>
const params = new URLSearchParams(window.location.search);
const name = params.get("name") || "guest";
// Safe
document.getElementById("welcome").textContent = `Welcome, ${name}`;
</script>

Avoid innerHTML when the data comes from untrusted sources. textContent is the simplest fix in many cases.

Medium Tier: Markdown and Rich Text#

Markdown is tricky. Many teams assume Markdown is safe because it’s “just text,” but Markdown engines often allow HTML inside Markdown by default. This can lead to XSS if you render Markdown without sanitizing.

Safer approach:

  • Use a Markdown renderer configured to strip HTML.
  • Or sanitize the output HTML with a library like DOMPurify.
DOMPurify logo
import DOMPurify from "dompurify";
const html = markdownToHtml(userMarkdown);
const safe = DOMPurify.sanitize(html);
content.innerHTML = safe;

The goal is to restrict output to a known safe subset of tags and attributes.

Advanced Tier: Contextual Encoding (The “Hero” Lesson)#

Most XSS defenses fail because developers escape in the wrong context. Here’s the key idea:

Encoding must match the context where the input is placed.

HTML Context#

Use HTML entity encoding.

Attribute Context#

Encode HTML entities and ensure quotes are escaped. It is safer to avoid placing raw input in sensitive attributes like style, on* handlers, or href without validation.

URL Context#

Validate allowed protocols. Never allow javascript: or data: unless you truly need them.

function safeUrl(url) {
try {
const parsed = new URL(url, "https://example.com");
if (parsed.protocol !== "http:" && parsed.protocol !== "https:") {
return "#";
}
return parsed.toString();
} catch {
return "#";
}
}

JavaScript Context#

Avoid injecting untrusted input directly. If you must, serialize safely with JSON:

<script>
const data = JSON.parse(document.getElementById("data").textContent);
</script>
<script id="data" type="application/json">
{"name": "Alice"}
</script>

JSON is safer than raw strings because the browser won’t execute it as code. You still need to escape </script> sequences if you generate the JSON server-side.

Advanced Tier: Avoiding Dangerous DOM Sinks#

Certain APIs are inherently risky. Here are common ones and safer alternatives.

Dangerous:

  • innerHTML
  • outerHTML
  • insertAdjacentHTML
  • document.write
  • eval, Function, setTimeout with strings

Safer alternatives:

  • textContent
  • setAttribute with validation
  • DOM methods like createElement, appendChild

When you must use innerHTML, sanitize the input and keep the allowed tags minimal.

Advanced Tier: Trusted Types (Defense in Depth)#

Trusted Types is a browser feature that blocks unsafe sinks unless the data is wrapped in a trusted policy. It helps teams enforce safe coding patterns in large applications.

High-level flow:

  1. Enable a Content Security Policy requiring Trusted Types.
  2. Define a policy that sanitizes HTML.
  3. Use that policy whenever inserting HTML.

This prevents accidental innerHTML usage from becoming an XSS bug. It is especially useful for large teams.

Advanced Tier: Content Security Policy (CSP)#

CSP limits where scripts can load from and whether inline scripts are allowed. A strong CSP can prevent many XSS payloads from running.

Example (strict but manageable):

Content-Security-Policy: default-src 'self'; script-src 'self' 'nonce-abc123'; object-src 'none'; base-uri 'none'

Key points:

  • Avoid unsafe-inline and unsafe-eval if possible.
  • Use nonces for inline scripts you control.
  • Block object/embed content with object-src 'none'.

CSP does not fix XSS by itself, but it can reduce impact and make exploitation harder.

Advanced Tier: Rolling Out CSP Without Breaking Everything#

Teams often avoid CSP because it can be disruptive. The trick is to roll it out in phases and use reports to find violations before you enforce.

Practical rollout approach:

  • Start with Content-Security-Policy-Report-Only in production to collect violation reports.
  • Fix or allow known-safe inline scripts by moving them to external files.
  • Add nonces to the inline scripts you truly need.
  • Remove unsafe-inline once you have nonces in place.
  • Enforce with a strict policy after your report stream is quiet.

This approach keeps the site stable while you increase security steadily. It also creates a feedback loop where new unsafe changes are caught early.

Advanced Tier: Security Headers That Reduce XSS Risk#

These headers do not replace escaping or sanitization, but they limit how XSS can be exploited.

  • X-Content-Type-Options: nosniff prevents MIME-type confusion that can enable script execution.
  • Referrer-Policy: no-referrer or strict-origin-when-cross-origin reduces leakage of sensitive URLs.
  • Permissions-Policy can disable risky browser features on sensitive pages.
  • X-Frame-Options: DENY or frame-ancestors in CSP helps stop clickjacking that can amplify XSS attacks.

Think of these as guardrails. They make bad situations less severe, but they do not fix the root cause.

Advanced Tier: Frameworks and “Safe by Default” Isn’t Absolute#

Modern frameworks like React, Vue, Svelte, and Angular typically escape output by default. This is great, but there are still foot-guns:

React logo Vue logo Svelte logo Angular logo
  • dangerouslySetInnerHTML (React)
  • v-html (Vue)
  • {@html} (Svelte)
  • Bypassing Angular’s sanitization

These are escape hatches, and they exist for legitimate reasons. But they must always be paired with sanitization.

If you must render HTML:

  1. Sanitize the content.
  2. Limit allowed tags and attributes.
  3. Consider a CSP to limit what can run.

Advanced Tier: Storage, Cookies, and Why XSS Beats CSRF#

XSS is powerful because it runs in the site’s origin, so it can do anything a user can do in the browser. That includes reading data and calling APIs with the user’s session.

Key points:

  • If tokens are stored in localStorage or sessionStorage, XSS can read them directly.
  • If you store session tokens in cookies, set HttpOnly so JavaScript cannot read them.
  • Even with HttpOnly, XSS can still make authenticated requests because the browser will attach cookies automatically.

This is why XSS is often more dangerous than CSRF. A CSRF attack can only send requests, but XSS can also read responses, scrape DOM content, and observe user actions in real time.

Defense tip:

  • Use HttpOnly cookies for session tokens.
  • Use SameSite cookies to reduce CSRF.
  • Combine that with CSP and strict output encoding to prevent XSS in the first place.

Advanced Tier: XSS via Third-Party Content#

Sometimes the vulnerability isn’t your code but your dependencies. Common examples:

  • Analytics snippets added inline.
  • Ad scripts that load third-party content.
  • UI libraries that allow HTML strings for tooltips or popovers.

If any third-party content can include user data, you can still get XSS. Keep dependencies updated and review their security advisories.

Advanced Tier: Sandboxing User Content#

If your app must support user HTML (forums, wikis, page builders), consider sandboxing:

  • Render the content inside an <iframe> with a restrictive sandbox attribute.
  • Disallow scripts and same-origin access.
  • Use a separate domain for untrusted content.

This is “hero level” because it changes architecture, but it can dramatically reduce risk.

Hero Tier: Context Cheat Sheet (Where XSS Sneaks In)#

If you only memorize one table, make it this one. It summarizes the most common contexts and the safest default handling.

ContextExampleSafe Handling
HTML text<div>USER</div>HTML entity encode
Attribute value<a title=\"USER\">HTML entity encode and validate
URL attribute<a href=\"USER\">Validate protocol + encode
JS stringconst x = \"USER\";Avoid; use JSON data block
CSSstyle=\"color: USER\"Avoid; use whitelist values

This is why generic escaping is not enough. You need to know where the data lands and use the correct handling for that context.

Hero Tier: SVG and Template Edge Cases#

SVG is part of HTML and can execute scripts through certain attributes and elements. If you allow SVG uploads or inline SVG content, treat it as untrusted HTML and sanitize it with a strict allowlist.

Template engines can also be abused when developers concatenate strings or bypass escaping. If you see a template helper that outputs “raw” HTML, treat it as a red flag that requires sanitization.

Simple rules:

  • Do not accept raw SVG from users unless you sanitize it.
  • Avoid any “raw HTML” helpers unless you own the input.
  • Review every instance of “raw output” in templates during code review.

Advanced Tier: Common XSS Mistakes (Even for Experts)#

  1. Escaping once and reusing elsewhere A string safe for HTML text is not safe for JavaScript context.

  2. Sanitizing input instead of output Attackers can re-contextualize data later. Output encoding must happen where data is used.

  3. Using regex to strip tags HTML parsing is complex. Use a real HTML sanitizer.

  4. Assuming CSP is enough CSP helps, but it is not a substitute for safe coding.

  5. Allowing data: or javascript: URLs These can execute scripts in some contexts.

“Easy to Hero” Examples by Skill Level#

Easy: Safe Greeting (Template Escaping)#

<h1>Welcome, {{ name }}</h1>

Use a template engine that escapes by default. Avoid raw HTML output unless necessary.

Easy: Safe DOM Update#

welcomeEl.textContent = `Welcome, ${name}`;

Intermediate: Sanitized Rich Text#

const html = markdownToHtml(userMarkdown);
const safe = DOMPurify.sanitize(html, {
ALLOWED_TAGS: ["p", "b", "i", "em", "strong", "a", "ul", "li"],
ALLOWED_ATTR: ["href", "rel"],
});
content.innerHTML = safe;

Intermediate: Safe URL Handling#

function allowHttpOnly(url) {
try {
const u = new URL(url, "https://example.com");
return ["http:", "https:"].includes(u.protocol) ? u.toString() : "#";
} catch {
return "#";
}
}

Advanced: CSP with Nonces#

Content-Security-Policy: default-src 'self'; script-src 'self' 'nonce-r4nd0m'; object-src 'none'

Hero: Sandboxed User Content#

<iframe
src="/user-content/123"
sandbox="allow-same-origin"
referrerpolicy="no-referrer"
></iframe>

Add a different domain for user content to reduce damage even further.

Detection and Testing (Practical Workflow)#

XSS is easier to prevent than to clean up later, but teams should test for it consistently.

Code review checklist:

  • Are we using a template engine with auto-escaping?
  • Any direct use of innerHTML, dangerouslySetInnerHTML, or similar?
  • Any HTML passed to a component from external data?
  • Any user input in JavaScript strings or event handlers?

Static analysis:

  • Use linters or SAST tools that flag unsafe sinks.
  • Review warnings and confirm whether data is trusted.

Dynamic testing:

  • Check pages where user input is reflected.
  • Test rich text features thoroughly.
  • Include URL parameters, headers, and JSON fields.

Hero Tier: Building a Long-Term XSS Defense Culture#

XSS prevention is not a one-time fix. The best teams treat it as a culture of safe defaults.

Things that help over the long term:

  • Centralize escaping and sanitization helpers so teams do not reinvent unsafe versions.
  • Create lint rules that block unsafe DOM sinks.
  • Run security tests in CI so regressions are caught early.
  • Keep a short list of approved UI components for rich text.
  • Document “safe patterns” with examples in your engineering handbook.

Over time, these guardrails reduce the number of decisions each developer must make, which reduces mistakes.

Incident Response if You Find XSS in Production#

  1. Fix the vulnerable output (escape/sanitize correctly).
  2. Review logs to understand the scope (who was affected).
  3. Rotate session tokens and invalidate suspicious sessions.
  4. Update CSP if you can do so safely.
  5. Review similar endpoints for the same pattern.

XSS can be exploited quickly, so response speed matters.

Quick Reference: Prevention Checklist#

  • Escape output by default.
  • Encode based on context (HTML, attribute, URL, JS).
  • Avoid dangerous DOM sinks and inline scripts.
  • Sanitize rich text and restrict allowed tags.
  • Validate URLs and block javascript: and data: protocols.
  • Use CSP and (when possible) Trusted Types.
  • Keep dependencies updated.
  • Consider sandboxing untrusted content.

Final Thoughts#

XSS is not just a “beginner vulnerability.” It can appear in modern apps due to tight deadlines, complex frontends, or unsafe third-party dependencies. The hero-level approach is to make unsafe patterns impossible by default: use safe template engines, avoid risky DOM APIs, and layer defenses like CSP and Trusted Types. If you do that, you’ll stop XSS at the source and keep it from coming back.