XML Security Essentials: Protecting Your Data from XXE, Injection & More
Why XML Security Matters in 2026
XML underpins some of the most sensitive data flows in existence: healthcare records (HL7 FHIR), financial transactions (ISO 20022, XBRL), identity federation (SAML), e-commerce (EDI), and enterprise integrations. A vulnerability in XML processing can expose confidential data, allow server compromise, or take down critical systems.
XML security vulnerabilities remain in OWASP's Top 10 list and consistently rank among the most impactful vulnerabilities in enterprise applications. XXE attacks alone have been used to breach major financial institutions, healthcare providers, and government systems. Understanding and mitigating these risks is not optional for any developer working with XML.
Many XML parsers are configured to process external entities by default â even in 2026. Simply upgrading your library version is not sufficient. You must explicitly configure your parser to disable dangerous features. See the Parser Hardening section for specific instructions.
XXE (XML External Entity) Attacks CRITICAL
đ¨ Attack: XXE â XML External Entity Injection
OWASP A05 / CVE-CRITICALXXE attacks exploit XML parsers that process external entity references defined in a DTD (Document Type Definition). An attacker embeds a malicious entity declaration in submitted XML, causing the server to resolve the reference and potentially:
- Read arbitrary local files (including
/etc/passwd, credentials, private keys) - Perform Server-Side Request Forgery (SSRF) to reach internal networks
- Execute denial-of-service via infinite loops
- Scan internal network ports and services
<!-- Attacker submits this XML to your API -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<userQuery>
<username>&xxe;</username>
</userQuery>
<!-- If your parser resolves external entities, -->
<!-- it will substitute &xxe; with the contents -->
<!-- of /etc/passwd and return them in the response -->
â Defense: Disable External Entity Processing
The most effective defense is to completely disable DTD processing. If DTDs are required for your use case, at minimum disable external entity resolution and external document type declarations.
// Disable external entities in DocumentBuilderFactory (Java)
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Completely disable DTD processing (recommended)
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// If DTDs are needed, at minimum disable these:
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(inputStream);
# Python: Use defusedxml library (pip install defusedxml)
# This library patches all standard XML parsers to be safe by default
import defusedxml.ElementTree as ET
# defusedxml raises DTDForbidden, EntitiesForbidden, ExternalReferenceForbidden
# for any potentially dangerous XML constructs
tree = ET.parse('data.xml') # Safe by default
# Or if using lxml:
from lxml import etree
parser = etree.XMLParser(
resolve_entities=False,
no_network=True,
load_dtd=False
)
tree = etree.parse('data.xml', parser)
// Node.js: Use fast-xml-parser with secure settings
const { XMLParser } = require("fast-xml-parser");
const parser = new XMLParser({
processEntities: false, // Disable entity processing
htmlEntities: false, // Disable HTML entities
ignoreDeclaration: true, // Ignore XML declarations
allowBooleanAttributes: false,
});
const result = parser.parse(xmlString);
// Alternatively, for SAX parsing use saxes or sax-js:
// Both do NOT support external entity resolution by default
XML Injection Attacks HIGH
XML injection occurs when an attacker injects malicious XML markup into data that is used to construct an XML document. Unlike SQL injection, the goal is to alter the XML structure rather than query a database â but the impact can be equally severe.
Consider a login system that constructs an XML query from user input:
// BAD: Directly interpolating user input into XML
const xmlQuery = `
<user>
<username>${userInput}</username>
<role>standard</role>
</user>
`;
// If attacker inputs: alice</username><role>admin</role><junk>
// The resulting XML becomes:
// <user>
// <username>alice</username>
// <role>admin</role>
// <junk></junk>
// <role>standard</role>
// </user>
// The parser may use the first <role> value: admin!
// GOOD: Use DOM APIs to build XML â never string concatenation
const doc = document.implementation.createDocument(null, "user");
const root = doc.documentElement;
const usernameEl = doc.createElement("username");
usernameEl.textContent = userInput; // DOM API escapes automatically
root.appendChild(usernameEl);
const roleEl = doc.createElement("role");
roleEl.textContent = "standard"; // Hard-coded, not from user input
root.appendChild(roleEl);
// If you MUST use string concatenation, escape ALL special characters:
function escapeXml(str) {
return str.replace(/&/g, '&')
.replace(//g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''');
}
Billion Laughs: XML DoS Attacks HIGH
The Billion Laughs attack (also called an XML bomb) is a denial-of-service attack using deeply nested entity definitions. A tiny XML file can cause exponential memory expansion when the parser resolves all entities:
<?xml version="1.0"?>
<!DOCTYPE lol [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>
<!-- This expands to 10^9 "lol" strings â roughly 3 GB in memory -->
Disabling DTD processing (the same defense used against XXE) also prevents Billion Laughs attacks. Additionally, modern parsers like libxml2 have built-in entity expansion limits. Configure your parser with a maximum entity expansion depth and size.
XPath Injection MEDIUM
XPath injection is analogous to SQL injection, but targets XML databases and XPath queries. If user input is directly interpolated into an XPath expression, an attacker can manipulate the query to bypass authentication or extract unauthorized data.
// BAD: User input directly in XPath
const xpath = `//user[username='${username}' and password='${password}']`;
// If attacker enters username: ' or '1'='1
// Query becomes: //user[username='' or '1'='1' and password='...']
// This returns ALL users â authentication bypassed!
// GOOD: Use parameterized XPath (supported by most XPath 2.0+ engines)
// Example in Java with Saxon:
XPathCompiler compiler = processor.newXPathCompiler();
XPathExecutable exec = compiler.compile(
"//user[username=$user and password=$pass]"
);
XPathSelector selector = exec.load();
selector.setVariable(new QName("user"), XdmValue.makeValue(username));
selector.setVariable(new QName("pass"), XdmValue.makeValue(password));
XdmValue result = selector.evaluate();
XML Encryption (XMLEnc) PROTECT
XML Encryption (W3C XMLEnc) allows you to encrypt specific elements or the entire content of an XML document, ensuring confidentiality for sensitive data during storage or transmission. Unlike transport-level encryption (TLS/HTTPS), XML Encryption provides element-level protection that persists even after the document leaves its transport channel.
The encrypted data is represented using the <xenc:EncryptedData> element:
<xenc:EncryptedData xmlns:xenc="http://www.w3.org/2001/04/xmlenc#"
Type="http://www.w3.org/2001/04/xmlenc#Element">
<xenc:EncryptionMethod
Algorithm="http://www.w3.org/2001/04/xmlenc#aes256-cbc"/>
<ds:KeyInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
<xenc:EncryptedKey>
<xenc:EncryptionMethod
Algorithm="http://www.w3.org/2001/04/xmlenc#rsa-oaep-mgf1p"/>
<xenc:CipherData>
<xenc:CipherValue>Gj7MU1kI3Nz...==</xenc:CipherValue>
</xenc:CipherData>
</xenc:EncryptedKey>
</ds:KeyInfo>
<xenc:CipherData>
<xenc:CipherValue>RmxhZ3NhcmU...==</xenc:CipherValue>
</xenc:CipherData>
</xenc:EncryptedData>
XMLEnc is commonly used in SOAP/WS-Security for protecting sensitive message parts (e.g., credit card numbers in a payment body) without encrypting the entire SOAP envelope.
XML Digital Signatures (XMLDSig) PROTECT
XML Signature (W3C XMLDSig) allows you to digitally sign XML documents or specific elements within them. It serves two purposes: authentication (verifying the signer's identity) and integrity (ensuring the signed content hasn't been modified).
XMLDSig is the backbone of SAML-based Single Sign-On, where identity providers sign assertions about authenticated users, and service providers verify those signatures before granting access.
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
<SignedInfo>
<CanonicalizationMethod
Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
<SignatureMethod
Algorithm="http://www.w3.org/2001/04/xmldsig-more#rsa-sha256"/>
<Reference URI="#order-12345">
<DigestMethod
Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"/>
<DigestValue>8Vfho3L7Zx...==</DigestValue>
</Reference>
</SignedInfo>
<SignatureValue>KwIXh0oQ2...==</SignatureValue>
<KeyInfo>
<X509Data>
<X509Certificate>MIIBvTCC...</X509Certificate>
</X509Data>
</KeyInfo>
</Signature>
A common XMLDSig vulnerability is signature wrapping: an attacker keeps the signature valid but moves the signed element elsewhere in the document and injects a malicious replacement in the position the application expects. Always reference elements by ID and verify that the signed element is the one your application processes.
Parser Hardening by Language
Default XML parser configurations are often insecure. Here's a concise hardening reference for the most common languages and frameworks:
// .NET: Use XmlReaderSettings with restrictions
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit; // Disallow DTDs entirely
settings.MaxCharactersFromEntities = 0; // Disable entity expansion
settings.XmlResolver = null; // Disable external resolution
using (XmlReader reader = XmlReader.Create(inputStream, settings)) {
// Safe parsing
while (reader.Read()) { /* ... */ }
}
<?php
// Disable entity loading before parsing
libxml_disable_entity_loader(true); // PHP < 8.0
// PHP 8.0+: entity loading is disabled by default
// Use LIBXML_NOENT flag ONLY if you need entity substitution for known-safe docs
$dom = new DOMDocument();
// Do NOT use LIBXML_NOENT for untrusted input
$dom->loadXML($xmlString, LIBXML_NONET | LIBXML_NOERROR);
?>
XML Security Checklist
Use this checklist as a code review reference and security audit guide for any system that processes XML from external sources:
- đ¨Disable DTD/entity processing in your XML parser. This single measure prevents XXE, Billion Laughs, and SSRF via XML in one step.
- đ¨Never build XML via string concatenation with user-supplied data. Always use DOM APIs or a serialization library to construct XML programmatically.
- đ¨Validate all XML against a strict schema (XSD or RELAX NG) before processing. Use our XML Validator during development.
- â ī¸Use parameterized XPath queries for any XPath expressions that include user-supplied values.
- â ī¸Limit XML document size at the application layer. Reject documents above a configured size threshold before parsing.
- â ī¸Set entity expansion limits if DTD support is required. Limit nesting depth and total expansion size.
- â ī¸Verify XML Digital Signatures include the elements your application processes â not just any element in the document (signature wrapping defense).
- âšī¸Use XMLEnc for sensitive element-level data in SOAP messages or documents that persist beyond their transport channel.
- âšī¸Log and monitor XML parsing errors â unusual parser errors may indicate attempted injection or fuzzing.
- âšī¸Keep XML libraries updated. XXE vulnerabilities are frequently patched â outdated parsers are a common root cause of real-world breaches.
- âšī¸Use canonicalization (C14N) before signing or hashing XML to ensure consistent serialization across systems.
đ§ RELATED TOOLS & ARTICLES
Frequently Asked Questions
/etc/passwd) or an internal network service. When the parser resolves the entity reference, it reads that file and the contents appear in the parser's output, which the attacker can then extract. The fix is to disable DTD processing in your XML parser configuration.createElement and textContent) to construct XML programmatically â the DOM will escape all special characters automatically. If you must use string building, escape all five XML special characters: & â &, < â <, > â >, " â ", ' â '. Finally, validate the resulting XML against a strict schema before processing.xml.etree.ElementTree, xml.dom.minidom, etc.) are vulnerable to XXE and Billion Laughs attacks when processing untrusted XML. The Python documentation explicitly warns about this. The recommended solution is to use the defusedxml library (pip install defusedxml), which is a drop-in replacement for all Python XML modules that disables all dangerous features by default. For new projects, lxml with explicit security settings is also a good choice.