HTML Entity Encoder & Decoder: The Complete Guide
What are HTML Entities?
HTML entities are special character codes that represent characters reserved by HTML or characters that are not easily typed on a keyboard. The browser renders these entities as their corresponding characters, allowing you to display symbols like <, >, &, and " without breaking the HTML parsing.
HTML entities always begin with an ampersand (&) and end with a semicolon (;). They come in two varieties: named entities like & for & and numeric entities like < for <. Our HTML Entity Encoder & Decoder handles both formats, making it easy to switch between raw text and entity-encoded representations.
Encoding vs Decoding
Encoding
Encoding converts special characters into their HTML entity equivalents. In raw text, the < character would be interpreted as the start of an HTML tag. Encoding transforms it to < so it displays safely as text:
Raw text: 5 < 10 && 3 > 1
Encoded: 5 < 10 && 3 > 1
Browser view: 5 < 10 && 3 > 1
Decoding
Decoding reverses the process — it converts HTML entity codes back into their actual characters. When you fetch HTML content from an API or scrape a web page, the text often contains encoded entities. Decoding restores them for human readability or further processing:
Encoded HTML: The <title> tag defines the document title.
Decoded: The <title> tag defines the document title.
Tip: Use the HTML Entity Encoder & Decoder to toggle between encoded and decoded forms instantly. Paste your text, click "Encode" or "Decode", and copy the result.
Named vs Numeric Entities
HTML supports two encoding schemes for special characters:
| Type | Format | Example | Result |
|---|---|---|---|
| Named entity | &name; | © | © |
| Decimal numeric | &#DDDD; | © | © |
| Hexadecimal numeric | &#xHHHH; | © | © |
Named Entities
Named entities are easier to read and remember. HTML5 defines over 2,000 named character references, including common ones like <, >, &, ", and '. The tool uses named entities by default for the five core HTML special characters.
Numeric Entities
Numeric entities use Unicode code points and cover every character in the Unicode standard. Use decimal form (©) or hexadecimal (©). Numeric entities are especially useful for encoding characters that have no named entity equivalent, such as ☯ (⦿) or ✓ (✓).
Encode All Non-ASCII Characters
The encoder provides an option to encode all non-ASCII characters. This is useful when you need to ensure your HTML is valid in legacy encoding contexts or when delivering text through systems that may not handle UTF-8 correctly:
Input: Café résumé (with encoding option ON)
Output: Café résumé
XSS Prevention with Encoding
Cross-Site Scripting (XSS) remains one of the most common web security vulnerabilities. The root cause is almost always the same: user-controlled data is rendered as HTML without proper encoding. HTML entity encoding is your first line of defense.
How Encoding Prevents XSS
When user input contains HTML tags or JavaScript code, encoding converts the dangerous characters into harmless entity references:
User input: <script>alert('hacked')</script>
Encoded: <script>alert('hacked')</script>
Result: The browser displays the text safely instead of executing it.
Context-Specific Encoding
HTML encoding is essential but not sufficient in every context. Follow these guidelines:
- HTML body context: Encode
<,>,&,",' - HTML attribute context: Encode
",&,',< - URL context: Use URL encoding (percent-encoding), not HTML entity encoding
- JavaScript context: Use JavaScript string escaping, not HTML entity encoding
Tip: Always encode user-generated content before rendering it on a web page. Modern frameworks like React and Vue do this automatically, but when rendering raw HTML or building templates server-side, you must handle encoding explicitly.
Common HTML Entities Reference
Here is a quick reference of the most frequently used HTML entities:
| Character | Named Entity | Numeric Entity | Description |
|---|---|---|---|
| < | < | < | Less than |
| > | > | > | Greater than |
| & | & | & | Ampersand |
| " | " | " | Double quote |
| ' | ' | ' | Apostrophe / single quote |
| © | © | © | Copyright |
| ® | ® | ® | Registered trademark |
| € | € | € | Euro sign |
|   | Non-breaking space | |
| — | — | — | Em dash |
Practical Examples
Displaying Code Snippets on a Blog
When writing a technical blog post, you need to display code that contains HTML tags. Without encoding, the browser interprets your example code as actual HTML. Use the encoder to safely escape code snippets before inserting them into your article:
Before: <div class="container">Hello</div>
After: <div class="container">Hello</div>
Sanitizing User Comments
A comment system receives user input that may contain malicious HTML. Encode the input before storing it in the database to ensure it renders as plain text on the page. This prevents stored XSS attacks and keeps your users safe.
Email Content Rendering
HTML emails have notoriously inconsistent rendering across clients. Encoding special characters ensures your email content displays correctly in Outlook, Gmail, and Apple Mail without unexpected formatting issues.
Scraped Data Cleanup
Web scraping often yields HTML-encoded text like <b>hello</b>. Decoding this restores the actual characters, making the scraped data usable for analysis, display, or further processing.
Best Practices
- Encode at the point of output: Store raw text in your database and encode it only when rendering HTML. This preserves the original data and gives you flexibility when choosing output formats.
- Never double-encode: If text is already encoded, encoding it again produces double entities like
&lt;. Always decode first if you are unsure of the encoding state. - Use the correct encoding for the context: HTML entity encoding is for HTML contexts. Use URL encoding for URLs, JSON escaping for JSON strings, and database parameterization for SQL queries.
- Prefer named entities for readability: Use
©instead of©when a named entity exists — it is more readable when inspecting the raw HTML source. - Always encode the five core characters: The minimum set for safe HTML output is
<,>,&,", and'. For defense-in-depth, consider encoding all non-ASCII characters as well. - Test with both directions: After encoding, verify that decoding reproduces the original text exactly. A round-trip test catches edge cases like invalid entity names or missing semicolons.
Ready to encode or decode HTML entities?
Try HTML Entity Encoder & Decoder