Overview
The Unicode Encoder converts text into six different Unicode encoding formats: JavaScript escape sequences (\u), HTML decimal entities (&#), HTML hex entities (&#x), UTF-8 hex bytes, Unicode code points (U+), and URL percent-encoding. A character breakdown table shows each character individually.
How to Use
Type or paste text in the input field, then select your desired encoding format from the six tabs. The encoded result appears below, ready to copy. The character breakdown table at the bottom lists each character’s glyph, code point, name, and decimal value. Non-ASCII characters like emoji, Chinese characters, and accented letters are encoded differently by each format.
Background & Context
Unicode is an international standard that assigns a unique code point to every character in every human writing system — currently over 149,000 characters covering 161 scripts. Unicode version 1.0 was released in 1991. Emoji were added in Unicode 6.0 in 2010. UTF-8 is the dominant encoding on the web, used by over 98% of all websites, as it is backward-compatible with ASCII.




