skip to content

JavaScript: Escaping Special Characters

Every programming language has it's special characters - characters that mean something special such as identifying a variable, the end of a line or a break in some data. JavaScript is no different, so it provides a number of functions that encode and decode special characters.

If you're interacting between PHP and JavaScript you will also need to be familiar with the PHP functions for encoding and decoding special characters, which is why we've created this special tool for testing and comparing different functions.

Encoding and decoding using JavaScript and PHP

The form below let's you see the output of various functions that are used to encode special characters when they appear in plain text or URL parameters (following the '?' in a URL). This page calls the PHP functions directly using Ajax rather than a JavaScript emulation. If you have a string to decode, use the buttons on the right instead.

INPUT:
The function you select will be applied to the INPUT text and not to already encoded (or decoded) text below.

JavaScript 1.0 - 1.4

JavaScript 1.5+

JavaScript 1.5+

PHP: urlencode

PHP: rawurlencode

PHP: htmlentities

PHP: addslashes

PHP: utf8_encode

PHP: json_encode

These functions perform replacements on certain characters as shown in the table futher down the page and described briefly here:

  • The JavaScript escape function replaces most punctuation symbols with the equivalent hex-codes, but was found to be inadequate when it came to UNICODE character encoding and has been superseded by the encodeURI function.
  • The encodeURIComponent function is an extension of encodeURI, the difference being that it also escapes the following characters: , / ? : @ & = + $
  • On the PHP side of things, the only difference beween urlencode and rawurlencode is that the latter escapes the <space> character wheras urlencode uses the widely accepted + instead.
  • The htmlentities function escapes characters which have special meaning inside HTML by inserting HTML entities in their place (eg. &amp; in place of &). See our article on ASCII Character Codes for more details.
  • All functions have a complementary 'decode' function that pretty much does the opposite.

Escaping Double and Single Quotes

Another essential PHP function that comes in handy when passing data to JavaScript is addslashes which will add a backslash before: backslashes, single- and double-quotes.

For example, to echo a PHP variable into inline JavaScript code:

... onclick="return confirm('Delete this item: <?PHP echo addslashes(htmlspecialchars($name)); ?>?';" ...

Please note that as of PHP 8.1.0 the default behaviour of htmlspecialchars has changed, so the above code will need patching, as explained here.

In the HTML we use double-quotes and in the JavaScript single-quotes, so any quotes within the JavaScript code will need to be escaped so that they don't conflict with either the HTML or JavaScript quotes.

For more details on escaping PHP variables for use in JavaScript see our related article: Passing PHP variables to JavaScript.

Table of encoded characters

Here you can see how the various JavaScript and PHP functions apply to a range of common characters.

Input JavaScript PHP
escape encodeURI encodeURIComponent urlencode rawurlencode htmlentities
<space> %20 %20 %20 + %20
! %21 ! ! %21 %21 !
@ @ @ %40 %40 %40 @
# %23 # %23 %23 %23 #
$ %24 $ %24 %24 %24 $
% %25 %25 %25 %25 %25 %
^ %5E %5E %5E %5E %5E ^
& %26 & %26 %26 %26 &amp;
* * * * %2A %2A *
( %28 ( ( %28 %28 (
) %29 ) ) %29 %29 )
- - - - - - -
_ _ _ _ _ _ _
= %3D = %3D %3D %3D =
+ + + %2B %2B %2B +
: %3A : %3A %3A %3A :
; %3B ; %3B %3B %3B; ;
. . . . . . .
" %22 %22 %22 %22 %22 &quot;
' %27 ' ' %27 %27 ' or &#039;
\ %5C %5C %5C %5C %5C \
/ / / %2F %2F %2F /
? %3F ? %3F %3F %3F ?
< %3C %3C %3C %3C %3C &lt;
> %3E %3E %3E %3E %3E &gt;
~ %7E ~ ~ %7E %7E ~
[ %5B %5B %5B %5B %5B [
] %5D %5D %5D %5D %5D ]
{ %7B %7B %7B %7B %7B {
} %7D %7D %7D %7D %7D }
` %60 %60 %60 %60 %60 `

The RFC 1738 specifications make fascinating reading - considering that the document is 10 years old yet still applicable.

Catching exceptions

In some cases when the input is invalid the decodeURI and decodeURIComponent functions can thrown an exceptions which we can catch and identify using a try...catch statement as follows:

let input = "%E0%A4%A"; try { let output = decodeURI(input); } catch(error) { if(error instanceof URIError) { console.log("Caught " + error.name + ": " + error.message); } else { throw error; } }

This example will log "Caught error URIError: URI error".

If you want to know about any JavaScript errors which might be occuring on your website, check out our article Using XMLHttpRequest to log JavaScript errors explaining how to use Ajax to record caught errors to a text file.

Function utf8_encode() is deprecated

The utf8 encode/decode functions have been deprecated as of PHP 8.2. Instead you need to use the more powerful mb_convert_encoding function as follows:

utf8_encode mb_convert_encoding($input, 'UTF-8');
mb_convert_encoding($input, 'UTF-8', 'ISO-8859-1');
utf8_decode mb_convert_encoding($input, 'ISO-8859-1');
mb_convert_encoding($input, 'ISO-8859-1', 'UTF-8');

With mb_convert_encoding the second parameter is the desired character encoding of the output. The third parameter is optional, specifying the character encoding of your input string.

References

< JavaScript

User Comments

Post your comment or question

2 January, 2022

After 15 years it is still useful... thanks!

20 May, 2020

Thanks for showing differences in JS encoding and decoding with an easy interactive copy-paste demo. This was even more helpful than Stack Overflow.

Seems you need encodeURIComponent to POST Unicode (UTF-8) to PHP, and decodeURIComponent on return, else bullets (•) for example become %u2022, percent signs replacing backslashes permanently.

18 April, 2013

Interesting that if you paste é (e-acute) into UTF8 Encode it gives you é, but if you try to UTF8 decode this you get � (question mark in a diamond). Why?

If you put é in the INPUT box and use UTF8 Decode it returns to é (e-acute). With é in the INPUT box you get � as it can't be decoded.

26 June, 2012

Thank you very much for posting this. Perhaps you might consider adding json_encode?

It's been added as an option now - the JSON functions didn't exist when the article was first written

23 December, 2011

Thanks so much for this site!

I have spent hours and hours on trying to get special characters to be properly encoded from PHP into XML code to be picked up by Ajax and then properly decoded, and I ran into nothing but issues with either the XML not being parseable or the characters getting messed up, and using this site I was finally able to find the one combination that worked for me - rawurlencode -> decodeURIComponent. I also noticed that I had to allow PHP to make 2 characters out of certain special characters (registration symbol) and save it to MySQL as two characters for the decoding to work.

In my case, when displaying my database content with special characters correctly in a text field using just plain PHP, what worked the best for me was:

echo htmlentities(rawurldecode($mixedString), ENT_QUOTES, "UTF-8");

Note that I think your database AND your HTML code need to be encoded with UTF-8, too.

Great way of demonstrating what these functions do - thanks again!

27 November, 2010

the pasting-from-Word prob can be fixed with a function such as transcribe_cp1252_to_lat­in1

php.net/manual/en/function.strtr.php#80591

24 November, 2010

thank you very much for this page, it has been very helpful to me. i just want to add an info:
if you are using ajax(post) forms or in a similar occasion, you can use encodeURIComponent on javascript side and 'conditional' stripslashes on php side, like:
$Name = get_magic_quotes_gpc() ? stripslashes($_POST['Name']) : $_POST['Name'];
- urldecode() will be automatically executed on server for $_POST parameters.
- if you want to output data in html directly, use php command nl2br, for line breaks etc.
- for unicode chars, html files should be saved with code page utf-8 char set. nothing extra to do with code.
- for details and whys, read official documentations of these commands.

28 September, 2010

I have found that characters like ' or - aren't encoded by htmlentities and therefore appear weird in the HTML.

A plain ' or - will not cause problems. It's when you copy and paste a 'smart quote' or different type of dash from a source (e.g. Microsoft Word) that uses a different character encoding from your web page. The easiest solution is to work only with UTF-8 (Unicode) or Plain Text (ASCII) content.

18 July, 2010

Excellent examples of escaping characters! Wish I found this site sooner. Thanks Ed.

3 July, 2010

Is it possible to see the sourcecode behind your "JavaScript: Escaping Special Characters" i would love to have that on my pc when i look at codes that has the special chars and i am working offline.

The JavaScript functions are already visible in the page code. The PHP functions require a server-side script - so won't work offline anyway.

21 September, 2008

I noticed that the percent sign % doesn't properly decode in javascript. It will encode to %25, but gives a javascript error when trying to decode. Were you aware of this bug and are there any known fixes?

The definition of the decodeURI function is that it "decodes a Uniform Resource Identifier (URI) previously created by encodeURI or by a similar routine" and "does not decode escape sequences that could not have been introduced by encodeURI". You will need to use a different function, or write your own, if your string is not already properly encoded.

24 March, 2008

very helpful page. Maybe you should add an encoding option so we can test the funcs with different encodings.

That's a great idea, but I'm still a bit clueless when it comes to encoding - everything we do is locked down as UTF-8. If I could just work out where to start...

12 January, 2008

THANK-YOU!!! I needed this table for my pmwiki and finally found it here and so beautifully organized.

26 April, 2007

When user copies the data from microsoft word to my PHP application this often inserts unidentified characters. Is there a permant solutions for this?

Not if they insist on using Word. The problem is that Word uses it's own Character Set which uses characters that aren't compatible with a lot of websites or databases.

You can improve things slightly by disabling all "AutoCorrect" and "AutoReplace" features in Word which will fix a lot of problems with quotes, dashes and list markers, but beyond that things can get complicated.

top