Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
165 views
in Technique[技术] by (71.8m points)

javascript - Why doesn't encodeURIComponent encode single quotes/apostrophes?

The escape() function, was deprecated and replaced by encodeURIComponent but encodeURIComponent doesn't encode single quote/apostrophe character. Which I need to escape the apostrophes in a persons surname (E.g. 'O'Neill') in an AJAX form. Why would they remove the ability of something they were trying to improve?

EDIT:

So here is a code example to explain the problem more thoroughly. So as you can see the surname 'O'Neill' contains an apostrophe that needs to be escaped when passing the variable in the url. But this would also happen in other places in the form, for instance if an address entered was 'Billy's Tavern'.

<input id='surname' value="O'Neill">                        
<script>
var get_url = '?surname='+encodeURIComponent($('#surname').val());
$.ajax({
    url: get_url
});
</script>

My current solution, using a custom function. My question was just to ask why there is a need for a custom function.

<script>
function customEncodeURIComponent(URI) {
    return encodeURIComponent(URI).replace(/'/g, "%27");
}
</script>

<input id='surname' value="O'Neill">
<script>
var get_url = '?surname='+customEncodeURIComponent($('#surname').val());
$.ajax({
    url: get_url
});
</script>
question from:https://stackoverflow.com/questions/18251399/why-doesnt-encodeuricomponent-encode-single-quotes-apostrophes

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

encodeURIComponent escapes all characters except the following:

alphabetic, decimal digits, - _ . ! ~ * ' ( )

If you wish to use an encoding compatible with RFC 3986 (which reserves !, ', (, ), and *), you can use:

function rfc3986EncodeURIComponent (str) {  
    return encodeURIComponent(str).replace(/[!'()*]/g, escape);  
}

You can get more information on this on MDN.

UPDATE:

To answer your question, on why ' and the other chars mentioned above are not encoded by encodeURIComponent, the short answer is that they only need to be encoded in certain URI schemes and the decision to encode them depends on the scheme you're using.

To quote RFC 3986:

URI producing applications should percent-encode data octets that correspond to characters in the reserved set unless these characters are specifically allowed by the URI scheme to represent data in that component. If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII.

Where "reserved set" is defined as

reserved    = gen-delims / sub-delims
gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
            / "*" / "+" / "," / ";" / "="

Apostrophe is in the sub-delims group. In other words, you must leave these characters unencoded expecially if you are sure that consuming applications will know what to do with them: for example if you mistakenly encoded ? and & they will no longer delimit query parts. Historically there were also proposal for path segments parameters delimited with ; and , (didn't get large adoption), so these characters are also still allowed,. It is not that apostrohe is "free to use" (ie unreserved) in URI data, but that it was assumed it will have some special meaning in the URI context, for example the segment part:

segment       = *pchar
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...