Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
546 views
in Technique[技术] by (71.8m points)

javascript - Match everything but not quoted strings

I want to match everything but no quoted strings.

I can match all quoted strings with this: /(("([^"\]|\.)*")|('([^'\]|\.)*'))/ So I tried to match everything but no quoted strings with this: /[^(("([^"\]|\.)*")|('([^'\]|\.)*'))]/ but it doesn't work.

I would like to use only regex because I will want to replace it and want to get the quoted text after it back.

string.replace(regex, function(a, b, c) {
   // return after a lot of operations
});

A quoted string is for me something like this "bad string" or this 'cool string'

So if I input:

he're is "watever o"k" efre 'dder'4rdr'?

It should output this matches:

["he're is ", " efre ", "?"]

And than I wan't to replace them.

I know my question is very difficult but it is not impossible! Nothing is impossible.

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

EDIT: Rewritten to cover more edge cases.

This can be done, but it's a bit complicated.

result = subject.match(/(?:(?=(?:(?:\.|"(?:\.|[^"\])*"|[^\'"])*'(?:\.|"(?:\.|[^"'\])*"|[^\'])*')*(?:\.|"(?:\.|[^"\])*"|[^\'])*$)(?=(?:(?:\.|'(?:\.|[^'\])*'|[^\'"])*"(?:\.|'(?:\.|[^'"\])*'|[^\"])*")*(?:\.|'(?:\.|[^'\])*'|[^\"])*$)(?:\.|[^\'"]))+/g);

will return

, he said. 
, she replied. 
, he reminded her. 
, 

from this string (line breaks added and enclosing quotes removed for clarity):

"Hello", he said. "What's up, "doc"?", she replied. 
'I need a 12" crash cymbal', he reminded her. 
"2" by 4 inches", 'Back"'slashes \ are OK!'

Explanation: (sort of, it's a bit mindboggling)

Breaking up the regex:

(?:
 (?=      # Assert even number of (relevant) single quotes, looking ahead:
  (?:
   (?:\.|"(?:\.|[^"\])*"|[^\'"])*
   '
   (?:\.|"(?:\.|[^"'\])*"|[^\'])*
   '
  )*
  (?:\.|"(?:\.|[^"\])*"|[^\'])*
  $
 )
 (?=      # Assert even number of (relevant) double quotes, looking ahead:
  (?:
   (?:\.|'(?:\.|[^'\])*'|[^\'"])*
   "
   (?:\.|'(?:\.|[^'"\])*'|[^\"])*
   "
  )*
  (?:\.|'(?:\.|[^'\])*'|[^\"])*
  $
 )
 (?:\.|[^\'"]) # Match text between quoted sections
)+

First, you can see that there are two similar parts. Both these lookahead assertions ensure that there is an even number of single/double quotes in the string ahead, disregarding escaped quotes and quotes of the opposite kind. I'll show it with the single quotes part:

(?=                   # Assert that the following can be matched:
 (?:                  # Match this group:
  (?:                 #  Match either:
   \.                #  an escaped character
  |                   #  or
   "(?:\.|[^"\])*"  #  a double-quoted string
  |                   #  or
   [^\'"]            #  any character except backslashes or quotes
  )*                  # any number of times.
  '                   # Then match a single quote
  (?:\.|"(?:\.|[^"'\])*"|[^\'])*'   # Repeat once to ensure even number,
                      # (but don't allow single quotes within nested double-quoted strings)
 )*                   # Repeat any number of times including zero
 (?:\.|"(?:\.|[^"\])*"|[^\'])*      # Then match the same until...
 $                    # ... end of string.
)                     # End of lookahead assertion.

The double quotes part works the same.

Then, at each position in the string where these two assertions succeed, the next part of the regex actually tries to match something:

(?:      # Match either
 \.     # an escaped character
|        # or
 [^\'"] # any character except backslash, single or double quote
)        # End of non-capturing group

The whole thing is repeated once or more, as many times as possible. The /g modifier makes sure we get all matches in the string.

See it in action here on RegExr.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...