Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
442 views
in Technique[技术] by (71.8m points)

regex - 是否有正则表达式来检测有效的正则表达式?(Is there a regular expression to detect a valid regular expression?)

Is it possible to detect a valid regular expression with another regular expression?

(是否可以使用另一个正则表达式检测有效的正则表达式?)

If so please give example code below.

(如果是这样,请在下面举例说明。)

  ask by psytek translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
/
^                                             # start of string
(                                             # first group start
  (?:
    (?:[^?+*{}()[]\|]+                      # literals and ^, $
     | \.                                    # escaped characters
     | [ (?: ^?\. | ^[^\] | [^\^] )     # character classes
          (?: [^]\]+ | \. )* ]
     | ( (?:?[:=!]|?<[=!]|?>)? (?1)?? )  # parenthesis, with recursive content
     | (? (?:R|[+-]?d+) )                 # recursive matching
     )
    (?: (?:[?+*]|{d+(?:,d*)?}) [?+]? )?   # quantifiers
  | |                                        # alternative
  )*                                          # repeat content
)                                             # end first group
$                                             # end of string
/

This is a recursive regex, and is not supported by many regex engines.

(这是一个递归正则表达式,许多正则表达式引擎都不支持。)

PCRE based ones should support it.

(基于PCRE的应该支持它。)

Without whitespace and comments:

(没有空格和评论:)

/^((?:(?:[^?+*{}()[]\|]+|\.|[(?:^?\.|^[^\]|[^\^])(?:[^]\]+|\.)*]|((?:?[:=!]|?<[=!]|?>)?(?1)??)|(?(?:R|[+-]?d+)))(?:(?:[?+*]|{d+(?:,d*)?})[?+]?)?||)*)$/

.NET does not support recursion directly.

(.NET不直接支持递归。)

(The (?1) and (?R) constructs.) The recursion would have to be converted to counting balanced groups:

(( (?1)(?R)构造。)递归必须转换为计数平衡组:)

^                                         # start of string
(?:
  (?: [^?+*{}()[]\|]+                   # literals and ^, $
   | \.                                  # escaped characters
   | [ (?: ^?\. | ^[^\] | [^\^] )   # character classes
        (?: [^]\]+ | \. )* ]
   | ( (?:?[:=!]
         | ?<[=!]
         | ?>
         | ?<[^Wd]w*>
         | ?'[^Wd]w*'
         )?                               # opening of group
     (?<N>)                               #   increment counter
   | )                                   # closing of group
     (?<-N>)                              #   decrement counter
   )
  (?: (?:[?+*]|{d+(?:,d*)?}) [?+]? )? # quantifiers
| |                                      # alternative
)*                                        # repeat content
$                                         # end of string
(?(N)(?!))                                # fail if counter is non-zero.

Compacted:

(压实:)

^(?:(?:[^?+*{}()[]\|]+|\.|[(?:^?\.|^[^\]|[^\^])(?:[^]\]+|\.)*]|((?:?[:=!]|?<[=!]|?>|?<[^Wd]w*>|?'[^Wd]w*')?(?<N>)|)(?<-N>))(?:(?:[?+*]|{d+(?:,d*)?})[?+]?)?||)*$(?(N)(?!))

From the comments:

(来自评论:)

Will this validate substitutions and translations?

(这会验证替换和翻译吗?)

It will validate just the regex part of substitutinos and translations.

(它将仅验证替代和翻译的正则表达式部分。) s/<this part>/.../

It is not theoretically possible to match all valid regex grammars with a regex.

(理论上不可能将所有有效的正则表达式语法与正则表达式匹配。)

It is possible if the regex engine supports recursion, such as PCRE, but that can't really be called regular expressions any more.

(如果正则表达式引擎支持递归(例如PCRE),但实际上不能再称为正则表达式。)

Indeed, a "recursive regular expression" is not a regular expression.

(实际上,“递归正则表达式”不是正则表达式。)

But this an often-accepted extension to regex engines... Ironically, this extended regex doesn't match extended regexes.

(但这是一个经常被接受的正则表达式引擎的扩展......具有讽刺意味的是,这个扩展的正则表达式与扩展的正则表达式不匹配。)

"In theory, theory and practice are the same. In practice, they're not."

(“理论上,理论和实践都是一样的。在实践中,它们不是。”)

Almost everyone who knows regular expressions knows that regular expressions does not support recursion.

(几乎每个知道正则表达式的人都知道正则表达式不支持递归。)

But PCRE and most other implementations support much more than basic regular expressions.

(但PCRE和大多数其他实现支持的不仅仅是基本的正则表达式。)

using this with shell script in the grep command , it shows me some error.. grep: Invalid content of {} .

(在grep命令中使用它与shell脚本,它显示了一些错误.. grep:{}的内容无效。)

Can you please help, I am making a script that could grep a code base to find all the files that contain regular expressions

(你能帮忙吗,我正在创建一个脚本,可以grep一个代码库来查找包含正则表达式的所有文件)

This pattern exploits an extension called recursive regular expressions.

(此模式利用称为递归正则表达式的扩展。)

This is not supported by the POSIX flavor of regex.

(正则表达式的POSIX风格不支持此功能。)

You could try with the -P switch, to enable the PCRE regex flavor.

(您可以尝试使用-P开关,以启用PCRE正则表达式风格。)

Regex itself "is not a regular language and hence cannot be parsed by regular expression..."

(正则表达式本身“不是常规语言,因此无法通过正则表达式进行解析...”)

This is true for classical regular expressions.

(对于经典正则表达式,这是正确的。)

Some modern implementations allow recursion, which makes it into a Context Free language, although it is somewhat verbose for this task.

(一些现代实现允许递归,这使得它成为一种Context Free语言,尽管这个任务有点冗长。)

I see where you're matching []()/\ .

(我看到你匹配[]()/\ 。)

and other special regex characters.

(和其他特殊的正则表达式字符。)

Where are you allowing non-special characters?

(你在哪里允许非特殊字符?)

It seems like this will match ^(?:[\.]+)$ , but not ^abcdefg$ .

(看起来这将匹配^(?:[\.]+)$ ,但不匹配^abcdefg$ 。)

That's a valid regex.

(这是一个有效的正则表达式。)

[^?+*{}()[\]\\|] will match any single character, not part of any of the other constructs.

([^?+*{}()[\]\\|]将匹配任何单个字符,而不是任何其他结构的一部分。)

This includes both literal ( a - z ), and certain special characters ( ^ , $ , . ).

(这包括文字( a - z )和某些特殊字符( ^$ , . )。)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...