Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
323 views
in Technique[技术] by (71.8m points)

Regex to find a string from last occurrence of a character to a number or [

I have a text

Word_1 (string_2)! String_3 - String_4 X_1:X_2

where String_2, String_3 and String_4 can have one or multiple words (with numbers also, like aaa b32), while X_1 and X_2 can be a single digit number or a [ (followed by a single digit number and a ], so something like 9:[4] or [9]:4).

And this regex (used inside an Android app) for matching String_4 (so the string between - and X_1:

(?<=-s)[a-zA-Z0-9]+s([a-zA-Z0-9]+s)?

It works fine, but I've just realized that String_2 can also contains '- ' (for example String_2 = aaa - bbb ccc) so I need to fix my regex for starting from last occurrence of - (and again ending before X_1, considering that String_4 can have numbers as well).

How to do that?

I've tried (?:[^-](?!(-)))+$ but I can't find a way for avoiding the after - and to stop at X_1.

Thanks!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As this 9:[4] or [9]:4 is the only allowed format after String_4, and String_2 is between parenthesis, you can assert only the allowed chars after the match until the end of the string.

(?<=-s)[a-zA-Z0-9]+s(?:[a-zA-Z0-9]+s)?(?=[][:sd]*$)

Regex demo

Or use a bit more precise match with for example a capturing group instead of lookarounds and matching 1 or more words for String_4:

-s([a-zA-Z0-9]+(?:s+[a-zA-Z0-9]+)*)s+(?:d+:[d+]|[d+]:d)$

Explanation

  • -s Match - and a whitspace char
  • ( Capture group 1
    • [a-zA-Z0-9]+ Match 1+ times any of the listed ranges
    • (?:s+[a-zA-Z0-9]+)* Optionally repeat 1+ whitespace chars and 1+ chars from the character class to match multiple words
  • ) Close group 1
  • s+ Match 1+ whitespace chars
  • (?: Non capture group
    • d+:[d+] Match the 9:[4] format
    • | Or
    • [d+]:d Match the [9]:4 format
  • ) Close non capture group
  • $ End of string

Regex demo


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...