Non greedy (reluctant) regex matching in sed?

Question

Welcome To Ask or Share your Answers For Others

Non greedy (reluctant) regex matching in sed?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

Non greedy (reluctant) regex matching in sed?

I'm trying to use sed to clean up lines of URLs to extract just the domain.

So from:

http://www.suepearson.co.uk/product/174/71/3816/

I want:

http://www.suepearson.co.uk/

(either with or without the trailing slash, it doesn't matter)

I have tried:

 sed 's|(http://.*?/).*|1|'

and (escaping the non-greedy quantifier)

sed 's|(http://.*?/).*|1|'

but I can not seem to get the non-greedy quantifier (?) to work, so it always ends up matching the whole string.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-16T21:10:11+0000

Neither basic nor extended Posix/GNU regex recognizes the non-greedy quantifier; you need a later regex. Fortunately, Perl regex for this context is pretty easy to get:

perl -pe 's|(http://.*?/).*|1|'

Categories

Non greedy (reluctant) regex matching in sed?

Non greedy (reluctant) regex matching in sed?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags