Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
595 views
in Technique[技术] by (71.8m points)

c - Detecting and skipping line comments with Flex

How can I detect one line comments like // in Flex and skip those lines?

Also, for /* comments, will the following snippet be enough?

   "/*"         { comment(); }
  %%

    comment()
    {
        char c, c1;
    loop:
        while ((c = input()) != '*' && c != 0)
            putchar(c);

        if ((c1 = input()) != '/' && c != 0)
        {
            unput(c1);
            goto loop;
        }

        if (c != 0)
            putchar(c1);
    }
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Why don't you just use regular expressions to recognize the comments? The whole point of lex/flex is to save you from having to write lexical scanners by hand. The code you present should work (if you put the pattern /* at the beginning of the line), but it's a bit ugly, and it is not obvious that it will work.

Your question says that you want to skip comments, but the code you provide uses putchar() to print the comment, except for the /* at the beginning. Which is it that you want to do? If you want to echo the comments, you can use an ECHO action instead of doing nothing.

Here are the regular expressions:

Single line comment

This one is easy because in lex/flex, . won't match a newline. So the following will match from // to the end of the line, and then do nothing.

"//".*                                    { /* DO NOTHING */ }

Multiline comment

This is a bit trickier, and the fact that * is a regular expression character as well as a key part of the comment marker makes the following regex a bit hard to read. I use [*] as a pattern which recognizes the character *; in flex/lex, you can use "*" instead. Use whichever you find more readable. Essentially, the regular expression matches sequences of characters ending with a (string of) * until it finds one where the next character is a /. In other words, it has the same logic as your C code.

[/][*][^*]*[*]+([^*/][^*]*[*]+)*[/]       { /* DO NOTHING */ }

The above requires the terminating */; an unterminated comment will force the lexer to back up to the beginning of the comment and accept some other token, usually a / division operator. That's likely not what you want, but it's not easy to recover from an unterminated comment since there's no really good way to know where the comment should have ended. Consequently, I recommend adding an error rule:

[/][*][^*]*[*]+([^*/][^*]*[*]+)*[/]       { /* DO NOTHING */ }
[/][*]                                    { fatal_error("Unterminated comment"); }

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...