Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
219 views
in Technique[技术] by (71.8m points)

c - Comparison of words in a string?

Is there a function which I can use to compare strings where the position of the words would not matter? I mean that "Aaron Jack Brussels" is the same as "Brussels Aaron Jack" etc.

question from:https://stackoverflow.com/questions/66065038/comparison-of-words-in-a-string

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There is no standard function that will do anything close to your goal. You need to write specific code. You can iterate on one string and search for each word in the other string and vice versa.

Here is a simple implementation that does not modify the strings nor allocate any memory:

#include <stdio.h>
#include <string.h>

int countword(const char *w, size_t len, const char *str) {
    size_t i;
    int count = 0;

    for (;;) {
        while (*str == ' ')
            str++;
        if (!*str)
            return count;
        for (i = 1; str[i] && str[i] != ' '; i++)
            continue;
        if (i == len && !memcmp(w, str, len))
            count++;
        str += i;
    }
}

int samewords(const char *s1, const char *s2) {
    const char *p0, *p;

    for (p = s1;;) {
        while (*p == ' ')
            p++;
        if (!*p)
            return 1;
        for (p0 = p++; *p && *p != ' '; p++)
            continue;
        if (countword(p0, p - p0, s1) != countword(p0, p - p0, s2))
            return 0;
    }
}

int main() {
    if (samewords("Aaron  Jack  Brussels", "Brussels Aaron Jack"))
        printf("OK
");
    if (samewords("Aaron  Jack  Brussels", "AaronJackBrussels"))
        printf("Not OK
");
    if (samewords("Aaron Jack", "Aaron Jack Jack"))
        printf("Not OK
");
    if (samewords("Aaron Jack Brussels", "Aaron Jack"))
        printf("Not OK
");
    if (samewords("John John Doe", "John Doe Doe"))
        printf("Not OK
");    return 0;
}

You can extend it to handle multiple separators such as space, TAB and newline using strspn() and strcspn() from <string.h>:

int countword(const char *w, size_t len, const char *str) {
    const char *separators = " 
";
    size_t i;
    int count = 0;

    for (;;) {
        str += strspn(str, separators);
        if (!*str)
            return count;
        i = strcspn(str, separators);
        if (i == len && !memcmp(w, str, len))
            count++;
        str += i;
    }
}

int samewords(const char *s1, const char *s2) {
    const char *separators = " 
";
    const char *p0, *p;

    for (p = s1;;) {
        p += strspn(p, separators);
        if (!*p)
            return 1;
        p += strcspn(p0 = p, separators);
        if (countword(p0, p - p0, s1) != countword(p0, p - p0, s2))
            return 0;
    }
}

Note: I updated the answer with a more general version that can handle duplicated words such as "John John Doe" <-> "John Doe Doe" which the previous version would have mistakenly considered equivalent.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...