Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
206 views
in Technique[技术] by (71.8m points)

c++ - Implementing a "string pool" that is guaranteed not to move

I need a "string pool" object into which I can repeatedly insert a "sequence of chars" (I use this phrase to mean "string" without confusing it with std::string or a C string), obtain a pointer to the sequence, and be guaranteed that the pointer will not become invalidated if/when the pool needs to grow. Using a simple std::string as the pool won't work, because of the possibility for the string to be reallocated when it outgrows its initial capacity, thus invalidating all previous pointers into it.

The pool will not grow without bound -- there are well-defined points at which I will call a clear() method on it -- but I don't want to reserve any maximum capacity on it, either. It should be able to grow, without moving.

One possibility I'm considering is inserting each new sequence of chars into a forward_list<string> and obtaining begin()->c_str(). Another is inserting into an unordered_set<string>, but I'm having a hard time finding out what happens when an unordered_set has to grow. The third possibility I'm considering (less enthusiastically) is rolling my own chain of 1K buffers into which I concatenate the sequence of chars. That has the advantage (I guess) of having the highest performance, which is a requirement for this project.

I'd be interested in hearing how others would recommend approaching this.

UPDATE 1: edited to clarify my use of the phrase "sequence of chars" to be equivalent to the general notion of a "string" without implying either std::string or null-terminated char array.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I've used this approach in the past:

using Atom = const char*;

Atom make_atom(string const& value)
{
    static set<string> interned;
    return interned.insert(value).first->c_str();
}

Obviously, if you want/need to clear the set, you'd make it available in some wider scope.

For even more efficiency move/emplace the strings into the set.

Update I've added this approach for completeness. See it Live on Coliru

#include <string>
#include <set>
using namespace std;

using Atom = const char*;

template <typename... Args>
typename enable_if<
    is_constructible<string, Args...>::value, Atom
>::type emplace_atom(Args&&... args)
{
    static set<string> interned;
    return interned.emplace(forward<Args>(args)...).first->c_str();
}

#include <iostream>

int main() {
    cout << emplace_atom("Hello World
");
    cout << emplace_atom(80, '=');
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...