Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
468 views
in Technique[技术] by (71.8m points)

c++ - How to store binary data when you only care about speed?

I have N points in D dimensions, where let's say N is 1 million and D 1 hundred. All my points have binary coordinates, i.e. {0, 1}^D, and I am only interested in speed.

Currently my implementation uses std::vector<int>. I am wondering if I could benefit in terms of faster execution by changing my . I am only doing insertions and searches (I don't change the bits).

All related questions I found mention std::vector<char>, std::vector<bool> and std::bitset, but all mention the space benefits one should get by using such structures.

What's the appropriate data structure, when speed is of main concern, for binary data in C++?


I intend to populate my data structure with the binary data and then do a lot of contiguous searches (I mean that I don't really care for the i-th coordinate of a point, if I am accessing a point I will access all of its coordinates continuously). I will compute the Hamming distance between each other.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If the values are independently, uniformly distributed, and you want to find the Hamming distance between two independently, randomly chosen points, the most efficient layout is a packed array of bits.

This packed array would ideally be chunked into the largest block size over which your popcnt instruction works: 64 bits. The hamming distance is the sum of popcnt(x_blocks[i] ^ y_blocks[i]). On processors with efficient unaligned accesses, byte alignment with unaligned reads is likely to be most efficient. On processors where unaligned reads incur a penalty, one should consider whether the memory overhead of aligned rows is worth faster logic.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...