Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
190 views
in Technique[技术] by (71.8m points)

python - Why is iterating over a dict so slow?

I have a script that does a lot of dict deletions and eventually iterates over it.

I've managed to reduce it to a simple benchmark:

> py -m timeit -s "a = {i:i for i in range(10000000)};[a.pop(i) for i in range(10000000-1)]" "next(iter(a))"
10 loops, best of 5: 30.8 msec per loop

How come iterating over a single key after I've deleted all previous values becomes slow?

question from:https://stackoverflow.com/questions/65895767/why-is-iterating-over-a-dict-so-slow

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Since 3.6, Python dictionaries work with an internal hash table and an array of entries.

When a key is removed from the dictionary, its entry is actually replaced in the array with a dummy value marking the entry as deleted.

Upon iteration, it skips all of these dummy values one by one, until it finds the next real item.

That's why if you'll skip the first value, and remove only the rest, you'll see the iteration is as fast as iterating over a single item dictionary:

> py -m timeit -s "a = {i:i for i in range(10000000)};[a.pop(i) for i in range(1,10000000-1)]" "next(iter(a))"
1000000 loops, best of 5: 219 nsec per loop

For more information about the internal dictionary structure, you may see this wonderful answer.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...