Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
611 views
in Technique[技术] by (71.8m points)

indexing - mysql not using index?

I have a table with columns like word, A_, E_, U_ .. these columns with X_ are tinyints having the value of how many times the specific letter exists in the word (to later help optimize the wildcard search query).

There is totally 252k rows. If i make search like WHERE u_ > 0 i get 60k rows. But if i do the explain of that select, it says there is 225k rows to go through and no index possible. Why? Column was added as index. Why it doesn't say there is 60k rows to go through and that possible key is U_?

enter image description here

listing the indexes on table (also strange that others are groupped under A_ index)

enter image description here

In comparison if i run query: where id > 250000 i get 2983 results, and if i do explain of that select it says there is 2982 rows and key to be used primary.

Btw if i group by U_ i get this: (but probably doesnt matter much because i already said the query returns 60k results)

enter image description here

EDIT:

If i create column U (varchar(1)) and do the update U = 'U' where U_ > 0, then if i do the select WHERE U = 'U' i get also 60k rows (obviously), but if i do explain i get this:

enter image description here

Still not so good (rows 120k not 60k) but at least better than rows 225k in previous case. Although this solution is bit more piggy that than the first one, but maybe bit more efficient.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

My experience is that MySQL chooses to do a tablescan, even if there is an index on the column you're searching, if your query would select more than approximately 25% of the rows in the table.

The reason for this is that using a secondary index in InnoDB is a bit more work than using a primary index.

  1. Look up value in secondary index, like your index on u_.
  2. Read index entry, and find corresponding primary key value(s) of rows where that value in u_ is stored.
  3. Look up row(s) by primary key.

It's actually at least double the work to look up by secondary key. This isn't a problem if you ultimately match a small minority of rows of the table, and there are definitely cases where a secondary index is really important for your query. So don't be reluctant to use secondary indexes.

But if your query matches too many rows, and that becomes a big portion of the table, then it would be less work to just scan the table start-to-finish.

By analogy, why doesn't the index at the back of a book contain the word "the"? Because the entry would naturally list every single page in the book, and it would be a waste for you to refer to the index and then use it to guide you to each page in the main part of the book. You would have been better off just reading the book.

MySQL does not have any officially documented threshold for choosing a tablescan over an indexed search. The 25% figure is only my experience (actually sometimes it seems closer to 21%, but I don't know the code well enough to understand exactly how the threshold is calculated).

I've seen cases where the proportion of rows matched was very close to whatever threshold is in the implementation, and the behavior of the optimizer can actually flip-flop from one query to the next, resulting in highly variable performance.

If this case applies to you, you can use an index hint to make MySQL's optimizer pretend that a tablescan is prohibitively expensive, and it should prefer an index to a tablescan. This is done with the FORCE INDEX hint.

SELECT * FROM words FORCE INDEX(U_) WHERE U_ > 0

I still try to use index hints conservatively. They aren't necessary except in rare cases, and using an index hint means your query must include the index name. This makes it hard to change indexes without breaking your application code.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...