Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
448 views
in Technique[技术] by (71.8m points)

indexing - Boosting Lucene Terms When Building the Index

Is it possible to determine that specific terms are more important then other when creating the index (not when querying it) ?

Consider for example a synonym filter:
doc 1: "this is a nice car"
doc 2: "this is a nice vehicle"

I want to add the term vehicle to the first doc and the term car to the second doc, but I want that if later the index is queried with the word car then the first document will be scored higher then the second one and if queried for vehicle it will be the other way around.

Will calling setBoost on the fields before adding them to their respective documents do the trick?

Or maybe I should add the synonyms to a different field name?

Or am I looking at this from a wrong point of view ?

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Setting boost on a filed affects all terms in that field so this wouldn't work in your case.

But it should be posible using Lucene payloads (a byte array that can be set for every term). You would use them to set term specific boosts (vehicle to 0.5 for doc 1, for example). Then you'll implement your own Similarity and override scorePayload() method to decode that boost and then use PayloadTermQuery which allows you to contribute to the score based on the boots you have in the payload for that term.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...