Consider improving interaction between stop words and phrase search #793
Replies: 2 comments 2 replies
-
Hi @LukasKalbertodt , thanks for bringing this to our attention. You make a good point about stop words affecting phrase search for your use case. We’ll bring this up with the team and share an update here. As a heads-up, this will probably happen in January since most of the team will be off for the holidays in the coming weeks. |
Beta Was this translation helpful? Give feedback.
-
Hello @LukasKalbertodt and @macraig, This is an old bug (it may be related to v1.0.0), and v1.12 will change the behavior without fixing the bug, it will not return documents anymore. 🤔 The proper way of computing a phrase search should be to consider stop words as "blank space"/"joker terms" that can be replaced by any word. For instance, the phrase "tour of the moon" would be kind of replaced by "tour __ __ moon", and so, Meilisearch would try to match any document containing "tour" and "moon" separated by exactly 2 words. Stop words are not considered searchable by Meilisearch and are completely skipped during the indexing process, that's why it would be difficult to retrieve the exact same phrase 😞 |
Beta Was this translation helpful? Give feedback.
-
I just tried using stop words with Meili for the first time. My main goal was to reduce the number of match positions that would be returned by the API, as I'm using
showMatchesPosition
. But of course also to improve relevancy, search speed, indexing speed and index size.But I noticed that stop words basically make phrase search unusable. It still works for phrases that don't have any stop words in it, like full names. So it's still useful. But as soon as the phrase contains a stop word, two things happen:
_matchesPosition
does not contain the match.Here you can see an example from our production application. Left is without stop words, right is with. The second and third result on the left have an exact match in the subtitles (just in case you were confused why they appeared). The second result on the right does not contain
moon
anywhere in the document.I checked Google and there, phrase search also considers stop words. Well, at least I suspect that Google uses stop words somehow when not using phrase search. Sorry for the ego-centric example, but it's the easiest I could come up with: searching for
"performs the perspective divide and uses the angular radius as field of view"
only brings up exactly one result (well, I suppose it will soon also bring up this very discussion...). And the exact phrase is highlighted in the result, includingthe
. Changing thethe
toa
does not result in any results.I have not really found a way to work around this problem. From my perspective, using stop words cripples phrase search, making it completely useless in many situations and very likely confusing users. That's not really an option for us, so it seems we cannot use stop words at all?
This seems like a suboptimal solution and I wanted to open this discussion to talk about this. Whether you have considered ways to improve the situation. I think it would be best if phrase search would still work as before, not being affected by any configured stop words. But I suspect that implementing this is not trivial?
Beta Was this translation helpful? Give feedback.
All reactions