-
Notifications
You must be signed in to change notification settings - Fork 147
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is there an existing issue?
- I have searched existing issues
Use case
Imagine you search for “AI in healthcare” in a news app.
A normal nearestNeighborsF32 search might return the top 10 most relevant articles — but 8 of them may be about chatbots for hospitals, basically saying the same thing.
With Maximal Marginal Relevance (MMR), the system instead returns:
- An article about chatbots in hospitals
- One about AI in medical imaging
- One about drug discovery
- One about patient data analysis
This way, you still get relevant results for your query, but also diverse perspectives, rather than numerous near-duplicates.
Proposed solution
- Support Maximal Marginal Relevance (MMR): a method for retrieving documents that balances similarity to the query with diversity among the selected items. It is calculated as follows:
- Alternative solution: expose an API for calculating the similarity between two vector embeddings, allowing developers to implement MMR within their own applications.
Additional context
- I’d be happy to contribute if you can point me to where I should start.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request