Skip to content

Adding Maximal Marginal Relevance (MMR) for searching #750

@marshelino-maged

Description

@marshelino-maged

Is there an existing issue?

Use case

Imagine you search for “AI in healthcare” in a news app.

A normal nearestNeighborsF32 search might return the top 10 most relevant articles — but 8 of them may be about chatbots for hospitals, basically saying the same thing.

With Maximal Marginal Relevance (MMR), the system instead returns:

  • An article about chatbots in hospitals
  • One about AI in medical imaging
  • One about drug discovery
  • One about patient data analysis

This way, you still get relevant results for your query, but also diverse perspectives, rather than numerous near-duplicates.

Proposed solution

  • Support Maximal Marginal Relevance (MMR): a method for retrieving documents that balances similarity to the query with diversity among the selected items. It is calculated as follows:
Image
  • Alternative solution: expose an API for calculating the similarity between two vector embeddings, allowing developers to implement MMR within their own applications.

Additional context

  • I’d be happy to contribute if you can point me to where I should start.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions