Skip to content

Conversation

@ishaan-jaff
Copy link
Contributor

@ishaan-jaff ishaan-jaff commented Dec 8, 2025

[Feat] New model - add nvidia nim llama-3.2-nv-rerankqa-1b-v2

Adds support for NVIDIA NIM /v1/ranking endpoint for rerank models like nvidia/llama-3.2-nv-rerankqa-1b-v2.
Some NVIDIA NIM rerank models use the /v1/ranking endpoint instead of the default /v1/retrieval/{model}/reranking endpoint. Users can now force requests to the /v1/ranking endpoint by using the ranking/ prefix in the model name.

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature
✅ Test

Changes

@vercel
Copy link

vercel bot commented Dec 8, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Ready Ready Preview Comment Dec 8, 2025 9:48pm

@ishaan-jaff ishaan-jaff merged commit 601da4a into main Dec 8, 2025
54 of 59 checks passed
Copy link

@sytianhe sytianhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good other than on question

**Ranking Endpoint (`/v1/ranking`):**

```
model: nvidia_nim/ranking/nvidia/llama-3.2-nv-rerankqa-1b-v2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So anything after nvidia_nim/ranking/ will be used as model field in the payload to the /v1/ranking endpoint, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants