Skip to main content

[PRE-RELEASE] v1.73.6-stable

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM
warning

Known Issues​

The non-root docker image has a known issue around the UI not loading. If you use the non-root docker image we recommend waiting before upgrading to this version. We will post a patch fix for this.

Deploy this version​

This release is not out yet. The pre-release will be live on Sunday and the stable release will be live on Wednesday.


New Models / Updated Models​

Updated Models​

Bugs​

  • Sambanova
    • Handle float timestamps - PR s/o @neubig
  • Azure
    • support Azure Authentication method (azure ad token, api keys) - PR @hsuyuming
    • Map ‘image_url’ str as nested dict - PR s/o davis-featherstone
  • Watsonx
    • Set ‘model’ field to None when model is part of a custom deployment - fixes error raised by WatsonX in those cases - PR s/o @cbjuan
  • Perplexity
    • Support web_search_options - PR
    • Support citation token and search queries cost calculation - PR
  • Anthropic
    • Null value in usage block handling - PR

Features​

  • Azure OpenAI
    • Check if o-series model supports reasoning effort (enables drop_params to work for o1 models)
    • Add o3-pro model pricing
    • Assistant + tool use cost tracking - PR
  • OpenRouter
    • Add Mistral 3.2 24B to model mapping
  • Gemini (Google AI Studio + VertexAI)
    • Only use accepted format values (enum and datetime) - else gemini raises errors - PR
    • Cache tools if passed alongside cached content (else gemini raises an error) - PR
    • Json schema translation improvement: Fix unpack_def handling of nested $ref inside anyof items - PR
  • NVIDIA Nim
    • Add ‘response_format’ param support - PR @shagunb-acn 
  • Mistral
    • Fix thinking prompt to match hugging face recommendation - PR
    • Add supports_response_schema: true for all mistral models except codestral-mamba - PR
  • Ollama
    • Fix unnecessary await on embedding calls - PR
  • OpenAI
    • New o3 and o4-mini deep research models - PR
  • ElevenLabs
    • New STT provider - PR
  • Deepseek
    • Add deepseek-r1 + deepseek-v3 cost tracking - PR

LLM API Endpoints​

Features​

  • MCP
    • Send appropriate auth string value to /tool/call endpoint with x-mcp-auth - PR s/o @wagnerjt
  • /v1/messages
    • Custom LLM support - PR
  • /chat/completions
    • Azure Responses API via chat completion support - PR
  • /responses
    • Add reasoning content support for non-openai providers - PR
  • [NEW] /generateContent
    1. New endpoints for gemini cli support https://github.com/BerriAI/litellm/pull/12040
    2. Support calling Google AI Studio / VertexAI Gemini models in their native format - https://github.com/BerriAI/litellm/pull/12046
    3. Add logging + cost tracking for stream + non-stream vertex/google ai studio routes - https://github.com/BerriAI/litellm/pull/12058
    4. Add Bridge from generateContent to /chat/completions - https://github.com/BerriAI/litellm/pull/12081
  • /batches
    • Filter deployments to only those where managed file was written to - PR
    • Save all model / file id mappings in db (previously it was just the first one) - enables ‘true’ loadbalancing - PR
    • Support List Batches with target model name specified - PR

Spend Tracking / Budget Improvements​

Features​

  • Passthrough
    • Bedrock cost tracking (/invoke + /converse routes) on streaming + non-streaming - PR
    • VertexAI - anthropic cost calculation support - PR
  • Batches
    • Background job for cost tracking LiteLLM Managed batches - PR

Management Endpoints / UI​

Bugs​

  • General UI
    • Fix today selector date mutation in dashboard components - PR
  • Usage
    • Aggregate usage data across all pages of paginated endpoint - PR
  • Teams
    • De-duplicate models in team settings dropdown - PR
  • Models
    • Preserve public model name when selecting ‘test connect’ with azure model (previously would reset) - PR
  • Invitation Links
    • Ensure Invite links email contain the correct invite id when using tf provider - PR

Features​

  • Models
    • Add ‘last success’ column to health check table - PR
  • MCP
    • New UI component to support auth types: api key, bearer token, basic auth - PR s/o @wagnerjt
    • Ensure internal users can access /mcp and /mcp/ routes - PR
  • SCIM
    • Ensure default_internal_user_params are applied for new users - PR
  • Team
    • Support default key expiry for team member keys - PR
    • Expand team member add check to cover user email - PR
  • UI
    • Restrict UI access by SSO group - PR
  • Keys
    • Add new new_key param for regenerating key - PR
  • Test Keys
    • New ‘get code’ button for getting runnable python code snippet based on ui configuration - PR

Logging / Guardrail Integrations​

Bugs​

  • Braintrust
    • Adds model to metadata to enable braintrust cost estimation - PR

Features​

  • Callbacks
    • (Enterprise) - disable logging callbacks in request headers - PR
    • Add List Callbacks API Endpoint - PR
  • Bedrock Guardrail
    • Don't raise exception on intervene action - PR
    • Ensure PII Masking is applied on response streaming or non streaming content when using post call - PR
  • [NEW] Palo Alto Networks Prisma AIRS Guardrail
  • ElasticSearch
    • New Elasticsearch Logging Tutorial - PR
  • Message Redaction
    • Preserve usage / model information for Embedding redaction - PR

Performance / Loadbalancing / Reliability improvements​

Bugs​

  • Team-only models
    • Filter team-only models from routing logic for non-team calls
  • Context Window Exceeded error
    • Catch anthropic exceptions - PR

Features​

  • Router
    • allow using dynamic cooldown time for a specific deployment - PR
    • handle cooldown_time = 0 for deployments - PR
  • Redis
    • Add better debugging to see what variables are set - PR

General Proxy Improvements​

Bugs​

  • aiohttp
    • Check HTTP_PROXY vars in networking requests
    • Allow using HTTP_ Proxy settings with trust_env

Features​

  • Docs
    • Add recommended spec - PR
  • Swagger
    • Introduce new environment variable NO_REDOC to opt-out Redoc - PR

New Contributors​

Git Diff​