🚀 Xinference

Release Notes v1.12.0

✨ Key Highlights

🧠 New Model Support
Qwen3-Omni — Multi-modal understanding and generation (unified processing of text, image, and speech)
DeepSeek-OCR — High-precision document recognition model
Jina-Reranker-V3 — High-performance model for semantic reranking tasks
🐍 Python 3.13 Support
Compatible with the latest Python runtime environment, further enhancing compatibility and development experience.
🖥️ OCR Gradio UI
Built-in OCR graphical interface for easy testing and integration of document recognition capabilities.

🌍 Community Edition Updates

📦 Installation Methods

Pip Installation: pip install 'xinference==1.12.0'

Docker Usage: Pull the latest image or update with pip in the container

🎕️ New Model Support

  • Qwen3-Omni
  • DeepSeek-OCR
  • Jina-Reranker-V3

✨ New Features

  • Python 3.13 support
  • Added OCR Gradio UI

⚙️ Build & Optimization

  • Fixed compatibility issues with torch.audio 2.9
  • Fixed Python 3.12 build configuration and Dockerfile
  • Optimized transformers and cu128 image dependencies

🧩 Bug Fixes

  • Fixed random character issues in Qwen3 models
  • Fixed IndexTTS2 errors with transformers 4.57.1
  • Fixed Docker errors in OAuth2 mode
  • Fixed Qwen3-VL startup exceptions
  • Fixed progress bar display issues

🏢 Enterprise Edition Updates

🟢 0-Replica Model Startup Support
Allow models to be in a "started but with zero replicas" state, facilitating dynamic scaling and resource management.
🛡️ Stability Enhancements
Fixed multiple compatibility and reliability issues, improving system stability for large-scale deployments.