Xinference v1.12.0 Release Notes

✨ Key Highlights

🧠 New Model Support

                        Qwen3-Omni — Multi-modal understanding and generation (unified processing of text, image, and speech)

                        DeepSeek-OCR — High-precision document recognition model

                        Jina-Reranker-V3 — High-performance model for semantic reranking tasks

🐍 Python 3.13 Support

                        Compatible with the latest Python runtime environment, further enhancing compatibility and development experience.

🖥️ OCR Gradio UI

                        Built-in OCR graphical interface for easy testing and integration of document recognition capabilities.

🌍 Community Edition Updates

📦 Installation Methods

Pip Installation: pip install 'xinference==1.12.0'

Docker Usage: Pull the latest image or update with pip in the container

🎕️ New Model Support

Qwen3-Omni
DeepSeek-OCR
Jina-Reranker-V3

✨ New Features

Python 3.13 support
Added OCR Gradio UI

⚙️ Build & Optimization

Fixed compatibility issues with torch.audio 2.9
Fixed Python 3.12 build configuration and Dockerfile
Optimized transformers and cu128 image dependencies

🧩 Bug Fixes

Fixed random character issues in Qwen3 models
Fixed IndexTTS2 errors with transformers 4.57.1
Fixed Docker errors in OAuth2 mode
Fixed Qwen3-VL startup exceptions
Fixed progress bar display issues

🏢 Enterprise Edition Updates

🟢 0-Replica Model Startup Support

                        Allow models to be in a "started but with zero replicas" state, facilitating dynamic scaling and resource management.

🛡️ Stability Enhancements

                        Fixed multiple compatibility and reliability issues, improving system stability for large-scale deployments.

For more information and documentation, visit:

🌐 xinference.io

🚀 Xinference

✨ Key Highlights

🌍 Community Edition Updates

📦 Installation Methods

🎕️ New Model Support

✨ New Features

⚙️ Build & Optimization

🧩 Bug Fixes

🏢 Enterprise Edition Updates