🚀 Xinference v2.1.0 Release Notes

✅ Highlights

🧠 GLM-4.7 / GLM-4.7-Flash Support
Added full support for GLM-4.7 and GLM-4.7-Flash, further expanding the GLM model ecosystem.
🎤 Qwen3-ASR Series Launched
New additions:
- Qwen3-ASR-0.6B
- Qwen3-ASR-1.7B
Fully supports Qwen3-ASR speech recognition models, covering both lightweight and high-performance scenarios.
🖼️ FLUX.2-Klein Series Support
New additions:
- FLUX.2-Klein-4B
- FLUX.2-Klein-9B
Enhanced image generation and editing capabilities, continuously improving FLUX ecosystem support.
🔁 MinerU2.5-2509-1.2B Adjustment
Updated and adjusted the MinerU2.5-2509-1.2B model, optimizing model configuration and adaptation processes.

📦 Installation

pip: pip install 'xinference==2.1.0'

Docker: Pull the latest image, or update via pip inside the container.

Updated DeepSeek-V3.2 / DeepSeek-V3.2-Exp model configurations.
Optimized image build dependencies (constrained setuptools < 82).
Refactored API layer structure:
- Extracted Pydantic request Schemas.
- Modularized route registration for clearer code structure.

🔧 Stability Enhancements
Includes multiple underlying optimizations and bug fixes to improve overall runtime stability and enterprise deployment reliability.

Welcome to upgrade and experience Xinference v2.1.0 🚀