Highlights
Extensive Model Support
- Qwen3-VL and Qwen3-Next — Latest members of the Qwen3 family
- Qwen-Image-Edit-2509 — Continuously enhanced image editing capabilities
- IndexTTS2 — New high-quality speech synthesis model
sglang Engine Supports Structured Output
Now supports OpenAI-style json structured output, further enhancing compatibility with external APIs.
Community Edition Updates
Installation
pip install:
pip install 'xinference==1.10.1'
Docker usage:
Pull the latest image, or update with pip inside the container
New Model Support
IndexTTS2
Qwen-Image-Edit-2509
Baichuan-M2
Qwen3-VL
Qwen3-Next
New Features
- sglang supports json structured output
- UI: Support for request_limits parameter
- UI: Dynamic detection of download_hub
- Support listing flexible models via WebUI and command line
Enhancements
- Optimized Qwen2.5-VL performance on Mac MPS
- deepseek-r1-0528 model supports tool_calls
- Upgraded funasr dependency
- Updated CUDA 12.8 Dockerfile
- UI model launch page refactored
Bug Fixes
- Optimized rerank model lookup logic, support for video model type
- Fixed VLLM version dependency issue required by seed-oss
- Fixed registration issue when model names are duplicated
- Fixed issue where custom model drawer component in UI cannot be opened
- Fixed issue where registered models cannot use tools
- Fixed finish_reason field processing logic
- Fixed vllm structured output compatibility issue
Documentation Updates
- Updated README with additional MaxKB related information
Enterprise Edition Updates (v0.2.4 Release)
Enhanced PDF Parsing Capabilities
Added support for more PDF parsing models, further strengthening enterprise document processing capabilities
Feature and Performance Optimization
Overall usability and stability continuously improved