Xinference v1.11.0.post1 Release Notes

✅ Key Highlights

🧠

⚙️

🖼️

Direct compatibility with images/edits interface, enhancing compatibility with image editing and generation models.

pip pip install 'xinference==1.11.0.post1'

Docker Pull the latest image or update using pip in container

Instruct / Thinking versions

Vision Language Model

OpenAI image edit API support

VLLM multi-model loading support (including Omni, image, video, audio models)

VLLM AWQ 8bit quantization support

CUDA 12.8 image upgraded VLLM to 0.10.2

Fixed UI button issue when n_gpu_layers=-1

Fixed CI build and CUDA 12.8 Dockerfile issues

Synchronized multimodal model JSON (audio, image, video, LLM)

Automatic model replica scheduling and lifecycle management, providing unified interface for clustered inference

Fixed several known issues, overall operation is more stable and reliable