The most advanced Qwen model yet, with major gains in text, vision, video, and reasoning.
100K+
GGUF version by Unsloth
Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.
This generation delivers comprehensive upgrades across the board: superior text understanding & generation, deeper visual perception & reasoning, extended context length, enhanced spatial and video dynamics comprehension, and stronger agent interaction capabilities.
Key Enhancements:

This is the weight repository for Qwen3-VL-8B-Instruct.
| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
|---|---|---|---|---|---|
ai/qwen3-vl:8Bai/qwen3-vl:8B-UD-Q4_K_XLai/qwen3-vl:latest | 8B | MOSTLY_Q4_K_M | 262K tokens | 5.91 GiB | 4.79 GB |
ai/qwen3-vl:2B-BF16 | 2B | MOSTLY_BF16 | 262K tokens | 4.38 GiB | 3.21 GB |
ai/qwen3-vl:2B-Q8_K_XL | 2B | MOSTLY_Q8_0 | 262K tokens | 3.34 GiB | 2.17 GB |
ai/qwen3-vl:2B-UD-Q4_K_XL | 2B | MOSTLY_Q4_K_M | 262K tokens | 2.22 GiB | 1.05 GB |
ai/qwen3-vl:4B-Q8_K_XL | 4B | MOSTLY_Q8_0 | 262K tokens | 6.13 GiB | 4.70 GB |
ai/qwen3-vl:8B-Q8_K_XL | 8B | MOSTLY_Q8_0 | 262K tokens | 10.36 GiB | 10.08 GB |
ai/qwen3-vl:32B-Q8_K_XL | 32B | MOSTLY_Q8_0 | 262K tokens | 37.46 GiB | 36.76 GB |
ai/qwen3-vl:32B-UD-Q4_K_XL | 32B | MOSTLY_Q4_K_M | 262K tokens | 20.41 GiB | 18.67 GB |
ai/qwen3-vl:4B-BF16 | 4B | MOSTLY_BF16 | 262K tokens | 8.92 GiB | 7.49 GB |
ai/qwen3-vl:4B-UD-Q4_K_XL | 4B | MOSTLY_Q4_K_M | 262K tokens | 3.80 GiB | 2.37 GB |
ai/qwen3-vl:8B-BF16 | 8B | MOSTLY_BF16 | 262K tokens | 15.54 GiB | 15.26 GB |
¹: VRAM estimated based on model characteristics.
latest→8B
Run the model:
docker model run ai/qwen3-vl
For more information, check out the Docker Model Runner docs.
Content type
Model
Digest
sha256:a18971a77…
Size
5.9 GB
Last updated
3 months ago
docker model pull ai/qwen3-vlPulls:
5,521
Last week