Instructions to use Qwen/Qwen3.6-35B-A3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Qwen/Qwen3.6-35B-A3B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="Qwen/Qwen3.6-35B-A3B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Qwen/Qwen3.6-35B-A3B") model = AutoModelForImageTextToText.from_pretrained("Qwen/Qwen3.6-35B-A3B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- AMD Developer Cloud
- Local Apps
- vLLM
How to use Qwen/Qwen3.6-35B-A3B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Qwen/Qwen3.6-35B-A3B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Qwen/Qwen3.6-35B-A3B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/Qwen/Qwen3.6-35B-A3B
- SGLang
How to use Qwen/Qwen3.6-35B-A3B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Qwen/Qwen3.6-35B-A3B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Qwen/Qwen3.6-35B-A3B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Qwen/Qwen3.6-35B-A3B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Qwen/Qwen3.6-35B-A3B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use Qwen/Qwen3.6-35B-A3B with Docker Model Runner:
docker model run hf.co/Qwen/Qwen3.6-35B-A3B
Qwen3.6 support for TRT-LLM?
#60 opened 3 days ago
by
thecomputerguy81
Add MMMU-Pro evaluation result
#59 opened 3 days ago
by
SaylorTwift
Running Qwen 3.6 35B-A3B on a single 8GB consumer laptop GPU (RTX 4060)
#58 opened 4 days ago
by
myeongjun77
Request versions in other sizes.
#57 opened 8 days ago
by
jian2023
100b version
β 3
#56 opened 11 days ago
by
TheBigBlockPC
Wrong tag qwen3_5_moe , unless you guys use some architecture change as the tag ,idk
#55 opened 12 days ago
by
Kz000000
how to enable non-thinking mode of this model in llama.cpp?
1
#54 opened 16 days ago
by
daijava
Create Alpha and Omega 1.01
#53 opened 21 days ago
by
Kjppmp
η §ι‘ΎδΈδΈTesla T4ηθδΊΊοΌιθ¦8Bγ9Bγ14Bγ16Bεζ°
2
#52 opened 22 days ago
by
zhousp666
Tool use failure [Fix Found]
3
#51 opened 27 days ago
by
JonnyWhatshisface
Is it really better in real world task?
π 1
9
#50 opened 29 days ago
by
BornSaint
Built a multimodal open source app on Qwen3.6-35B-A3B: vision reasoning, doc-to-JSON, screenshot-to-React, runs on Ollama or llama.cpp
π 3
#49 opened 30 days ago
by
gvij
Request: DOI
#48 opened 30 days ago
by
dancinlife
Qwen3.6_9-12B_A2B_MoE
π 1
1
#47 opened about 1 month ago
by
Konti201203
Qwen3.6 200B?
β€οΈ 4
2
#46 opened about 1 month ago
by
celikburak
Upload IMG-20260421-WA0030.jpg
#45 opened about 1 month ago
by
jannybaby99
Upload IMG-20260421-WA0030.jpg
#44 opened about 1 month ago
by
jannybaby99
Upload IMG-20260421-WA0030.jpg
#43 opened about 1 month ago
by
jannybaby99
My Mac mini 16GB would like to formally request Qwen3.6-9B π€£
ππ₯ 9
3
#42 opened about 1 month ago
by
OliverTD
Cambia el idioma a espaΓ±ol
#41 opened about 1 month ago
by
kenedy698
Very buggy in vllm + opencode
π 1
2
#40 opened about 1 month ago
by
void009
1 ΡΠ΅ΠΌΠ°
βπ§ 6
#39 opened about 1 month ago
by
Yackiv
Long video - crash
1
#38 opened about 1 month ago
by
Marbuel
My RTX 3090 ran out of excuses: Qwen3.6-35B-A3B
π₯β€οΈ 32
7
#37 opened about 1 month ago
by
Kukedlc
Image - resolution - tokens
1
#36 opened about 1 month ago
by
Marbuel
Question about UI navigation and coordinates
π 1
1
#35 opened about 1 month ago
by
ztsvvstz
GPTQ Int4 Quant
#34 opened about 1 month ago
by
nikhilr12
Add LICENSE file
π 1
#33 opened about 1 month ago
by
deepsweet
Qwen3.6
#32 opened about 1 month ago
by
willowoods
a3b jinja template (165898 token session) no errors, opencode
π 4
1
#31 opened about 1 month ago
by
KuziaMother
tool calling failed with claude code
1
#30 opened about 1 month ago
by
weisunding
Something happened.
#29 opened about 1 month ago
by
owao
ES QWEN LA MEJOR EMPRESA DE INTELIGENCIA ARTIFICIAL DEL MUNDO ENTERO
#28 opened about 1 month ago
by
josenoya
Add ParseBench evaluation results
#26 opened about 1 month ago
by
boyang-runllama
Question from a nOOb.
2
#24 opened about 1 month ago
by
m3thod
Inconsistent parameters recommendation
π 4
#23 opened about 1 month ago
by
PhilippeEiffel
Excellent SVG improve since last version!
π 2
1
#22 opened about 1 month ago
by
kq
ε€ͺεζΎεε¦
6
#21 opened about 1 month ago
by
yukojiangjiang
Poor tools use and endless resoning loop
π₯ 1
8
#20 opened about 1 month ago
by
qiaozhiyi
endless reasoning loops
π 1
8
#19 opened about 1 month ago
by
phoebdroid
Smaller models, embedding and reranker models
β€οΈ 11
#18 opened about 1 month ago
by
Duonglv
Your model is excellent, thank you very much! Waiting for the 122b model, version 3.6! But...
π 16
2
#17 opened about 1 month ago
by
Kosh69
Installation Video and Testing - Step by Step
#15 opened about 1 month ago
by
fahdmirzac
Incredible! Qwen 3.6 is simply unbelievable!
π₯ 9
6
#14 opened about 1 month ago
by
Mithnick
Architectural Comparison: Qwen3.5-35B-A3B vs. Qwen3.6-35B-A3B
π₯ 21
1
#12 opened about 1 month ago
by
BuiDoan
[GUIDE] Run Qwen3.6-35B-A3B at full context on a 4090 and GB10 Spark with vLLM and Llama.cpp
#11 opened about 1 month ago
by
erdal
A Quick Note of Thanks to the Qwen Team π
π₯β€οΈ 21
1
#10 opened about 1 month ago
by
nikhilprasanth
Qwen/Qwen3.6-35B-A3B-GPTQ-Int4?
β 8
3
#9 opened about 1 month ago
by
sujithr