Instructions to use keras/qwen3_5_2b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- KerasHub
How to use keras/qwen3_5_2b with KerasHub:
import keras_hub # Load CausalLM model (optional: use half precision for inference) causal_lm = keras_hub.models.CausalLM.from_preset("hf://keras/qwen3_5_2b", dtype="bfloat16") causal_lm.compile(sampler="greedy") # (optional) specify a sampler # Generate text causal_lm.generate("Keras: deep learning for", max_length=64)import keras_hub # Create a Backbone model unspecialized for any task backbone = keras_hub.models.Backbone.from_preset("hf://keras/qwen3_5_2b") - Keras
How to use keras/qwen3_5_2b with Keras:
# Available backend options are: "jax", "torch", "tensorflow". import os os.environ["KERAS_BACKEND"] = "jax" import keras model = keras.saving.load_model("hf://keras/qwen3_5_2b") - Notebooks
- Google Colab
- Kaggle
Model Overview
Model Summary
##Description
The official release of Qwen3.5, introducing the open-weight of the first model in the Qwen3.5 series, As a native vision-language model, demonstrates outstanding results across a full range of benchmark evaluations, including reasoning, coding, agent capabilities, and multimodal understanding, empowering developers and enterprises to achieve significantly greater productivity. Built on an innovative hybrid architecture that fuses linear attention (via Gated Delta Networks) with a sparse mixture-of-experts, the model attains remarkable inference efficiency.
Qwen3.5 Highlights
Qwen3.5 features the following enhancement:
Unified Vision-Language Foundation: Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks.
Efficient Hybrid Architecture: Gated Delta Networks combined with sparse Mixture-of-Experts deliver high-throughput inference with minimal latency and cost overhead.
Scalable RL Generalization: Reinforcement learning scaled across million-agent environments with progressively complex task distributions for robust real-world adaptability.
Global Linguistic Coverage: Expanded support to 201 languages and dialects, enabling inclusive, worldwide deployment with nuanced cultural and regional understanding.
Next-Generation Training Infrastructure: Near-100% multimodal training efficiency compared to text-only training and asynchronous RL frameworks supporting massive-scale agent scaffolds and environment orchestration.
For more details, please refer to Qwen3.5 Blog, GitHub.
Weights are released under the Apache 2 License . Keras model code is released under the Apache 2 License.
Links
- [Qwen3.5 Quickstart Notebook]( coming soon..!)
- Qwen3.5 API Documentation
- Qwen3.5 Model Card
- KerasHub Beginner Guide
- KerasHub Model Publishing Guide
Installation
Keras and KerasHub can be installed with:
pip install -U -q keras-hub
pip install -U -q keras
JAX, TensorFlow, and Torch come pre-installed in Kaggle Notebooks. For instructions on installing them in another environment, see the Keras Getting Started page.
Presets
The following model checkpoints are provided by the Keras team. For each of the presets, we provide code examples in the tab below:
| Preset Name | Parameters | Description |
|---|---|---|
| qwen3_5_0.8b_base | 0.8 Billion | Ultra-lightweight foundation model. Ideal for edge devices and efficient, task-specific fine-tuning. Supports Multimodal , video processing tasks. |
| qwen3_5_0.8b | 0.8 Billion | Instruction-tuned ultra-lightweight model. Best for simple chat and basic NLP tasks on resource-constrained devices. Supports Multimodal , video processing tasks. |
| qwen3_5_2b_base | 2 Billion | Lightweight foundation model. Balances speed and capability; great for mobile deployment and domain-specific fine-tuning. Supports Multimodal , video processing tasks. |
| qwen3_5_2b | 2 Billion | Instruction-tuned lightweight model. Optimized for fast chat applications and general assistance on consumer hardware. Supports Multimodal , video processing tasks. |
| qwen3_5_4b_base | 4 Billion | Mid-small foundation model. Offers improved reasoning and context understanding for custom fine-tuning tasks. |
| qwen3_5_4b | 4 Billion | Instruction-tuned mid-small model. A capable assistant for general text generation and conversational tasks on standard GPUs. Supports Multimodal , video processing tasks. |
| qwen3_5_9b_base | 9 Billion | Mid-sized foundation model. Delivers strong reasoning, coding, and math baseline capabilities for advanced fine-tuning. Supports Multimodal , video processing tasks. |
| qwen3_5_9b | 9 Billion | Instruction-tuned mid-sized model. Highly capable chatbot offering strong logic, coding assistance, and multi-lingual support. Supports Multimodal , video processing tasks. |
| qwen3_5_27b | 27 Billion | Instruction-tuned large model. Delivers high-tier performance for complex reasoning, coding, and extensive contextual tasks. Supports Multimodal , video processing tasks. |
- Downloads last month
- 33