๐Ÿš€ v0.1.6: Real-time Metrics & Blackwell-Optimized Docker (Recommended)

This model is fully compatible with the DGX-Spark-llama.cpp-Bench. Experience the state-of-the-art inference engine optimized for NVIDIA Blackwell (DGX Spark) hardware.

๐ŸŒŸ Key Features (v0.1.6)

  • Real-time Performance Metrics: Now visualizes Input TPS and Output TPS during streaming.
  • Improved Reasoning UI: Seamlessly renders and stabilizes the model's Chain-of-Thought (CoT).
  • Blackwell Optimization: Native support for ARM64/SM121 and CUDA 13.0 FP4.

๐Ÿณ Quick Start

# Pull the latest optimized image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.6

For more details, visit our GitHub Repository.


๐Ÿš€ v0.1.6: ์‹ค์‹œ๊ฐ„ ์ง€ํ‘œ ๋ฐ Blackwell ์ตœ์ ํ™” ๋„์ปค (๊ถŒ์žฅ)

์ด ๋ชจ๋ธ์€ DGX-Spark-llama.cpp-Bench ์‹œ์Šคํ…œ์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. NVIDIA Blackwell (DGX Spark) ํ•˜๋“œ์›จ์–ด์˜ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€๋กœ ํ™œ์šฉํ•˜์„ธ์š”.

๐ŸŒŸ ์ฃผ์š” ํŠน์ง• (v0.1.6)

  • ์‹ค์‹œ๊ฐ„ ์„ฑ๋Šฅ ์ง€ํ‘œ ์‹œ๊ฐํ™”: ์ŠคํŠธ๋ฆฌ๋ฐ ์ค‘ Input TPS ๋ฐ Output TPS๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.
  • ์ง€๋Šฅํ˜• ์ถ”๋ก  UI ๊ณ ๋„ํ™”: ๋ชจ๋ธ์˜ ์ƒ๊ฐํ•˜๋Š” ๊ณผ์ •(CoT)์„ ๋” ์•ˆ์ •์ ์œผ๋กœ ๋ Œ๋”๋งํ•ฉ๋‹ˆ๋‹ค.
  • Blackwell ์ตœ์ ํ™”: ARM64/SM121 ์•„ํ‚คํ…์ฒ˜ ๋ฐ CUDA 13.0 FP4 ๊ฐ€์† ์ง€์›.

๐Ÿณ ์‹คํ–‰ ๋ฐฉ๋ฒ•

# ์ตœ์‹  ์ตœ์ ํ™” ์ด๋ฏธ์ง€ ๋‚ด๋ ค๋ฐ›๊ธฐ
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.6

์ƒ์„ธํ•œ ์‚ฌ์šฉ๋ฒ•์€ GitHub ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.



๐Ÿš€ v0.1.5: Real-time Metrics & Blackwell-Optimized Docker (Recommended)

This model is fully compatible with the DGX-Spark-llama.cpp-Bench. Experience the state-of-the-art inference engine optimized for NVIDIA Blackwell (DGX Spark) hardware.

๐ŸŒŸ Key Features (v0.1.5)

  • Real-time Performance Metrics: Now visualizes Input TPS and Output TPS during streaming.
  • Improved Reasoning UI: Seamlessly renders and stabilizes the model's Chain-of-Thought (CoT).
  • Blackwell Optimization: Native support for ARM64/SM121 and CUDA 13.0 FP4.

๐Ÿณ Quick Start

# Pull the latest optimized image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.5

For more details, visit our GitHub Repository.


๐Ÿš€ v0.1.5: ์‹ค์‹œ๊ฐ„ ์ง€ํ‘œ ๋ฐ Blackwell ์ตœ์ ํ™” ๋„์ปค (๊ถŒ์žฅ)

์ด ๋ชจ๋ธ์€ DGX-Spark-llama.cpp-Bench ์‹œ์Šคํ…œ์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. NVIDIA Blackwell (DGX Spark) ํ•˜๋“œ์›จ์–ด์˜ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€๋กœ ํ™œ์šฉํ•˜์„ธ์š”.

๐ŸŒŸ ์ฃผ์š” ํŠน์ง• (v0.1.5)

  • ์‹ค์‹œ๊ฐ„ ์„ฑ๋Šฅ ์ง€ํ‘œ ์‹œ๊ฐํ™”: ์ŠคํŠธ๋ฆฌ๋ฐ ์ค‘ Input TPS ๋ฐ Output TPS๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.
  • ์ง€๋Šฅํ˜• ์ถ”๋ก  UI ๊ณ ๋„ํ™”: ๋ชจ๋ธ์˜ ์ƒ๊ฐํ•˜๋Š” ๊ณผ์ •(CoT)์„ ๋” ์•ˆ์ •์ ์œผ๋กœ ๋ Œ๋”๋งํ•ฉ๋‹ˆ๋‹ค.
  • Blackwell ์ตœ์ ํ™”: ARM64/SM121 ์•„ํ‚คํ…์ฒ˜ ๋ฐ CUDA 13.0 FP4 ๊ฐ€์† ์ง€์›.

๐Ÿณ ์‹คํ–‰ ๋ฐฉ๋ฒ•

# ์ตœ์‹  ์ตœ์ ํ™” ์ด๋ฏธ์ง€ ๋‚ด๋ ค๋ฐ›๊ธฐ
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.5

์ƒ์„ธํ•œ ์‚ฌ์šฉ๋ฒ•์€ GitHub ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.



๐Ÿš€ v0.1.4: Quick Start with Blackwell-Optimized Docker (Recommended)

This model is fully compatible with the DGX-Spark-llama.cpp-Bench. Experience the best performance on NVIDIA Blackwell (DGX Spark) hardware with our optimized inference engine.

๐ŸŒŸ Key Features (v0.1.4)

  • Blackwell Optimized: Native support for ARM64/SM121 and CUDA 13.0 FP4.
  • Intelligent Reasoning UI: Automatic extraction and visualization of reasoning processes (CoT).
  • One-Click Deployment: Standardized environment via GHCR Docker image.

๐Ÿณ How to Run

# Pull the latest optimized image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.4

# Follow the instructions in our repo to serve this model
# GitHub: https://github.com/sowilow/DGX-Spark-llama.cpp-Bench

๐Ÿš€ v0.1.4: Blackwell ์ตœ์ ํ™” ๋„์ปค ํ€ต์Šคํƒ€ํŠธ (๊ถŒ์žฅ)

์ด ๋ชจ๋ธ์€ DGX-Spark-llama.cpp-Bench ์‹œ์Šคํ…œ์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. NVIDIA Blackwell (DGX Spark) ํ•˜๋“œ์›จ์–ด์˜ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€๋กœ ํ™œ์šฉํ•˜๋Š” ์ตœ์ ํ™”๋œ ์ถ”๋ก  ์—”์ง„์„ ๊ฒฝํ—˜ํ•ด ๋ณด์„ธ์š”.

๐ŸŒŸ ์ฃผ์š” ํŠน์ง• (v0.1.4)

  • Blackwell ์ตœ์ ํ™”: ARM64/SM121 ์•„ํ‚คํ…์ฒ˜ ๋ฐ CUDA 13.0 FP4 ํ•˜๋“œ์›จ์–ด ๊ฐ€์† ์ง€์›.
  • ์ง€๋Šฅํ˜• ์ถ”๋ก  UI: ๋ชจ๋ธ์˜ ์ƒ๊ฐํ•˜๋Š” ๊ณผ์ •(CoT)์„ ์ž๋™์œผ๋กœ ๊ฐ์ง€ํ•˜๊ณ  ์‹œ๊ฐํ™”ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฐ„ํŽธํ•œ ๋ฐฐํฌ: GHCR ๋„์ปค ์ด๋ฏธ์ง€๋ฅผ ํ†ตํ•ด ํ™˜๊ฒฝ ์„ค์ • ์—†์ด ์ฆ‰์‹œ ์‹คํ–‰ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

๐Ÿณ ์‹คํ–‰ ๋ฐฉ๋ฒ•

# ์ตœ์‹  ์ตœ์ ํ™” ์ด๋ฏธ์ง€ ๋‚ด๋ ค๋ฐ›๊ธฐ
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.4

์ƒ์„ธํ•œ ์‚ฌ์šฉ๋ฒ•์€ GitHub ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.



๐Ÿš€ Quick Start with Docker (Recommended)

You can easily run this model using the DGX-Spark-llama.cpp-Bench inference engine. It's pre-configured for high-performance inference on NVIDIA hardware (especially Blackwell/DGX Spark).

1. Pull the Docker Image

docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:latest

2. Run the Inference Server

For detailed configuration and usage, visit the GitHub Repository.


LFM2.5-1.2B-Instruct-DGX-Spark-GGUF

This repository contains GGUF-quantized weights for LFM2.5-1.2B-Instruct, specifically optimized for NVIDIA Blackwell (DGX Spark) hardware.

๐Ÿš€ Key Features

  • Hardware Optimized: Built with CUDA 13.0 and SM121 (Blackwell) native acceleration.
  • Quantization:
    • Q4_K_M: Balanced performance and accuracy.
    • Q8_0: High precision preservation.
  • Base Model Integration: Linked directly to the original LiquidAI/LFM2.5-1.2B-Instruct.

โš–๏ธ License & Attribution

This model is a quantized version of the original LiquidAI/LFM2.5-1.2B-Instruct and is subject to its original license.

๐Ÿ“‚ Files Included

  • lfm2.5-1.2b-instruct-q4_k_m.gguf: 4-bit quantized model.
  • lfm2.5-1.2b-instruct-q8_0.gguf: 8-bit quantized model.

Created using DGX-Spark-llama.cpp-Bench

Downloads last month
49
GGUF
Model size
1B params
Architecture
lfm2
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sowilow/LFM2.5-1.2B-Instruct-DGX-Spark-GGUF

Quantized
(49)
this model

Collection including sowilow/LFM2.5-1.2B-Instruct-DGX-Spark-GGUF