CohereLabsCommunity (Cohere Labs Community)

jjzha

submitted a paper to Daily Papers about 12 hours ago

CroCo: Cross-Lingual Contrastive Preference Tuning on Self-Generations

Paper • 2605.26293 • Published 3 days ago • 1

mridul3301

authored 2 papers 2 days ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Paper • 2510.24081 • Published Oct 28, 2025 • 24

Learning POMDP World Models from Observations with Language-Model Priors

Paper • 2605.13740 • Published 15 days ago • 6

prithivMLmods

posted an update 6 days ago

Post

5364

I've made 8 Spaces in the Qwen-Image-Edit series, and out of them, 5 Spaces reached “Space of the Week”! A few Spaces are still topping the list even after many months.

Cumulatively, the series has crossed 8.2 million+ ZeroGPU runs and nearly 4 million visitors overall.

Thanks for all the community support! 🤗❤️

🔗 Spaces: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

4 replies

·

Reubencf

posted an update 8 days ago

Post

4608

I have improved my Portfolio please do check it out
Reubencf/Portfolio

6 replies

·

Tonic

posted an update 13 days ago

Post

2703

🙋🏻‍♂️ Hey there folks ,

Turns out : if we predict 🌏 earth we can save a lot of time looking for interesting things and less time looking at things that we expect to see.

Sentinel-2 imagery 🛰️basically takes a long time to download towards earth. so our "near real time" systems are quite far from that in practical terms.

meanwhile , if we "predict" what we will see , based on what we do see , we can send down much less data in a timely way , and prioritize 📡earth-bound response .

I'm talking about illegal fishing , logging , mining or building in nature reserves , the more of that we predict early the more we're able to stop it on time.

At least that's the concept !

check out the blog : https://huggingface.co/blog/Tonic/save-patagonia-by-predicting-earth

- Collection: https://huggingface.co/collections/NuTonic/earth-observation-with-temporal-and-general-understanding
- Code: https://github.com/Josephrp/Nutonic
- Dataset: NuTonic/sat-vl-sft-training-ready-v1
- Model: NuTonic/lspace
- Training: NuTonic/lspace-trackio
- Evals: NuTonic/Patagonia_Eval

2 replies

·

Cartinoe5930

authored 2 papers 15 days ago

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

Paper • 2601.06165 • Published Jan 7 • 16

KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context

Paper • 2604.13058 • Published Mar 18 • 2

Cartinoe5930

authored a paper 16 days ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Paper • 2605.09063 • Published 19 days ago • 79

prithivMLmods

posted an update 26 days ago

Post

5904

Multimodal-Edge Demo, a node-based inference canvas demo, is now live on Spaces. It features node-based Transformers for fast inference across 10+ edge-device multimodal models on the Hub, all within a single space. The series includes models from Qwen3.5, Qwen3-VL, Gemma 4, and the LFM 2.5 VL model series, with support for reasoning and grounding tasks.

🤗 Demo: prithivMLmods/Multimodal-Edge-Node
🔗 GitHub: https://github.com/PRITHIVSAKTHIUR/Multimodal-Edge-Node
✅ Multimodal Apps Collections: https://huggingface.co/collections/prithivMLmods/hall-of-multimodal-apps

🤗 > To learn more, visit the app page or the respective model pages.

Tonic

posted an update 29 days ago

Post

4256

🙋🏻‍♂️ Hey there folks,

since everyone liked my previous announcement post ( https://huggingface.co/posts/Tonic/338509028435394 ) so much , i'm back with more high quality proceedural datasets in the Geospacial domain for SFT training !

Check this one out :
NuTonic/sat-bbox-metadata-sft-v1

the goal is to be able to train vision models on multiple images for remote sensing analysis with one shot .

hope you like it ! 🚀

2 replies

·

Tonic

posted an update about 1 month ago

Post

3615

🙋🏻‍♂️ Hey there folks ,

I'm sharing huggingface's largest dataset of annotated statelite images today.

check it out here : NuTonic/sat-image-boundingbox-sft-full

I hope you like it , the idea is to be able to use this with small vision models 🚀

prithivMLmods

posted an update about 1 month ago

Post

1903

Now, a collection of various compression schemes for Qwen3.6 and the abliterated version 1 of dense models is available on the Hub. Check it out via the links below. 👇

🔗 Qwen3.6-MoE: https://huggingface.co/collections/prithivMLmods/qwen36-35b-a3b-compressions
🔗 Qwen3.6-27B Compressions: https://huggingface.co/collections/prithivMLmods/qwen36-27b-compressions

🤗 > To learn more, visit the app page or the respective model pages.

prithivMLmods

posted an update about 1 month ago

Post

4204

HY-World-2.0 — A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds is now available on Spaces, and it works both as native Gradio components and in Gradio server mode.

> HY-World-2.0-Demo: prithivMLmods/HY-World-2.0-Demo
> HY-World-2.0 [Server Mode]: prithivMLmods/HY-World-2.0-Demo
> Featuring 3D reconstruction and Gaussian splats with the Rerun viewer, along with camera poses, depth maps, and surface normals.
> In Server Mode, Gradio is served via FastAPI, with FastAPI remaining the top-level server.
> Model: tencent/HY-World-2.0
> GitHub: https://github.com/PRITHIVSAKTHIUR/HY-World-2.0-Demo

🤗To learn more, visit the app page or the respective model pages.

prithivMLmods

posted an update about 1 month ago

Post

6225

A new comparator on Spaces showcases Standard FLUX.2 Decoder vs. FLUX.2 Small Decoder. The Small Decoder is ~1.4× faster, uses ~1.4× less VRAM, and maintains near-identical image quality. It has ~28M parameters with narrower channels [96, 192, 384, 384] vs. [128, 256, 512, 512], and the demo supports sequence generation by running both decoders simultaneously and comparing the results side by side.

🤗 Comparator: https://huggingface.co/spaces/prithivMLmods/Flux.2-4B-Decoder-Comparator
🔗 FLUX.2-small-decoder: black-forest-labs/FLUX.2-small-decoder
🔗 GitHub: https://github.com/PRITHIVSAKTHIUR/Flux.2-4B-Encoder-Comparator
🚁 Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

🤗 > App built on the Gradio SDK. To learn more, visit the app page or the respective model pages.

prithivMLmods

posted an update about 2 months ago

Post

4246

Now, a collection of various compression schemes for Gemma 4 and the abliterated version 1 of dense models is available on the Hub. Check it out via the links below. 👇

🔗Gemma 4 Compression(s)- https://huggingface.co/collections/prithivMLmods/gemma-4-compressions
🔗Gemma 4 Uncensored [MAX] + Compression(s) - [`β ]- https://huggingface.co/collections/prithivMLmods/gemma-4-uncensored-max-compressions
🔗Gemma 4 Compression(s) - MoE- https://huggingface.co/collections/prithivMLmods/gemma-4-compressions-moe
🔗Gemma-4 F32 GGUF- https://huggingface.co/collections/prithivMLmods/gemma-4-f32-gguf

🤗 > To learn more, visit the app page or the respective model pages.

prithivMLmods

posted an update about 2 months ago

Post

2344

Now the demo for image detection based on SAM3 and Gemma-4 (*Filter) is available on Spaces, using full-fledged Transformers inference with multimodal reasoning for processed images. It also supports video segmentation (mask), video segmentation (annotation), and image click segmentation.

🤗 Demo Space: prithivMLmods/SAM3-Gemma4-CUDA
🥽 SAM3: facebook/sam3
🔗 gemma-4-E2B-it: google/gemma-4-E2B-it

To learn more, visit the app page or the respective model pages.

1 reply

·

prithivMLmods

posted an update about 2 months ago

Post

4778

The demo for Image Detection (*Filter) based on SAM3 and Qwen-3.5 is now available on Hugging Face Spaces using Transformers inference, with multimodal reasoning for processed images, and it also supports video segmentation (mask), video segmentation (annotation), and image click segmentation.

🤗 Demo Space: prithivMLmods/SAM3-Plus-Qwen3.5
🥽 SAM3: facebook/sam3
🔗 Qwen-3.5: Qwen/Qwen3.5-2B

To learn more, visit the app page or the respective model pages.

5 replies

·

kenza-ily

authored a paper about 2 months ago

DISCO: Document Intelligence Suite for COmparative Evaluation

Paper • 2603.23511 • Published Mar 4

prithivMLmods

posted an update 2 months ago

Post

5326

Flux-Klein-KV-Edit-Consistency demo is now available on Spaces. It preserves character identity and delivers high-quality, realistic results after edits. No need for any special prompts, just upload the image, type your prompt, and get the resulting image blazing fast.

🔥 Demo Space: prithivMLmods/flux-klein-kv-edit-consistency
🤗 Model: black-forest-labs/FLUX.2-klein-9b-kv
🤗 Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
🔗 Gradio Server Mode: https://www.gradio.app/main/guides/server-mode

➔ Built with Headless Gradio, an alternative to using gr.Blocks for creating the frontend and triggering events, powered by FastAPI + Gradio. You can now design the frontend however you want, with continued support for APIs, MCP, and ZeroGPU.

➔ Gradio Server Mode is now available from gradio@v6.10.0.

To learn more, visit the app page or the respective model pages.

Cohere Labs Community

AI & ML interests

Recent Activity

CroCo: Cross-Lingual Contrastive Preference Tuning on Self-Generations

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Learning POMDP World Models from Observations with Language-Model Priors

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

DISCO: Document Intelligence Suite for COmparative Evaluation

AI & ML interests

Recent Activity

Team members 171

CohereLabsCommunity's activity