Built a lane-based dataset bundle explorer for LLM training — would love feedback from the HF community

DinoDS · April 29, 2026, 6:52am

Hi everyone! I’ve been building DinoDS, a modular dataset system for LLM training built around lane-based dataset bundles.

The idea is simple: instead of treating training data like one giant premade dump, I’m organizing it into capability-focused bundles that map to specific assistant behaviors and failure types — things like:

retrieval grounding
workflow / tool routing
memory and continuity
structured outputs
identity and behavior shaping

I’ve started publishing some of these dataset bundle previews on Hugging Face, and I also made a Space that helps people explore which dataset bundle might actually be useful for their use case.

So the current flow is:

explore the DinoDS concept
identify what kind of assistant behavior you want to improve
see which bundle / lane family fits
check out the related dataset previews

I’d really love feedback from the HF community on a few things:

Does this bundle-first / lane-based way of presenting datasets make sense?
Is the Space + dataset bundle flow intuitive?
What would make these previews more useful for people evaluating training data?
Would you rather explore by failure type, capability, or use case?

You can check out the bundles, the Space, and the website here:

Hugging Face Space: Dinodataset Failure Mapper - a Hugging Face Space by DinoDS
Dataset bundles: DinoDS (DinoDS Labs)
Website: www.dinodsai.com

Would love thoughts, criticism, and suggestions — especially from people building assistants, copilots, routing systems, or structured-output workflows.

Topic		Replies	Views
Dino Data Workflow Routing Preview: training models to route, structure, and prepare actions instead of only replying 🤗Datasets	2	22	April 30, 2026
Access to Llama 2 repo 🤗Datasets	0	333	September 13, 2023
Tools, datasets ,benchmarks in AI Safety 🤗Datasets	0	134	June 20, 2024
Sharing a dataset of satellite images for research and training LLMs 🤗Datasets	0	25	December 19, 2025
For fine-tuned LLAMA 2 Beginners	0	316	October 16, 2023

Built a lane-based dataset bundle explorer for LLM training — would love feedback from the HF community

Related topics