Building on HF

Daniel Fox PRO

FlameF0X

https://flamef0x.github.io

FlameF0X

AI & ML interests

Pre-training text generator. (Brother, im 18) Please don't try to contact me.

Recent Activity

liked a model 11 minutes ago

nvidia/PiD

liked a model about 7 hours ago

jedisct1/MiMo-V2.5-coder-Q2

liked a model about 7 hours ago

openbmb/BitCPM-CANN-0.5B-gguf

View all activity

Organizations

upvoted 2 papers about 19 hours ago

WorldKV: Efficient World Memory with World Retrieval and Compression

Paper • 2605.22718 • Published 5 days ago • 38

PiD: Fast and High-Resolution Latent Decoding with Pixel Diffusion

Paper • 2605.23902 • Published 4 days ago • 26

upvoted 4 papers 1 day ago

upvoted 3 articles 3 days ago

Article

EMO: Pretraining mixture of experts for emergent modularity

allenai

•

18 days ago

• 38

Article

OlmoEarth v1.1: A more efficient family of Earth observation models

allenai

•

7 days ago

• 18

Article

Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models

nvidia

•

3 days ago

• 17

upvoted 2 papers 3 days ago

Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps

Paper • 2605.16928 • Published 10 days ago • 89

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published 6 days ago • 201

upvoted a changelog 4 days ago

Hugging Face Changelog

Copy Repo Contents to Buckets Instantly

4 days ago

• 40

upvoted a paper 5 days ago

HRM-Text: Efficient Pretraining Beyond Scaling

Paper • 2605.20613 • Published 6 days ago • 78

upvoted a changelog 6 days ago

Hugging Face Changelog

Filter Leaderboards by Model Size

6 days ago

• 98

upvoted a paper 6 days ago

Delta Attention Residuals

Paper • 2605.18855 • Published 13 days ago • 8

upvoted 4 papers 7 days ago

Quantitative Video World Model Evaluation for Geometric-Consistency

Paper • 2605.15185 • Published 12 days ago • 3

RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-model GRPO

Paper • 2605.15190 • Published 12 days ago • 13

Long Context Pre-Training with Lighthouse Attention

Paper • 2605.06554 • Published 19 days ago • 29

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 12 days ago • 110

upvoted a paper 11 days ago

Why Are Linear RNNs More Parallelizable?

Paper • 2603.03612 • Published Mar 5 • 1

Daniel Fox PRO

AI & ML interests

Recent Activity

Organizations

FlameF0X's activity

EMO: Pretraining mixture of experts for emergent modularity

OlmoEarth v1.1: A more efficient family of Earth observation models

Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models

Copy Repo Contents to Buckets Instantly

Filter Leaderboards by Model Size