SANA-Sprint Collection πSANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation β’ 6 items β’ Updated 6 days ago β’ 35
Sana Collection β‘οΈSana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer β’ 21 items β’ Updated 6 days ago β’ 90
DRAMA Collection A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages. β’ 3 items β’ Updated Feb 26 β’ 6
SuperBPE Collection SuperBPE tokenizers and models trained with them β’ 8 items β’ Updated 13 days ago β’ 14
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper β’ 2503.03601 β’ Published Mar 5 β’ 230
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators Paper β’ 2502.06394 β’ Published Feb 10 β’ 90
π§ Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community β’ 21 items β’ Updated 7 days ago β’ 128
Ru Dialogue Benchmarks Collection A collection of benchmarks for evaluating the quality of dialogue models in Russian. β’ 3 items β’ Updated Jan 15 β’ 2
Byte Latent Transformer: Patches Scale Better Than Tokens Paper β’ 2412.09871 β’ Published Dec 13, 2024 β’ 101
Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper β’ 2412.16145 β’ Published Dec 20, 2024 β’ 39
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 16 items β’ Updated Feb 20 β’ 254
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation Paper β’ 2409.06820 β’ Published Sep 10, 2024 β’ 69
WebInstruct π Embeddings π§± Models Collection A collection of SoTA embeddings model fine-tuned on WebInstruct dataset to learn to pair instructions with its responses β’ 3 items β’ Updated Sep 4, 2024 β’ 11