1 10 61

babycommando

AI & ML interests

ai!

Recent Activity

liked a model 40 minutes ago

maxhirez/ShorNet

upvoted a collection 13 days ago

Llasa

liked a model 15 days ago

OuteAI/Llama-OuteTTS-1.0-1B-ONNX

View all activity

Organizations

babycommando's activity

liked a model 40 minutes ago

maxhirez/ShorNet

Graph Machine Learning • Updated 4 days ago • 1

upvoted a collection 13 days ago

Llasa

Collection

TTS foundation model compatible with Llama framework (160k hours tokenized speech data released) • 11 items • Updated Feb 21 • 18

liked a model 15 days ago

OuteAI/Llama-OuteTTS-1.0-1B-ONNX

Text-to-Speech • Updated 15 days ago • 68 • 7

liked a model 17 days ago

unsloth/Llama-4-Scout-17B-16E-Instruct

Image-Text-to-Text • Updated 11 days ago • 10.1k • 56

upvoted a paper 7 months ago

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17, 2024 • 75

liked 2 models 8 months ago

yukiarimo/yuna-ai-v2

Text Generation • Updated Sep 21, 2024 • 105 • 4

Qwen/Qwen2-Audio-7B-Instruct

Audio-Text-to-Text • Updated Jan 12 • 86.6k • • 418

upvoted a paper 9 months ago

Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning

Paper • 2407.15815 • Published Jul 22, 2024 • 14

liked a model 9 months ago

PatronusAI/Llama-3-Patronus-Lynx-8B-Instruct

Text Generation • Updated Jul 22, 2024 • 294 • 42

upvoted a paper 10 months ago

Wavelets Are All You Need for Autoregressive Image Generation

Paper • 2406.19997 • Published Jun 28, 2024 • 32

liked a model 10 months ago

internlm/internlm-xcomposer2d5-7b

Visual Question Answering • Updated Jul 22, 2024 • 1.56k • 203

upvoted a paper 10 months ago

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3, 2024 • 96

liked 2 Spaces 10 months ago

754

Florence 2

📉

Analyze images to generate captions, detect objects, or perform OCR

536

AuraSR-v2

😻

Upscale images to x4

upvoted a paper 10 months ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24, 2024 • 61

upvoted an article 10 months ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24, 2024

• 192

upvoted a paper 10 months ago

LiveMind: Low-latency Large Language Models with Simultaneous Inference

Paper • 2406.14319 • Published Jun 20, 2024 • 14

liked 2 models 10 months ago

facebook/multi-token-prediction

Updated Jun 18, 2024 • 368

nvidia/Nemotron-4-340B-Base

Updated Jun 28, 2024 • 58 • 145