Joe Rocca

rocca

AI & ML interests

None yet

Recent Activity

liked a model about 13 hours ago
HiDream-ai/HiDream-I1-Full
liked a model about 14 hours ago
microsoft/bitnet-b1.58-2B-4T
liked a model about 14 hours ago
OpenGVLab/VideoMAE2
View all activity

Organizations

DeepGHS's profile picture

rocca's activity

reacted to merve's post with ๐Ÿค— 29 days ago
view post
Post
4058
So many open releases at Hugging Face past week ๐Ÿคฏ recapping all here โคต๏ธ merve/march-21-releases-67dbe10e185f199e656140ae

๐Ÿ‘€ Multimodal
> Mistral AI released a 24B vision LM, both base and instruction FT versions, sota ๐Ÿ”ฅ (OS)
> with IBM we released SmolDocling, a sota 256M document parser with Apache 2.0 license (OS)
> SpatialLM is a new vision LM that outputs 3D bounding boxes, comes with 0.5B (QwenVL based) and 1B (Llama based) variants
> SkyWork released SkyWork-R1V-38B, new vision reasoning model (OS)

๐Ÿ’ฌ LLMs
> NVIDIA released new Nemotron models in 49B and 8B with their post-training dataset
> LG released EXAONE, new reasoning models in 2.4B, 7.8B and 32B
> Dataset: Glaive AI released a new reasoning dataset of 22M+ examples
> Dataset: NVIDIA released new helpfulness dataset HelpSteer3
> Dataset: OpenManusRL is a new agent dataset based on ReAct framework (OS)
> Open-R1 team released OlympicCoder, new competitive coder model in 7B and 32B
> Dataset: GeneralThought-430K is a new reasoning dataset (OS)

๐Ÿ–ผ๏ธ Image Generation/Computer Vision
> Roboflow released RF-DETR, new real-time sota object detector (OS) ๐Ÿ”ฅ
> YOLOE is a new real-time zero-shot object detector with text and visual prompts ๐Ÿฅน
> Stability AI released Stable Virtual Camera, a new novel view synthesis model
> Tencent released Hunyuan3D-2mini, new small and fast 3D asset generation model
> ByteDance released InfiniteYou, new realistic photo generation model
> StarVector is a new 8B model that generates svg from images
> FlexWorld is a new model that expands 3D views (OS)

๐ŸŽค Audio
> Sesame released CSM-1B new speech generation model (OS)

๐Ÿค– Robotics
> NVIDIA released GR00T, new robotics model for generalized reasoning and skills, along with the dataset

*OS ones have Apache 2.0 or MIT license
New activity in lodestones/Chroma about 1 month ago

Diffusers Roadmap?

1
2
#5 opened about 1 month ago by
Impulse2000