Utility - a rivasmig Collection

rivasmig 's Collections

Copy

VLMs

Methods

Utility

Utility

updated Jul 3

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

Paper • 2411.05738 • Published Nov 8, 2024 • 15
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents

Paper • 2410.22476 • Published Oct 29, 2024 • 29
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published Oct 30, 2024 • 51
Training-free Regional Prompting for Diffusion Transformers

Paper • 2411.02395 • Published Nov 4, 2024 • 26
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published Nov 7, 2024 • 58
DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

Paper • 2411.04999 • Published Nov 7, 2024 • 18
GazeGen: Gaze-Driven User Interaction for Visual Content Generation

Paper • 2411.04335 • Published Nov 7, 2024 • 15
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details

Paper • 2411.03047 • Published Nov 5, 2024 • 9
CLEAR: Character Unlearning in Textual and Visual Modalities

Paper • 2410.18057 • Published Oct 23, 2024 • 210
Generative World Explorer

Paper • 2411.11844 • Published Nov 18, 2024 • 79
RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 57
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 89
WebWalker: Benchmarking LLMs in Web Traversal

Paper • 2501.07572 • Published Jan 13 • 21
UnCommon Objects in 3D

Paper • 2501.07574 • Published Jan 13 • 13
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning

Paper • 2501.06590 • Published Jan 11 • 11
Evaluating Sample Utility for Data Selection by Mimicking Model Weights

Paper • 2501.06708 • Published Jan 12 • 5
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
HoloPart: Generative 3D Part Amodal Segmentation

Paper • 2504.07943 • Published Apr 10 • 29
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 74
Cobra: Efficient Line Art COlorization with BRoAder References

Paper • 2504.12240 • Published Apr 16 • 28
OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published Apr 8 • 180
One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7 • 109
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Paper • 2504.02826 • Published Apr 3 • 69
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published Mar 31 • 77
MoCha: Towards Movie-Grade Talking Character Synthesis

Paper • 2503.23307 • Published Mar 30 • 138
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 63
Segment Any Motion in Videos

Paper • 2503.22268 • Published Mar 28 • 19
Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image

Paper • 2503.17358 • Published Mar 21 • 6
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation

Paper • 2503.19777 • Published Mar 25 • 1
StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs

Paper • 2505.20139 • Published May 26 • 18
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Paper • 2506.15677 • Published Jun 18 • 24
Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset

Paper • 2506.18851 • Published Jun 23 • 29
3D Arena: An Open Platform for Generative 3D Evaluation

Paper • 2506.18787 • Published Jun 23 • 12
AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models

Paper • 2506.19851 • Published Jun 24 • 59
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19 • 126
Inverse-and-Edit: Effective and Fast Image Editing by Cycle Consistency Models

Paper • 2506.19103 • Published Jun 23 • 42
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling

Paper • 2506.20452 • Published Jun 25 • 19
ReCode: Updating Code API Knowledge with Reinforcement Learning

Paper • 2506.20495 • Published Jun 25 • 8
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Paper • 2507.01945 • Published Jul 2 • 76
JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching

Paper • 2506.23552 • Published Jun 30 • 10