Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Paper • 2409.11355 • Published Sep 17, 2024 • 32
Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects Paper • 2403.16428 • Published Mar 25, 2024
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Paper • 2409.11355 • Published Sep 17, 2024 • 32
Point2Vec for Self-Supervised Representation Learning on Point Clouds Paper • 2303.16570 • Published Mar 29, 2023
InstrumentGen: Generating Sample-Based Musical Instruments From Text Paper • 2311.04339 • Published Nov 7, 2023
Distortion Audio Effects: Learning How to Recover the Clean Signal Paper • 2202.01664 • Published Feb 3, 2022
Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models Paper • 2407.15641 • Published Jul 22, 2024
DSP-informed bandwidth extension using locally-conditioned excitation and linear time-varying filter subnetworks Paper • 2407.15624 • Published Jul 22, 2024
Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech Paper • 2210.13397 • Published Oct 24, 2022
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project Paper • 2309.15869 • Published Sep 26, 2023
Real-time Speech Summarization for Medical Conversations Paper • 2406.15888 • Published Jun 22, 2024 • 1
VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain Paper • 2404.05659 • Published Apr 8, 2024 • 2