Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
henern 's Collections
RAG
Data
Training
Capabilities
Inference
Evaluating
Vision
Audio
Reports

Audio

updated Feb 12

Audio/Music/Speech/etc.

Upvote
-

  • Language Model Can Listen While Speaking

    Paper • 2408.02622 • Published Aug 5, 2024 • 43

  • Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis

    Paper • 2502.04128 • Published Feb 6 • 26
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs