--- license: other license_name: nvidia-oneway-noncommercial-license --- # PyTorch Implementation of Audio-to-Audio Schrodinger Bridges **Zhifeng Kong, Kevin J Shih, Weili Nie, Arash Vahdat, Sang-gil Lee, Joao Felipe Santos, Ante Jukic, Rafael Valle, Bryan Catanzaro** [[paper]](https://arxiv.org/abs/2501.11311) [[GitHub]](https://github.com/NVIDIA/diffusion-audio-restoration) [[Demo]](https://research.nvidia.com/labs/adlr/A2SB/) This repo contains the PyTorch implementation of [A2SB: Audio-to-Audio Schrodinger Bridges](https://arxiv.org/abs/2501.11311). A2SB is an audio restoration model tailored for high-res music at 44.1kHz. It is capable of both bandwidth extension (predicting high-frequency components) and inpainting (re-generating missing segments). Critically, A2SB is end-to-end without need of a vocoder to predict waveform outputs, and able to restore hour-long audio inputs. A2SB is capable of achieving state-of-the-art bandwidth extension and inpainting quality on several out-of-distribution music test sets. - We propose A2SB, a state-of-the-art, end-to-end, vocoder-free, and multi-task diffusion Schrodinger Bridge model for 44.1kHz high-res music restoration, using an effective factorized audio representation. - A2SB is the first long audio restoration model that could restore hour-long audio without boundary artifacts ## License The model is provided under the NVIDIA OneWay NonCommercial License. ## Citation ``` @article{kong2025a2sb, title={A2SB: Audio-to-Audio Schrodinger Bridges}, author={Kong, Zhifeng and Shih, Kevin J and Nie, Weili and Vahdat, Arash and Lee, Sang-gil and Santos, Joao Felipe and Jukic, Ante and Valle, Rafael and Catanzaro, Bryan}, journal={arXiv preprint arXiv:2501.11311}, year={2025} } ```