Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: audio-to-audio
|
|
12 |
|
13 |
# Scaling Analysis of Interleaved Speech-Text Language Models
|
14 |
|
15 |
-
The model was presented in the paper [Scaling Analysis of Interleaved Speech-Text Language Models](https://arxiv.org/abs/).
|
16 |
|
17 |
# Paper abstract
|
18 |
Existing Speech Language Model (SLM) scaling analysis paints a bleak picture. They predict that SLMs require much more compute and data
|
@@ -32,7 +32,7 @@ This is a Speech Language Model (SLM) trained for generating speech or text cont
|
|
32 |
## Model Details
|
33 |
|
34 |
### Model Description
|
35 |
-
This Speech Language Model, introduced in ["Scaling Analysis of Interleaved Speech-Text Language Models"](https://arxiv.org/abs/), focuses on scaling analysis of interleaved speech-text SLMs.
|
36 |
It was fine-tuned from [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) by extending its vocabulary with 500 speech tokens extracted from
|
37 |
the 11-th layer of [mhubert-25hz](https://huggingface.co/slprl/mhubert-base-25hz).
|
38 |
|
@@ -44,7 +44,7 @@ the 11-th layer of [mhubert-25hz](https://huggingface.co/slprl/mhubert-base-25hz
|
|
44 |
### Model Sources
|
45 |
|
46 |
- **Repository:** [https://github.com/slp-rl/slamkit](https://github.com/slp-rl/slamkit)
|
47 |
-
- **Paper:** [https://arxiv.org/abs/](https://arxiv.org/abs/)
|
48 |
- **Demo:** [https://pages.cs.huji.ac.il/adiyoss-lab/sims/](https://pages.cs.huji.ac.il/adiyoss-lab/sims/)
|
49 |
|
50 |
## Uses
|
@@ -60,7 +60,7 @@ We refer users to the official repository for full usage explanations - [github]
|
|
60 |
|
61 |
|
62 |
## Training Details
|
63 |
-
We highly encourage users to read the full [paper](https://arxiv.org/abs/
|
64 |
|
65 |
|
66 |
### Compute Infrastructure
|
@@ -76,6 +76,12 @@ easy and efficient training of Speech Language Models.
|
|
76 |
**BibTeX:**
|
77 |
```
|
78 |
@misc{maimon2025scaling,
|
79 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
}
|
81 |
```
|
|
|
12 |
|
13 |
# Scaling Analysis of Interleaved Speech-Text Language Models
|
14 |
|
15 |
+
The model was presented in the paper [Scaling Analysis of Interleaved Speech-Text Language Models](https://arxiv.org/abs/2504.02398).
|
16 |
|
17 |
# Paper abstract
|
18 |
Existing Speech Language Model (SLM) scaling analysis paints a bleak picture. They predict that SLMs require much more compute and data
|
|
|
32 |
## Model Details
|
33 |
|
34 |
### Model Description
|
35 |
+
This Speech Language Model, introduced in ["Scaling Analysis of Interleaved Speech-Text Language Models"](https://arxiv.org/abs/2504.02398), focuses on scaling analysis of interleaved speech-text SLMs.
|
36 |
It was fine-tuned from [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) by extending its vocabulary with 500 speech tokens extracted from
|
37 |
the 11-th layer of [mhubert-25hz](https://huggingface.co/slprl/mhubert-base-25hz).
|
38 |
|
|
|
44 |
### Model Sources
|
45 |
|
46 |
- **Repository:** [https://github.com/slp-rl/slamkit](https://github.com/slp-rl/slamkit)
|
47 |
+
- **Paper:** [https://arxiv.org/abs/](https://arxiv.org/abs/2504.02398)
|
48 |
- **Demo:** [https://pages.cs.huji.ac.il/adiyoss-lab/sims/](https://pages.cs.huji.ac.il/adiyoss-lab/sims/)
|
49 |
|
50 |
## Uses
|
|
|
60 |
|
61 |
|
62 |
## Training Details
|
63 |
+
We highly encourage users to read the full [paper](https://arxiv.org/abs/2504.02398), for full training details.
|
64 |
|
65 |
|
66 |
### Compute Infrastructure
|
|
|
76 |
**BibTeX:**
|
77 |
```
|
78 |
@misc{maimon2025scaling,
|
79 |
+
title={Scaling Analysis of Interleaved Speech-Text Language Models},
|
80 |
+
author={Gallil Maimon and Michael Hassid and Amit Roth and Yossi Adi},
|
81 |
+
year={2025},
|
82 |
+
eprint={2504.02398},
|
83 |
+
archivePrefix={arXiv},
|
84 |
+
primaryClass={cs.CL},
|
85 |
+
url={https://arxiv.org/abs/2504.02398},
|
86 |
}
|
87 |
```
|