meetween
/

Llama-speechlmm-1.0-xl

Feature Extraction

Generated from Trainer

Model card Files Files and versions Community

anferico commited on Mar 5

Commit

313dcdb

·

verified ·

1 Parent(s): 8d29c26

Update README.md

Files changed (1) hide show

README.md +1 -14

README.md CHANGED Viewed

@@ -65,20 +65,7 @@ For all the 4 sizes of SpeechLMM 1.0, the audio and video adapters are:
 Currently, this model can only be used via our [`speechlmm`](https://github.com/meetween/speechlmm) codebase. Refer to the instructions there for more details.
-Important: before you can use this model, you must follow these steps:
-1. Download the SeamlessM4T v2 speech encoder weights:
-    ```python
-    from transformers import AutoProcessor, SeamlessM4Tv2Model
-    processor = AutoProcessor.from_pretrained("facebook/seamless-m4t-v2-large")
-    model = AutoModel.from_pretrained("facebook/seamless-m4t-v2-large")
-    processor.save_pretrained("path/to/some_directory_1")
-    model.speech_encoder.save_pretrained("path/to/some_directory_1")
-    ```
-2. Go to `config.json` and change the `audio_encoder._name_or_path` to `path/to/some_directory_1`
-3. Download the Auto-AVSR video encoder weights from [here](https://drive.google.com/file/d/1shcWXUK2iauRhW9NbwCc25FjU1CoMm8i/view?usp=sharing) and put them in `path/to/some_directory_2`
-4. Go to `config.json` and change the `video_encoder._name_or_path` to `path/to/some_directory_2/vsr_trlrs3vox2_base.pth`
 ## Training Data

 Currently, this model can only be used via our [`speechlmm`](https://github.com/meetween/speechlmm) codebase. Refer to the instructions there for more details.
+Important: before you can use this model, you must download the SeamlessM4T v2 speech encoder by following the instructions provided in the README of the above repo. Please note that by downloading SeamlessM4Tv2, you agree with its license terms.
 ## Training Data