Hobit2002
/

whisper_tiny_cs

Automatic Speech Recognition

knowledge-distillation

Model card Files Files and versions Community

Hobit2002 commited on Apr 27

Commit

154d256

·

verified ·

1 Parent(s): 8e01df4

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -28,11 +28,11 @@ During early experiments, we observed that Whisper Tiny often produced invalid o
 The training loss combined the standard ASR loss with KD loss:
-\[
-L_{t} = \lambda_{lm} \, \text{CE}(\text{asr}, \text{true token}) + (1 - \lambda_{lm}) \, \text{KLD}(\text{asr distribution}, \text{mlm prediction})
-\]
-where \(\lambda_{lm}\) balances the two components.
 ### Hyperparameters

 The training loss combined the standard ASR loss with KD loss:
+$$
+L_t = \lambda_{lm} \, \text{CE}(\text{asr}, \text{true token}) + (1 - \lambda_{lm}) \, \text{KLD}(\text{asr distribution}, \text{mlm prediction})
+$$
+where $\lambda_{lm}$ balances the two components.
 ### Hyperparameters