|
--- |
|
library_name: transformers |
|
tags: |
|
- detoxification |
|
- text_style_transfer |
|
license: openrail++ |
|
datasets: |
|
- s-nlp/synthdetoxm |
|
language: |
|
- de |
|
- es |
|
- fr |
|
- ru |
|
base_model: |
|
- bigscience/mt0-xl |
|
pipeline_tag: text2text-generation |
|
--- |
|
|
|
# mT0-XL (SynthDetoxM Full) |
|
|
|
|
|
 |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This a fine-tune of [`bigscience/mt0-xl`](https://huggingface.co/bigscience/mt0-xl) model on a subset of the multilingual text detoxification dataset [SynthDetoxM](https://huggingface.co/datasets/s-nlp/synthdetoxm) from the NAACL 2025 Main Track paper *SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators* by Daniil Moskovskiy et al. |
|
|
|
## Usage |
|
|
|
The usage is similar to the |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
toxic_text = "Your toxic text goes here." |
|
|
|
pipe = pipeline("text2text-generation", model="s-nlp/mt0-xl-detox-sdm-full") |
|
pipe(f"Detoxify: {toxic_text}") |
|
|
|
``` |
|
|
|
## Training Details |
|
|
|
The model was fine-tuned for 2 epochs on [`s-nlp/synthdetoxm`](https://huggingface.co/datasets/s-nlp/synthdetoxm) dataset with full precision (FP32) using Adafactor optimizer with `1e-4` learning rate and batch size of `4` with gradient checkpointing enabled. The full training configuration is available below: |
|
|
|
```json |
|
{ |
|
"do_train": true, |
|
"do_eval": true, |
|
"per_device_train_batch_size": 4, |
|
"per_device_eval_batch_size": 4, |
|
"learning_rate": 1e-4, |
|
"weight_decay": 0, |
|
"num_train_epochs": 2, |
|
"gradient_accumulation_steps": 1, |
|
"logging_strategy": "steps", |
|
"logging_steps": 1, |
|
"save_strategy": "epoch", |
|
"save_total_limit": 1, |
|
"warmup_steps": 1, |
|
"report_to": "wandb", |
|
"optim": "adafactor", |
|
"lr_scheduler_type": "linear", |
|
"predict_with_generate": true, |
|
"bf16": false, |
|
"gradient_checkpointing": true, |
|
"output_dir": "/path/", |
|
"seed": 42, |
|
} |
|
|
|
``` |
|
|
|
#### Metrics |
|
|
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
|
We use the multilingual detoxification evaluation setup from [TextDetox 2024 Multilingual Text Detoxification Shared Task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html). |
|
Specifically, we use the following metrics: |
|
|
|
- **Style Transfer Accuracy** (**STA**) is calculated with a [`textdetox/xlmr-large-toxicity-classifier`](https://huggingface.co/textdetox/xlmr-large-toxicity-classifier). |
|
- **Text Similarity** (**SIM**) is calculated as a similarity of text embeddings given by a [`sentence-transformers/LaBSE`](https://huggingface.co/sentence-transformers/LaBSE) encoder. |
|
- **Fluency** (**FL**) is calculated as a character n-gram F score - [ChrF1](https://github.com/m-popovic/chrF). |
|
|
|
These metrics are aggregated in a final **Joint** metric (**J**): |
|
|
|
$$\textbf{J} = \frac{1}{n}\sum\limits_{i=1}^{n}\textbf{STA}(y_i) \cdot \textbf{SIM}(x_i,y_i) \cdot \textbf{FL}(x_i, y_i)$$ |
|
|
|
### Evaluation Results |
|
|
|
This model was evaluated on the test set of [`textdetox/multilingual_paradetox`](https://huggingface.co/datasets/textdetox/multilingual_paradetox) dataset from [TextDetox 2024 Multilingual Text Detoxification Shared Task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html). |
|
The results of the evaluation are presented below. |
|
|
|
| | **German** | **Spanish** | **Russian** | |
|
|----------------|------------|-------------|-------------| |
|
| **Human References** | 0.733 | 0.709 | 0.732 | |
|
| **Baselines** | | | | |
|
| Duplicate | 0.287 | 0.090 | 0.048 | |
|
| Delete | 0.362 | 0.319 | 0.255 | |
|
| Backtranslation| 0.233 | 0.275 | 0.223 | |
|
| **mT0-XL supervised fine-tuning** | | | | |
|
| [MultiParaDetox](https://huggingface.co/datasets/textdetox/multilingual_paradetox) [`s-nlp/mt0-xl-detox-mpd`](https://huggingface.co/s-nlp/mt0-xl-detox-mpd) | 0.446 | 0.344 | 0.472 | |
|
| [SynthDetoxM](https://huggingface.co/datasets/s-nlp/synthdetoxm) (Subset AVG this model) | 0.460 | 0.402 | 0.475 | |
|
| [SynthDetoxM](https://huggingface.co/datasets/s-nlp/synthdetoxm) [`s-nlp/mt0-xl-detox-sdm-full`](https://huggingface.co/s-nlp/mt0-xl-detox-sdm-full) | **0.482** | **0.470** | **0.546** | |
|
|
|
|
|
#### Software |
|
|
|
Code for replicating the results from the paper can be found on [GitHub](https://github.com/s-nlp/synthdetoxm). |
|
|
|
## Citation |
|
|
|
**BibTeX:** |
|
|
|
```latex |
|
@misc{moskovskiy2025synthdetoxmmodernllmsfewshot, |
|
title={SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators}, |
|
author={Daniil Moskovskiy and Nikita Sushko and Sergey Pletenev and Elena Tutubalina and Alexander Panchenko}, |
|
year={2025}, |
|
eprint={2502.06394}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2502.06394}, |
|
} |
|
``` |
|
|
|
## License |
|
|
|
This model is licensed under the OpenRAIL++ License, which supports the development of various technologies—both industrial and academic—that serve the public good. |
|
|
|
## Model Card Authors |
|
|
|
[Daniil Moskovskiy](https://huggingface.co/etomoscow) |
|
|
|
## Model Card Contact |
|
|
|
For any questions, please contact: [Daniil Moskovskiy](Daniil.Moskovskiy@skoltech.ru) |