Update README.md
Browse files
README.md
CHANGED
@@ -42,26 +42,128 @@ new_version: AnasAlokla/multilingual_go_emotions
|
|
42 |
pipeline_tag: text-classification
|
43 |
---
|
44 |
|
|
|
45 |
|
46 |
-
|
|
|
|
|
|
|
47 |
|
48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
|
|
|
50 |
|
51 |
-
|
52 |
|
53 |
-
**
|
|
|
|
|
|
|
54 |
|
55 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
56 |
|
57 |
## Links
|
58 |
|
59 |
-
* **Live Demo:** [
|
60 |
-
* **Dataset (Supports 6 Languages):** [
|
61 |
-
* **Model Used:** [
|
62 |
-
* **GitHub Code:** [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
|
64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
The following table shows the performance metrics of the fine-tuned model on the test set, broken down by emotion category.
|
67 |
|
@@ -133,3 +235,51 @@ The table below shows the performance of the test model with a threshold of 0.5:
|
|
133 |
| sadness | 0.968 | 0.512 | 0.408 | 0.454 | 0.441 | 1062 | 0.5 |
|
134 |
| surprise | 0.974 | 0.492 | 0.430 | 0.459 | 0.447 | 828 | 0.5 |
|
135 |
| neutral | 0.742 | 0.648 | 0.440 | 0.524 | 0.368 | 10524 | 0.5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
pipeline_tag: text-classification
|
43 |
---
|
44 |
|
45 |
+
# ๐ Multilingual GoEmotions Classifier ๐ฌ
|
46 |
|
47 |
+
[](https://huggingface.co/datasets/AnasAlokla/multilingual_go_emotions)
|
48 |
+
[](https://huggingface.co/AnasAlokla/multilingual_go_emotions#key-features)
|
49 |
+
[](https://huggingface.co/AnasAlokla/multilingual_go_emotions#overview)
|
50 |
+
[](https://huggingface.co/AnasAlokla/multilingual_go_emotions)
|
51 |
|
52 |
+
## Table of Contents
|
53 |
+
- ๐ [Overview](#overview)
|
54 |
+
- โจ [Key Features](#key-features)
|
55 |
+
- ๐ซ [Supported Emotions](#supported-emotions)
|
56 |
+
- ๐ [Links](#links)
|
57 |
+
- โ๏ธ [Installation](#installation)
|
58 |
+
- ๐ [Quickstart: Emotion Detection](#quickstart-emotion-detection)
|
59 |
+
- ๐ [Evaluation](#evaluation)
|
60 |
+
- ๐ก [Use Cases](#use-cases)
|
61 |
+
- ๐ [Trained On](#trained-on)
|
62 |
+
- ๐ง [Fine-Tuning Guide](#fine-tuning-guide)
|
63 |
+
- ๐ท๏ธ [Tags](#tags)
|
64 |
+
- ๐ฌ [Support & Contact](#support--contact)
|
65 |
|
66 |
+
## Overview
|
67 |
|
68 |
+
This repository contains a powerful **multilingual, multi-label emotion classification model**. It is fine-tuned from the robust `bert-base-multilingual-cased` model on the comprehensive `multilingual_go_emotions` dataset. The model is designed to analyze text and identify a wide spectrum of 27 different emotions, plus a neutral category. Its ability to detect multiple emotions simultaneously makes it highly effective for understanding nuanced text from diverse sources.
|
69 |
|
70 |
+
- **Model Name**: AnasAlokla/multilingual_go_emotions_V1.1
|
71 |
+
- **Architecture**: BERT (bert-base-multilingual-cased)
|
72 |
+
- **Tasks**: Multi-Label Text Classification | Emotion Detection | Sentiment Analysis
|
73 |
+
- **Languages**: Arabic, English, French, Spanish, Dutch, Turkish
|
74 |
|
75 |
+
## Key Features
|
76 |
+
|
77 |
+
- ๐ **Truly Multilingual**: Natively supports 6 major languages, making it ideal for global applications.
|
78 |
+
- ๐ท๏ธ **Multi-Label Classification**: Capable of detecting multiple emotions in a single piece of text, capturing complex emotional expressions.
|
79 |
+
- ๐ช **High Performance**: Built on `bert-base-multilingual-cased`, delivering strong results across all supported languages and emotions. See the detailed [evaluation metrics](#evaluation).
|
80 |
+
- ๐ **Open & Accessible**: Comes with a live demo, the full dataset, and the complete training code for full transparency and reproducibility.
|
81 |
+
- V1.1 **Improved Version**: An updated model is available that specifically improves performance on low-frequency emotion samples.
|
82 |
+
|
83 |
+
## Supported Emotions
|
84 |
+
|
85 |
+
The model is trained to classify text into 27 distinct emotion categories as well as a neutral class:
|
86 |
+
|
87 |
+
| Emotion | Emoji | Emotion | Emoji |
|
88 |
+
|----------------|-------|----------------|-------|
|
89 |
+
| Admiration | ๐คฉ | Love | โค๏ธ |
|
90 |
+
| Amusement | ๐ | Nervousness | ๐ฐ |
|
91 |
+
| Anger | ๐ | Optimism | โจ |
|
92 |
+
| Annoyance | ๐ | Pride | ๐ |
|
93 |
+
| Approval | ๐ | Realization | ๐ก |
|
94 |
+
| Caring | ๐ค | Relief | ๐ |
|
95 |
+
| Confusion | ๐ | Remorse | ๐ |
|
96 |
+
| Curiosity | ๐ค | Sadness | ๐ข |
|
97 |
+
| Desire | ๐ฅ | Surprise | ๐ฒ |
|
98 |
+
| Disappointment | ๐ | Disapproval | ๐ |
|
99 |
+
| Disgust | ๐คข | Gratitude | ๐ |
|
100 |
+
| Embarrassment | ๐ณ | Grief | ๐ญ |
|
101 |
+
| Excitement | ๐ | Joy | ๐ |
|
102 |
+
| Fear | ๐ฑ | Neutral | ๐ |
|
103 |
|
104 |
## Links
|
105 |
|
106 |
+
* **Live Demo:** [**Hugging Face Space**](https://huggingface.co/spaces/AnasAlokla/test_emotion_chatbot)
|
107 |
+
* **Dataset (Supports 6 Languages):** [**multilingual_go_emotions**](https://huggingface.co/datasets/AnasAlokla/multilingual_go_emotions)
|
108 |
+
* **Based Model Used:** [**AnasAlokla/multilingual_go_emotions**](https://huggingface.co/AnasAlokla/multilingual_go_emotions)
|
109 |
+
* **GitHub Code:** [**emotion_chatbot**](https://github.com/anasAloklah/emotion_chatbot)
|
110 |
+
|
111 |
+
## Installation
|
112 |
+
|
113 |
+
Install the required libraries using pip:
|
114 |
+
|
115 |
+
```bash
|
116 |
+
pip install transformers torch
|
117 |
+
```
|
118 |
+
## Quickstart: Emotion Detection
|
119 |
+
|
120 |
+
You can easily use this model for multi-label emotion classification with the transformers pipeline. Set top_k=None to see all predicted emotions above the model's default threshold.
|
121 |
+
|
122 |
+
```python
|
123 |
+
from transformers import pipeline
|
124 |
+
|
125 |
+
# Load the multilingual, multi-label emotion classification pipeline
|
126 |
+
emotion_classifier = pipeline(
|
127 |
+
"text-classification",
|
128 |
+
model="AnasAlokla/multilingual_go_emotions",
|
129 |
+
top_k=None # To return all scores for each label
|
130 |
+
)
|
131 |
|
132 |
+
# --- Example 1: English ---
|
133 |
+
text_en = "I'm so happy for you, but I'm also a little bit sad to see you go."
|
134 |
+
results_en = emotion_classifier(text_en)
|
135 |
+
print(f"Text (EN): {text_en}")
|
136 |
+
print(f"Predictions: {results_en}\n")
|
137 |
+
|
138 |
+
# --- Example 2: Spanish ---
|
139 |
+
text_es = "ยกQuรฉ sorpresa! No me lo esperaba para nada."
|
140 |
+
results_es = emotion_classifier(text_es)
|
141 |
+
print(f"Text (ES): {text_es}")
|
142 |
+
print(f"Predictions: {results_es}\n")
|
143 |
+
|
144 |
+
# --- Example 3: Arabic ---
|
145 |
+
text_ar = "ุฃุดุนุฑ ุจุฎูุจุฉ ุฃู
ู ูุบุถุจ ุจุณุจุจ ู
ุง ุญุฏุซ"
|
146 |
+
results_ar = emotion_classifier(text_ar)
|
147 |
+
print(f"Text (AR): {text_ar}")
|
148 |
+
print(f"Predictions: {results_ar}")
|
149 |
+
```
|
150 |
+
|
151 |
+
Expected Output (structure):
|
152 |
+
|
153 |
+
Text (EN): I'm so happy for you, but I'm also a little bit sad to see you go.
|
154 |
+
Predictions: [[{'label': 'joy', 'score': 0.9...}, {'label': 'sadness', 'score': 0.8...}, {'label': 'caring', 'score': 0.5...}, ...]]
|
155 |
+
|
156 |
+
Text (ES): ยกQuรฉ sorpresa! No me lo esperaba para nada.
|
157 |
+
Predictions: [[{'label': 'surprise', 'score': 0.9...}, {'label': 'excitement', 'score': 0.4...}, ...]]
|
158 |
+
|
159 |
+
Text (AR): ุฃุดุนุฑ ุจุฎูุจุฉ ุฃู
ู ูุบุถุจ ุจุณุจุจ ู
ุง ุญุฏุซ
|
160 |
+
Predictions: [[{'label': 'disappointment', 'score': 0.9...}, {'label': 'anger', 'score': 0.9...}, ...]]
|
161 |
+
|
162 |
+
## Evaluation
|
163 |
+
|
164 |
+
The model's performance was rigorously evaluated on the test set.
|
165 |
+
|
166 |
+
Test Set Performance
|
167 |
|
168 |
The following table shows the performance metrics of the fine-tuned model on the test set, broken down by emotion category.
|
169 |
|
|
|
235 |
| sadness | 0.968 | 0.512 | 0.408 | 0.454 | 0.441 | 1062 | 0.5 |
|
236 |
| surprise | 0.974 | 0.492 | 0.430 | 0.459 | 0.447 | 828 | 0.5 |
|
237 |
| neutral | 0.742 | 0.648 | 0.440 | 0.524 | 0.368 | 10524 | 0.5 |
|
238 |
+
|
239 |
+
## Use Cases
|
240 |
+
|
241 |
+
This model is ideal for applications requiring nuanced emotional understanding across different languages:
|
242 |
+
|
243 |
+
Global Customer Feedback Analysis: Analyze customer reviews, support tickets, and survey responses from around the world to gauge sentiment.
|
244 |
+
|
245 |
+
Multilingual Social Media Monitoring: Track brand perception and public mood across different regions and languages.
|
246 |
+
|
247 |
+
Advanced Chatbot Development: Build more empathetic and responsive chatbots that can understand user emotions in their native language.
|
248 |
+
|
249 |
+
Content Moderation: Automatically flag toxic, aggressive, or sensitive content on international platforms.
|
250 |
+
|
251 |
+
Market Research: Gain insights into how different cultures express emotions in text.
|
252 |
+
|
253 |
+
## Trained On
|
254 |
+
|
255 |
+
Base Model: [**AnasAlokla/multilingual_go_emotions**](https://huggingface.co/AnasAlokla/multilingual_go_emotions) - A powerful pretrained model supporting 104 languages.
|
256 |
+
|
257 |
+
Dataset: [**multilingual_go_emotions**](https://huggingface.co/datasets/AnasAlokla/multilingual_go_emotions) - A carefully translated and curated dataset for multilingual emotion analysis, based on the original Google GoEmotions dataset.
|
258 |
+
|
259 |
+
## Fine-Tuning Guide
|
260 |
+
|
261 |
+
To adapt this model for your own dataset or to replicate the training process, you can follow the methodology outlined in the official code repository. The repository provides a complete, end-to-end example, including data preprocessing, training scripts, and evaluation logic.
|
262 |
+
|
263 |
+
For full details, please refer to the GitHub repository:
|
264 |
+
[**emotion_chatbot**](https://github.com/anasAloklah/emotion_chatbot)
|
265 |
+
|
266 |
+
|
267 |
+
|
268 |
+
## Tags
|
269 |
+
|
270 |
+
`#multilingual-nlp` `#emotion-classification` `#text-classification` `#multi-label` `#bert`
|
271 |
+
`#transformer` `#natural-language-processing` `#sentiment-analysis` `#deep-learning`
|
272 |
+
`#arabic-nlp` `#french-nlp` `#spanish-nlp` `#goemotions`
|
273 |
+
`#BERT-Emotion` `#edge-nlp` `#emotion-detection` `#offline-nlp`
|
274 |
+
`#sentiment-analysis` `#emojis` `#emotions` `#embedded-nlp`
|
275 |
+
`#ai-for-iot` `#efficient-bert` `#nlp2025` `#context-aware` `#edge-ml`
|
276 |
+
`#smart-home-ai` `#emotion-aware` `#voice-ai` `#eco-ai` `#chatbot` `#social-media`
|
277 |
+
`#mental-health` `#short-text` `#smart-replies` `#tone-analysis`
|
278 |
+
|
279 |
+
## Support & Contact
|
280 |
+
|
281 |
+
For questions, bug reports, or collaboration inquiries, please open an issue on the Hugging Face Hub repository or contact the author directly.
|
282 |
+
|
283 |
+
Author: Anas Hamid Alokla
|
284 |
+
|
285 |
+
๐ฌ Email: anasaloklahaaa@gmail.com
|