AMR-KELEG
/

NADI2024-baseline

Text Classification

Model card Files Files and versions

AMR-KELEG commited on Sep 22, 2024

Commit

65b3d25

·

verified ·

1 Parent(s): 3ab43b3

Update README.md

Files changed (1) hide show

README.md +42 -0

README.md CHANGED Viewed

@@ -88,4 +88,46 @@ print(s1, s1_pred)
 s2 = "خليلي في مساج بريفي كيفاش الاتصال"
 s2_pred = predict_top_p(s2) # ['Algeria', 'Tunisia']
 print(s2, s2_pred)
 ```

 s2 = "خليلي في مساج بريفي كيفاش الاتصال"
 s2_pred = predict_top_p(s2) # ['Algeria', 'Tunisia']
 print(s2, s2_pred)
+```
+### Citation
+If you find the model useful, please cite the following [respective paper](https://aclanthology.org/2024.arabicnlp-1.79/):
+```
+@inproceedings{abdul-mageed-etal-2024-nadi,
+    title = "{NADI} 2024: The Fifth Nuanced {A}rabic Dialect Identification Shared Task",
+    author = "Abdul-Mageed, Muhammad  and
+      Keleg, Amr  and
+      Elmadany, AbdelRahim  and
+      Zhang, Chiyu  and
+      Hamed, Injy  and
+      Magdy, Walid  and
+      Bouamor, Houda  and
+      Habash, Nizar",
+    editor = "Habash, Nizar  and
+      Bouamor, Houda  and
+      Eskander, Ramy  and
+      Tomeh, Nadi  and
+      Abu Farha, Ibrahim  and
+      Abdelali, Ahmed  and
+      Touileb, Samia  and
+      Hamed, Injy  and
+      Onaizan, Yaser  and
+      Alhafni, Bashar  and
+      Antoun, Wissam  and
+      Khalifa, Salam  and
+      Haddad, Hatem  and
+      Zitouni, Imed  and
+      AlKhamissi, Badr  and
+      Almatham, Rawan  and
+      Mrini, Khalil",
+    booktitle = "Proceedings of The Second Arabic Natural Language Processing Conference",
+    month = aug,
+    year = "2024",
+    address = "Bangkok, Thailand",
+    publisher = "Association for Computational Linguistics",
+    url = "https://aclanthology.org/2024.arabicnlp-1.79",
+    pages = "709--728",
+    abstract = "We describe the findings of the fifth Nuanced Arabic Dialect Identification Shared Task (NADI 2024). NADI{'}s objective is to help advance SoTA Arabic NLP by providing guidance, datasets, modeling opportunities, and standardized evaluation conditions that allow researchers to collaboratively compete on prespecified tasks. NADI 2024 targeted both dialect identification cast as a multi-label task (Subtask 1), identification of the Arabic level of dialectness (Subtask 2), and dialect-to-MSA machine translation (Subtask 3). A total of 51 unique teams registered for the shared task, of whom 12 teams have participated (with 76 valid submissions during the test phase). Among these, three teams participated in Subtask 1, three in Subtask 2, and eight in Subtask 3. The winning teams achieved 50.57 F1 on Subtask 1, 0.1403 RMSE for Subtask 2, and 20.44 BLEU in Subtask 3, respectively. Results show that Arabic dialect processing tasks such as dialect identification and machine translation remain challenging. We describe the methods employed by the participating teams and briefly offer an outlook for NADI.",
+}
 ```