XLMR-SocMed-UkrWarNarrative-5C_v1

Model Description

XLMR-SocMed-UkrWarNarrative-5C_v1 is a fine-tuned XLM-RoBERTa model designed for multi-class classification of social media narratives related to the Ukrainian conflict. The model identifies and classifies disinformation and propaganda patterns in social media discourse across 5 distinct categories.

Model Details

Model Type: XLM-RoBERTa (multilingual transformer)
Language(s): Multilingual (optimized for social media content)
License: cc-by-nc-4.0
Finetuned from: xlm-roberta-large
Domain: Social Media, Conflict Analysis, Narrative Classification
Classes: 5 categories
Version: 1.0

Primary Use Cases

Academic research on conflict-related social media discourse
Content moderation and disinformation detection
Media literacy and narrative analysis studies
Social media monitoring for Ukrainian conflict-related content

Performance

Test Set Results

Accuracy: 76.16%
Macro F1: 77.51%
Macro Precision: 75.06%
Macro Recall: 81.28%

Per-Class Performance

Label	Category	Precision	Recall	F1-Score	Support
1601	Anti-Western Narratives	0.76	0.75	0.75	174
1602	Economic Fallout/Domestic Welfare Neglected	0.69	0.88	0.77	50
1603	Illegitimate and Corrupt Ukrainian Leadership	0.72	0.83	0.77	76
1604	Ukraine/Nazi Allegations	0.77	0.95	0.85	64
1699	None (Legitimate Discourse)	0.81	0.65	0.72	194

Training Details

Training Data

Source: Social media platforms (Twitter/X)

Training Procedure

Epochs: 4
Final Training Loss: 0.3397
Validation Accuracy: 78.63%
Validation F1 (Macro): 79.93%

Label Descriptions

1601 - Anti-Western Narratives

Consolidates NATO expansion criticism, Western interference allegations, historical grievances, propaganda claims, and Russian defense justifications. This unified framework demonstrates substantial empirical presence across 13 clusters (~583,000 tweets) with strong thematic coherence and overlapping linguistic patterns.

1602 - Economic Fallout/Domestic Welfare Neglected

Critical narratives focusing on resource allocation to Ukraine at the expense of domestic priorities. Shows strong empirical presence across 2 clusters (~90,000 tweets), emphasizing economic concerns and domestic welfare issues.

1603 - Illegitimate and Corrupt Ukrainian Leadership

Narratives aimed at delegitimizing Ukrainian government through corruption allegations and personal attacks. Shows concentrated empirical evidence despite limited keyword matches, focusing on undermining leadership credibility.

1604 - Ukraine/Nazi Allegations

The most empirically robust category with presence across 8 clusters (~157,000 tweets). Frames Ukraine as ideologically aligned with Nazi principles to justify Russian intervention. Shows highest recall (0.95) indicating strong pattern recognition.

1699 - None (Legitimate Discourse)

Captures legitimate discourse, factual reporting, and content outside the scope of the above narrative categories. Represents balanced, non-propagandistic content related to the Ukrainian conflict.

Bias and Limitations

Known Limitations

Model performance varies across categories (F1: 0.72-0.85)
Lower recall for "None" category (0.65) may misclassify legitimate content
Training data reflects social media biases and temporal patterns

Ethical Considerations

Responsible Use

Requires human oversight for content moderation decisions
Should not be used as sole authority for censorship decisions
Results should be interpreted within broader context of media analysis
Regular monitoring for performance drift and bias amplification

Potential Risks

False positive classifications may affect legitimate speech
Model may reflect historical biases present in training data
Adversarial attacks could exploit classification boundaries

Technical Specifications

Model Architecture

Base Model: xlm-roberta-large
Fine-tuning: Classification head with 5 output classes
Input: Social media text (max length as per XLM-RoBERTa specifications)
Output: Class probabilities for 5 narrative categories

Hardware Requirements

GPU recommended for inference at scale
Standard transformer model memory requirements
Optimized for batch processing of social media content

Citation

@misc{xlmr-socmed-ukrwarnarrative-5c-v1,
  title={XLMR-SocMed-UkrWarNarrative-5C_v1: A Multilingual Classifier for Ukrainian Conflict Narratives on Social Media},
  author={[Your Name/Institution]},
  year={2025},
  url={[Model URL]}
}

Model Card Authors

[Orsolya Ring/poltextLAB]

This model card follows the guidelines established by Mitchell et al. (2019) for transparent AI model documentation.