XLMR-SocMed-UkrWarNarrative-5C_v1

Model Description

XLMR-SocMed-UkrWarNarrative-5C_v1 is a fine-tuned XLM-RoBERTa model designed for multi-class classification of social media narratives related to the Ukrainian conflict. The model identifies and classifies disinformation and propaganda patterns in social media discourse across 5 distinct categories.

Model Details

  • Model Type: XLM-RoBERTa (multilingual transformer)
  • Language(s): Multilingual (optimized for social media content)
  • License: cc-by-nc-4.0
  • Finetuned from: xlm-roberta-large
  • Domain: Social Media, Conflict Analysis, Narrative Classification
  • Classes: 5 categories
  • Version: 1.0

Primary Use Cases

  • Academic research on conflict-related social media discourse
  • Content moderation and disinformation detection
  • Media literacy and narrative analysis studies
  • Social media monitoring for Ukrainian conflict-related content

Performance

Test Set Results

  • Accuracy: 76.16%
  • Macro F1: 77.51%
  • Macro Precision: 75.06%
  • Macro Recall: 81.28%

Per-Class Performance

Label Category Precision Recall F1-Score Support
1601 Anti-Western Narratives 0.76 0.75 0.75 174
1602 Economic Fallout/Domestic Welfare Neglected 0.69 0.88 0.77 50
1603 Illegitimate and Corrupt Ukrainian Leadership 0.72 0.83 0.77 76
1604 Ukraine/Nazi Allegations 0.77 0.95 0.85 64
1699 None (Legitimate Discourse) 0.81 0.65 0.72 194

Training Details

Training Data

  • Source: Social media platforms (Twitter/X)

Training Procedure

  • Epochs: 4
  • Final Training Loss: 0.3397
  • Validation Accuracy: 78.63%
  • Validation F1 (Macro): 79.93%

Label Descriptions

1601 - Anti-Western Narratives

Consolidates NATO expansion criticism, Western interference allegations, historical grievances, propaganda claims, and Russian defense justifications. This unified framework demonstrates substantial empirical presence across 13 clusters (~583,000 tweets) with strong thematic coherence and overlapping linguistic patterns.

1602 - Economic Fallout/Domestic Welfare Neglected

Critical narratives focusing on resource allocation to Ukraine at the expense of domestic priorities. Shows strong empirical presence across 2 clusters (~90,000 tweets), emphasizing economic concerns and domestic welfare issues.

1603 - Illegitimate and Corrupt Ukrainian Leadership

Narratives aimed at delegitimizing Ukrainian government through corruption allegations and personal attacks. Shows concentrated empirical evidence despite limited keyword matches, focusing on undermining leadership credibility.

1604 - Ukraine/Nazi Allegations

The most empirically robust category with presence across 8 clusters (~157,000 tweets). Frames Ukraine as ideologically aligned with Nazi principles to justify Russian intervention. Shows highest recall (0.95) indicating strong pattern recognition.

1699 - None (Legitimate Discourse)

Captures legitimate discourse, factual reporting, and content outside the scope of the above narrative categories. Represents balanced, non-propagandistic content related to the Ukrainian conflict.

Bias and Limitations

Known Limitations

  • Model performance varies across categories (F1: 0.72-0.85)
  • Lower recall for "None" category (0.65) may misclassify legitimate content
  • Training data reflects social media biases and temporal patterns

Ethical Considerations

Responsible Use

  • Requires human oversight for content moderation decisions
  • Should not be used as sole authority for censorship decisions
  • Results should be interpreted within broader context of media analysis
  • Regular monitoring for performance drift and bias amplification

Potential Risks

  • False positive classifications may affect legitimate speech
  • Model may reflect historical biases present in training data
  • Adversarial attacks could exploit classification boundaries

Technical Specifications

Model Architecture

  • Base Model: xlm-roberta-large
  • Fine-tuning: Classification head with 5 output classes
  • Input: Social media text (max length as per XLM-RoBERTa specifications)
  • Output: Class probabilities for 5 narrative categories

Hardware Requirements

  • GPU recommended for inference at scale
  • Standard transformer model memory requirements
  • Optimized for batch processing of social media content

Citation

@misc{xlmr-socmed-ukrwarnarrative-5c-v1,
  title={XLMR-SocMed-UkrWarNarrative-5C_v1: A Multilingual Classifier for Ukrainian Conflict Narratives on Social Media},
  author={[Your Name/Institution]},
  year={2025},
  url={[Model URL]}
}

Model Card Authors

[Orsolya Ring/poltextLAB]


This model card follows the guidelines established by Mitchell et al. (2019) for transparent AI model documentation.

Downloads last month
16
Safetensors
Model size
560M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support