poltextlab/xlm-roberta-large-hungarian-sentiment-v2

Model description

This model is based on XLM-RoBERTa Large and fine-tuned for Hungarian sentiment analysis.
It classifies text into three sentiment categories:

  • 0 → Negative
  • 1 → Neutral
  • 2 → Positive

Training data

The model was trained on a mix of original and synthetically generated Hungarian texts.
Synthetic samples were introduced to improve class balance and robustness.

Performance

Overall metrics

  • Accuracy: 0.8530
  • Precision: 0.8449
  • Recall: 0.8530
  • F1 Score: 0.8469

Classification Report

Label Precision Recall F1-score Support
0 (Negative) 0.89 0.92 0.91 1866
1 (Neutral) 0.65 0.50 0.56 583
2 (Positive) 0.86 0.92 0.89 1341
Metric Precision Recall F1-score Support
Accuracy – – 0.85 3790
Macro avg 0.80 0.78 0.79 3790
Weighted avg 0.84 0.85 0.85 3790

Confusion Matrix

Confusion Matrix

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model_name = "poltextlab/xlm-roberta-large-hungarian-sentiment-v2"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

clf = pipeline("text-classification", model=model, tokenizer=tokenizer)

print(clf("Ez egy fantasztikus nap volt!"))  # expected: Positive
print(clf("Ez borzalmas élmény volt."))     # expected: Negative

License

MIT License

Limitations and notes

  • The neutral class performs weaker compared to the positive and negative classes.

  • Performance may vary depending on text length, domain, and context.

  • Due to the use of synthetic data, some linguistic patterns may be over- or underrepresented.

Downloads last month
47
Safetensors
Model size
560M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

  • Accuracy on Hungarian Sentiment (original + synthetic)
    self-reported
    0.853
  • Precision on Hungarian Sentiment (original + synthetic)
    self-reported
    0.845
  • Recall on Hungarian Sentiment (original + synthetic)
    self-reported
    0.853
  • F1 on Hungarian Sentiment (original + synthetic)
    self-reported
    0.847