paraphrase-MiniLM-L3-v2_immig
This SetFit model was trained on 48 title-abstracts samples (24 per class) to differeniate between published studies related to immigration/migration research and those that are not.
- Model Type: SetFit
- Sentence Transformer body: sentence-transformers/paraphrase-MiniLM-L3-v2
- Classification head: a LogisticRegression instance
- Train data/script repository: SetFit on GitHub
Evaluation
Metrics
Label | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|
all | 0.9812 | 0.9934 | 0.9868 | 0.9901 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("mmarbach/paraphrase-MiniLM-L3-v2_immig")
preds = model("TITLE: ... ABSTRACT: ....")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 97 | 155.6458 | 262 |
Label | Training Sample Count |
---|---|
immigration_topic | 24 |
other_topic | 24 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (4, 4)
- max_steps: -1
- sampling_strategy: oversampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0133 | 1 | 0.288 | - |
0.6667 | 50 | 0.1935 | - |
1.0 | 75 | - | 0.0980 |
1.3333 | 100 | 0.0472 | - |
2.0 | 150 | 0.0118 | 0.0767 |
2.6667 | 200 | 0.0057 | - |
3.0 | 225 | - | 0.0719 |
3.3333 | 250 | 0.0047 | - |
4.0 | 300 | 0.0039 | 0.0718 |
Framework Versions
- Python: 3.12.11
- SetFit: 1.1.2
- Sentence Transformers: 5.0.0
- Transformers: 4.53.0
- PyTorch: 2.7.1
- Datasets: 3.6.0
- Tokenizers: 0.21.2
- Downloads last month
- 21
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for sumtxt/paraphrase-MiniLM-L3-v2_immig
Evaluation results
- Accuracy on Unknowntest set self-reported0.981
- Precision on Unknowntest set self-reported0.993
- Recall on Unknowntest set self-reported0.987
- F1 on Unknowntest set self-reported0.990