Update README.md
Browse files
README.md
CHANGED
@@ -37,12 +37,34 @@ The preprocessing steps included:
|
|
37 |
Additionally, for fine-tuning the model for your own data, the preprocessing step involves converting new financial headlines into embeddings and feeding them into the RandomForest model.
|
38 |
|
39 |
### Model Evaluation
|
40 |
-
The model has been evaluated using metrics such as:
|
41 |
-
- **Accuracy**: The percentage of correctly classified headlines.
|
42 |
-
- **F1-score**: The harmonic mean of precision and recall, providing a better measure of model performance when dealing with imbalanced data.
|
43 |
-
- **Confusion Matrix**: Helps identify how well the model distinguishes between the different sentiment categories (positive, neutral, and negative).
|
44 |
|
45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
|
47 |
### Usage
|
48 |
|
@@ -50,17 +72,3 @@ To use the model, first install the necessary dependencies:
|
|
50 |
|
51 |
```bash
|
52 |
pip install sentence-transformers scikit-learn
|
53 |
-
|
54 |
-
```
|
55 |
-
|
56 |
-
license: apache-2.0
|
57 |
-
datasets:
|
58 |
-
- NickyNicky/Finance_sentiment_and_topic_classification_En
|
59 |
-
language:
|
60 |
-
- en
|
61 |
-
metrics:
|
62 |
-
- accuracy
|
63 |
-
base_model:
|
64 |
-
- sentence-transformers/all-MiniLM-L6-v2
|
65 |
-
pipeline_tag: text-classification
|
66 |
-
---
|
|
|
37 |
Additionally, for fine-tuning the model for your own data, the preprocessing step involves converting new financial headlines into embeddings and feeding them into the RandomForest model.
|
38 |
|
39 |
### Model Evaluation
|
|
|
|
|
|
|
|
|
40 |
|
41 |
+
|
42 |
+
On the test data, the model achieves an **accuracy of 61%**, with an **F1-score of 0.61**. Not optimal, but acceptable in terms of the simplicity and few data the model is trained on.
|
43 |
+
|
44 |
+
#### Hyperparameters:
|
45 |
+
- **Number of Estimators (n_estimators)**: 200
|
46 |
+
- **Max Depth (max_depth)**: 20
|
47 |
+
- **Min Samples Split (min_samples_split)**: 5
|
48 |
+
- **Min Samples Leaf (min_samples_leaf)**: 1
|
49 |
+
- **Random State (random_state)**: 42
|
50 |
+
- **Max Features (max_features)**: 'sqrt' (default value for RandomForest)
|
51 |
+
|
52 |
+
#### Classification Report:
|
53 |
+
- **Precision**:
|
54 |
+
- Class 0: 0.66
|
55 |
+
- Class 1: 0.62
|
56 |
+
- Class 2: 0.55
|
57 |
+
- **Recall**:
|
58 |
+
- Class 0: 0.52
|
59 |
+
- Class 1: 0.80
|
60 |
+
- Class 2: 0.52
|
61 |
+
- **F1-Score**:
|
62 |
+
- Class 0: 0.58
|
63 |
+
- Class 1: 0.70
|
64 |
+
- Class 2: 0.54
|
65 |
+
- **Overall Accuracy**: 0.61
|
66 |
+
- **Macro Average**: 0.61 (Precision, Recall, F1-Score)
|
67 |
+
- **Weighted Average**: 0.61 (Precision, Recall, F1-Score)
|
68 |
|
69 |
### Usage
|
70 |
|
|
|
72 |
|
73 |
```bash
|
74 |
pip install sentence-transformers scikit-learn
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|