|
--- |
|
license: apache-2.0 |
|
language: |
|
- de |
|
widget: |
|
- text: "STS Group AG erhält Großauftrag von führendem Nutzfahrzeughersteller in Nordamerika und plant Bau eines ersten US-Werks" |
|
- text: "Zukünftig soll jedoch je Geschäftsjahr eine Mindestdividende in Höhe von EUR 2,00 je dividendenberechtigter Aktie an die Aktionärinnen und Aktionäre ausgeschüttet werden." |
|
- text: "Comet passt Jahresprognose nach Q3 unter Erwartungen an" |
|
--- |
|
# German FinBERT For Sentiment Analysis (Pre-trained From Scratch Version, Fine-Tuned for Financial Sentiment Analysis) |
|
<img src="https://github.com/mscherrmann/mscherrmann.github.io/blob/master/assets/img/publication_preview/germanBert.png?raw=true" alt="Alt text for the image" width="500" height="300"/> |
|
|
|
|
|
German FinBERT is a BERT language model focusing on the financial domain within the German language. In my [paper](https://arxiv.org/pdf/2311.08793.pdf), I describe in more detail the steps taken to train the model and show that it outperforms its generic benchmarks for finance specific downstream tasks. |
|
|
|
This model is the [pre-trained from scratch version of German FinBERT](https://huggingface.co/scherrmann/GermanFinBert_SC), after fine-tuning on a translated version of the [financial news phrase bank](https://arxiv.org/abs/1307.5336) of Malo et al. (2013). The data is available [here](https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german). |
|
|
|
## Overview |
|
**Author** Moritz Scherrmann |
|
**Paper:** [here](https://arxiv.org/pdf/2311.08793.pdf) |
|
**Architecture:** BERT base |
|
**Language:** German |
|
**Specialization:** Financial sentiment |
|
**Base model:** [German_FinBert_SC](https://huggingface.co/scherrmann/GermanFinBert_SC) |
|
|
|
|
|
### Fine-tuning |
|
|
|
I fine-tune the model using the 1cycle policy of [Smith and Topin (2019)](https://arxiv.org/abs/1708.07120). I use the Adam optimization method of [Kingma and Ba (2014)](https://arxiv.org/abs/1412.6980) with |
|
standard parameters.I run a grid search on the evaluation set to find the best hyper-parameter setup. I test different |
|
values for learning rate, batch size and number of epochs, following the suggestions of [Chalkidis et al. (2020)](https://aclanthology.org/2020.findings-emnlp.261/). I repeat the fine-tuning for each setup five times with different seeds, to avoid getting good results by chance. |
|
After finding the best model w.r.t the evaluation set, I report the mean result across seeds for that model on the test set. |
|
|
|
### Results |
|
|
|
Translated [Financial news phrase bank](https://arxiv.org/abs/1307.5336) (Malo et al. (2013)), see [here](https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german) for the data: |
|
- Accuracy: 95.95% |
|
- Macro F1: 92.70% |
|
|
|
|
|
## Authors |
|
Moritz Scherrmann: `scherrmann [at] lmu.de` |
|
|
|
|
|
For additional details regarding the performance on fine-tune datasets and benchmark results, please refer to the full documentation provided in the study. |
|
|
|
See also: |
|
- scherrmann/GermanFinBERT_SC |
|
- scherrmann/GermanFinBERT_FP |
|
- scherrmann/GermanFinBERT_FP_QuAD |