scherrmann
/

GermanFinBert_SC_Sentiment

Text Classification

Model card Files Files and versions Community

GermanFinBert_SC_Sentiment / README.md

scherrmann's picture

Update README.md

a92ad2d over 1 year ago

|

history blame contribute delete

3.03 kB

	---
	license: apache-2.0
	language:
	- de
	widget:
	- text: "STS Group AG erhält Großauftrag von führendem Nutzfahrzeughersteller in Nordamerika und plant Bau eines ersten US-Werks"
	- text: "Zukünftig soll jedoch je Geschäftsjahr eine Mindestdividende in Höhe von EUR 2,00 je dividendenberechtigter Aktie an die Aktionärinnen und Aktionäre ausgeschüttet werden."
	- text: "Comet passt Jahresprognose nach Q3 unter Erwartungen an"
	---
	# German FinBERT For Sentiment Analysis (Pre-trained From Scratch Version, Fine-Tuned for Financial Sentiment Analysis)
	<img src="https://github.com/mscherrmann/mscherrmann.github.io/blob/master/assets/img/publication_preview/germanBert.png?raw=true" alt="Alt text for the image" width="500" height="300"/>


	German FinBERT is a BERT language model focusing on the financial domain within the German language. In my [paper](https://arxiv.org/pdf/2311.08793.pdf), I describe in more detail the steps taken to train the model and show that it outperforms its generic benchmarks for finance specific downstream tasks.

	This model is the [pre-trained from scratch version of German FinBERT](https://huggingface.co/scherrmann/GermanFinBert_SC), after fine-tuning on a translated version of the [financial news phrase bank](https://arxiv.org/abs/1307.5336) of Malo et al. (2013). The data is available [here](https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german).

	## Overview
	Author Moritz Scherrmann
	Paper: [here](https://arxiv.org/pdf/2311.08793.pdf)
	Architecture: BERT base
	Language: German
	Specialization: Financial sentiment
	Base model: [German_FinBert_SC](https://huggingface.co/scherrmann/GermanFinBert_SC)


	### Fine-tuning

	I fine-tune the model using the 1cycle policy of [Smith and Topin (2019)](https://arxiv.org/abs/1708.07120). I use the Adam optimization method of [Kingma and Ba (2014)](https://arxiv.org/abs/1412.6980) with
	standard parameters.I run a grid search on the evaluation set to find the best hyper-parameter setup. I test different
	values for learning rate, batch size and number of epochs, following the suggestions of [Chalkidis et al. (2020)](https://aclanthology.org/2020.findings-emnlp.261/). I repeat the fine-tuning for each setup five times with different seeds, to avoid getting good results by chance.
	After finding the best model w.r.t the evaluation set, I report the mean result across seeds for that model on the test set.

	### Results

	Translated [Financial news phrase bank](https://arxiv.org/abs/1307.5336) (Malo et al. (2013)), see [here](https://huggingface.co/datasets/scherrmann/financial_phrasebank_75agree_german) for the data:
	- Accuracy: 95.95%
	- Macro F1: 92.70%


	## Authors
	Moritz Scherrmann: `scherrmann [at] lmu.de`


	For additional details regarding the performance on fine-tune datasets and benchmark results, please refer to the full documentation provided in the study.

	See also:
	- scherrmann/GermanFinBERT_SC
	- scherrmann/GermanFinBERT_FP
	- scherrmann/GermanFinBERT_FP_QuAD