MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset Paper • 2305.04582 • Published May 8, 2023
TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task Paper • 2004.14855 • Published Apr 30, 2020
LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools Paper • 2401.12576 • Published Jan 23, 2024 • 2
Factuality Detection using Machine Translation -- a Use Case for German Clinical Text Paper • 2308.08827 • Published Aug 17, 2023
Saliency Map Verbalization: Comparing Feature Importance Representations from Model-free and Instruction-based Methods Paper • 2210.07222 • Published Oct 13, 2022
Why only Micro-F1? Class Weighting of Measures for Relation Classification Paper • 2205.09460 • Published May 19, 2022
MobIE: A German Dataset for Named Entity Recognition, Entity Linking and Relation Extraction in the Mobility Domain Paper • 2108.06955 • Published Aug 16, 2021
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models Paper • 2505.22232 • Published May 28 • 18
Familiarity: Better Evaluation of Zero-Shot Named Entity Recognition by Quantifying Label Shifts in Synthetic Training Data Paper • 2412.10121 • Published Dec 13, 2024 • 2
BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models Paper • 2412.15978 • Published Dec 20, 2024 • 1
Teuken-7B-v0.6 Collection OpenGPT-X Teuken 7B models trained on 6 trillion tokens • 1 item • Updated Dec 10, 2024
Teuken-7B-v0.5 Collection OpenGPT-X Teuken 7B models trained on 5 trillion tokens • 4 items • Updated Dec 9, 2024
Scaling Image Tokenizers with Grouped Spherical Quantization Paper • 2412.02632 • Published Dec 3, 2024 • 10
Teuken-7B-v0.55 Collection OpenGPT-X Teuken 7B models trained on 5.5 trillion tokens • 3 items • Updated Dec 3, 2024
Teuken-7B-v0.5 Collection OpenGPT-X Teuken 7B models trained on 5 trillion tokens • 4 items • Updated Dec 9, 2024