|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- nasa-impact/nasa-smd-ibm-v0.1 |
|
pipeline_tag: token-classification |
|
tags: |
|
- astronomy |
|
- uat |
|
- KAILAS-v02 |
|
--- |
|
|
|
# KAILAS |
|
KAILAS (aka Keyword Labeler At SciX aka Indus-UAT-Labeler aka nasa-smd-ibm-v0.1_UAT_Labeler) is a RoBERTa-based, Encoder-only transformer model, domain-adapted for NASA Science Mission Directorate (SMD) applications. It's fine-tuned on scientific journals and articles relevant to NASA SMD, aiming to enhance natural language technologies like information retrieval and intelligent search. |
|
This specific fork was finetuned on SciX Digital Library (https://scixplorer.org/, formerly NASA-ADS) proprietary data to label text with UAT labels (https://astrothesaurus.org/) |
|
|
|
## Model Details |
|
- **Base Model**: RoBERTa |
|
- **Tokenizer**: Custom |
|
- **Parameters**: 125M |
|
|
|
## Training Data |
|
- 18K titles, abstracts, body and acknowledgments from recent, quality astronomy papers |
|
- approximately 217M tokens |
|
|
|
|
|
<!-- ## Note --> |
|
|
|
<!-- ## Citation --> |
|
<!-- If you find this work useful, please cite using the following bibtex citation: --> |
|
|
|
<!-- ## Disclaimer --> |