--- library_name: transformers license: apache-2.0 datasets: - ik-ram28/synthetic-NER-dataset language: - fr base_model: - Ihor/gliner-biomed-large-v1.0 --- # EvalLLM-GLiNER-Biomedical ## Model Description This model is a fine-tuned version of [gliner-biomed-large-v1.0](https://huggingface.co/Ihor/gliner-biomed-large-v1.0) specifically designed for French biomedical Named Entity Recognition (NER). It was developed as part of the EvalLLM 2025 challenge. The model leverages GLiNER's zero-shot capabilities while being fine-tuned on synthetic biomedical data, making it highly effective for identifying 21 types of biomedical entities in French text. ## Model Details ### Base Model - **Architecture**: GLiNER (Generalist and Lightweight Model for Named Entity Recognition) - **Base Version**: gliner-biomed-large-v1.0 - **Language**: French - **Domain**: Biomedical and health-related text ### Training Configuration - **Training Epochs**: 3 (early stopping at 2.85 epochs) - **Learning Rate**: 1e-5 - **Weight Decay**: 0.01 - **Scheduler**: Cosine with 10% warm-up - **Batch Size**: 8 - **Training Data**: 1,748 synthetic documents ## Entity Types (21 categories) | Entity Type | French Label | Example | |-------------|--------------|---------| | `ABS_DATE` | Date absolue | "15 mars 2020" | | `ABS_PERIOD` | Période absolue | "janvier 2019 à mars 2020" | | `BIO_TOXIN` | Toxine biologique | "toxine botulique" | | `DIS_REF_TO_PATH` | Référence maladie-pathogène | "infection par E. coli" | | `DOC_AUTHOR` | Auteur de document | "Dr. Martin Dubois" | | `DOC_DATE` | Date de document | "publié le 12/03/2021" | | `DOC_SOURCE` | Source de document | "Journal of Medicine" | | `EVENT_MACRO` | Événement macro | "épidémie de COVID-19" | | `EVENT_MICRO` | Événement micro | "cas de contamination" | | `EXPLOSIVE` | Explosif | "TNT", "dynamite" | | `FUZZY_PERIOD` | Période floue | "début d'année", "récemment" | | `INF_DISEASE` | Maladie infectieuse | "grippe", "tuberculose" | | `LOCATION` | Localisation | "Paris", "France" | | `LOC_REF_TO_ORG` | Référence lieu-organisation | "hôpital de Lyon" | | `NON_INF_DISEASE` | Maladie non infectieuse | "diabète", "cancer" | | `ORGANIZATION` | Organisation | "OMS", "Institut Pasteur" | | `ORG_REF_TO_LOC` | Référence organisation-lieu | "OMS Europe" | | `PATHOGEN` | Pathogène | "virus Ebola", "E. coli" | | `PATH_REF_TO_DIS` | Référence pathogène-maladie | "virus causant la grippe" | | `RADIOISOTOPE` | Radio-isotope | "uranium 235", "césium 137" | | `REL_DATE` | Date relative | "hier", "la semaine dernière" | | `REL_PERIOD` | Période relative | "depuis 3 mois" | | `TOXIC_AGENT` | Agent toxique | "plomb", "mercure" | ## Citation ```bibtex ``` ## Related Resources - **GitHub Repository**: [EvalLLM2025](https://github.com/ikram28/EvalLLM2025) - **Paper**: [Link to paper when published] - **Challenge**: [EvalLLM 2025](https://evalllm2025.sciencesconf.org/) ## License This model is released under the Apache 2.0 License. ## Acknowledgments - GLiNER team for the base architecture - EvalLLM 2025 organizers