|
--- |
|
title: README |
|
emoji: 🏢 |
|
colorFrom: indigo |
|
colorTo: blue |
|
sdk: static |
|
pinned: true |
|
license: cc-by-nc-sa-4.0 |
|
--- |
|
|
|
# BiMediX: Bilingual Medical Mixture of Experts LLM |
|
|
|
Welcome to the official HuggingFace repository for BiMediX, the bilingual medical Large Language Model (LLM) designed for English and Arabic interactions. BiMediX facilitates a broad range of **medical interactions**, including multi-turn chats, multiple-choice Q&A, and open-ended question answering. |
|
|
|
## Key Features |
|
|
|
- **Bilingual Support**: Seamless interaction in both English and Arabic for a wide range of medical interactions, including multi-turn chats, multiple-choice question answering, and open-ended question answering. |
|
- **BiMed1.3M Dataset**: Unique dataset with 1.3 million bilingual medical interactions across English and Arabic, including 250k synthesized multi-turn doctor-patient chats for instruction tuning. |
|
- **High-Quality Translation** : Utilizes a semi-automated English-to-Arabic translation pipeline with human refinement to ensure accuracy and quality in translations. |
|
- **Evaluation Benchmark for Arabic Medical LLMs**: Comprehensive benchmark for evaluating Arabic medical language models, setting a new standard in the field. |
|
- **State-of-the-Art Performance**: Outperforms existing models in medical benchmarks, while 8-times faster than comparable existing models. |
|
|
|
For full details of this model please read our [paper (pre-print)](#). |
|
|
|
## Getting Started |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_id = "BiMediX/BiMediX-Bi" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
|
|
text = "Hello BiMediX! I've been experiencing increased tiredness in the past week." |
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
|
outputs = model.generate(**inputs, max_new_tokens=500) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
|
|
## Model Details |
|
|
|
|
|
The BiMediX model, built on a Mixture of Experts (MoE) architecture, leverages the [Mixtral-8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) base model. It features a sophisticated router network to allocate tasks to the most relevant experts, each being a specialized feedforward blocks within the model. |
|
This approach enables the model to scale significantly by utilizing a sparse operation method, where less than 13 billion parameters are active during inference, enhancing efficiency. |
|
The training utilized the BiMed1.3M dataset, focusing on bilingual medical interactions in both English and Arabic, with a substantial corpus of over 632 million healthcare-specialized tokens. |
|
The model's fine-tuning process includes a low-rank adaptation technique (QLoRA) to efficiently adapt the model to specific tasks while keeping computational demands manageable. |
|
|
|
| Model Name | Download | |
|
|--------------|----------| |
|
| BiMediX-Eng | [HuggingFace Link](https://huggingface.co/BiMediX/BiMediX-Eng) | |
|
| BiMediX-Ara | [HuggingFace Link](https://huggingface.co/BiMediX/BiMediX-Ara) | |
|
| BiMediX-Bi | [HuggingFace Link](https://huggingface.co/BiMediX/BiMediX-Bi) | |
|
|
|
## Dataset |
|
|
|
(Details about the BiMed1.3M dataset, including composition and access.) |
|
|
|
## Benchmarks and Performance |
|
|
|
The BiMediX model was evaluated across several benchmarks, demonstrating its effectiveness in medical language understanding and question answering in both English and Arabic. |
|
|
|
1. **Medical Benchmarks Used for Evaluation:** |
|
- **PubMedQA**: A dataset for question answering from biomedical research papers, requiring reasoning over biomedical contexts. |
|
- **MedMCQA**: Multiple-choice questions from Indian medical entrance exams, covering a wide range of medical subjects. |
|
- **MedQA**: Questions from US and other medical board exams, testing specific knowledge and patient case understanding. |
|
- **Medical MMLU**: A compilation of questions from various medical subjects, requiring broad medical knowledge. |
|
|
|
2. **Results and Comparisons:** |
|
- **Bilingual Evaluation**: BiMediX showed superior performance in bilingual (Arabic-English) evaluations, outperforming both the Mixtral-8x7B base model and Jais-30B, a model designed for Arabic. It demonstrated more than 10 and 15 points higher average accuracy, respectively. |
|
- **Arabic Benchmark**: In Arabic-specific evaluations, BiMediX outperformed Jais-30B in all categories, highlighting the effectiveness of the BiMed1.3M dataset and bilingual training. |
|
- **English Benchmark**: BiMediX also excelled in English medical benchmarks, surpassing other state-of-the-art models like Med42-70B and Meditron-70B in terms of average performance and efficiency. |
|
|
|
These results underscore BiMediX's advanced capability in handling medical queries and its significant improvement over existing models in both languages, leveraging its unique bilingual dataset and training approach. |
|
|
|
## Limitations and Ethical Considerations |
|
|
|
**This release, intended for research, is not ready for clinical or commercial use.** Users are urged to employ BiMediX responsibly, especially when applying its outputs in real-world medical scenarios. |
|
It is imperative to verify the model's advice with qualified healthcare professionals and not to rely on AI for medical diagnoses or treatment decisions. |
|
Despite the overall advancements BiMediX brings to the field of medical NLP, it shares common challenges with other language models, |
|
including hallucinations, toxicity, and stereotypes. BiMediX's medical diagnoses and recommendations are not infallible. |
|
|
|
## License and Citation |
|
|
|
BiMediX is released under the CC-BY-NC-SA 4.0 License. |
|
For more details, please refer to the [LICENSE](https://huggingface.co/BiMediX/BiMediX-Bi/blob/main/LICENSE.txt) file included in this repository. |
|
|
|
If you use BiMediX in your research, please cite our work as follows: |
|
|
|
```bibtex |
|
@article{yourModel2024, |
|
title={BiMediX: Bilingual Medical Mixture of Experts LLM}, |
|
author={Your Name and Collaborators}, |
|
journal={Journal of AI Research}, |
|
year={2024}, |
|
volume={xx}, |
|
number={xx}, |
|
pages={xx-xx}, |
|
doi={xx.xxxx/xxxxxx} |
|
} |
|
``` |
|
|
|
Visit our [GitHub](https://github.com/mbzuai-oryx/BiMediX) for more information and resources. |