|
--- |
|
datasets: |
|
- ofir408/MedConceptsQA |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
base_model: |
|
- TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
pipeline_tag: question-answering |
|
library_name: transformers |
|
tags: |
|
- tinyllama |
|
- lora |
|
- instruction-tuned |
|
- peft |
|
- Lora |
|
- merged |
|
- medical |
|
- healthcare |
|
--- |
|
|
|
# 🩺 TinyLlama Medical Assistant (Merged LoRA) |
|
|
|
**Author:** Nabil Faieaz |
|
**Base model:** [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) |
|
**Fine-tuning method:** LoRA (Low-Rank Adaptation) using PEFT → merged into base weights |
|
**Intended use:** Concise, factual, general medical information |
|
|
|
--- |
|
|
|
## 📌 Overview |
|
|
|
This model is a **fine-tuned version of TinyLlama 1.1B-Chat** adapted for **medical question answering**. |
|
It has been trained to give **brief and accurate** answers to medical-related queries, following a consistent Q/A style. |
|
|
|
Key features: |
|
- ✅ LoRA fine-tuning for efficient adaptation on limited compute (T4 GPU) |
|
- ✅ Merged LoRA + base into a **single standalone model** (no separate adapter needed) |
|
- ✅ Optimized for short, factual answers — avoids overly verbose outputs |
|
- ✅ Context-aware: warns users to seek professional medical help for urgent/personal issues |
|
|
|
--- |
|
|
|
## ⚠️ Disclaimer |
|
|
|
> **This model is for educational and informational purposes only.** |
|
> It is **not** a substitute for professional medical advice, diagnosis, or treatment. |
|
> Always consult a qualified healthcare provider for medical concerns. |
|
|
|
--- |
|
|
|
## 🚀 Quick Start |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
model_id = "nabilfaieaz/tinyllama-med-full" |
|
|
|
# Load tokenizer and model |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
if tokenizer.pad_token is None: |
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
torch_dtype="auto", |
|
device_map="auto" |
|
) |
|
|
|
# Example prompt |
|
system_prompt = ( |
|
"You are a helpful, concise medical assistant. Provide general information only, " |
|
"not a diagnosis. If urgent or personal issues are mentioned, advise seeing a clinician." |
|
) |
|
|
|
question = "What is hypertension?" |
|
prompt = f"{system_prompt}\n\nQuestion: {question}\nAnswer:" |
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
outputs = model.generate( |
|
**inputs, |
|
max_new_tokens=128, |
|
do_sample=False, |
|
temperature=0.0, |
|
top_p=1.0, |
|
eos_token_id=tokenizer.eos_token_id, |
|
pad_token_id=tokenizer.pad_token_id |
|
) |
|
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
🧠 Training Details |
|
Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
Fine-tuning method: LoRA (via peft) |
|
Target modules: q_proj, k_proj, v_proj, o_proj |
|
LoRA config: |
|
* r = 16 |
|
* alpha = 16 |
|
* dropout = 0.0 |
|
Max sequence length: 512 tokens |
|
Batch size: 2 per device (gradient accumulation for effective batch) |
|
Learning rate: 2e-4 |
|
Precision: fp16 |
|
Evaluation: periodic eval every 200 steps |
|
Checkpoints: saved every 500 steps, final merge from checkpoint-17000 |
|
|
|
📊 Intended Use |
|
Intended: |
|
* Educational explanations of medical terms and concepts |
|
* Study aid for medical students and healthcare professionals |
|
* Healthcare-related chatbot demos |
|
|
|
Not intended: |
|
* Real-time clinical decision making |
|
* Emergency medical guidance |
|
* Handling sensitive personal medical data (PHI) |
|
⚙️ Technical Notes |
|
* The model is merged — you don’t need to separately load LoRA adapters. |
|
* Works with Hugging Face transformers ≥ 4.38. |
|
* Can be quantized to 4-bit (e.g., QLoRA) for local inference. |
|
|
|
|