--- language: - en license: apache-2.0 library_name: transformers tags: - medical - spinal-cord-injury - healthcare - disability - accessibility - fine-tuned - lora - mistral base_model: teknium/OpenHermes-2.5-Mistral-7B pipeline_tag: text-generation widget: - text: "What is autonomic dysreflexia?" example_title: "Medical Question" - text: "How can I transfer from my wheelchair to a car?" example_title: "Daily Living" - text: "What exercises are good for someone with paraplegia?" example_title: "Exercise & Rehabilitation" model-index: - name: sci-assistant results: [] --- # SCI Assistant - Spinal Cord Injury Specialized AI Assistant A specialized AI assistant fine-tuned specifically for people with spinal cord injuries (SCI). This model is based on OpenHermes-2.5-Mistral-7B and has been trained using a two-phase approach with LoRA (Low-Rank Adaptation) to provide contextually appropriate and medically-informed responses for the SCI community. ## Model Description This model was fine-tuned using a two-phase training approach: 1. **Phase 1**: Domain pretraining on SCI-related medical texts and resources 2. **Phase 2**: Instruction tuning on conversational SCI-focused Q&A pairs The model understands the unique challenges, medical realities, and daily life considerations of individuals living with spinal cord injuries. ## Training Details - **Base Model**: teknium/OpenHermes-2.5-Mistral-7B - **Training Method**: QLoRA (4-bit quantization with LoRA adapters) - **Training Data**: 119,117 total entries (35,779 domain text + 83,337 instruction pairs) - **Hardware**: RTX 4070 Super (12GB VRAM) - **Training Time**: ~20 hours total (Phase 1 + Phase 2) ## Usage This repository contains both the LoRA adapter and the full merged model. Choose the option that works best for you: ### Option 1: Use the Full Merged Model (Recommended) ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("basiphobe/sci-assistant") tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant") # Example usage prompt = "What are the signs of autonomic dysreflexia?" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=200) response = tokenizer.decode(outputs[0], skip_special_tokens=True) ``` ### Option 2: Use the LoRA Adapter (Smaller Download) ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import PeftModel import torch # Load model bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16, ) base_model = AutoModelForCausalLM.from_pretrained( "teknium/OpenHermes-2.5-Mistral-7B", quantization_config=bnb_config, device_map="auto" ) model = PeftModel.from_pretrained(base_model, "basiphobe/sci-assistant") tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant") # Format prompt with SCI context system_context = "You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI." prompt = f"{system_context}\n\n### Instruction:\n{your_question}\n\n### Response:\n" # Generate response inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7) response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True) ``` ## Files in this Repository - **Full Merged Model**: Ready-to-use model files (`model-*.safetensors`, `config.json`, etc.) - **LoRA Adapter**: Smaller adapter files (`adapter_model.safetensors`, `adapter_config.json`) - **Tokenizer**: Shared tokenizer files for both options ## GGUF Format Models This repository also includes GGUF format models optimized for use with **llama.cpp**, **Ollama**, and other GGUF-compatible inference engines. These formats offer excellent performance and compatibility across different platforms. ### Available GGUF Models | File | Size | Format | Use Case | RAM Required | |------|------|--------|----------|--------------| | `merged-sci-model.gguf` | 14GB | F16 | Maximum quality inference | ~16GB | | `merged-sci-model-q6_k.gguf` | 5.6GB | Q6_K | High quality with good compression | ~8GB | | `merged-sci-model-q5_k_m.gguf` | 4.8GB | Q5_K_M | Excellent quality/size balance | ~7GB | | `merged-sci-model-q5_k_s.gguf` | 4.7GB | Q5_K_S | Good quality, slightly smaller | ~7GB | | `merged-sci-model-q4_k_m.gguf` | 4.1GB | Q4_K_M | Balanced quality/performance | ~6GB | ### Usage with Ollama **1. Download and create Modelfile:** ```bash # Download the Q5_K_M model (recommended balance of quality/size) wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf # Create Modelfile cat > Modelfile << 'EOF' FROM ./merged-sci-model-q5_k_m.gguf TEMPLATE """<|im_start|>system You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI.<|im_end|> <|im_start|>user {{ .Prompt }}<|im_end|> <|im_start|>assistant """ PARAMETER stop "<|im_start|>" PARAMETER stop "<|im_end|>" PARAMETER temperature 0.7 PARAMETER top_p 0.9 EOF ``` **2. Create and run the model:** ```bash ollama create sci-assistant -f Modelfile ollama run sci-assistant "What are the signs of autonomic dysreflexia?" ``` ### Usage with llama.cpp **1. Install and setup:** ```bash # Clone and build llama.cpp git clone https://github.com/ggerganov/llama.cpp cd llama.cpp make # Download model wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf ``` **2. Interactive chat:** ```bash ./main -m merged-sci-model-q5_k_m.gguf \ --temp 0.7 \ --repeat_penalty 1.1 \ -c 4096 \ --interactive \ --in-prefix "<|im_start|>user\n" \ --in-suffix "<|im_end|>\n<|im_start|>assistant\n" ``` **3. Single prompt:** ```bash ./main -m merged-sci-model-q5_k_m.gguf \ --temp 0.7 \ -c 2048 \ -p "<|im_start|>system\nYou are a specialized medical assistant for people with spinal cord injuries.<|im_end|>\n<|im_start|>user\nWhat exercises are good for someone with paraplegia?<|im_end|>\n<|im_start|>assistant\n" ``` ### Performance Comparison - **F16 Model** (`merged-sci-model.gguf`): Maximum quality, largest memory footprint - **Q6_K Model** (`merged-sci-model-q6_k.gguf`): Near-maximum quality with 60% size reduction - **Q5_K_M Model** (`merged-sci-model-q5_k_m.gguf`): Excellent quality retention, good balance - **Q5_K_S Model** (`merged-sci-model-q5_k_s.gguf`): Very good quality, slightly more compressed - **Q4_K_M Model** (`merged-sci-model-q4_k_m.gguf`): Good quality, smallest size, recommended for resource-constrained environments All models use the **ChatML** template format and support up to **32K context length**. ## Intended Use This model is designed to: - Provide SCI-specific information and guidance - Answer questions about daily life with spinal cord injuries - Offer practical advice for common SCI challenges - Support the SCI community with contextually appropriate responses ## Limitations - This model is for informational purposes only and should not replace professional medical advice - Always consult with healthcare providers for medical decisions - The model may not have information about the latest medical developments - Responses should be verified with medical professionals when making health-related decisions ## Direct Use This model can be used directly for: - Educational purposes about spinal cord injuries - Providing general information and support to the SCI community - Research into specialized medical AI assistants - Personal use by individuals seeking SCI-related information The model is designed to provide contextually appropriate responses that consider the unique challenges and medical realities of spinal cord injuries. ### Downstream Use This model can be fine-tuned further for: - Integration into healthcare applications - Specialized medical chatbots for rehabilitation centers - Educational platforms for SCI awareness and training - Research applications in medical AI - Custom applications for SCI support organizations When used in downstream applications, implementers should: - Maintain the medical disclaimer requirements - Ensure proper supervision by medical professionals - Implement appropriate safety measures and content filtering - Validate outputs for medical accuracy in their specific use case ### Out-of-Scope Use This model should NOT be used for: - **Medical diagnosis or treatment decisions** - Always consult healthcare professionals - **Emergency medical situations** - Seek immediate professional medical help - **Legal or financial advice** related to SCI cases - **Replacement for professional medical consultation** - **Clinical decision-making** without physician oversight - **Applications targeting vulnerable populations** without proper safeguards - **Commercial medical applications** without appropriate medical validation and oversight ## Bias, Risks, and Limitations ### Medical Limitations - **Not a substitute for medical professionals**: All medical advice should be verified with qualified healthcare providers - **Training data limitations**: May not include the most recent medical research or treatments - **Individual variation**: SCI affects individuals differently; responses may not apply to all cases - **Geographic bias**: Training data may be biased toward certain healthcare systems or regions ### Technical Limitations - **Hallucination risk**: Like all language models, may generate plausible-sounding but incorrect information - **Context limitations**: Limited by input context window and may not retain information across long conversations - **Language limitations**: Primarily trained on English content - **Update lag**: Cannot access real-time medical research or current events ### Bias Considerations - **Training data bias**: Reflects biases present in source medical literature and online content - **Demographic representation**: May not equally represent all demographics within the SCI community - **Healthcare access bias**: May reflect biases toward certain types of healthcare systems - **Severity bias**: May be more informed about certain types or severities of SCI ### Risk Mitigation - Always include medical disclaimers when using this model - Implement content filtering for harmful or dangerous advice - Regular evaluation by medical professionals is recommended - Monitor outputs for accuracy and appropriateness ## Recommendations Users should be aware of the following recommendations: **For Direct Users:** - Always verify medical information with qualified healthcare professionals - Use responses as educational/informational starting points, not definitive advice - Be aware that individual SCI experiences vary significantly - Seek immediate professional help for urgent medical concerns **For Developers/Implementers:** - Implement clear medical disclaimers in any application using this model - Provide easy access to professional medical resources alongside model responses - Consider implementing content filtering for potentially harmful advice - Regular review by medical professionals is strongly recommended - Ensure compliance with relevant healthcare regulations (HIPAA, etc.) **For Healthcare Organizations:** - Professional medical oversight is essential when implementing in clinical settings - Regular validation of model outputs against current medical standards - Integration should complement, not replace, professional medical consultation - Staff training on AI limitations and appropriate use cases ## Training Details ### Training Data The training dataset consisted of 119,117 carefully curated entries focused on spinal cord injury information: **Domain Pretraining Data (35,779 entries):** - Medical literature and research papers on SCI - Educational materials from reputable SCI organizations - Clinical guidelines and treatment protocols - Rehabilitation and therapy documentation - Patient education resources **Instruction Tuning Data (83,337 entries):** - SCI-focused question-answer pairs - Conversational examples with appropriate medical context - Real-world scenarios and practical advice situations - Educational Q&A formatted for instruction following All training data was filtered and curated to ensure: - Sources from reputable medical organizations and healthcare professionals - Content originally created or reviewed by medical professionals in the SCI field - Appropriate tone and sensitivity for SCI community - Removal of potentially harmful or dangerous advice - Proper medical disclaimers and context **Note**: While the source materials were created by medical professionals, this model itself has not undergone independent medical validation. ### Training Procedure The model was trained using a two-phase approach with QLoRA (Quantized Low-Rank Adaptation): **Phase 1 - Domain Pretraining:** - Focus: Medical terminology and SCI-specific knowledge - Duration: 2 epochs (~8 hours) - Data: 35,779 domain text entries - Objective: Adapt base model to SCI medical domain **Phase 2 - Instruction Tuning:** - Focus: Conversational abilities and response formatting - Duration: 2 epochs (~12 hours) - Data: 83,337 instruction-response pairs - Objective: Teach appropriate response patterns and tone #### Preprocessing Training data underwent extensive preprocessing: - Content sourced from materials created by healthcare professionals - Sensitive content filtering and safety checks - Standardized formatting for instruction-following - Quality filtering to remove low-quality or inappropriate content - Tokenization optimization for efficient training #### Training Hyperparameters - **Training regime:** 4-bit quantization with LoRA adapters (QLoRA) - **Learning rate:** 2e-4 with cosine scheduling - **LoRA rank:** 16 - **LoRA alpha:** 32 - **LoRA dropout:** 0.05 - **Target modules:** q_proj, v_proj - **Batch size:** 4 with gradient accumulation - **Max sequence length:** 512 tokens - **Optimizer:** AdamW with weight decay #### Speeds, Sizes, Times - **Total training time:** ~20 hours (8h Phase 1 + 12h Phase 2) - **Hardware:** RTX 4070 Super (12GB VRAM) - **Final model size:** 30MB (LoRA adapter only) - **Base model size:** 7B parameters (not included in adapter) - **Training throughput:** ~3.5 samples/second average - **Memory usage:** 6-7GB VRAM during training ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data The model was evaluated using: - Held-out test set of SCI-related questions (500 samples) - Manual review of response quality and appropriateness - Comparative analysis against general-purpose models on SCI topics - Assessment of domain-specific knowledge retention **Note**: Evaluation was conducted by the model developer, not independent medical professionals. #### Factors Evaluation considered multiple factors: - **Medical accuracy**: Correctness of SCI-related information - **Appropriateness**: Sensitivity and tone for SCI community - **Contextual relevance**: Understanding of SCI-specific challenges - **Safety**: Avoidance of harmful or dangerous advice - **Completeness**: Comprehensive responses to complex questions #### Metrics - **Medical accuracy score**: Based on consistency with source medical literature (not independently validated) - **Appropriateness rating**: Developer assessment of tone and sensitivity (4.2/5.0 subjective rating) - **Response relevance**: SCI-specific context understanding (82% relevance score) - **Safety compliance**: No obviously harmful medical advice detected in test samples - **Response quality**: Perplexity improvements over base model for SCI domain ### Results **Quantitative Results:** - 40% improvement in SCI domain perplexity over base model - Responses demonstrate consistency with source medical literature - 95% safety compliance (no obviously harmful medical advice detected) - 82% average relevance score for SCI-specific contexts **Qualitative Results:** - Responses demonstrate clear understanding of SCI terminology and concepts - Appropriate tone and sensitivity for disability community - Consistent inclusion of medical disclaimers - Good balance between being helpful and cautious about medical advice **Limitations of Evaluation:** - Evaluation conducted by model developer, not independent medical experts - No formal clinical validation or testing with SCI patients - Results based on consistency with training sources, not independent medical verification ## Environmental Impact Training carbon emissions estimated using energy consumption data: - **Hardware Type:** RTX 4070 Super (12GB VRAM) - **Hours used:** ~20 hours total training time - **Cloud Provider:** Local training (personal hardware) - **Compute Region:** North America - **Carbon Emitted:** Approximately 2.1 kg CO2eq (estimated based on local energy grid) The use of QLoRA significantly reduced training time and energy consumption compared to full fine-tuning methods, making this a relatively efficient training approach. ## Technical Specifications ### Model Architecture and Objective - **Base Architecture:** Mistral 7B transformer model - **Adaptation Method:** QLoRA (Quantized Low-Rank Adaptation) - **Objective:** Causal language modeling with SCI domain specialization - **Quantization:** 4-bit precision for memory efficiency - **LoRA Configuration:** Rank-16 adapters on attention projection layers ### Compute Infrastructure #### Hardware - **GPU:** NVIDIA RTX 4070 Super (12GB VRAM) - **CPU:** Modern multi-core processor - **RAM:** 32GB system memory - **Storage:** NVMe SSD for fast data loading #### Software - **Framework:** Transformers 4.36+, PEFT 0.16.0 - **Training:** QLoRA with bitsandbytes quantization - **Environment:** Python 3.10+, PyTorch 2.0+, CUDA 12.1 ## Citation If you use this model in your research or applications, please cite: **BibTeX:** ```bibtex @misc{sci_assistant_2025, title={SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support}, author={basiphobe}, year={2025}, howpublished={Hugging Face Model Repository}, url={https://huggingface.co/basiphobe/sci-assistant} } ``` **APA:** basiphobe. (2025). *SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support*. Hugging Face. https://huggingface.co/basiphobe/sci-assistant ## Glossary **SCI**: Spinal Cord Injury - damage to the spinal cord that results in temporary or permanent changes in function **QLoRA**: Quantized Low-Rank Adaptation - an efficient fine-tuning method that reduces memory requirements **Domain Pretraining**: Training phase focused on learning domain-specific terminology and knowledge **Instruction Tuning**: Training phase focused on learning conversational patterns and response formatting **Perplexity**: A metric measuring how well a language model predicts text (lower is better) **LoRA**: Low-Rank Adaptation - parameter-efficient fine-tuning technique ## Model Card Authors **Primary Author:** basiphobe **Model Development:** Individual research project for SCI community support **Data Sources:** Curated from medical literature and educational materials created by healthcare professionals **Validation Status:** Model has not undergone independent medical professional validation ## Model Card Contact For questions, issues, or feedback regarding this model: - **Hugging Face:** https://huggingface.co/basiphobe/sci-assistant - **Issues:** Please report issues through Hugging Face model repository - **Medical Concerns:** Always consult qualified healthcare professionals **Important Note:** This model is provided for educational and informational purposes. Always seek professional medical advice for health-related questions and decisions. ### Framework versions - PEFT 0.16.0