SCI Assistant - Spinal Cord Injury Specialized AI Assistant

A specialized AI assistant fine-tuned specifically for people with spinal cord injuries (SCI). This model is based on OpenHermes-2.5-Mistral-7B and has been trained using a two-phase approach with LoRA (Low-Rank Adaptation) to provide contextually appropriate and medically-informed responses for the SCI community.

Model Description

This model was fine-tuned using a two-phase training approach:

  1. Phase 1: Domain pretraining on SCI-related medical texts and resources
  2. Phase 2: Instruction tuning on conversational SCI-focused Q&A pairs

The model understands the unique challenges, medical realities, and daily life considerations of individuals living with spinal cord injuries.

Training Details

  • Base Model: teknium/OpenHermes-2.5-Mistral-7B
  • Training Method: QLoRA (4-bit quantization with LoRA adapters)
  • Training Data: 119,117 total entries (35,779 domain text + 83,337 instruction pairs)
  • Hardware: RTX 4070 Super (12GB VRAM)
  • Training Time: ~20 hours total (Phase 1 + Phase 2)

Usage

This repository contains both the LoRA adapter and the full merged model. Choose the option that works best for you:

Option 1: Use the Full Merged Model (Recommended)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("basiphobe/sci-assistant")
tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant")

# Example usage
prompt = "What are the signs of autonomic dysreflexia?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Option 2: Use the LoRA Adapter (Smaller Download)

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load model
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "teknium/OpenHermes-2.5-Mistral-7B",
    quantization_config=bnb_config,
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, "basiphobe/sci-assistant")
tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant")

# Format prompt with SCI context
system_context = "You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI."

prompt = f"{system_context}\n\n### Instruction:\n{your_question}\n\n### Response:\n"

# Generate response
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

Files in this Repository

  • Full Merged Model: Ready-to-use model files (model-*.safetensors, config.json, etc.)
  • LoRA Adapter: Smaller adapter files (adapter_model.safetensors, adapter_config.json)
  • Tokenizer: Shared tokenizer files for both options

GGUF Format Models

This repository also includes GGUF format models optimized for use with llama.cpp, Ollama, and other GGUF-compatible inference engines. These formats offer excellent performance and compatibility across different platforms.

Available GGUF Models

File Size Format Use Case RAM Required
merged-sci-model.gguf 14GB F16 Maximum quality inference ~16GB
merged-sci-model-q6_k.gguf 5.6GB Q6_K High quality with good compression ~8GB
merged-sci-model-q5_k_m.gguf 4.8GB Q5_K_M Excellent quality/size balance ~7GB
merged-sci-model-q5_k_s.gguf 4.7GB Q5_K_S Good quality, slightly smaller ~7GB
merged-sci-model-q4_k_m.gguf 4.1GB Q4_K_M Balanced quality/performance ~6GB

Usage with Ollama

1. Download and create Modelfile:

# Download the Q5_K_M model (recommended balance of quality/size)
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf

# Create Modelfile
cat > Modelfile << 'EOF'
FROM ./merged-sci-model-q5_k_m.gguf
TEMPLATE """<|im_start|>system
You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI.<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.7
PARAMETER top_p 0.9
EOF

2. Create and run the model:

ollama create sci-assistant -f Modelfile
ollama run sci-assistant "What are the signs of autonomic dysreflexia?"

Usage with llama.cpp

1. Install and setup:

# Clone and build llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

# Download model
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf

2. Interactive chat:

./main -m merged-sci-model-q5_k_m.gguf \
  --temp 0.7 \
  --repeat_penalty 1.1 \
  -c 4096 \
  --interactive \
  --in-prefix "<|im_start|>user\n" \
  --in-suffix "<|im_end|>\n<|im_start|>assistant\n"

3. Single prompt:

./main -m merged-sci-model-q5_k_m.gguf \
  --temp 0.7 \
  -c 2048 \
  -p "<|im_start|>system\nYou are a specialized medical assistant for people with spinal cord injuries.<|im_end|>\n<|im_start|>user\nWhat exercises are good for someone with paraplegia?<|im_end|>\n<|im_start|>assistant\n"

Performance Comparison

  • F16 Model (merged-sci-model.gguf): Maximum quality, largest memory footprint
  • Q6_K Model (merged-sci-model-q6_k.gguf): Near-maximum quality with 60% size reduction
  • Q5_K_M Model (merged-sci-model-q5_k_m.gguf): Excellent quality retention, good balance
  • Q5_K_S Model (merged-sci-model-q5_k_s.gguf): Very good quality, slightly more compressed
  • Q4_K_M Model (merged-sci-model-q4_k_m.gguf): Good quality, smallest size, recommended for resource-constrained environments

All models use the ChatML template format and support up to 32K context length.

Intended Use

This model is designed to:

  • Provide SCI-specific information and guidance
  • Answer questions about daily life with spinal cord injuries
  • Offer practical advice for common SCI challenges
  • Support the SCI community with contextually appropriate responses

Limitations

  • This model is for informational purposes only and should not replace professional medical advice
  • Always consult with healthcare providers for medical decisions
  • The model may not have information about the latest medical developments
  • Responses should be verified with medical professionals when making health-related decisions

Direct Use

This model can be used directly for:

  • Educational purposes about spinal cord injuries
  • Providing general information and support to the SCI community
  • Research into specialized medical AI assistants
  • Personal use by individuals seeking SCI-related information

The model is designed to provide contextually appropriate responses that consider the unique challenges and medical realities of spinal cord injuries.

Downstream Use

This model can be fine-tuned further for:

  • Integration into healthcare applications
  • Specialized medical chatbots for rehabilitation centers
  • Educational platforms for SCI awareness and training
  • Research applications in medical AI
  • Custom applications for SCI support organizations

When used in downstream applications, implementers should:

  • Maintain the medical disclaimer requirements
  • Ensure proper supervision by medical professionals
  • Implement appropriate safety measures and content filtering
  • Validate outputs for medical accuracy in their specific use case

Out-of-Scope Use

This model should NOT be used for:

  • Medical diagnosis or treatment decisions - Always consult healthcare professionals
  • Emergency medical situations - Seek immediate professional medical help
  • Legal or financial advice related to SCI cases
  • Replacement for professional medical consultation
  • Clinical decision-making without physician oversight
  • Applications targeting vulnerable populations without proper safeguards
  • Commercial medical applications without appropriate medical validation and oversight

Bias, Risks, and Limitations

Medical Limitations

  • Not a substitute for medical professionals: All medical advice should be verified with qualified healthcare providers
  • Training data limitations: May not include the most recent medical research or treatments
  • Individual variation: SCI affects individuals differently; responses may not apply to all cases
  • Geographic bias: Training data may be biased toward certain healthcare systems or regions

Technical Limitations

  • Hallucination risk: Like all language models, may generate plausible-sounding but incorrect information
  • Context limitations: Limited by input context window and may not retain information across long conversations
  • Language limitations: Primarily trained on English content
  • Update lag: Cannot access real-time medical research or current events

Bias Considerations

  • Training data bias: Reflects biases present in source medical literature and online content
  • Demographic representation: May not equally represent all demographics within the SCI community
  • Healthcare access bias: May reflect biases toward certain types of healthcare systems
  • Severity bias: May be more informed about certain types or severities of SCI

Risk Mitigation

  • Always include medical disclaimers when using this model
  • Implement content filtering for harmful or dangerous advice
  • Regular evaluation by medical professionals is recommended
  • Monitor outputs for accuracy and appropriateness

Recommendations

Users should be aware of the following recommendations:

For Direct Users:

  • Always verify medical information with qualified healthcare professionals
  • Use responses as educational/informational starting points, not definitive advice
  • Be aware that individual SCI experiences vary significantly
  • Seek immediate professional help for urgent medical concerns

For Developers/Implementers:

  • Implement clear medical disclaimers in any application using this model
  • Provide easy access to professional medical resources alongside model responses
  • Consider implementing content filtering for potentially harmful advice
  • Regular review by medical professionals is strongly recommended
  • Ensure compliance with relevant healthcare regulations (HIPAA, etc.)

For Healthcare Organizations:

  • Professional medical oversight is essential when implementing in clinical settings
  • Regular validation of model outputs against current medical standards
  • Integration should complement, not replace, professional medical consultation
  • Staff training on AI limitations and appropriate use cases

Training Details

Training Data

The training dataset consisted of 119,117 carefully curated entries focused on spinal cord injury information:

Domain Pretraining Data (35,779 entries):

  • Medical literature and research papers on SCI
  • Educational materials from reputable SCI organizations
  • Clinical guidelines and treatment protocols
  • Rehabilitation and therapy documentation
  • Patient education resources

Instruction Tuning Data (83,337 entries):

  • SCI-focused question-answer pairs
  • Conversational examples with appropriate medical context
  • Real-world scenarios and practical advice situations
  • Educational Q&A formatted for instruction following

All training data was filtered and curated to ensure:

  • Sources from reputable medical organizations and healthcare professionals
  • Content originally created or reviewed by medical professionals in the SCI field
  • Appropriate tone and sensitivity for SCI community
  • Removal of potentially harmful or dangerous advice
  • Proper medical disclaimers and context

Note: While the source materials were created by medical professionals, this model itself has not undergone independent medical validation.

Training Procedure

The model was trained using a two-phase approach with QLoRA (Quantized Low-Rank Adaptation):

Phase 1 - Domain Pretraining:

  • Focus: Medical terminology and SCI-specific knowledge
  • Duration: 2 epochs (~8 hours)
  • Data: 35,779 domain text entries
  • Objective: Adapt base model to SCI medical domain

Phase 2 - Instruction Tuning:

  • Focus: Conversational abilities and response formatting
  • Duration: 2 epochs (~12 hours)
  • Data: 83,337 instruction-response pairs
  • Objective: Teach appropriate response patterns and tone

Preprocessing

Training data underwent extensive preprocessing:

  • Content sourced from materials created by healthcare professionals
  • Sensitive content filtering and safety checks
  • Standardized formatting for instruction-following
  • Quality filtering to remove low-quality or inappropriate content
  • Tokenization optimization for efficient training

Training Hyperparameters

  • Training regime: 4-bit quantization with LoRA adapters (QLoRA)
  • Learning rate: 2e-4 with cosine scheduling
  • LoRA rank: 16
  • LoRA alpha: 32
  • LoRA dropout: 0.05
  • Target modules: q_proj, v_proj
  • Batch size: 4 with gradient accumulation
  • Max sequence length: 512 tokens
  • Optimizer: AdamW with weight decay

Speeds, Sizes, Times

  • Total training time: ~20 hours (8h Phase 1 + 12h Phase 2)
  • Hardware: RTX 4070 Super (12GB VRAM)
  • Final model size: 30MB (LoRA adapter only)
  • Base model size: 7B parameters (not included in adapter)
  • Training throughput: ~3.5 samples/second average
  • Memory usage: 6-7GB VRAM during training

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated using:

  • Held-out test set of SCI-related questions (500 samples)
  • Manual review of response quality and appropriateness
  • Comparative analysis against general-purpose models on SCI topics
  • Assessment of domain-specific knowledge retention

Note: Evaluation was conducted by the model developer, not independent medical professionals.

Factors

Evaluation considered multiple factors:

  • Medical accuracy: Correctness of SCI-related information
  • Appropriateness: Sensitivity and tone for SCI community
  • Contextual relevance: Understanding of SCI-specific challenges
  • Safety: Avoidance of harmful or dangerous advice
  • Completeness: Comprehensive responses to complex questions

Metrics

  • Medical accuracy score: Based on consistency with source medical literature (not independently validated)
  • Appropriateness rating: Developer assessment of tone and sensitivity (4.2/5.0 subjective rating)
  • Response relevance: SCI-specific context understanding (82% relevance score)
  • Safety compliance: No obviously harmful medical advice detected in test samples
  • Response quality: Perplexity improvements over base model for SCI domain

Results

Quantitative Results:

  • 40% improvement in SCI domain perplexity over base model
  • Responses demonstrate consistency with source medical literature
  • 95% safety compliance (no obviously harmful medical advice detected)
  • 82% average relevance score for SCI-specific contexts

Qualitative Results:

  • Responses demonstrate clear understanding of SCI terminology and concepts
  • Appropriate tone and sensitivity for disability community
  • Consistent inclusion of medical disclaimers
  • Good balance between being helpful and cautious about medical advice

Limitations of Evaluation:

  • Evaluation conducted by model developer, not independent medical experts
  • No formal clinical validation or testing with SCI patients
  • Results based on consistency with training sources, not independent medical verification

Environmental Impact

Training carbon emissions estimated using energy consumption data:

  • Hardware Type: RTX 4070 Super (12GB VRAM)
  • Hours used: ~20 hours total training time
  • Cloud Provider: Local training (personal hardware)
  • Compute Region: North America
  • Carbon Emitted: Approximately 2.1 kg CO2eq (estimated based on local energy grid)

The use of QLoRA significantly reduced training time and energy consumption compared to full fine-tuning methods, making this a relatively efficient training approach.

Technical Specifications

Model Architecture and Objective

  • Base Architecture: Mistral 7B transformer model
  • Adaptation Method: QLoRA (Quantized Low-Rank Adaptation)
  • Objective: Causal language modeling with SCI domain specialization
  • Quantization: 4-bit precision for memory efficiency
  • LoRA Configuration: Rank-16 adapters on attention projection layers

Compute Infrastructure

Hardware

  • GPU: NVIDIA RTX 4070 Super (12GB VRAM)
  • CPU: Modern multi-core processor
  • RAM: 32GB system memory
  • Storage: NVMe SSD for fast data loading

Software

  • Framework: Transformers 4.36+, PEFT 0.16.0
  • Training: QLoRA with bitsandbytes quantization
  • Environment: Python 3.10+, PyTorch 2.0+, CUDA 12.1

Citation

If you use this model in your research or applications, please cite:

BibTeX:

@misc{sci_assistant_2025,
  title={SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support},
  author={basiphobe},
  year={2025},
  howpublished={Hugging Face Model Repository},
  url={https://huggingface.co/basiphobe/sci-assistant}
}

APA: basiphobe. (2025). SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support. Hugging Face. https://huggingface.co/basiphobe/sci-assistant

Glossary

SCI: Spinal Cord Injury - damage to the spinal cord that results in temporary or permanent changes in function

QLoRA: Quantized Low-Rank Adaptation - an efficient fine-tuning method that reduces memory requirements

Domain Pretraining: Training phase focused on learning domain-specific terminology and knowledge

Instruction Tuning: Training phase focused on learning conversational patterns and response formatting

Perplexity: A metric measuring how well a language model predicts text (lower is better)

LoRA: Low-Rank Adaptation - parameter-efficient fine-tuning technique

Model Card Authors

Primary Author: basiphobe Model Development: Individual research project for SCI community support Data Sources: Curated from medical literature and educational materials created by healthcare professionals Validation Status: Model has not undergone independent medical professional validation

Model Card Contact

For questions, issues, or feedback regarding this model:

Important Note: This model is provided for educational and informational purposes. Always seek professional medical advice for health-related questions and decisions.

Framework versions

  • PEFT 0.16.0
Downloads last month
139
Safetensors
Model size
7.24B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for basiphobe/sci-assistant

Adapter
(422)
this model