SCI Assistant - Spinal Cord Injury Specialized AI Assistant

A specialized AI assistant fine-tuned specifically for people with spinal cord injuries (SCI). This model is based on OpenHermes-2.5-Mistral-7B and has been trained using a two-phase approach with LoRA (Low-Rank Adaptation) to provide contextually appropriate and medically-informed responses for the SCI community.

Model Description

This model was fine-tuned using a two-phase training approach:

Phase 1: Domain pretraining on SCI-related medical texts and resources
Phase 2: Instruction tuning on conversational SCI-focused Q&A pairs

The model understands the unique challenges, medical realities, and daily life considerations of individuals living with spinal cord injuries.

Training Details

Base Model: teknium/OpenHermes-2.5-Mistral-7B
Training Method: QLoRA (4-bit quantization with LoRA adapters)
Training Data: 119,117 total entries (35,779 domain text + 83,337 instruction pairs)
Hardware: RTX 4070 Super (12GB VRAM)
Training Time: ~20 hours total (Phase 1 + Phase 2)

Usage

This repository contains both the LoRA adapter and the full merged model. Choose the option that works best for you:

Option 1: Use the Full Merged Model (Recommended)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("basiphobe/sci-assistant")
tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant")

# Example usage
prompt = "What are the signs of autonomic dysreflexia?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Option 2: Use the LoRA Adapter (Smaller Download)

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load model
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "teknium/OpenHermes-2.5-Mistral-7B",
    quantization_config=bnb_config,
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, "basiphobe/sci-assistant")
tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant")

# Format prompt with SCI context
system_context = "You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI."

prompt = f"{system_context}\n\n### Instruction:\n{your_question}\n\n### Response:\n"

# Generate response
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

Files in this Repository

Full Merged Model: Ready-to-use model files (model-*.safetensors, config.json, etc.)
LoRA Adapter: Smaller adapter files (adapter_model.safetensors, adapter_config.json)
Tokenizer: Shared tokenizer files for both options

GGUF Format Models

This repository also includes GGUF format models optimized for use with llama.cpp, Ollama, and other GGUF-compatible inference engines. These formats offer excellent performance and compatibility across different platforms.

Available GGUF Models

File	Size	Format	Use Case	RAM Required
`merged-sci-model.gguf`	14GB	F16	Maximum quality inference	~16GB
`merged-sci-model-q6_k.gguf`	5.6GB	Q6_K	High quality with good compression	~8GB
`merged-sci-model-q5_k_m.gguf`	4.8GB	Q5_K_M	Excellent quality/size balance	~7GB
`merged-sci-model-q5_k_s.gguf`	4.7GB	Q5_K_S	Good quality, slightly smaller	~7GB
`merged-sci-model-q4_k_m.gguf`	4.1GB	Q4_K_M	Balanced quality/performance	~6GB

Usage with Ollama

1. Download and create Modelfile:

# Download the Q5_K_M model (recommended balance of quality/size)
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf

# Create Modelfile
cat > Modelfile << 'EOF'
FROM ./merged-sci-model-q5_k_m.gguf
TEMPLATE """<|im_start|>system
You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI.<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.7
PARAMETER top_p 0.9
EOF

2. Create and run the model:

ollama create sci-assistant -f Modelfile
ollama run sci-assistant "What are the signs of autonomic dysreflexia?"

Usage with llama.cpp

1. Install and setup:

# Clone and build llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

# Download model
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf

2. Interactive chat:

./main -m merged-sci-model-q5_k_m.gguf \
  --temp 0.7 \
  --repeat_penalty 1.1 \
  -c 4096 \
  --interactive \
  --in-prefix "<|im_start|>user\n" \
  --in-suffix "<|im_end|>\n<|im_start|>assistant\n"

3. Single prompt:

./main -m merged-sci-model-q5_k_m.gguf \
  --temp 0.7 \
  -c 2048 \
  -p "<|im_start|>system\nYou are a specialized medical assistant for people with spinal cord injuries.<|im_end|>\n<|im_start|>user\nWhat exercises are good for someone with paraplegia?<|im_end|>\n<|im_start|>assistant\n"

Performance Comparison

F16 Model (merged-sci-model.gguf): Maximum quality, largest memory footprint
Q6_K Model (merged-sci-model-q6_k.gguf): Near-maximum quality with 60% size reduction
Q5_K_M Model (merged-sci-model-q5_k_m.gguf): Excellent quality retention, good balance
Q5_K_S Model (merged-sci-model-q5_k_s.gguf): Very good quality, slightly more compressed
Q4_K_M Model (merged-sci-model-q4_k_m.gguf): Good quality, smallest size, recommended for resource-constrained environments

All models use the ChatML template format and support up to 32K context length.

Intended Use

This model is designed to:

Provide SCI-specific information and guidance
Answer questions about daily life with spinal cord injuries
Offer practical advice for common SCI challenges
Support the SCI community with contextually appropriate responses

Limitations

This model is for informational purposes only and should not replace professional medical advice
Always consult with healthcare providers for medical decisions
The model may not have information about the latest medical developments
Responses should be verified with medical professionals when making health-related decisions

Direct Use

This model can be used directly for:

Educational purposes about spinal cord injuries
Providing general information and support to the SCI community
Research into specialized medical AI assistants
Personal use by individuals seeking SCI-related information

The model is designed to provide contextually appropriate responses that consider the unique challenges and medical realities of spinal cord injuries.

Downstream Use

This model can be fine-tuned further for:

Integration into healthcare applications
Specialized medical chatbots for rehabilitation centers
Educational platforms for SCI awareness and training
Research applications in medical AI
Custom applications for SCI support organizations

When used in downstream applications, implementers should:

Maintain the medical disclaimer requirements
Ensure proper supervision by medical professionals
Implement appropriate safety measures and content filtering
Validate outputs for medical accuracy in their specific use case

Out-of-Scope Use

This model should NOT be used for:

Medical diagnosis or treatment decisions - Always consult healthcare professionals
Emergency medical situations - Seek immediate professional medical help
Legal or financial advice related to SCI cases
Replacement for professional medical consultation
Clinical decision-making without physician oversight
Applications targeting vulnerable populations without proper safeguards
Commercial medical applications without appropriate medical validation and oversight

Bias, Risks, and Limitations

Medical Limitations

Not a substitute for medical professionals: All medical advice should be verified with qualified healthcare providers
Training data limitations: May not include the most recent medical research or treatments
Individual variation: SCI affects individuals differently; responses may not apply to all cases
Geographic bias: Training data may be biased toward certain healthcare systems or regions

Technical Limitations

Hallucination risk: Like all language models, may generate plausible-sounding but incorrect information
Context limitations: Limited by input context window and may not retain information across long conversations
Language limitations: Primarily trained on English content
Update lag: Cannot access real-time medical research or current events

Bias Considerations

Training data bias: Reflects biases present in source medical literature and online content
Demographic representation: May not equally represent all demographics within the SCI community
Healthcare access bias: May reflect biases toward certain types of healthcare systems
Severity bias: May be more informed about certain types or severities of SCI

Risk Mitigation

Always include medical disclaimers when using this model
Implement content filtering for harmful or dangerous advice
Regular evaluation by medical professionals is recommended
Monitor outputs for accuracy and appropriateness

Recommendations

Users should be aware of the following recommendations:

For Direct Users:

Always verify medical information with qualified healthcare professionals
Use responses as educational/informational starting points, not definitive advice
Be aware that individual SCI experiences vary significantly
Seek immediate professional help for urgent medical concerns

For Developers/Implementers:

Implement clear medical disclaimers in any application using this model
Provide easy access to professional medical resources alongside model responses
Consider implementing content filtering for potentially harmful advice
Regular review by medical professionals is strongly recommended
Ensure compliance with relevant healthcare regulations (HIPAA, etc.)

For Healthcare Organizations:

Professional medical oversight is essential when implementing in clinical settings
Regular validation of model outputs against current medical standards
Integration should complement, not replace, professional medical consultation
Staff training on AI limitations and appropriate use cases

Training Details

Training Data

The training dataset consisted of 119,117 carefully curated entries focused on spinal cord injury information:

Domain Pretraining Data (35,779 entries):

Medical literature and research papers on SCI
Educational materials from reputable SCI organizations
Clinical guidelines and treatment protocols
Rehabilitation and therapy documentation
Patient education resources

Instruction Tuning Data (83,337 entries):

SCI-focused question-answer pairs
Conversational examples with appropriate medical context
Real-world scenarios and practical advice situations
Educational Q&A formatted for instruction following

All training data was filtered and curated to ensure:

Sources from reputable medical organizations and healthcare professionals
Content originally created or reviewed by medical professionals in the SCI field
Appropriate tone and sensitivity for SCI community
Removal of potentially harmful or dangerous advice
Proper medical disclaimers and context

Note: While the source materials were created by medical professionals, this model itself has not undergone independent medical validation.

Training Procedure

The model was trained using a two-phase approach with QLoRA (Quantized Low-Rank Adaptation):

Phase 1 - Domain Pretraining:

Focus: Medical terminology and SCI-specific knowledge
Duration: 2 epochs (~8 hours)
Data: 35,779 domain text entries
Objective: Adapt base model to SCI medical domain

Phase 2 - Instruction Tuning:

Focus: Conversational abilities and response formatting
Duration: 2 epochs (~12 hours)
Data: 83,337 instruction-response pairs
Objective: Teach appropriate response patterns and tone

Preprocessing

Training data underwent extensive preprocessing:

Content sourced from materials created by healthcare professionals
Sensitive content filtering and safety checks
Standardized formatting for instruction-following
Quality filtering to remove low-quality or inappropriate content
Tokenization optimization for efficient training

Training Hyperparameters

Training regime: 4-bit quantization with LoRA adapters (QLoRA)
Learning rate: 2e-4 with cosine scheduling
LoRA rank: 16
LoRA alpha: 32
LoRA dropout: 0.05
Target modules: q_proj, v_proj
Batch size: 4 with gradient accumulation
Max sequence length: 512 tokens
Optimizer: AdamW with weight decay

Speeds, Sizes, Times

Total training time: ~20 hours (8h Phase 1 + 12h Phase 2)
Hardware: RTX 4070 Super (12GB VRAM)
Final model size: 30MB (LoRA adapter only)
Base model size: 7B parameters (not included in adapter)
Training throughput: ~3.5 samples/second average
Memory usage: 6-7GB VRAM during training

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated using:

Held-out test set of SCI-related questions (500 samples)
Manual review of response quality and appropriateness
Comparative analysis against general-purpose models on SCI topics
Assessment of domain-specific knowledge retention

Note: Evaluation was conducted by the model developer, not independent medical professionals.

Factors

Evaluation considered multiple factors:

Medical accuracy: Correctness of SCI-related information
Appropriateness: Sensitivity and tone for SCI community
Contextual relevance: Understanding of SCI-specific challenges
Safety: Avoidance of harmful or dangerous advice
Completeness: Comprehensive responses to complex questions

Metrics

Medical accuracy score: Based on consistency with source medical literature (not independently validated)
Appropriateness rating: Developer assessment of tone and sensitivity (4.2/5.0 subjective rating)
Response relevance: SCI-specific context understanding (82% relevance score)
Safety compliance: No obviously harmful medical advice detected in test samples
Response quality: Perplexity improvements over base model for SCI domain

Results

Quantitative Results:

40% improvement in SCI domain perplexity over base model
Responses demonstrate consistency with source medical literature
95% safety compliance (no obviously harmful medical advice detected)
82% average relevance score for SCI-specific contexts

Qualitative Results:

Responses demonstrate clear understanding of SCI terminology and concepts
Appropriate tone and sensitivity for disability community
Consistent inclusion of medical disclaimers
Good balance between being helpful and cautious about medical advice

Limitations of Evaluation:

Evaluation conducted by model developer, not independent medical experts
No formal clinical validation or testing with SCI patients
Results based on consistency with training sources, not independent medical verification

Environmental Impact

Training carbon emissions estimated using energy consumption data:

Hardware Type: RTX 4070 Super (12GB VRAM)
Hours used: ~20 hours total training time
Cloud Provider: Local training (personal hardware)
Compute Region: North America
Carbon Emitted: Approximately 2.1 kg CO2eq (estimated based on local energy grid)

The use of QLoRA significantly reduced training time and energy consumption compared to full fine-tuning methods, making this a relatively efficient training approach.

Technical Specifications

Model Architecture and Objective

Base Architecture: Mistral 7B transformer model
Adaptation Method: QLoRA (Quantized Low-Rank Adaptation)
Objective: Causal language modeling with SCI domain specialization
Quantization: 4-bit precision for memory efficiency
LoRA Configuration: Rank-16 adapters on attention projection layers

Compute Infrastructure

Hardware

GPU: NVIDIA RTX 4070 Super (12GB VRAM)
CPU: Modern multi-core processor
RAM: 32GB system memory
Storage: NVMe SSD for fast data loading

Software

Framework: Transformers 4.36+, PEFT 0.16.0
Training: QLoRA with bitsandbytes quantization
Environment: Python 3.10+, PyTorch 2.0+, CUDA 12.1

Citation

If you use this model in your research or applications, please cite:

BibTeX:

@misc{sci_assistant_2025,
  title={SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support},
  author={basiphobe},
  year={2025},
  howpublished={Hugging Face Model Repository},
  url={https://huggingface.co/basiphobe/sci-assistant}
}

APA: basiphobe. (2025). SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support. Hugging Face. https://huggingface.co/basiphobe/sci-assistant

Glossary

SCI: Spinal Cord Injury - damage to the spinal cord that results in temporary or permanent changes in function

QLoRA: Quantized Low-Rank Adaptation - an efficient fine-tuning method that reduces memory requirements

Domain Pretraining: Training phase focused on learning domain-specific terminology and knowledge

Instruction Tuning: Training phase focused on learning conversational patterns and response formatting

Perplexity: A metric measuring how well a language model predicts text (lower is better)

LoRA: Low-Rank Adaptation - parameter-efficient fine-tuning technique

Model Card Authors

Primary Author: basiphobe Model Development: Individual research project for SCI community support Data Sources: Curated from medical literature and educational materials created by healthcare professionals Validation Status: Model has not undergone independent medical professional validation

Model Card Contact

For questions, issues, or feedback regarding this model:

Hugging Face: https://huggingface.co/basiphobe/sci-assistant
Issues: Please report issues through Hugging Face model repository
Medical Concerns: Always consult qualified healthcare professionals

Important Note: This model is provided for educational and informational purposes. Always seek professional medical advice for health-related questions and decisions.

Framework versions

PEFT 0.16.0

SCI Assistant - Spinal Cord Injury Specialized AI Assistant

Model Description

Training Details

Usage

Option 1: Use the Full Merged Model (Recommended)

Option 2: Use the LoRA Adapter (Smaller Download)

Files in this Repository

GGUF Format Models

Available GGUF Models

Usage with Ollama

Usage with llama.cpp

Performance Comparison

Intended Use

Limitations

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Medical Limitations

Technical Limitations

Bias Considerations

Risk Mitigation

Recommendations

Training Details

Training Data

Training Procedure

Preprocessing

Training Hyperparameters

Speeds, Sizes, Times

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Environmental Impact

Technical Specifications

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation

Glossary

Model Card Authors

Model Card Contact

Framework versions

Model tree for basiphobe/sci-assistant

Evaluation results