---
model_name: "Qwen3-0.6B-en-law-qa"
finetuned_by: "Ahsan Ahmed Khan (Ontario)"
model_type: "Fine-tuned Causal Language Model for Legal Q&A"
base_model: "Qwen/Qwen3-0.6B"
language: "en"
finetuning_method: "LoRA (Low-Rank Adaptation)"
license: "apache-2.0"
datasets:
  - "haistudy/en_law_qa"
tags:
  - "legal"
  - "question-answering"
  - "law"
  - "instruction-tuned"
---

# Model Card for Qwen3-0.6B-en-law-qa

## Model Details
- **Developed by:** Ontario (Ahsan Ahmed KHan)
- **Base Model:** [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
- **Dataset:** [haistudy/en_law_qa](https://huggingface.co/datasets/haistudy/en_law_qa)
- **Language:** English
- **License:** Apache 2.0
- **Fine-tuning Approach:** Parameter-Efficient Fine-Tuning (LoRA)

## Model Description
Fine-tuned version of Qwen3-0.6B optimized for legal question answering. Trained on 5,560 legal QA pairs covering:
- Contract law
- Intellectual property
- Criminal law
- Family law
- Environmental law

## Intended Uses
✅ Legal research assistance  
✅ Legal education  
✅ Explaining legal concepts  
❌ Actual legal advice  
❌ Handling sensitive personal legal matters

## Training Configuration
training_parameters:
  epochs: 73 (partial training)
  batch_size: 16
  gradient_accumulation_steps: 16
  learning_rate: 2e-4
  optimizer: "paged_adamw_8bit"

quantization:
  load_in_4bit: true
  bnb_4bit_quant_type: "nf4"
  bnb_4bit_compute_dtype: "bfloat16"

lora_config:
  r: 8
  lora_alpha: 32
  target_modules: 
    - "q_proj"
    - "k_proj"
    - "v_proj"
    - "o_proj"
  lora_dropout: 0.05
  bias: "none"

Usage Example

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

model_name = "Qwen/Qwen3-0.6B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(base_model, "your-username/Qwen3-0.6B-en-law-qa")

# Create prompt
question = "What are the key elements of a valid contract?"
messages = [
    {"role": "user", "content": question}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# Generate response
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Data
yaml
dataset_stats:
  samples: 5560
  format: |
    <|im_start|>user
    {Question}<|im_end|>
    <|im_start|>assistant
    {Answer}<|im_end|>
data_sources:
  - Contract law
  - Intellectual property
  - Criminal law
  - Family law
  - Environmental law
Limitations
Limited to knowledge in training data (2023 cutoff)

May generate plausible but incorrect information

Not a substitute for professional legal advice

English-only capability

Environmental Impact
Hardware: 1 × NVIDIA T4 GPU (Google Colab)
CO2 Emissions: ≈0.8 kg (estimated during partial training)
Calculated using Machine Learning Impact calculator

Contact
For questions or feedback: ahsanahmedkhan@proton.me