onisj commited on
Commit
903b224
·
verified ·
1 Parent(s): 09a9ec5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +170 -3
README.md CHANGED
@@ -1,3 +1,170 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - intent-classification
7
+ - mental-health
8
+ - transformer
9
+ - conversational-ai
10
+ pipeline_tag: text-classification
11
+ base_model: distilbert-base-uncased
12
+ ---
13
+
14
+ # 🧠 Intent Classifier (MindPadi)
15
+
16
+ The `intent_classifier` is a transformer-based text classification model trained to detect **user intents** in a mental health support setting. It powers the MindPadi assistant's ability to route conversations to the appropriate modules—like emotional support, scheduling, reflection, or journal analysis—based on the user’s message.
17
+
18
+
19
+
20
+ ## 📝 Model Overview
21
+
22
+ - **Model Architecture:** DistilBERT (uncased) + classification head
23
+ - **Task:** Intent Classification
24
+ - **Classes:** Over 20 intent categories (e.g., `vent`, `gratitude`, `help_request`, `journal_analysis`)
25
+ - **Model Size:** ~66M parameters
26
+ - **Files:**
27
+ - `config.json`
28
+ - `pytorch_model.bin` or `model.safetensors`
29
+ - `tokenizer_config.json`, `vocab.txt`, `tokenizer.json`
30
+ - `checkpoint-*/` (optional training checkpoints)
31
+
32
+
33
+
34
+ ## ✅ Intended Use
35
+
36
+ ### ✔️ Use Cases
37
+ - Detecting user intent in MindPadi mental health conversations
38
+ - Enabling context-specific dialogue flows
39
+ - Assisting with journal entry triage and tagging
40
+ - Triggering therapy-related tools (e.g., emotion check-ins, PubMed summaries)
41
+
42
+ ### 🚫 Not Intended For
43
+ - Multilingual intent classification (English-only)
44
+ - Legal or medical diagnosis tasks
45
+ - Multi-label classification (currently single-label per input)
46
+
47
+
48
+
49
+ ## 💡 Example Intents Detected
50
+
51
+ | Intent | Description |
52
+ |--------------------|-------------------------------------------------------|
53
+ | `vent` | User expressing frustration or emotion freely |
54
+ | `help_request` | Seeking mental health support |
55
+ | `schedule_session` | Booking a therapy check-in |
56
+ | `gratitude` | Showing appreciation for support |
57
+ | `journal_analysis` | Submitting a journal entry for AI feedback |
58
+ | `reflection` | Talking about personal growth or setbacks |
59
+ | `not_sure` | Unsure or unclear message from user |
60
+
61
+
62
+
63
+ ## 🛠️ Training Details
64
+
65
+ - **Base Model:** `distilbert-base-uncased`
66
+ - **Dataset:** Curated and annotated conversations (`training/datasets/finetuned/intents/`)
67
+ - **Script:** `training/train_intent_classifier.py`
68
+ - **Preprocessing:**
69
+ - Text normalization (lowercasing, punctuation removal)
70
+ - Label encoding
71
+ - **Loss:** CrossEntropyLoss
72
+ - **Metrics:** Accuracy, F1-score
73
+ - **Tokenizer:** WordPiece (DistilBERT tokenizer)
74
+
75
+
76
+
77
+ ## 📊 Evaluation
78
+
79
+ | Metric | Score |
80
+ |-----------|-------------|
81
+ | Accuracy | 91.3% |
82
+ | F1-score | 89.8% |
83
+ | Recall@3 | 97.1% |
84
+ | Precision | 88.4% |
85
+
86
+ Evaluation performed on a held-out validation split of MindPadi intent dataset.
87
+
88
+
89
+
90
+ ## 🔍 Example Usage
91
+
92
+ ```python
93
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
94
+ import torch
95
+
96
+ model = AutoModelForSequenceClassification.from_pretrained("mindpadi/intent_classifier")
97
+ tokenizer = AutoTokenizer.from_pretrained("mindpadi/intent_classifier")
98
+
99
+ text = "I’m struggling with my emotions today"
100
+ inputs = tokenizer(text, return_tensors="pt")
101
+ outputs = model(**inputs)
102
+
103
+ predicted_class = torch.argmax(outputs.logits, dim=1).item()
104
+ print("Predicted intent ID:", predicted_class)
105
+ ````
106
+
107
+ To map `intent ID → label`, load your label encoder from:
108
+
109
+ ```python
110
+ from joblib import load
111
+ label_encoder = load("intent_encoder/label_encoder.joblib")
112
+ print("Predicted intent:", label_encoder.inverse_transform([predicted_class])[0])
113
+ ```
114
+
115
+
116
+ ## 🔌 Inference Endpoint Example
117
+
118
+ ```python
119
+ import requests
120
+
121
+ API_URL = "https://api-inference.huggingface.co/models/mindpadi/intent_classifier"
122
+ headers = {"Authorization": f"Bearer <your-api-token>"}
123
+ payload = {"inputs": "Can I book a mental health session?"}
124
+
125
+ response = requests.post(API_URL, headers=headers, json=payload)
126
+ print(response.json())
127
+ ```
128
+
129
+
130
+
131
+ ## ⚠️ Limitations
132
+
133
+ * Not robust to long-form texts (>256 tokens); truncate or summarize input.
134
+ * May confuse overlapping intents like `vent` and `help_request`
135
+ * False positives possible in vague or sarcastic inputs
136
+ * Requires pairing with fallback model (`intent_fallback`) for reliability
137
+
138
+
139
+
140
+ ## 🔐 Ethical Considerations
141
+
142
+ * This model is for **supportive routing**, not clinical diagnosis
143
+ * Use with user consent and proper data privacy safeguards
144
+ * Intent predictions should not override human judgment in sensitive contexts
145
+
146
+
147
+
148
+ ## 📂 Integration Points
149
+
150
+ | Location | Functionality |
151
+ | ---------------------------------- | --------------------------------------------- |
152
+ | `app/chatbot/intent_classifier.py` | Main classifier logic |
153
+ | `app/chatbot/intent_router.py` | Routes based on predicted intent |
154
+ | `app/utils/embedding_search.py` | Uses `intent_encoder` for similarity fallback |
155
+ | `data/processed_intents.json` | Annotated intent samples |
156
+
157
+
158
+
159
+ ## 📜 License
160
+
161
+ MIT License – freely available for commercial and non-commercial use.
162
+
163
+
164
+ ## 📬 Contact
165
+
166
+ * **Team:** MindPadi AI Developers
167
+ * **Profile:** [https://huggingface.co/mindpadi](https://huggingface.co/mindpadi)
168
+ * **Email:** \[[you@example.com](mailto:you@example.com)]
169
+
170
+ *Last updated: May 2025*