mgbam commited on
Commit
b53ba4b
·
verified ·
1 Parent(s): d51474a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +116 -0
README.md ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This repository contains a fine-tuned version of BiomedCLIP (specifically the PubMedBERT_256-vit_base_patch16_224 variant) using OpenCLIP. The model is trained to recognize and classify various medical images (e.g., chest X-rays, histopathology slides) in a zero-shot manner. It was further adapted on a subset of medical data (e.g., from the WinterSchool/MedificsDataset) to enhance performance on specific image classes.
2
+
3
+ Model Details
4
+ Architecture: Vision Transformer (ViT-B/16) + PubMedBERT-based text encoder, loaded through open_clip.
5
+ Training Objective: CLIP-style contrastive learning to align medical text prompts with images.
6
+ Fine-Tuned On: Selected medical images and text pairs, including X-rays, histopathology images, etc.
7
+ Intended Use:
8
+ Zero-shot classification of medical images (e.g., “This is a photo of a chest X-ray”).
9
+ Exploratory research or educational demos showcasing multi-modal (image-text) alignment in the medical domain.
10
+ Usage
11
+ Below is a minimal Python snippet using OpenCLIP. Adjust the labels and text prompts as needed:
12
+
13
+ python
14
+ Copy
15
+ import torch
16
+ import open_clip
17
+ from PIL import Image
18
+
19
+ # 1) Load the fine-tuned model
20
+ model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms(
21
+ "hf-hub:your-username/OpenCLIP-BiomedCLIP-Finetuned",
22
+ pretrained=None
23
+ )
24
+ tokenizer = open_clip.get_tokenizer("hf-hub:your-username/OpenCLIP-BiomedCLIP-Finetuned")
25
+
26
+ device = "cuda" if torch.cuda.is_available() else "cpu"
27
+ model.to(device)
28
+ model.eval()
29
+
30
+ # 2) Example labels
31
+ labels = [
32
+ "chest X-ray",
33
+ "brain MRI",
34
+ "bone X-ray",
35
+ "squamous cell carcinoma histopathology",
36
+ "adenocarcinoma histopathology",
37
+ "immunohistochemistry histopathology"
38
+ ]
39
+
40
+ # 3) Load and preprocess an image
41
+ image_path = "path/to/your_image.jpg"
42
+ image = Image.open(image_path).convert("RGB")
43
+ image_tensor = preprocess_val(image).unsqueeze(0).to(device)
44
+
45
+ # 4) Create text prompts & tokenize
46
+ text_prompts = [f"This is a photo of a {label}" for label in labels]
47
+ tokens = tokenizer(text_prompts).to(device)
48
+
49
+ # 5) Forward pass
50
+ with torch.no_grad():
51
+ image_features = model.encode_image(image_tensor)
52
+ text_features = model.encode_text(tokens)
53
+ logit_scale = model.logit_scale.exp()
54
+ logits = (logit_scale * image_features @ text_features.t()).softmax(dim=-1)
55
+
56
+ # 6) Get predictions
57
+ probs = logits[0].cpu().tolist()
58
+ for label, prob in zip(labels, probs):
59
+ print(f"{label}: {prob:.4f}")
60
+ Example Gradio App
61
+ You can also deploy a simple Gradio demo:
62
+
63
+ python
64
+ Copy
65
+ import gradio as gr
66
+ import torch
67
+ import open_clip
68
+ from PIL import Image
69
+
70
+ model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms(
71
+ "hf-hub:your-username/OpenCLIP-BiomedCLIP-Finetuned",
72
+ pretrained=None
73
+ )
74
+ tokenizer = open_clip.get_tokenizer("hf-hub:your-username/OpenCLIP-BiomedCLIP-Finetuned")
75
+ device = "cuda" if torch.cuda.is_available() else "cpu"
76
+ model.to(device)
77
+ model.eval()
78
+
79
+ labels = ["chest X-ray", "brain MRI", "histopathology", "etc."]
80
+
81
+ def classify_image(img):
82
+ if img is None:
83
+ return {}
84
+ image_tensor = preprocess_val(img).unsqueeze(0).to(device)
85
+ prompts = [f"This is a photo of a {label}" for label in labels]
86
+ tokens = tokenizer(prompts).to(device)
87
+ with torch.no_grad():
88
+ image_feats = model.encode_image(image_tensor)
89
+ text_feats = model.encode_text(tokens)
90
+ logit_scale = model.logit_scale.exp()
91
+ logits = (logit_scale * image_feats @ text_feats.T).softmax(dim=-1)
92
+ probs = logits.squeeze().cpu().numpy().tolist()
93
+ return {label: float(prob) for label, prob in zip(labels, probs)}
94
+
95
+ demo = gr.Interface(fn=classify_image, inputs=gr.Image(type="pil"), outputs="label")
96
+ demo.launch()
97
+ Performance
98
+ Accuracy: Varies based on your specific dataset. This model can effectively classify medical images like chest X-rays or histopathology slides, but performance depends heavily on fine-tuning data coverage.
99
+ Potential Limitations:
100
+ Ultrasound, CT, MRI or other modalities might not be recognized if not included in training data.
101
+ The model may incorrectly label images that fall outside its known categories.
102
+ Limitations & Caveats
103
+ Not a Medical Device: This model is not FDA-approved or clinically validated. It’s intended for research and educational purposes only.
104
+ Data Bias: If the training dataset lacked certain pathologies or modalities, the model may systematically misclassify them.
105
+ Security: This model uses standard PyTorch and open_clip. Be mindful of potential vulnerabilities when loading models or code from untrusted sources.
106
+ Privacy: If you use patient data, comply with local regulations (HIPAA, GDPR, etc.).
107
+ Citation & Acknowledgements
108
+ Base Model: BiomedCLIP by Microsoft
109
+ OpenCLIP: GitHub – open_clip
110
+ Fine-tuning dataset: WinterSchool/MedificsDataset
111
+ If you use this model in your research or demos, please cite the above works accordingly.
112
+
113
+ License
114
+ [Specify your license here—e.g., MIT, Apache 2.0, or a custom license.]
115
+
116
+ Note: Always include disclaimers that this model is not a substitute for professional medical advice and that it may not generalize to all imaging modalities or patient populations.