numb3r3 commited on
Commit
22f67ae
·
verified ·
1 Parent(s): e14ddde

update readme

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -144,15 +144,23 @@ Compared to `jina-reranker-v2-base-multilingual`, `jina-reranker-m0` significant
144
  pip install transformers >= 4.47.3
145
  ```
146
 
 
 
 
 
 
 
147
  And then use the following code snippet to load the model:
148
 
149
  ```python
150
  from transformers import AutoModel
151
 
 
152
  model = AutoModel.from_pretrained(
153
  'jinaai/jina-reranker-m0',
154
  torch_dtype="auto",
155
  trust_remote_code=True,
 
156
  )
157
 
158
  model.to('cuda') # or 'cpu' if no GPU is available
 
144
  pip install transformers >= 4.47.3
145
  ```
146
 
147
+ If you run it on a GPU that support FlashAttention-2. By 2024.9.12, it supports Ampere, Ada, or Hopper GPUs (e.g., A100, RTX 3090, RTX 4090, H100),
148
+
149
+ ```bash
150
+ pip install flash-attn --no-build-isolation
151
+ ```
152
+
153
  And then use the following code snippet to load the model:
154
 
155
  ```python
156
  from transformers import AutoModel
157
 
158
+ # comment out the flash_attention_2 line if you don't have a compatible GPU
159
  model = AutoModel.from_pretrained(
160
  'jinaai/jina-reranker-m0',
161
  torch_dtype="auto",
162
  trust_remote_code=True,
163
+ attn_implementation="flash_attention_2"
164
  )
165
 
166
  model.to('cuda') # or 'cpu' if no GPU is available