Add/update the quantized ONNX model files and README.md for Transformers.js v3 (#2)
Browse files- Add/update the quantized ONNX model files and README.md for Transformers.js v3 (82873170d9e4b76ccd9a5c2ac0c3b428879f72e9)
Co-authored-by: Yuichiro Tachibana <whitphx@users.noreply.huggingface.co>
- README.md +5 -5
- onnx/model_bnb4.onnx +3 -0
- onnx/model_int8.onnx +3 -0
- onnx/model_q4.onnx +3 -0
- onnx/model_q4f16.onnx +3 -0
- onnx/model_uint8.onnx +3 -0
README.md
CHANGED
@@ -18,19 +18,19 @@ https://huggingface.co/jinaai/jina-embeddings-v2-base-de with ONNX weights to be
|
|
18 |
|
19 |
## Usage (Transformers.js)
|
20 |
|
21 |
-
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@
|
22 |
```bash
|
23 |
-
npm i @
|
24 |
```
|
25 |
|
26 |
You can then use the model to compute embeddings, as follows:
|
27 |
|
28 |
```js
|
29 |
-
import { pipeline, cos_sim } from '@
|
30 |
|
31 |
// Create a feature extraction pipeline
|
32 |
const extractor = await pipeline('feature-extraction', 'Xenova/jina-embeddings-v2-base-de', {
|
33 |
-
|
34 |
});
|
35 |
|
36 |
// Compute sentence embeddings
|
@@ -51,4 +51,4 @@ console.log(score);
|
|
51 |
|
52 |
---
|
53 |
|
54 |
-
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
|
18 |
|
19 |
## Usage (Transformers.js)
|
20 |
|
21 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
22 |
```bash
|
23 |
+
npm i @huggingface/transformers
|
24 |
```
|
25 |
|
26 |
You can then use the model to compute embeddings, as follows:
|
27 |
|
28 |
```js
|
29 |
+
import { pipeline, cos_sim } from '@huggingface/transformers';
|
30 |
|
31 |
// Create a feature extraction pipeline
|
32 |
const extractor = await pipeline('feature-extraction', 'Xenova/jina-embeddings-v2-base-de', {
|
33 |
+
dtype: "fp32" // Options: "fp32", "fp16", "q8", "q4"
|
34 |
});
|
35 |
|
36 |
// Compute sentence embeddings
|
|
|
51 |
|
52 |
---
|
53 |
|
54 |
+
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
onnx/model_bnb4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e8ab7347d77642646bbff73027b7d294806a73d67e93d81d739ea5808c74aa2d
|
3 |
+
size 251953322
|
onnx/model_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:89d0966581ca4c4fae3ec21e23bc387637e0f053a131f0322dac568527eb00ad
|
3 |
+
size 160893546
|
onnx/model_q4.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f6a7e897f75576b09a2245c04070b816a917f74da6818ad1e9443970d1240010
|
3 |
+
size 259030670
|
onnx/model_q4f16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2d06c3d3a791cf1bba67fa8b4bbe205759dcf38e241c0110caf8058cbd9ee0f6
|
3 |
+
size 157999063
|
onnx/model_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6966e78f28c22c71d4bec84a9e8b3310d1684400eb3e98fb85d465d53cd3c5ae
|
3 |
+
size 160893583
|