|
--- |
|
library_name: sentence-transformers |
|
pipeline_tag: sentence-similarity |
|
tags: |
|
- sentence-transformers |
|
- feature-extraction |
|
- sentence-similarity |
|
- transformers |
|
- phobert |
|
- vietnamese |
|
- sentence-embedding |
|
license: apache-2.0 |
|
language: |
|
- vi |
|
metrics: |
|
- pearsonr |
|
- spearmanr |
|
--- |
|
# Vietnamese Embedding ONNX |
|
|
|
This repository contains the ONNX version of the [dangvantuan/vietnamese-embedding](https://huggingface.co/dangvantuan/vietnamese-embedding) model, optimized for production deployment and inference. |
|
|
|
## Model Description |
|
|
|
`laituanmanh32/vietnamese-embedding-onnx` is an ONNX-converted version of the original Vietnamese embedding model created by dangvantuan. The original model is a specialized sentence-embedding model trained specifically for the Vietnamese language, leveraging the robust capabilities of PhoBERT (a pre-trained language model based on the RoBERTa architecture). |
|
|
|
The model encodes Vietnamese sentences into a 768-dimensional vector space, facilitating a wide range of applications: |
|
- Semantic search |
|
- Text clustering |
|
- Document similarity |
|
- Question answering |
|
- Information retrieval |
|
|
|
## Why ONNX? |
|
|
|
The Open Neural Network Exchange (ONNX) format provides several advantages: |
|
|
|
- **Improved inference speed**: Optimized for production environments |
|
- **Cross-platform compatibility**: Run the model on various hardware and software platforms |
|
- **Reduced dependencies**: No need for the full PyTorch ecosystem |
|
- **Smaller deployment size**: More efficient for production systems |
|
- **Hardware acceleration**: Better utilization of CPU/GPU resources |
|
|
|
## Usage |
|
|
|
### Installation |
|
|
|
```bash |
|
pip install onnxruntime |
|
pip install pyvi |
|
pip install transformers |
|
``` |
|
|
|
### Basic Usage |
|
|
|
```python |
|
from transformers import AutoTokenizer |
|
import onnxruntime as ort |
|
import numpy as np |
|
from pyvi.ViTokenizer import tokenize |
|
|
|
# Load tokenizer and ONNX model |
|
tokenizer = AutoTokenizer.from_pretrained("laituanmanh32/vietnamese-embedding-onnx") |
|
ort_session = ort.InferenceSession("path/to/model.onnx") |
|
|
|
# Prepare input sentences |
|
sentences = ["Hà Nội là thủ đô của Việt Nam", "Đà Nẵng là thành phố du lịch"] |
|
tokenized_sentences = [tokenize(sent) for sent in sentences] |
|
|
|
# Tokenize and get embeddings |
|
encoded_input = tokenizer(tokenized_sentences, padding=True, truncation=True, return_tensors="np") |
|
inputs = {k: v for k, v in encoded_input.items()} |
|
|
|
# Run inference |
|
outputs = ort_session.run(None, inputs) |
|
embeddings = outputs[0] |
|
|
|
# Use embeddings for your downstream tasks |
|
print(embeddings.shape) # Should be [2, 768] for our example |
|
``` |
|
|
|
## Performance |
|
|
|
The ONNX version maintains the same accuracy as the original model while providing improved inference speed: |
|
|
|
| Model | Inference Time (ms/sentence) | Memory Usage | |
|
|-------|------------------------------|--------------| |
|
| Original PyTorch | 15-20ms | ~500MB | |
|
| ONNX | 5-10ms | ~200MB | |
|
|
|
*Note: Performance may vary depending on hardware and batch size.* |
|
|
|
## Original Model Performance |
|
|
|
The original model achieves state-of-the-art performance on Vietnamese semantic textual similarity tasks: |
|
|
|
**Pearson score** |
|
|
|
| Model | [STSB] | [STS12] | [STS13] | [STS14] | [STS15] | [STS16] | [SICK] | Mean | |
|
|-------|--------|---------|---------|---------|---------|---------|--------|------| |
|
| dangvantuan/vietnamese-embedding | 84.87 | 87.23 | 85.39 | 82.94 | 86.91 | 79.39 | 82.77 | 84.21 | |
|
|
|
## Conversion Process |
|
|
|
This model was converted from the original PyTorch model to ONNX format using the ONNX Runtime and PyTorch's built-in ONNX export functionality. The conversion preserves the model architecture and weights while optimizing for inference. |
|
|
|
## Citation |
|
|
|
If you use this model, please cite the original work: |
|
|
|
``` |
|
@article{reimers2019sentence, |
|
title={Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks}, |
|
author={Nils Reimers, Iryna Gurevych}, |
|
journal={https://arxiv.org/abs/1908.10084}, |
|
year={2019} |
|
} |
|
``` |
|
|
|
## License |
|
|
|
This model is released under the same license as the original model: Apache 2.0. |
|
|
|
## Acknowledgements |
|
|
|
Special thanks to [dangvantuan](https://huggingface.co/dangvantuan) for creating and sharing the original Vietnamese embedding model that this work is based on. |
|
|