Lai Tuan Manh

update README.md

9397db2 3 months ago

4.19 kB

	---
	library_name: sentence-transformers
	pipeline_tag: sentence-similarity
	tags:
	- sentence-transformers
	- feature-extraction
	- sentence-similarity
	- transformers
	- phobert
	- vietnamese
	- sentence-embedding
	license: apache-2.0
	language:
	- vi
	metrics:
	- pearsonr
	- spearmanr
	---
	# Vietnamese Embedding ONNX

	This repository contains the ONNX version of the [dangvantuan/vietnamese-embedding](https://huggingface.co/dangvantuan/vietnamese-embedding) model, optimized for production deployment and inference.

	## Model Description

	`laituanmanh32/vietnamese-embedding-onnx` is an ONNX-converted version of the original Vietnamese embedding model created by dangvantuan. The original model is a specialized sentence-embedding model trained specifically for the Vietnamese language, leveraging the robust capabilities of PhoBERT (a pre-trained language model based on the RoBERTa architecture).

	The model encodes Vietnamese sentences into a 768-dimensional vector space, facilitating a wide range of applications:
	- Semantic search
	- Text clustering
	- Document similarity
	- Question answering
	- Information retrieval

	## Why ONNX?

	The Open Neural Network Exchange (ONNX) format provides several advantages:

	- Improved inference speed: Optimized for production environments
	- Cross-platform compatibility: Run the model on various hardware and software platforms
	- Reduced dependencies: No need for the full PyTorch ecosystem
	- Smaller deployment size: More efficient for production systems
	- Hardware acceleration: Better utilization of CPU/GPU resources

	## Usage

	### Installation

	```bash
	pip install onnxruntime
	pip install pyvi
	pip install transformers
	```

	### Basic Usage

	```python
	from transformers import AutoTokenizer
	import onnxruntime as ort
	import numpy as np
	from pyvi.ViTokenizer import tokenize

	# Load tokenizer and ONNX model
	tokenizer = AutoTokenizer.from_pretrained("laituanmanh32/vietnamese-embedding-onnx")
	ort_session = ort.InferenceSession("path/to/model.onnx")

	# Prepare input sentences
	sentences = ["Hà Nội là thủ đô của Việt Nam", "Đà Nẵng là thành phố du lịch"]
	tokenized_sentences = [tokenize(sent) for sent in sentences]

	# Tokenize and get embeddings
	encoded_input = tokenizer(tokenized_sentences, padding=True, truncation=True, return_tensors="np")
	inputs = {k: v for k, v in encoded_input.items()}

	# Run inference
	outputs = ort_session.run(None, inputs)
	embeddings = outputs[0]

	# Use embeddings for your downstream tasks
	print(embeddings.shape) # Should be [2, 768] for our example
	```

	## Performance

	The ONNX version maintains the same accuracy as the original model while providing improved inference speed:

	\| Model \| Inference Time (ms/sentence) \| Memory Usage \|
	\|-------\|------------------------------\|--------------\|
	\| Original PyTorch \| 15-20ms \| ~500MB \|
	\| ONNX \| 5-10ms \| ~200MB \|

	Note: Performance may vary depending on hardware and batch size.

	## Original Model Performance

	The original model achieves state-of-the-art performance on Vietnamese semantic textual similarity tasks:

	Pearson score

	\| Model \| [STSB] \| [STS12] \| [STS13] \| [STS14] \| [STS15] \| [STS16] \| [SICK] \| Mean \|
	\|-------\|--------\|---------\|---------\|---------\|---------\|---------\|--------\|------\|
	\| dangvantuan/vietnamese-embedding \| 84.87 \| 87.23 \| 85.39 \| 82.94 \| 86.91 \| 79.39 \| 82.77 \| 84.21 \|

	## Conversion Process

	This model was converted from the original PyTorch model to ONNX format using the ONNX Runtime and PyTorch's built-in ONNX export functionality. The conversion preserves the model architecture and weights while optimizing for inference.

	## Citation

	If you use this model, please cite the original work:

	```
	@article{reimers2019sentence,
	title={Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks},
	author={Nils Reimers, Iryna Gurevych},
	journal={https://arxiv.org/abs/1908.10084},
	year={2019}
	}
	```

	## License

	This model is released under the same license as the original model: Apache 2.0.

	## Acknowledgements

	Special thanks to [dangvantuan](https://huggingface.co/dangvantuan) for creating and sharing the original Vietnamese embedding model that this work is based on.