Y-Research-Group
/

CSR-NV_Embed_v2-Retrieval-SciFACT

sentence-transformers

feature-extraction

text-embeddings-inference

Model card Files Files and versions

CSR-NV_Embed_v2-Retrieval-SciFACT / README.md

W1nd-navigator's picture

Update README.md

23772f4 verified 22 days ago

|

history blame contribute delete

3.9 kB

	---
	license: mit
	datasets:
	- mteb/scifact
	language:
	- en
	pipeline_tag: text-retrieval
	library_name: sentence-transformers
	tags:
	- mteb
	- text
	- transformers
	- text-embeddings-inference
	- CSR
	model-index:
	- name: CSR
	results:
	- dataset:
	name: MTEB SciFact
	type: mteb/scifact
	revision: 0228b52cf27578f30900b9e5271d331663a030d7
	config: default
	split: test
	languages:
	- eng-Latn
	metrics:
	- type: ndcg@1
	value: 0.59333
	- type: ndcg@3
	value: 0.65703
	- type: ndcg@5
	value: 0.67072
	- type: ndcg@10
	value: 0.68412
	- type: ndcg@20
	value: 0.69238
	- type: ndcg@100
	value: 0.70514
	- type: ndcg@1000
	value: 0.71517
	- type: map@1
	value: 0.5675
	- type: map@3
	value: 0.63602
	- type: map@5
	value: 0.64712
	- type: map@10
	value: 0.65301
	- type: map@20
	value: 0.65552
	- type: map@100
	value: 0.65778
	- type: map@1000
	value: 0.65815
	- type: recall@1
	value: 0.5675
	- type: recall@3
	value: 0.69772
	- type: recall@5
	value: 0.73367
	- type: recall@10
	value: 0.77333
	- type: recall@20
	value: 0.80367
	- type: recall@100
	value: 0.86667
	- type: recall@1000
	value: 0.945
	- type: precision@1
	value: 0.59333
	- type: precision@3
	value: 0.25667
	- type: precision@5
	value: 0.164
	- type: precision@10
	value: 0.08667
	- type: precision@20
	value: 0.04533
	- type: precision@100
	value: 0.0099
	- type: precision@1000
	value: 0.00107
	- type: mrr@1
	value: 0.59333
	- type: mrr@3
	value: 0.64667
	- type: mrr@5
	value: 0.65333
	- type: mrr@10
	value: 0.65883
	- type: mrr@20
	value: 0.66105
	- type: mrr@100
	value: 0.66254
	- type: mrr@1000
	value: 0.66292
	- type: main_score
	value: 0.68412
	task:
	type: Retrieval
	---

	For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [Github](https://github.com/neilwen987/CSR_Adaptive_Rep).


	## Usage
	📌 Tip: For NV-Embed-V2, using Transformers versions later than 4.47.0 may lead to performance degradation, as ``model_type=bidir_mistral`` in ``config.json`` is no longer supported.

	We recommend using ``Transformers 4.47.0.``

	### Sentence Transformers Usage
	You can evaluate this model loaded by Sentence Transformers with the following code snippet:
	```python
	import mteb
	from sentence_transformers import SparseEncoder
	model = SparseEncoder(
	"Y-Research-Group/CSR-NV_Embed_v2-Retrieval-SciFACT ",
	trust_remote_code=True
	)
	model.prompts = {
	"SciFact-query": "Instrcut: Given a scientific claim, retrieve documents that support or refute the claim\nQuery:"
	}
	task = mteb.get_tasks(tasks=["SciFact"])
	evaluation = mteb.MTEB(tasks=task)
	evaluation.run(
	model,
	eval_splits=["test"],
	output_folder="./results/SciFact",
	show_progress_bar=True
	encode_kwargs={"convert_to_sparse_tensor": False, "batch_size": 8},
	) # MTEB don't support sparse tensors yet, so we need to convert to dense tensors
	```

	## Citation
	```bibtex
	@inproceedings{wenbeyond,
	title={Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation},
	author={Wen, Tiansheng and Wang, Yifei and Zeng, Zequn and Peng, Zhong and Su, Yudi and Liu, Xinyang and Chen, Bo and Liu, Hongwei and Jegelka, Stefanie and You, Chenyu},
	booktitle={Forty-second International Conference on Machine Learning}
	}
	```