Marquis commited on
Commit
f9f5680
Β·
verified Β·
1 Parent(s): bec9664

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -3,18 +3,18 @@ license: mit
3
  ---
4
  # MM-R5: MultiModal Reasoning-Enhanced ReRanker via Reinforcement Learning for Document Retrieval
5
 
6
- [![arXiv](https://img.shields.io/badge/arXiv-2506.12494-b31b1b.svg)](https://arxiv.org/abs/2506.12494)
7
- [![Hugging Face](https://img.shields.io/badge/huggingface-MMR5-yellow.svg)](https://huggingface.co/i2vec/MM-R5)
8
- [![Github](https://img.shields.io/badge/Github-MMR5-black.svg)](https://github.com/i2vec/MM-R5)
9
  ****
10
- # πŸ“’ News
11
  - **2025-06-20**: Our model [MM-R5](https://huggingface.co/i2vec/MM-R5) is now publicly available on Hugging Face!
12
  - **2025-06-14**: Our publication [MM-R5: MultiModal Reasoning-Enhanced ReRanker via Reinforcement Learning for Document Retrieval](https://arxiv.org/abs/2506.12364) is now available!
13
 
14
- # πŸ“– Introduction
15
  We introduce **MM-R5**, a novel *Multimodal Reasoning-Enhanced ReRanker* designed to improve document retrieval in complex, multimodal settings. Unlike traditional rerankers that treat candidates as isolated inputs, MM-R5 incorporates explicit chain-of-thought reasoning across textual, visual, and structural modalities to better assess relevance. The model follows a two-stage training paradigm: during the supervised fine-tuning (SFT) stage, it is trained to produce structured reasoning chains over multimodal content. To support this, we design a principled data construction method that generates high-quality reasoning traces aligned with retrieval intent, enabling the model to learn interpretable and effective decision paths. In the second stage, reinforcement learning is applied to further optimize the reranking performance using carefully designed reward functions, including task-specific ranking accuracy and output format validity. This combination of reasoning supervision and reward-driven optimization allows MM-R5 to deliver both accurate and interpretable reranking decisions. Experiments on the MMDocIR benchmark show that MM-R5 achieves state-of-the-art top-k retrieval performance, outperforming strong unimodal and large-scale multimodal baselines in complex document understanding scenarios.
16
 
17
- # πŸš€ Getting Started
18
  You can get the reranker from [here](https://github.com/i2vec/MM-R5/blob/main/examples/reranker.py)
19
  ```python
20
  from reranker import QueryReranker
@@ -36,7 +36,7 @@ print(f"Query: {query}")
36
  print(f"Reranked order: {predicted_order}")
37
  ```
38
 
39
- # πŸ–‹οΈ Citation
40
  If you use MM-R5 in your research, please cite our project:
41
  ```bibtex
42
 
 
3
  ---
4
  # MM-R5: MultiModal Reasoning-Enhanced ReRanker via Reinforcement Learning for Document Retrieval
5
 
6
+ [![arXiv](https://img.shields.io/badge/arXiv-2506.12364-b31b1b.svg)](https://arxiv.org/abs/2506.12364)
7
+ [![Hugging Face](https://img.shields.io/badge/huggingface-MM--R5-yellow.svg)](https://huggingface.co/i2vec/MM-R5)
8
+ [![Github](https://img.shields.io/badge/Github-MM--R5-black.svg)](https://github.com/i2vec/MM-R5)
9
  ****
10
+ ## πŸ“’ News
11
  - **2025-06-20**: Our model [MM-R5](https://huggingface.co/i2vec/MM-R5) is now publicly available on Hugging Face!
12
  - **2025-06-14**: Our publication [MM-R5: MultiModal Reasoning-Enhanced ReRanker via Reinforcement Learning for Document Retrieval](https://arxiv.org/abs/2506.12364) is now available!
13
 
14
+ ## πŸ“– Introduction
15
  We introduce **MM-R5**, a novel *Multimodal Reasoning-Enhanced ReRanker* designed to improve document retrieval in complex, multimodal settings. Unlike traditional rerankers that treat candidates as isolated inputs, MM-R5 incorporates explicit chain-of-thought reasoning across textual, visual, and structural modalities to better assess relevance. The model follows a two-stage training paradigm: during the supervised fine-tuning (SFT) stage, it is trained to produce structured reasoning chains over multimodal content. To support this, we design a principled data construction method that generates high-quality reasoning traces aligned with retrieval intent, enabling the model to learn interpretable and effective decision paths. In the second stage, reinforcement learning is applied to further optimize the reranking performance using carefully designed reward functions, including task-specific ranking accuracy and output format validity. This combination of reasoning supervision and reward-driven optimization allows MM-R5 to deliver both accurate and interpretable reranking decisions. Experiments on the MMDocIR benchmark show that MM-R5 achieves state-of-the-art top-k retrieval performance, outperforming strong unimodal and large-scale multimodal baselines in complex document understanding scenarios.
16
 
17
+ ## πŸš€ Getting Started
18
  You can get the reranker from [here](https://github.com/i2vec/MM-R5/blob/main/examples/reranker.py)
19
  ```python
20
  from reranker import QueryReranker
 
36
  print(f"Reranked order: {predicted_order}")
37
  ```
38
 
39
+ ## πŸ–‹οΈ Citation
40
  If you use MM-R5 in your research, please cite our project:
41
  ```bibtex
42