|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- MBZUAI/LLaVA-Phi-3-mini-4k-instruct |
|
pipeline_tag: image-segmentation |
|
--- |
|
|
|
<div align="center"> |
|
<br> |
|
<h3>One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos</h3> |
|
|
|
[Zechen Bai](https://www.baizechen.site/) <sup>1</sup> |
|
[Tong He](https://hetong007.github.io/) <sup>2</sup> |
|
[Haiyang Mei](https://mhaiyang.github.io/) <sup>1</sup> |
|
[Pichao Wang](https://wangpichao.github.io/) <sup>2</sup> |
|
[Ziteng Gao](https://sebgao.github.io/) <sup>1</sup> |
|
[Joya Chen](https://chenjoya.github.io/) <sup>1</sup> |
|
[Lei Liu](https://openreview.net/profile?id=~liulei2) <sup>2</sup> |
|
[Zheng Zhang](https://scholar.google.com/citations?user=k0KiE4wAAAAJ&hl=en) <sup>2</sup> |
|
[Mike Zheng Shou](https://sites.google.com/view/showlab) <sup>1</sup> |
|
|
|
NeurIPS 2024 |
|
|
|
<sup>1</sup> [Show Lab, National University of Singapore](https://sites.google.com/view/showlab/home?authuser=0) <sup>2</sup> Amazon |
|
|
|
[](https://arxiv.org/abs/2409.19603) |
|
|
|
Please find the code at: https://github.com/showlab/VideoLISA |
|
|