Valentin Buchner commited on
Commit
3bc13ec
Β·
1 Parent(s): 1e5dbee

put config at bottom of readme

Browse files
Files changed (3) hide show
  1. Leaderboard.md +0 -163
  2. README.md +16 -12
  3. app.py +1 -1
Leaderboard.md DELETED
@@ -1,163 +0,0 @@
1
- # πŸ”₯πŸ…οΈGenCeption Leaderboard πŸ…οΈπŸ”₯
2
-
3
- Evaluated MLLMs: [ChatGPT-4V](https://cdn.openai.com/papers/GPTV_System_Card.pdf), [mPLUG-Owl2](https://arxiv.org/pdf/2311.04257.pdf), [LLaVA-13B](https://arxiv.org/pdf/2304.08485.pdf), [LLaVA-7B](https://arxiv.org/pdf/2304.08485.pdf)
4
-
5
- <table>
6
- <tr><th>Existence </th><th>Count</th></tr>
7
- <tr><td>
8
-
9
- | Model | GC@3|
10
- |--|--|
11
- | ChatGPT-4V|0.422 |
12
- | mPLUG-Owl2|0.323 |
13
- | LLaVA-7B|0.308 |
14
- | LLaVA-13B|0.305 |
15
-
16
- </td><td>
17
-
18
- | Model | GC@3|
19
- |--|--|
20
- | ChatGPT-4V|0.404 |
21
- | mPLUG-Owl2|0.299 |
22
- | LLaVA-13B|0.294 |
23
- | LLaVA-7B|0.353 |
24
-
25
- </td></tr> </table>
26
-
27
-
28
- <table>
29
- <tr><th>Position </th><th>Color</th></tr>
30
- <tr><td>
31
-
32
- | Model | GC@3|
33
- |--|--|
34
- | ChatGPT-4V|0.408|
35
- | mPLUG-Owl2|0.306 |
36
- | LLaVA-7B|0.285 |
37
- | LLaVA-13B|0.255 |
38
-
39
- </td><td>
40
-
41
- | Model | GC@3|
42
- |--|--|
43
- | ChatGPT-4V|0.403 |
44
- | LLaVA-13B|0.300 |
45
- | mPLUG-Owl2|0.290 |
46
- | LLaVA-7B|0.284 |
47
-
48
- </td></tr> </table>
49
-
50
-
51
- <table>
52
- <tr><th>Poster </th><th>Celebrity</th></tr>
53
- <tr><td>
54
-
55
- | Model | GC@3|
56
- |--|--|
57
- | ChatGPT-4V|0.324|
58
- | mPLUG-Owl2|0.243 |
59
- | LLaVA-13B|0.215 |
60
- | LLaVA-7B|0.214 |
61
-
62
- </td><td>
63
-
64
- | Model | GC@3|
65
- |--|--|
66
- | ChatGPT-4V|0.332 |
67
- | mPLUG-Owl2|0.232 |
68
- | LLaVA-13B|0.206 |
69
- | LLaVA-7B|0.188 |
70
-
71
- </td></tr> </table>
72
-
73
-
74
- <table>
75
- <tr><th>Scene </th><th>Landmark</th></tr>
76
- <tr><td>
77
-
78
- | Model | GC@3|
79
- |--|--|
80
- | ChatGPT-4V|0.393|
81
- | mPLUG-Owl2|0.299 |
82
- | LLaVA-13B|0.277 |
83
- | LLaVA-7B|0.266 |
84
-
85
- </td><td>
86
-
87
- | Model | GC@3|
88
- |--|--|
89
- | ChatGPT-4V|0.353 |
90
- | mPLUG-Owl2|0.275 |
91
- | LLaVA-7B|0.252 |
92
- | LLaVA-13B|0.242 |
93
-
94
- </td></tr> </table>
95
-
96
-
97
- <table>
98
- <tr><th>Artwork </th><th>Commonsense Reasoning</th></tr>
99
- <tr><td>
100
-
101
- | Model | GC@3|
102
- |--|--|
103
- | ChatGPT-4V|0.421|
104
- | mPLUG-Owl2|0.252 |
105
- | LLaVA-13B|0.212 |
106
- | LLaVA-7B|0.210 |
107
-
108
- </td><td>
109
-
110
- | Model | GC@3|
111
- |--|--|
112
- | ChatGPT-4V|0.471 |
113
- | mPLUG-Owl2|0.353 |
114
- | LLaVA-13B|0.334 |
115
- | LLaVA-7B|0.294 |
116
-
117
- </td></tr> </table>
118
-
119
-
120
- <table>
121
- <tr><th>Code Reasoning </th><th>Numerical Calculation</th></tr>
122
- <tr><td>
123
-
124
- | Model | GC@3|
125
- |--|--|
126
- | ChatGPT-4V|0.193|
127
- | mPLUG-Owl2|0.176 |
128
- | LLaVA-13B|0.144 |
129
- | LLaVA-7B|0.107 |
130
-
131
- </td><td>
132
-
133
- | Model | GC@3|
134
- |--|--|
135
- | ChatGPT-4V|0.240 |
136
- | LLaVA-13B|0.195 |
137
- | mPLUG-Owl2|0.192 |
138
- | LLaVA-7B|0.155 |
139
-
140
- </td></tr> </table>
141
-
142
-
143
- <table>
144
- <tr><th>Text Translation </th><th>OCR</th></tr>
145
- <tr><td>
146
-
147
- | Model | GC@3|
148
- |--|--|
149
- | ChatGPT-4V|0.157|
150
- | LLaVA-13B|0.116 |
151
- | LLaVA-7B|0.111 |
152
- | mPLUG-Owl2|0.081 |
153
-
154
- </td><td>
155
-
156
- | Model | GC@3|
157
- |--|--|
158
- | ChatGPT-4V|0.393 |
159
- | mPLUG-Owl2|0.276 |
160
- | LLaVA-13B|0.239 |
161
- | LLaVA-7B|0.222 |
162
-
163
- </td></tr> </table>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -1,14 +1,3 @@
1
- ---
2
- title: Genception Leaderboard
3
- emoji: πŸ”₯
4
- colorFrom: red
5
- colorTo: green
6
- sdk: gradio
7
- sdk_version: 4.19.2
8
- app_file: app.py
9
- pinned: true
10
- ---
11
-
12
  # GenCeption: Evaluate Multimodal LLMs with Unlabeled Unimodal Data
13
 
14
  <div>
@@ -33,7 +22,7 @@ We demostrate a 5-iteration GenCeption procedure below run on a seed images to e
33
 
34
 
35
  ## Contribute
36
- Please add your model details and results to `leaderboard/leaderboard.json` and **create a PR (Pull-Request)** to contribute your results to the [πŸ”₯πŸ…οΈ**Leaderboard**πŸ…οΈπŸ”₯](https://huggingface.co/spaces/). Start by creating your virtual environment:
37
 
38
  ```{bash}
39
  conda create --name genception python=3.10 -y
@@ -64,3 +53,18 @@ The MME dataset, of which the image modality was used in our paper, can be obtai
64
  primaryClass={cs.AI,cs.CL,cs.LG}
65
  }
66
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # GenCeption: Evaluate Multimodal LLMs with Unlabeled Unimodal Data
2
 
3
  <div>
 
22
 
23
 
24
  ## Contribute
25
+ Please add your model details and results to `leaderboard/leaderboard.json` and **create a PR (Pull-Request)** to contribute your results to the [πŸ”₯πŸ…οΈ**Leaderboard**πŸ…οΈπŸ”₯](https://huggingface.co/spaces/valbuc/GenCeption). Start by creating your virtual environment:
26
 
27
  ```{bash}
28
  conda create --name genception python=3.10 -y
 
53
  primaryClass={cs.AI,cs.CL,cs.LG}
54
  }
55
  ```
56
+
57
+ ## HF Space config
58
+
59
+ Please dont be distracted by this content - it just configues the [πŸ€— Leaderboard](https://huggingface.co/spaces/valbuc/GenCeption).
60
+
61
+ ---
62
+ title: Genception Leaderboard
63
+ emoji: πŸ”₯
64
+ colorFrom: red
65
+ colorTo: green
66
+ sdk: gradio
67
+ sdk_version: 4.19.2
68
+ app_file: app.py
69
+ pinned: true
70
+ ---
app.py CHANGED
@@ -104,4 +104,4 @@ scheduler = BackgroundScheduler()
104
  scheduler.add_job(update_data, "cron", hour=0) # Update data once a day at midnight
105
  scheduler.start()
106
 
107
- demo.queue(default_concurrency_limit=40).launch(share=True)
 
104
  scheduler.add_job(update_data, "cron", hour=0) # Update data once a day at midnight
105
  scheduler.start()
106
 
107
+ demo.queue(default_concurrency_limit=40).launch()