Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
Generated from Trainer
dataset_size:182886
loss:ReasoningGuidedRankingLoss
Eval Results
text-embeddings-inference
bwang0911 commited on
Commit
9ac1253
·
verified ·
1 Parent(s): 9525764

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,902 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:182886
8
+ - loss:ReasoningGuidedRankingLoss
9
+ base_model: BAAI/bge-base-en-v1.5
10
+ widget:
11
+ - source_sentence: Hey Reddit, what do you do in New York City?
12
+ sentences:
13
+ - The second text directly answers the question posed in the first text. It provides
14
+ personal recommendations for places to eat and things to do in New York City,
15
+ fulfilling the user's query. The text also offers a specific recommendation for
16
+ a restaurant, Crif Dogs, and a menu item.
17
+ - "For example, let's say you're at a section containing 9 tables\n\n 1 2 \
18
+ \ 3\n 4 5 6\n 7 8 9\n\nI'm sitting on the west side of Table 7,\
19
+ \ there are people at Tables 5 and 6. Someone comes in through the crowd and sits\
20
+ \ on the east side of table 8, making awkward eye contact while we've got our\
21
+ \ mouths full.\n\nI always found it extremely uncomfortable... why oh why can't\
22
+ \ they just sit with their back to me? As far as I'm concerned this is almost\
23
+ \ as canonical as urinal rules."
24
+ - This is my first year living here and I was just wondering if you knew of any
25
+ awesome places to eat, fun places to go, trees to climb, anything of the sort.
26
+ I for one would recommend Crif Dogs to anyone who has not been. Go there and get
27
+ the "Spicy Redneck," you won't regret it.
28
+ - source_sentence: 'KEYC - Charges: Man Lived With Dead Bodies of His Mother, Brother'
29
+ sentences:
30
+ - The second text provides a detailed elaboration of the headline. It specifies
31
+ the location, the man's name, the charges, and the circumstances surrounding the
32
+ discovery of the bodies. It expands on the initial information, providing specific
33
+ details about the case.
34
+ - 'Well, this is one way to go out.
35
+
36
+ Robert Gene White took a trip to El Paso to visit the Red Parrot, a full service
37
+ gentlemen’s club. While Mr. White was enjoying a lap dance from one of the lovely
38
+ ladies, he passed away.
39
+
40
+ It wasn’t until the dance was over that they noticed Mr. White wasn’t moving.
41
+ Initially, the club thought Mr. White was “playing dead” just trying to get out
42
+ of paying his bill. Quickly they realized he wasn’t faking and began CPR, then
43
+ called 911. Unfortunately paramedics were unable to revive him.
44
+
45
+ Is anyone else completely encapsulated at the idea that this clearly isn’t the
46
+ first time someone has tried to “play dead” to get out of the bill?'
47
+ - 'Prosecutors say a Minnesota man lived in his house with the decomposing bodies
48
+ of his mother and twin brother for about a year.
49
+
50
+ Sixty-year-old Robert James Kuefler of White Bear Lake is charged with interference
51
+ with a dead body or scene of death because he neglected to tell authorities they
52
+ died of natural causes, according to the St. Paul Pioneer Press .
53
+
54
+ The bodies were found last year. Kuefler was charged this week. He allegedly told
55
+ police his mother, 94-year-old Evelyn Kuefler, died in August 2015 and his brother,
56
+ Richard Kuefler, died before that and he couldn''t bring himself to bury them.
57
+
58
+ The complaint says his mother''s body was decayed and skeletal and his brother''s
59
+ body was "mummified."
60
+
61
+ Robert Kuefler didn''t return a message left by The Associated Press.
62
+
63
+ -KEYC News 12'
64
+ - source_sentence: Innovative procedure saves baby alpaca in Lebanon
65
+ sentences:
66
+ - 'Police say 28-year-old Wesley Flores pulled out a gun and shot himself in the
67
+ jaw after four hours of unsuccessful negotiations. He''s since been sent to a
68
+ hospital in Lubbock.
69
+
70
+ Authorities say Flores was originally taken into custody on a warrant for failing
71
+ to show up to a scheduled court appearance.'
72
+ - The second text elaborates on the innovative procedure mentioned in the first
73
+ text. It provides details about the specific case of an alpaca named Hercules,
74
+ the innovative treatment (NuCress scaffold), the medical team involved, and the
75
+ positive outcome of the procedure, thus expanding on the initial claim.
76
+ - 'Hercules the alpaca was only 24 hours old when he broke his front left leg at
77
+ Cedar Rock Ranch in Lebanon. He received a plasma transfusion and was bottle-fed
78
+ for months. The open wound and exposed bone led to a serious infection, preventing
79
+ the bone break from healing properly.
80
+
81
+ The animal’s veterinarian referred him to the University of Tennessee College
82
+ of Veterinary Medicine for advanced treatment.
83
+
84
+ Dr. Pierre-Yves Mulon, UTCVM assistant professor in farm animal medicine and surgery,
85
+ determined the NuCress scaffold was the best option to heal the fragile animal.
86
+
87
+ The Nucress scaffold is a nanomaterial-based bone regeneration device pioneered
88
+ by University of Arkansas at Little Rock’s systems engineering professor Dr. Alexadru
89
+ S. Biris, UTCVM head of large animal clinical sciences, Dr. David Anderson and
90
+ a team of designated researchers.
91
+
92
+ The scaffold is designed to be implanted directly into the wound by a surgeon
93
+ and can be loaded with drugs to fight infection or with hormones and stem cells
94
+ to encourage bone growth. As a result, the scaffold can deliver bacteria-fighting
95
+ drugs directly to the wound and be safely absorbed by the body, generally eliminating
96
+ the need for additional surgeries.
97
+
98
+ Mulon loaded the scaffold with antibiotics and implanted it into Hercules’ wound,
99
+ expecting a long wait due to the alpaca’s condition. The process proved quicker
100
+ than he expected.
101
+
102
+ “Hercules responded well and fast,” said Mulon. “We was able to walk immediately
103
+ after surgery and has been very active. The bone repaired within the time range
104
+ expected for a closed fracture, though it was an open one.”
105
+
106
+ Mulon said while other options, such as traditionally administered drugs, could
107
+ have been used, they would have presented more obstacles such as future surgeries.
108
+
109
+ “It is difficult to confirm if the results would have changed using any other
110
+ option; however, I think it would have necessitated more time,” said Mulon. “Any
111
+ open fracture carries a guarded to poor prognosis, and Hercules made it as we
112
+ are very happy,”
113
+
114
+ Researchers received a grant of more than $5 million from the Department of Defense
115
+ and hope to develop the product for use with humans.'
116
+ - source_sentence: Trump, Macron To Hold Joint Press Conference During State Visit
117
+ sentences:
118
+ - 'Updated at 10:58 a.m. ET
119
+
120
+ President Trump and French President Emmanuel Macron will field questions from
121
+ reporters on Tuesday, in between talks on the Iran nuclear deal and a lavish state
122
+ dinner.
123
+
124
+ Macron is the first of two European leaders Trump is hosting this week. German
125
+ Chancellor Angela Merkel will be in Washington, D.C., on Friday. Both France and
126
+ Germany joined the U.S. in a six-nation pact with Iran to halt its nuclear program
127
+ in exchange for sanctions relief. Trump has threatened to pull the U.S. out of
128
+ that deal. Macron and Merkel want him to stay in.
129
+
130
+ Trump''s former advisers struggled to make the case for the nuclear deal, and
131
+ the newest members of Trump''s national security team are as skeptical of the
132
+ agreement as he is.
133
+
134
+ "People know my views on the Iran deal. It was a terrible deal. It should have
135
+ never, ever been made," Trump said Tuesday during an Oval Office photo opportunity
136
+ with Macron. "It''s insane. It''s ridiculous. It should have never been made,
137
+ but we will be talking about it."
138
+
139
+ Macron argues the nuclear agreement is worth preserving.
140
+
141
+ "We have a common objective, we want to make sure there''s no escalation and no
142
+ nuclear proliferation in the region. We now need to find the right path forward,"
143
+ Macron said, through an interpreter.
144
+
145
+ Macron has skillfully courted Trump, inviting the U.S. president to be his guest
146
+ last year at an elaborate military parade marking Bastille Day in Paris. Trump
147
+ was so impressed, he ordered his own military parade this November, marking the
148
+ 100th anniversary of the end of World War I.
149
+
150
+ The two presidents and their wives celebrated the wartime alliance between the
151
+ U.S. and France on Monday by planting an oak tree on the South Lawn of the White
152
+ House. The sapling comes from Belleau Wood, where more than 9,000 Marines died
153
+ in the final months of the first world war, according to a White House statement.
154
+
155
+ Later, the two couples took a sightseeing helicopter tour of Washington, then
156
+ held a private dinner at George Washington''s historic Mt. Vernon estate.
157
+
158
+ Despite their evident personal chemistry, Trump and Macron have significant policy
159
+ differences to discuss. In addition to the Iran nuclear deal, Macron wants a permanent
160
+ exemption from the president''s new steel and aluminum tariffs. And he''d like
161
+ to see a more lasting commitment from the U.S. to stabilization efforts in Syria.
162
+ Military forces from France and the U.K. joined the U.S. in launching air strikes
163
+ on Syria earlier this month in retaliation for a suspected chemical weapons attack.
164
+ But Trump is impatient to withdraw U.S. troops from that country as quickly as
165
+ possible.
166
+
167
+ "What you do have are two leaders who have a great deal of respect for one another,
168
+ who have a great friendship," said White House spokeswoman Sarah Sanders. She
169
+ added that friendship allows the two men to have "very open and candid conversations."
170
+
171
+ Sanders said she expects "a very productive and very positive state visit for
172
+ both countries."
173
+
174
+ The visit will be marked by the first state dinner of the Trump administration.
175
+ The White House has been decorated for the event with cherry blossoms, sweet peas
176
+ and white lilacs. The menu is American with French influences: spring lamb and
177
+ jambalaya.
178
+
179
+ On Wednesday, Macron is set to address a joint session of Congress.'
180
+ - 'Liverpool manager Jurgen Klopp admits that he cannot explain his side''s performance
181
+ during their 2-2 draw with Sunderland at the Stadium of Light.
182
+
183
+ Liverpool manager Jurgen Klopp has admitted that he cannot explain his side''s
184
+ performance during the 2-2 draw with Sunderland at the Stadium of Light this afternoon.
185
+
186
+ The Reds led twice through goals from Daniel Sturridge and Sadio Mane, but on
187
+ both occasions they were pegged back by penalties from Jermain Defoe.
188
+
189
+ Liverpool had been looking for five straight league wins for the first time under
190
+ Klopp, but the German suggested that the two-day turnaround between matches prevented
191
+ them from playing their best football.
192
+
193
+ "I am not able to explain it because I don''t know exactly what I saw, my team
194
+ were fighting but I wasn''t sure if they could do it. We can play better football
195
+ but I''m not sure if you can play better with that break," he told BBC Sport.
196
+
197
+ "I don''t know how it feels when you have to do the things you have to do today.
198
+ I told the players if nobody wanted to play I would never speak about and not
199
+ tell anyone, but nobody came and that was a good thing. About the football we
200
+ played, I actually have no idea how to speak about it.
201
+
202
+ "There was no foul before the free kick for the second penalty. You need a little
203
+ bit of luck, but Sunderland worked hard too and maybe they deserved it."
204
+
205
+ The results means that Liverpool miss the chance to close the gap on Premier League
206
+ leaders Chelsea to three points.'
207
+ - The second text elaborates on the title by providing details about the joint press
208
+ conference, including the date, topics to be discussed (Iran nuclear deal, tariffs,
209
+ Syria), and the context of the state visit. It also mentions the leaders' differing
210
+ views and the overall atmosphere of the visit.
211
+ - source_sentence: Crossover and multicriticality due to the Dzyaloshinsky-Moriya
212
+ interaction
213
+ sentences:
214
+ - Attention is focused on the theoretical principles governing the underlying geometry
215
+ of motifs, border patterns and all-over patterns. The systematic classification
216
+ and construction of two-dimensional periodic patterns and tilings is introduced,
217
+ with particular relerence to two-colour and higher colour counterchange possibilities.
218
+ An identification is made of the geometrical restraints encountered when introducing
219
+ systematic interchange of colour. A wide ranging series of original patterns and
220
+ tilings is constructed and fully illustrated; these designs have been printed
221
+ in fabric form and are presented in the accompanying exhibition.
222
+ - We show that the addition of a Dzyaloshinsky-Moriya interaction to a Heisenberg
223
+ ferromagnet introduces only one crossover exponent, which is the same as for the
224
+ usual uniaxial anisotropy. This result is in contrast to a previous report by
225
+ Liu.
226
+ - 'The second text elaborates on the first by specifying the impact of the Dzyaloshinsky-Moriya
227
+ interaction on a Heisenberg ferromagnet. It highlights a key finding: the introduction
228
+ of only one crossover exponent, contrasting with a prior study. This directly
229
+ addresses the topic introduced in the title.'
230
+ datasets:
231
+ - bwang0911/reasoning_pairs_filtered_w_reason_ccnews
232
+ - bwang0911/reasoning_pairs_filtered_w_reason
233
+ - bwang0911/reasoning_pairs_filtered_w_reason_s2orc
234
+ pipeline_tag: sentence-similarity
235
+ library_name: sentence-transformers
236
+ metrics:
237
+ - cosine_accuracy@1
238
+ - cosine_accuracy@3
239
+ - cosine_accuracy@5
240
+ - cosine_accuracy@10
241
+ - cosine_precision@1
242
+ - cosine_precision@3
243
+ - cosine_precision@5
244
+ - cosine_precision@10
245
+ - cosine_recall@1
246
+ - cosine_recall@3
247
+ - cosine_recall@5
248
+ - cosine_recall@10
249
+ - cosine_ndcg@10
250
+ - cosine_mrr@10
251
+ - cosine_map@100
252
+ model-index:
253
+ - name: SentenceTransformer based on BAAI/bge-base-en-v1.5
254
+ results:
255
+ - task:
256
+ type: information-retrieval
257
+ name: Information Retrieval
258
+ dataset:
259
+ name: mteb/nfcorpus
260
+ type: mteb/nfcorpus
261
+ metrics:
262
+ - type: cosine_accuracy@1
263
+ value: 0.5046439628482973
264
+ name: Cosine Accuracy@1
265
+ - type: cosine_accuracy@3
266
+ value: 0.6346749226006192
267
+ name: Cosine Accuracy@3
268
+ - type: cosine_accuracy@5
269
+ value: 0.6965944272445821
270
+ name: Cosine Accuracy@5
271
+ - type: cosine_accuracy@10
272
+ value: 0.7678018575851393
273
+ name: Cosine Accuracy@10
274
+ - type: cosine_precision@1
275
+ value: 0.5046439628482973
276
+ name: Cosine Precision@1
277
+ - type: cosine_precision@3
278
+ value: 0.3993808049535604
279
+ name: Cosine Precision@3
280
+ - type: cosine_precision@5
281
+ value: 0.3572755417956657
282
+ name: Cosine Precision@5
283
+ - type: cosine_precision@10
284
+ value: 0.28668730650154794
285
+ name: Cosine Precision@10
286
+ - type: cosine_recall@1
287
+ value: 0.06516889989501519
288
+ name: Cosine Recall@1
289
+ - type: cosine_recall@3
290
+ value: 0.11387269263353653
291
+ name: Cosine Recall@3
292
+ - type: cosine_recall@5
293
+ value: 0.1396374157566347
294
+ name: Cosine Recall@5
295
+ - type: cosine_recall@10
296
+ value: 0.18692123966555005
297
+ name: Cosine Recall@10
298
+ - type: cosine_ndcg@10
299
+ value: 0.38253279961982706
300
+ name: Cosine Ndcg@10
301
+ - type: cosine_mrr@10
302
+ value: 0.5874551575015973
303
+ name: Cosine Mrr@10
304
+ - type: cosine_map@100
305
+ value: 0.195968677576039
306
+ name: Cosine Map@100
307
+ - task:
308
+ type: information-retrieval
309
+ name: Information Retrieval
310
+ dataset:
311
+ name: mteb/trec covid
312
+ type: mteb/trec-covid
313
+ metrics:
314
+ - type: cosine_accuracy@1
315
+ value: 0.86
316
+ name: Cosine Accuracy@1
317
+ - type: cosine_accuracy@3
318
+ value: 1.0
319
+ name: Cosine Accuracy@3
320
+ - type: cosine_accuracy@5
321
+ value: 1.0
322
+ name: Cosine Accuracy@5
323
+ - type: cosine_accuracy@10
324
+ value: 1.0
325
+ name: Cosine Accuracy@10
326
+ - type: cosine_precision@1
327
+ value: 0.86
328
+ name: Cosine Precision@1
329
+ - type: cosine_precision@3
330
+ value: 0.8799999999999999
331
+ name: Cosine Precision@3
332
+ - type: cosine_precision@5
333
+ value: 0.856
334
+ name: Cosine Precision@5
335
+ - type: cosine_precision@10
336
+ value: 0.8320000000000001
337
+ name: Cosine Precision@10
338
+ - type: cosine_recall@1
339
+ value: 0.0007006541633990996
340
+ name: Cosine Recall@1
341
+ - type: cosine_recall@3
342
+ value: 0.002166976340027841
343
+ name: Cosine Recall@3
344
+ - type: cosine_recall@5
345
+ value: 0.003562871514029663
346
+ name: Cosine Recall@5
347
+ - type: cosine_recall@10
348
+ value: 0.00692643022454112
349
+ name: Cosine Recall@10
350
+ - type: cosine_ndcg@10
351
+ value: 0.843458611785082
352
+ name: Cosine Ndcg@10
353
+ - type: cosine_mrr@10
354
+ value: 0.9233333333333333
355
+ name: Cosine Mrr@10
356
+ - type: cosine_map@100
357
+ value: 0.5214168404644098
358
+ name: Cosine Map@100
359
+ - task:
360
+ type: information-retrieval
361
+ name: Information Retrieval
362
+ dataset:
363
+ name: mteb/fiqa
364
+ type: mteb/fiqa
365
+ metrics:
366
+ - type: cosine_accuracy@1
367
+ value: 0.35802469135802467
368
+ name: Cosine Accuracy@1
369
+ - type: cosine_accuracy@3
370
+ value: 0.5231481481481481
371
+ name: Cosine Accuracy@3
372
+ - type: cosine_accuracy@5
373
+ value: 0.5848765432098766
374
+ name: Cosine Accuracy@5
375
+ - type: cosine_accuracy@10
376
+ value: 0.6743827160493827
377
+ name: Cosine Accuracy@10
378
+ - type: cosine_precision@1
379
+ value: 0.35802469135802467
380
+ name: Cosine Precision@1
381
+ - type: cosine_precision@3
382
+ value: 0.23251028806584362
383
+ name: Cosine Precision@3
384
+ - type: cosine_precision@5
385
+ value: 0.16944444444444445
386
+ name: Cosine Precision@5
387
+ - type: cosine_precision@10
388
+ value: 0.10648148148148148
389
+ name: Cosine Precision@10
390
+ - type: cosine_recall@1
391
+ value: 0.18514227970246488
392
+ name: Cosine Recall@1
393
+ - type: cosine_recall@3
394
+ value: 0.31801450435709694
395
+ name: Cosine Recall@3
396
+ - type: cosine_recall@5
397
+ value: 0.3720212443592073
398
+ name: Cosine Recall@5
399
+ - type: cosine_recall@10
400
+ value: 0.45586599186136223
401
+ name: Cosine Recall@10
402
+ - type: cosine_ndcg@10
403
+ value: 0.3826690717843391
404
+ name: Cosine Ndcg@10
405
+ - type: cosine_mrr@10
406
+ value: 0.4577338085439937
407
+ name: Cosine Mrr@10
408
+ - type: cosine_map@100
409
+ value: 0.32368570015506426
410
+ name: Cosine Map@100
411
+ - task:
412
+ type: information-retrieval
413
+ name: Information Retrieval
414
+ dataset:
415
+ name: mteb/quora
416
+ type: mteb/quora
417
+ metrics:
418
+ - type: cosine_accuracy@1
419
+ value: 0.8112
420
+ name: Cosine Accuracy@1
421
+ - type: cosine_accuracy@3
422
+ value: 0.9258
423
+ name: Cosine Accuracy@3
424
+ - type: cosine_accuracy@5
425
+ value: 0.9553
426
+ name: Cosine Accuracy@5
427
+ - type: cosine_accuracy@10
428
+ value: 0.9773
429
+ name: Cosine Accuracy@10
430
+ - type: cosine_precision@1
431
+ value: 0.8112
432
+ name: Cosine Precision@1
433
+ - type: cosine_precision@3
434
+ value: 0.3723666666666666
435
+ name: Cosine Precision@3
436
+ - type: cosine_precision@5
437
+ value: 0.24552000000000013
438
+ name: Cosine Precision@5
439
+ - type: cosine_precision@10
440
+ value: 0.13407000000000002
441
+ name: Cosine Precision@10
442
+ - type: cosine_recall@1
443
+ value: 0.7047405405718852
444
+ name: Cosine Recall@1
445
+ - type: cosine_recall@3
446
+ value: 0.8691192994653526
447
+ name: Cosine Recall@3
448
+ - type: cosine_recall@5
449
+ value: 0.9144622696502942
450
+ name: Cosine Recall@5
451
+ - type: cosine_recall@10
452
+ value: 0.9524565789137283
453
+ name: Cosine Recall@10
454
+ - type: cosine_ndcg@10
455
+ value: 0.8811914153543994
456
+ name: Cosine Ndcg@10
457
+ - type: cosine_mrr@10
458
+ value: 0.8729545634920601
459
+ name: Cosine Mrr@10
460
+ - type: cosine_map@100
461
+ value: 0.8501811476426027
462
+ name: Cosine Map@100
463
+ ---
464
+
465
+ # SentenceTransformer based on BAAI/bge-base-en-v1.5
466
+
467
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the [reason_ccnews](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason_ccnews), [reason_reddit](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason) and [reason_s2orc](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason_s2orc) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
468
+
469
+ ## Model Details
470
+
471
+ ### Model Description
472
+ - **Model Type:** Sentence Transformer
473
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
474
+ - **Maximum Sequence Length:** 256 tokens
475
+ - **Output Dimensionality:** 768 dimensions
476
+ - **Similarity Function:** Cosine Similarity
477
+ - **Training Datasets:**
478
+ - [reason_ccnews](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason_ccnews)
479
+ - [reason_reddit](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason)
480
+ - [reason_s2orc](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason_s2orc)
481
+ <!-- - **Language:** Unknown -->
482
+ <!-- - **License:** Unknown -->
483
+
484
+ ### Model Sources
485
+
486
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
487
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
488
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
489
+
490
+ ### Full Model Architecture
491
+
492
+ ```
493
+ SentenceTransformer(
494
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': True}) with Transformer model: BertModel
495
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
496
+ (2): Normalize()
497
+ )
498
+ ```
499
+
500
+ ## Usage
501
+
502
+ ### Direct Usage (Sentence Transformers)
503
+
504
+ First install the Sentence Transformers library:
505
+
506
+ ```bash
507
+ pip install -U sentence-transformers
508
+ ```
509
+
510
+ Then you can load this model and run inference.
511
+ ```python
512
+ from sentence_transformers import SentenceTransformer
513
+
514
+ # Download from the 🤗 Hub
515
+ model = SentenceTransformer("bwang0911/reasoning-bge")
516
+ # Run inference
517
+ sentences = [
518
+ 'Crossover and multicriticality due to the Dzyaloshinsky-Moriya interaction',
519
+ 'We show that the addition of a Dzyaloshinsky-Moriya interaction to a Heisenberg ferromagnet introduces only one crossover exponent, which is the same as for the usual uniaxial anisotropy. This result is in contrast to a previous report by Liu.',
520
+ 'The second text elaborates on the first by specifying the impact of the Dzyaloshinsky-Moriya interaction on a Heisenberg ferromagnet. It highlights a key finding: the introduction of only one crossover exponent, contrasting with a prior study. This directly addresses the topic introduced in the title.',
521
+ ]
522
+ embeddings = model.encode(sentences)
523
+ print(embeddings.shape)
524
+ # [3, 768]
525
+
526
+ # Get the similarity scores for the embeddings
527
+ similarities = model.similarity(embeddings, embeddings)
528
+ print(similarities.shape)
529
+ # [3, 3]
530
+ ```
531
+
532
+ <!--
533
+ ### Direct Usage (Transformers)
534
+
535
+ <details><summary>Click to see the direct usage in Transformers</summary>
536
+
537
+ </details>
538
+ -->
539
+
540
+ <!--
541
+ ### Downstream Usage (Sentence Transformers)
542
+
543
+ You can finetune this model on your own dataset.
544
+
545
+ <details><summary>Click to expand</summary>
546
+
547
+ </details>
548
+ -->
549
+
550
+ <!--
551
+ ### Out-of-Scope Use
552
+
553
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
554
+ -->
555
+
556
+ ## Evaluation
557
+
558
+ ### Metrics
559
+
560
+ #### Information Retrieval
561
+
562
+ * Datasets: `mteb/nfcorpus`, `mteb/trec-covid`, `mteb/fiqa` and `mteb/quora`
563
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
564
+
565
+ | Metric | mteb/nfcorpus | mteb/trec-covid | mteb/fiqa | mteb/quora |
566
+ |:--------------------|:--------------|:----------------|:-----------|:-----------|
567
+ | cosine_accuracy@1 | 0.5046 | 0.86 | 0.358 | 0.8112 |
568
+ | cosine_accuracy@3 | 0.6347 | 1.0 | 0.5231 | 0.9258 |
569
+ | cosine_accuracy@5 | 0.6966 | 1.0 | 0.5849 | 0.9553 |
570
+ | cosine_accuracy@10 | 0.7678 | 1.0 | 0.6744 | 0.9773 |
571
+ | cosine_precision@1 | 0.5046 | 0.86 | 0.358 | 0.8112 |
572
+ | cosine_precision@3 | 0.3994 | 0.88 | 0.2325 | 0.3724 |
573
+ | cosine_precision@5 | 0.3573 | 0.856 | 0.1694 | 0.2455 |
574
+ | cosine_precision@10 | 0.2867 | 0.832 | 0.1065 | 0.1341 |
575
+ | cosine_recall@1 | 0.0652 | 0.0007 | 0.1851 | 0.7047 |
576
+ | cosine_recall@3 | 0.1139 | 0.0022 | 0.318 | 0.8691 |
577
+ | cosine_recall@5 | 0.1396 | 0.0036 | 0.372 | 0.9145 |
578
+ | cosine_recall@10 | 0.1869 | 0.0069 | 0.4559 | 0.9525 |
579
+ | **cosine_ndcg@10** | **0.3825** | **0.8435** | **0.3827** | **0.8812** |
580
+ | cosine_mrr@10 | 0.5875 | 0.9233 | 0.4577 | 0.873 |
581
+ | cosine_map@100 | 0.196 | 0.5214 | 0.3237 | 0.8502 |
582
+
583
+ <!--
584
+ ## Bias, Risks and Limitations
585
+
586
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
587
+ -->
588
+
589
+ <!--
590
+ ### Recommendations
591
+
592
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
593
+ -->
594
+
595
+ ## Training Details
596
+
597
+ ### Training Datasets
598
+
599
+ #### reason_ccnews
600
+
601
+ * Dataset: [reason_ccnews](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason_ccnews) at [2e4fb05](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason_ccnews/tree/2e4fb0585e862af0623b97b64d34325001b218a2)
602
+ * Size: 44,978 training samples
603
+ * Columns: <code>title</code>, <code>body</code>, and <code>reason</code>
604
+ * Approximate statistics based on the first 1000 samples:
605
+ | | title | body | reason |
606
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
607
+ | type | string | string | string |
608
+ | details | <ul><li>min: 6 tokens</li><li>mean: 15.34 tokens</li><li>max: 42 tokens</li></ul> | <ul><li>min: 21 tokens</li><li>mean: 221.75 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 28 tokens</li><li>mean: 59.19 tokens</li><li>max: 88 tokens</li></ul> |
609
+ * Samples:
610
+ | title | body | reason |
611
+ |:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
612
+ | <code>Fight Leaves Wayne Simmonds Shirtless</code> | <code>Reed Saxon/AP Images<br>Kevin Bieksa and Wayne Simmonds dropped the gloves just 95 seconds into last night’s 4-3 Ducks shootout win over the Flyers, and Bieksa immediately yanked his opponent’s jersey over his head, to the delight of the crowd and to grins from Simmonds and the officials.<br>That’s not supposed to happen. NHL players wear something called a fight strap, which binds the back of the jersey to the pants, preventing the jersey from being pulled off. (Losing a jersey is an advantage in a fight, as it gives the shirtless player’s opponent nothing to grab on to. Sabres enforcer Rob Ray was notorious for losing his gear in a fight, occasionally taking it off himself before clinching.) Any player who engaged in a fight without wearing a fight strap is subject to an automatic game misconduct.<br>Advertisement<br>Simmonds wasn’t ejected, though; at the one-minute mark of the video above, you can see he did have his fight strap properly attached. It just broke, which happens on occasion.</code> | <code>The article describes a hockey fight involving Wayne Simmonds, confirming the title's claim. It details the fight, including Simmonds' jersey being pulled off, and explains the rules and context around the incident, directly elaborating on the event suggested by the title.</code> |
613
+ | <code>Merck CEO Kenneth Frazier ditches Trump over Charlottesville silence</code> | <code>Merck CEO Kenneth C. Frazier resigned from the president’s council on manufacturing Monday in direct protest of President Donald Trump’s lack of condemnation of white nationalist actions in Charlottesville, Va. over the weekend.<br>In a statement, Frazier, who is African-American, said he believes the country’s strength comes from the diversity of its citizens and that he feels personally compelled to stand up for that diversity and against intolerance.<br>“America’s leaders must honor our fundamental values by clearly rejecting expressions of hatred, bigotry and group supremacy, which run counter to the American ideal that all people are created equal,” he wrote. “As CEO of Merck, and as a matter of personal conscience, I feel a responsibility to take a stand against intolerance and extremism.”<br>RELATED: At least one death has been confirmed after a car plowed into a crowd of protesters in Charlottesville<br>Trump immediately fired back at Frazier on Twitter, saying the Merck CEO now “will have...</code> | <code>The second text provides a detailed elaboration of the first. It explains the context of Kenneth Frazier's resignation, the reasons behind it (Trump's silence on Charlottesville), and includes Frazier's statement. It also provides additional background information about Frazier and the President's Manufacturing Council.</code> |
614
+ | <code>Lightning's Braydon Coburn: Joining road trip</code> | <code>Coburn (lower body) will travel with the team on its upcoming four-game road trip and is hoping to play at some point in the second half of the trip, Bryan Burns of the Lightning's official site reports.<br>The veteran blueliner is yet to play in the month of December, having already missed four games. However, the fact that Coburn is traveling with the team and has been given a chance to play at some point within the next week will be music to the ears of fantasy owners who benefited from Coburn's surprising production -- seven points in 25 games -- earlier in the season. Keep an eye out for updates as the trip progresses.</code> | <code>The second text elaborates on the first by providing details about Braydon Coburn's situation. It specifies that he will join the team on a road trip and offers context about his injury, recovery timeline, and potential for playing, directly expanding on the initial announcement.</code> |
615
+ * Loss: [<code>ReasoningGuidedRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#reasoningguidedrankingloss) with these parameters:
616
+ ```json
617
+ {
618
+ "scale": 20.0,
619
+ "similarity_fct": "cos_sim"
620
+ }
621
+ ```
622
+
623
+ #### reason_reddit
624
+
625
+ * Dataset: [reason_reddit](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason) at [2fd69ee](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason/tree/2fd69eed3d8056fbdd0c5a5e4572d2524d861626)
626
+ * Size: 41,703 training samples
627
+ * Columns: <code>title</code>, <code>body</code>, and <code>reason</code>
628
+ * Approximate statistics based on the first 1000 samples:
629
+ | | title | body | reason |
630
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
631
+ | type | string | string | string |
632
+ | details | <ul><li>min: 6 tokens</li><li>mean: 18.82 tokens</li><li>max: 69 tokens</li></ul> | <ul><li>min: 16 tokens</li><li>mean: 126.63 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 42 tokens</li><li>mean: 59.32 tokens</li><li>max: 84 tokens</li></ul> |
633
+ * Samples:
634
+ | title | body | reason |
635
+ |:-----------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
636
+ | <code>The one feature the iPad is really missing.</code> | <code>I don't care about the lack of camera. I never use the one on my MacBook, and even if I did the angle would be terrible on the iPad.<br><br>I don't care if third party apps can't run in the background. I don't listen to streaming music.<br><br>I don't care that the App Store is a closed system. I can jailbreak for myself and I think the closed system works better for most users.<br><br>The one feature I want is User Accounts and a Guest Account. If this device is meant to be a coffee table computer, it needs to be able to accomadate multiple users.</code> | <code>The second text identifies the missing feature from the iPad as user accounts and a guest account. The first sentence in the second text sets up a contrast by stating what the author *doesn't* care about. The final sentence directly addresses the prompt by stating the feature the author *does* want.</code> |
637
+ | <code>Dear Sydney Reddit'ers, Would you like any changes made to the style of this subreddit?</code> | <code>I was going to subtly edit the style of the Sydney subreddit but then I found this post and realised that people have very strong opinions about how their reddit should look. <br><br><br><br>So before I make any changes do you have any opinions or suggestions?</code> | <code>The second text directly responds to the question in the first text. It acknowledges the query about subreddit style changes and seeks further input from the community before making any modifications. It demonstrates an understanding of the original post's intent and a willingness to engage with user preferences.</code> |
638
+ | <code>I skipped bail, ran away, and never got caught. AM(A)A.</code> | <code>Long/short story, I went to work in the United States in the last 90s and was busted in a major drug raid. I risked up to lifetime in jail if caught since I was associated with so many crimes; at the bare minimum, said my attorney, I was looking at 7 years in jail, and much more likely more than this.<br><br>My attorney said I was in a lot of trouble. He was the first to bring it up. I did not want to lose 10, 15 or 25 years of my life in jail, especially at my age. Since I was not a United States citizen, I should simply skip bail and run away. And never come back.<br><br>My bail was initially supposed to be $300,000 but my attorney managed to get the judge to set a final bail of $100,000. He explained I was a trustworthy person, lawfully employed, who never did anything wrong and never committed any crime. He portrayed me as someone trustworthy and intelligent who could take care of his responsibilities. The judge agreed and decided on a very low bail, especially for the crimes I was accused of....</code> | <code>The second text provides a detailed account of the events summarized in the first text. It elaborates on the circumstances of skipping bail, running away, and avoiding capture, offering specific details about the legal situation, the escape plan, and the aftermath. The AMAA at the end indicates the user is open to questions about the story.</code> |
639
+ * Loss: [<code>ReasoningGuidedRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#reasoningguidedrankingloss) with these parameters:
640
+ ```json
641
+ {
642
+ "scale": 20.0,
643
+ "similarity_fct": "cos_sim"
644
+ }
645
+ ```
646
+
647
+ #### reason_s2orc
648
+
649
+ * Dataset: [reason_s2orc](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason_s2orc) at [4d04170](https://huggingface.co/datasets/bwang0911/reasoning_pairs_filtered_w_reason_s2orc/tree/4d04170e1df7f9f7fc63aa92a28dddee804ef0e5)
650
+ * Size: 96,205 training samples
651
+ * Columns: <code>title</code>, <code>body</code>, and <code>reason</code>
652
+ * Approximate statistics based on the first 1000 samples:
653
+ | | title | body | reason |
654
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
655
+ | type | string | string | string |
656
+ | details | <ul><li>min: 6 tokens</li><li>mean: 19.26 tokens</li><li>max: 75 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 138.29 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 47 tokens</li><li>mean: 67.13 tokens</li><li>max: 107 tokens</li></ul> |
657
+ * Samples:
658
+ | title | body | reason |
659
+ |:----------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
660
+ | <code>Syntheses, Structures and Properties of Two Transition Metal-Flexible Ligand Coordination Polymers</code> | <code>Two coordination polymers based on 3,5-bis(4-carboxyphenylmethyloxy) benzoic acid (H3L), [M(HL)]·2H2O M = Mn(1), Co(2), have been synthesized under hydrothermal conditions. Their structures have been determined by single-crystal X-ray diffraction and further characterized by elemental analysis, IR spectra and TGA. The two complexes possess 3D framework with diamond channels resulting from the trans-configuration of the flexible ligand and three coordination modes, 3(η2, η1), 2(η1, η1), η1, of carboxyl groups in the ligand. The framework can be represented with Schlafli symbol of (48·66)(47·66). The wall of the channel consists of left- or right-handed helical polymeric chains. UV–visible–NIR and photoluminescence spectra, magnetic properties of 1 and 2 have also been discussed.</code> | <code>The second text elaborates on the title by detailing the synthesis, structure, and properties of two specific transition metal coordination polymers. It provides the chemical formula, synthesis method, structural characteristics (3D framework, channels), and characterization techniques (X-ray diffraction, IR spectra, etc.) mentioned in the title.</code> |
661
+ | <code>Discussion on the Influence and Development of Technical Aesthetics in Modern Landscape Design</code> | <code>The source of technical aesthetics was introduced and its meaning was explained.The relations between technical aesthetics and modern landscpae design were discussed.The embodiment of technical aesthetics in landscpae design was discussed in the aspects of new material,new technology,new structureand new apparatus.It was put forward that the the development direction of technical aesthetics were tending to sensibility, native land and zoology.</code> | <code>The second text directly addresses the topic introduced in the first text. It explores the meaning, application, and future directions of technical aesthetics within modern landscape design, elaborating on the influence and development mentioned in the title.</code> |
662
+ | <code>GRIN optics for dual-band IR sensors (Conference Presentation)</code> | <code>Graded index (GRIN) optics offer potential for both weight savings and increased performance but have until recently been limited to visible and NIR bands (wavelengths shorter than about 0.9 µm). NRL has developed glass-based IR-GRIN lenses compatible with SWIR-LWIR wavebands. Recent designs show the potential for significant SWaP reduction benefits and improved performance using IR-GRIN lens elements in dual-band, MWIR-LWIR sensors. The SWaP and performance advantages of IR-GRIN lenses in platform-relevant dual-band imagers will be presented.</code> | <code>The second text elaborates on the first by providing a detailed description of GRIN optics, specifically for dual-band IR sensors. It explains the potential benefits (weight savings, increased performance) and highlights the development of IR-GRIN lenses compatible with SWIR-LWIR wavebands, aligning directly with the conference presentation topic.</code> |
663
+ * Loss: [<code>ReasoningGuidedRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#reasoningguidedrankingloss) with these parameters:
664
+ ```json
665
+ {
666
+ "scale": 20.0,
667
+ "similarity_fct": "cos_sim"
668
+ }
669
+ ```
670
+
671
+ ### Training Hyperparameters
672
+ #### Non-Default Hyperparameters
673
+
674
+ - `eval_strategy`: steps
675
+ - `per_device_train_batch_size`: 128
676
+ - `learning_rate`: 5e-06
677
+ - `num_train_epochs`: 1
678
+ - `warmup_ratio`: 0.2
679
+ - `fp16`: True
680
+ - `batch_sampler`: no_duplicates
681
+
682
+ #### All Hyperparameters
683
+ <details><summary>Click to expand</summary>
684
+
685
+ - `overwrite_output_dir`: False
686
+ - `do_predict`: False
687
+ - `eval_strategy`: steps
688
+ - `prediction_loss_only`: True
689
+ - `per_device_train_batch_size`: 128
690
+ - `per_device_eval_batch_size`: 8
691
+ - `per_gpu_train_batch_size`: None
692
+ - `per_gpu_eval_batch_size`: None
693
+ - `gradient_accumulation_steps`: 1
694
+ - `eval_accumulation_steps`: None
695
+ - `torch_empty_cache_steps`: None
696
+ - `learning_rate`: 5e-06
697
+ - `weight_decay`: 0.0
698
+ - `adam_beta1`: 0.9
699
+ - `adam_beta2`: 0.999
700
+ - `adam_epsilon`: 1e-08
701
+ - `max_grad_norm`: 1.0
702
+ - `num_train_epochs`: 1
703
+ - `max_steps`: -1
704
+ - `lr_scheduler_type`: linear
705
+ - `lr_scheduler_kwargs`: {}
706
+ - `warmup_ratio`: 0.2
707
+ - `warmup_steps`: 0
708
+ - `log_level`: passive
709
+ - `log_level_replica`: warning
710
+ - `log_on_each_node`: True
711
+ - `logging_nan_inf_filter`: True
712
+ - `save_safetensors`: True
713
+ - `save_on_each_node`: False
714
+ - `save_only_model`: False
715
+ - `restore_callback_states_from_checkpoint`: False
716
+ - `no_cuda`: False
717
+ - `use_cpu`: False
718
+ - `use_mps_device`: False
719
+ - `seed`: 42
720
+ - `data_seed`: None
721
+ - `jit_mode_eval`: False
722
+ - `use_ipex`: False
723
+ - `bf16`: False
724
+ - `fp16`: True
725
+ - `fp16_opt_level`: O1
726
+ - `half_precision_backend`: auto
727
+ - `bf16_full_eval`: False
728
+ - `fp16_full_eval`: False
729
+ - `tf32`: None
730
+ - `local_rank`: 0
731
+ - `ddp_backend`: None
732
+ - `tpu_num_cores`: None
733
+ - `tpu_metrics_debug`: False
734
+ - `debug`: []
735
+ - `dataloader_drop_last`: False
736
+ - `dataloader_num_workers`: 0
737
+ - `dataloader_prefetch_factor`: None
738
+ - `past_index`: -1
739
+ - `disable_tqdm`: False
740
+ - `remove_unused_columns`: True
741
+ - `label_names`: None
742
+ - `load_best_model_at_end`: False
743
+ - `ignore_data_skip`: False
744
+ - `fsdp`: []
745
+ - `fsdp_min_num_params`: 0
746
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
747
+ - `tp_size`: 0
748
+ - `fsdp_transformer_layer_cls_to_wrap`: None
749
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
750
+ - `deepspeed`: None
751
+ - `label_smoothing_factor`: 0.0
752
+ - `optim`: adamw_torch
753
+ - `optim_args`: None
754
+ - `adafactor`: False
755
+ - `group_by_length`: False
756
+ - `length_column_name`: length
757
+ - `ddp_find_unused_parameters`: None
758
+ - `ddp_bucket_cap_mb`: None
759
+ - `ddp_broadcast_buffers`: False
760
+ - `dataloader_pin_memory`: True
761
+ - `dataloader_persistent_workers`: False
762
+ - `skip_memory_metrics`: True
763
+ - `use_legacy_prediction_loop`: False
764
+ - `push_to_hub`: False
765
+ - `resume_from_checkpoint`: None
766
+ - `hub_model_id`: None
767
+ - `hub_strategy`: every_save
768
+ - `hub_private_repo`: None
769
+ - `hub_always_push`: False
770
+ - `gradient_checkpointing`: False
771
+ - `gradient_checkpointing_kwargs`: None
772
+ - `include_inputs_for_metrics`: False
773
+ - `include_for_metrics`: []
774
+ - `eval_do_concat_batches`: True
775
+ - `fp16_backend`: auto
776
+ - `push_to_hub_model_id`: None
777
+ - `push_to_hub_organization`: None
778
+ - `mp_parameters`:
779
+ - `auto_find_batch_size`: False
780
+ - `full_determinism`: False
781
+ - `torchdynamo`: None
782
+ - `ray_scope`: last
783
+ - `ddp_timeout`: 1800
784
+ - `torch_compile`: False
785
+ - `torch_compile_backend`: None
786
+ - `torch_compile_mode`: None
787
+ - `dispatch_batches`: None
788
+ - `split_batches`: None
789
+ - `include_tokens_per_second`: False
790
+ - `include_num_input_tokens_seen`: False
791
+ - `neftune_noise_alpha`: None
792
+ - `optim_target_modules`: None
793
+ - `batch_eval_metrics`: False
794
+ - `eval_on_start`: False
795
+ - `use_liger_kernel`: False
796
+ - `eval_use_gather_object`: False
797
+ - `average_tokens_across_devices`: False
798
+ - `prompts`: None
799
+ - `batch_sampler`: no_duplicates
800
+ - `multi_dataset_batch_sampler`: proportional
801
+
802
+ </details>
803
+
804
+ ### Training Logs
805
+ | Epoch | Step | Training Loss | mteb/nfcorpus_cosine_ndcg@10 | mteb/trec-covid_cosine_ndcg@10 | mteb/fiqa_cosine_ndcg@10 | mteb/quora_cosine_ndcg@10 |
806
+ |:------:|:----:|:-------------:|:----------------------------:|:------------------------------:|:------------------------:|:-------------------------:|
807
+ | -1 | -1 | - | 0.3714 | 0.8385 | 0.3831 | 0.8889 |
808
+ | 0.0070 | 10 | 0.9492 | - | - | - | - |
809
+ | 0.0140 | 20 | 0.9799 | - | - | - | - |
810
+ | 0.0210 | 30 | 0.84 | - | - | - | - |
811
+ | 0.0280 | 40 | 0.9555 | - | - | - | - |
812
+ | 0.0350 | 50 | 0.9292 | 0.3695 | 0.8401 | 0.3840 | 0.8892 |
813
+ | 0.0420 | 60 | 1.1549 | - | - | - | - |
814
+ | 0.0490 | 70 | 0.8573 | - | - | - | - |
815
+ | 0.0559 | 80 | 0.5784 | - | - | - | - |
816
+ | 0.0629 | 90 | 0.7275 | - | - | - | - |
817
+ | 0.0699 | 100 | 0.4792 | 0.3766 | 0.8457 | 0.3886 | 0.8887 |
818
+ | 0.0769 | 110 | 0.6293 | - | - | - | - |
819
+ | 0.0839 | 120 | 0.5167 | - | - | - | - |
820
+ | 0.0909 | 130 | 0.3838 | - | - | - | - |
821
+ | 0.0979 | 140 | 0.3458 | - | - | - | - |
822
+ | 0.1049 | 150 | 0.4897 | 0.3739 | 0.8494 | 0.3866 | 0.8876 |
823
+ | 0.1119 | 160 | 0.3124 | - | - | - | - |
824
+ | 0.1189 | 170 | 0.4367 | - | - | - | - |
825
+ | 0.1259 | 180 | 0.3565 | - | - | - | - |
826
+ | 0.1329 | 190 | 0.2646 | - | - | - | - |
827
+ | 0.1399 | 200 | 0.2 | 0.3757 | 0.8508 | 0.3852 | 0.8860 |
828
+ | 0.1469 | 210 | 0.2051 | - | - | - | - |
829
+ | 0.1538 | 220 | 0.1248 | - | - | - | - |
830
+ | 0.1608 | 230 | 0.2398 | - | - | - | - |
831
+ | 0.1678 | 240 | 0.1599 | - | - | - | - |
832
+ | 0.1748 | 250 | 0.3251 | 0.3743 | 0.8527 | 0.3840 | 0.8840 |
833
+ | 0.1818 | 260 | 0.263 | - | - | - | - |
834
+ | 0.1888 | 270 | 0.2523 | - | - | - | - |
835
+ | 0.1958 | 280 | 0.2156 | - | - | - | - |
836
+ | 0.2028 | 290 | 0.1587 | - | - | - | - |
837
+ | 0.2098 | 300 | 0.1977 | 0.3777 | 0.8557 | 0.3859 | 0.8830 |
838
+ | 0.2168 | 310 | 0.1544 | - | - | - | - |
839
+ | 0.2238 | 320 | 0.1301 | - | - | - | - |
840
+ | 0.2308 | 330 | 0.1178 | - | - | - | - |
841
+ | 0.2378 | 340 | 0.1084 | - | - | - | - |
842
+ | 0.2448 | 350 | 0.1784 | 0.3800 | 0.8540 | 0.3860 | 0.8821 |
843
+ | 0.2517 | 360 | 0.1541 | - | - | - | - |
844
+ | 0.2587 | 370 | 0.0982 | - | - | - | - |
845
+ | 0.2657 | 380 | 0.1897 | - | - | - | - |
846
+ | 0.2727 | 390 | 0.117 | - | - | - | - |
847
+ | 0.2797 | 400 | 0.1806 | 0.3785 | 0.8458 | 0.3861 | 0.8818 |
848
+ | 0.2867 | 410 | 0.1258 | - | - | - | - |
849
+ | 0.2937 | 420 | 0.1249 | - | - | - | - |
850
+ | 0.3007 | 430 | 0.1987 | - | - | - | - |
851
+ | 0.3077 | 440 | 0.1512 | - | - | - | - |
852
+ | 0.3147 | 450 | 0.1646 | 0.3817 | 0.8422 | 0.3829 | 0.8814 |
853
+ | 0.3217 | 460 | 0.1322 | - | - | - | - |
854
+ | 0.3287 | 470 | 0.1464 | - | - | - | - |
855
+ | 0.3357 | 480 | 0.1488 | - | - | - | - |
856
+ | 0.3427 | 490 | 0.1033 | - | - | - | - |
857
+ | 0.3497 | 500 | 0.1209 | 0.3825 | 0.8435 | 0.3827 | 0.8812 |
858
+
859
+
860
+ ### Framework Versions
861
+ - Python: 3.10.12
862
+ - Sentence Transformers: 3.5.0.dev0
863
+ - Transformers: 4.50.0
864
+ - PyTorch: 2.6.0+cu124
865
+ - Accelerate: 1.5.2
866
+ - Datasets: 2.21.0
867
+ - Tokenizers: 0.21.1
868
+
869
+ ## Citation
870
+
871
+ ### BibTeX
872
+
873
+ #### Sentence Transformers
874
+ ```bibtex
875
+ @inproceedings{reimers-2019-sentence-bert,
876
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
877
+ author = "Reimers, Nils and Gurevych, Iryna",
878
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
879
+ month = "11",
880
+ year = "2019",
881
+ publisher = "Association for Computational Linguistics",
882
+ url = "https://arxiv.org/abs/1908.10084",
883
+ }
884
+ ```
885
+
886
+ <!--
887
+ ## Glossary
888
+
889
+ *Clearly define terms in order to be accessible across audiences.*
890
+ -->
891
+
892
+ <!--
893
+ ## Model Card Authors
894
+
895
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
896
+ -->
897
+
898
+ <!--
899
+ ## Model Card Contact
900
+
901
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
902
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 3072,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.50.0",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.5.0.dev0",
4
+ "transformers": "4.50.0",
5
+ "pytorch": "2.6.0+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c1d360a375b56ce3e1f29e9a8cd07962ad6e61b05cd47695c89ec2388b49c40
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_length": 256,
51
+ "model_max_length": 256,
52
+ "never_split": null,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "[PAD]",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "[SEP]",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "[UNK]"
65
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff