codebyzeb commited on
Commit
29dda54
·
verified ·
1 Parent(s): b3b5b2d

Upload folder using huggingface_hub

Browse files
fw57M_Entropy_thresholdM_500/merges.txt ADDED
@@ -0,0 +1,243 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #version: 0.2
2
+ n g
3
+ e r
4
+ u r
5
+ h e
6
+ n d
7
+ s s
8
+ g h
9
+ g e
10
+ e n
11
+ g r
12
+ u p
13
+ k e
14
+ d y
15
+ d e
16
+ n t
17
+ r d
18
+ d g
19
+ v e
20
+ r t
21
+ a r
22
+ p l
23
+ i n
24
+ l d
25
+ r g
26
+ l y
27
+ r o
28
+ e d
29
+ a n
30
+ m b
31
+ t e
32
+ n c
33
+ l e
34
+ e l
35
+ u l
36
+ u s
37
+ y m
38
+ q u
39
+ g u
40
+ l l
41
+ w n
42
+ r k
43
+ h er
44
+ u m
45
+ i g
46
+ e s
47
+ c e
48
+ e m
49
+ i b
50
+ t r
51
+ o n
52
+ c l
53
+ m a
54
+ s e
55
+ l i
56
+ m e
57
+ t h
58
+ b l
59
+ d en
60
+ g n
61
+ o w
62
+ r c
63
+ a t
64
+ e t
65
+ c h
66
+ d u
67
+ u d
68
+ a l
69
+ u c
70
+ u g
71
+ i v
72
+ b u
73
+ a nd
74
+ u t
75
+ s h
76
+ l a
77
+ e c
78
+ i o
79
+ a rd
80
+ r a
81
+ o m
82
+ o u
83
+ o g
84
+ c t
85
+ o up
86
+ v i
87
+ d ro
88
+ t i
89
+ e nt
90
+ c k
91
+ pl e
92
+ u e
93
+ i ng
94
+ ma l
95
+ o r
96
+ p o
97
+ r e
98
+ l t
99
+ r n
100
+ s t
101
+ o d
102
+ r m
103
+ a g
104
+ s k
105
+ e a
106
+ t o
107
+ i l
108
+ b i
109
+ th e
110
+ e y
111
+ u nd
112
+ i a
113
+ h i
114
+ r i
115
+ ug h
116
+ a m
117
+ c a
118
+ g et
119
+ i t
120
+ i c
121
+ a k
122
+ h t
123
+ r v
124
+ b ut
125
+ a y
126
+ n a
127
+ a s
128
+ a d
129
+ n ce
130
+ u n
131
+ n s
132
+ s u
133
+ i r
134
+ t y
135
+ o v
136
+ t u
137
+ o y
138
+ a p
139
+ g d
140
+ i p
141
+ ro m
142
+ t ur
143
+ ec t
144
+ f f
145
+ i on
146
+ i m
147
+ l o
148
+ p t
149
+ u se
150
+ u i
151
+ i sh
152
+ n e
153
+ d i
154
+ a v
155
+ g a
156
+ n o
157
+ o p
158
+ ge r
159
+ i s
160
+ d rom
161
+ i e
162
+ w e
163
+ p h
164
+ d a
165
+ r r
166
+ m at
167
+ m o
168
+ b e
169
+ a i
170
+ i d
171
+ ll y
172
+ gr a
173
+ u re
174
+ l er
175
+ a c
176
+ h a
177
+ d er
178
+ d o
179
+ n k
180
+ ag e
181
+ s i
182
+ m p
183
+ d l
184
+ ig h
185
+ b a
186
+ o l
187
+ u a
188
+ a b
189
+ the r
190
+ e ss
191
+ rd e
192
+ ve l
193
+ n i
194
+ t em
195
+ r ch
196
+ l em
197
+ c i
198
+ m u
199
+ b er
200
+ m en
201
+ o s
202
+ m n
203
+ k o
204
+ a se
205
+ ve r
206
+ o o
207
+ d r
208
+ i de
209
+ r ap
210
+ a te
211
+ e st
212
+ r ty
213
+ v a
214
+ c u
215
+ e e
216
+ p e
217
+ l f
218
+ f t
219
+ h o
220
+ t t
221
+ t a
222
+ e g
223
+ i cl
224
+ l io
225
+ m i
226
+ c er
227
+ y l
228
+ o k
229
+ o rk
230
+ k a
231
+ f e
232
+ r y
233
+ ul t
234
+ l tur
235
+ l u
236
+ r am
237
+ e the
238
+ e v
239
+ n at
240
+ r s
241
+ t l
242
+ gh t
243
+ t ion
fw57M_Entropy_thresholdM_500/special_tokens_map.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<|endoftext|>",
3
+ "eos_token": "<|endoftext|>",
4
+ "pad_token": "<|padding|>"
5
+ }
fw57M_Entropy_thresholdM_500/stats.csv ADDED
The diff for this file is too large to render. See raw diff
 
fw57M_Entropy_thresholdM_500/tokenizer.json ADDED
@@ -0,0 +1,1533 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
5
+ "added_tokens": [
6
+ {
7
+ "id": 0,
8
+ "content": "<|padding|>",
9
+ "single_word": false,
10
+ "lstrip": false,
11
+ "rstrip": false,
12
+ "normalized": false,
13
+ "special": true
14
+ },
15
+ {
16
+ "id": 1,
17
+ "content": "<|endoftext|>",
18
+ "single_word": false,
19
+ "lstrip": false,
20
+ "rstrip": false,
21
+ "normalized": false,
22
+ "special": true
23
+ }
24
+ ],
25
+ "normalizer": {
26
+ "type": "Sequence",
27
+ "normalizers": [
28
+ {
29
+ "type": "NFD"
30
+ }
31
+ ]
32
+ },
33
+ "pre_tokenizer": {
34
+ "type": "ByteLevel",
35
+ "add_prefix_space": true,
36
+ "trim_offsets": true,
37
+ "use_regex": true
38
+ },
39
+ "post_processor": {
40
+ "type": "ByteLevel",
41
+ "add_prefix_space": true,
42
+ "trim_offsets": true,
43
+ "use_regex": true
44
+ },
45
+ "decoder": {
46
+ "type": "ByteLevel",
47
+ "add_prefix_space": true,
48
+ "trim_offsets": true,
49
+ "use_regex": true
50
+ },
51
+ "model": {
52
+ "type": "BPE",
53
+ "dropout": null,
54
+ "unk_token": null,
55
+ "continuing_subword_prefix": null,
56
+ "end_of_word_suffix": null,
57
+ "fuse_unk": false,
58
+ "byte_fallback": false,
59
+ "ignore_merges": false,
60
+ "vocab": {
61
+ "<|padding|>": 0,
62
+ "<|endoftext|>": 1,
63
+ "!": 2,
64
+ "\"": 3,
65
+ "#": 4,
66
+ "$": 5,
67
+ "%": 6,
68
+ "&": 7,
69
+ "'": 8,
70
+ "(": 9,
71
+ ")": 10,
72
+ "*": 11,
73
+ "+": 12,
74
+ ",": 13,
75
+ "-": 14,
76
+ ".": 15,
77
+ "/": 16,
78
+ "0": 17,
79
+ "1": 18,
80
+ "2": 19,
81
+ "3": 20,
82
+ "4": 21,
83
+ "5": 22,
84
+ "6": 23,
85
+ "7": 24,
86
+ "8": 25,
87
+ "9": 26,
88
+ ":": 27,
89
+ ";": 28,
90
+ "<": 29,
91
+ "=": 30,
92
+ ">": 31,
93
+ "?": 32,
94
+ "@": 33,
95
+ "A": 34,
96
+ "B": 35,
97
+ "C": 36,
98
+ "D": 37,
99
+ "E": 38,
100
+ "F": 39,
101
+ "G": 40,
102
+ "H": 41,
103
+ "I": 42,
104
+ "J": 43,
105
+ "K": 44,
106
+ "L": 45,
107
+ "M": 46,
108
+ "N": 47,
109
+ "O": 48,
110
+ "P": 49,
111
+ "Q": 50,
112
+ "R": 51,
113
+ "S": 52,
114
+ "T": 53,
115
+ "U": 54,
116
+ "V": 55,
117
+ "W": 56,
118
+ "X": 57,
119
+ "Y": 58,
120
+ "Z": 59,
121
+ "[": 60,
122
+ "\\": 61,
123
+ "]": 62,
124
+ "^": 63,
125
+ "_": 64,
126
+ "`": 65,
127
+ "a": 66,
128
+ "b": 67,
129
+ "c": 68,
130
+ "d": 69,
131
+ "e": 70,
132
+ "f": 71,
133
+ "g": 72,
134
+ "h": 73,
135
+ "i": 74,
136
+ "j": 75,
137
+ "k": 76,
138
+ "l": 77,
139
+ "m": 78,
140
+ "n": 79,
141
+ "o": 80,
142
+ "p": 81,
143
+ "q": 82,
144
+ "r": 83,
145
+ "s": 84,
146
+ "t": 85,
147
+ "u": 86,
148
+ "v": 87,
149
+ "w": 88,
150
+ "x": 89,
151
+ "y": 90,
152
+ "z": 91,
153
+ "{": 92,
154
+ "|": 93,
155
+ "}": 94,
156
+ "~": 95,
157
+ "¡": 96,
158
+ "¢": 97,
159
+ "£": 98,
160
+ "¤": 99,
161
+ "¥": 100,
162
+ "¦": 101,
163
+ "§": 102,
164
+ "¨": 103,
165
+ "©": 104,
166
+ "ª": 105,
167
+ "«": 106,
168
+ "¬": 107,
169
+ "®": 108,
170
+ "¯": 109,
171
+ "°": 110,
172
+ "±": 111,
173
+ "²": 112,
174
+ "³": 113,
175
+ "´": 114,
176
+ "µ": 115,
177
+ "¶": 116,
178
+ "·": 117,
179
+ "¸": 118,
180
+ "¹": 119,
181
+ "º": 120,
182
+ "»": 121,
183
+ "¼": 122,
184
+ "½": 123,
185
+ "¾": 124,
186
+ "¿": 125,
187
+ "À": 126,
188
+ "Á": 127,
189
+ "Â": 128,
190
+ "Ã": 129,
191
+ "Ä": 130,
192
+ "Å": 131,
193
+ "Æ": 132,
194
+ "Ç": 133,
195
+ "È": 134,
196
+ "É": 135,
197
+ "Ê": 136,
198
+ "Ë": 137,
199
+ "Ì": 138,
200
+ "Í": 139,
201
+ "Î": 140,
202
+ "Ï": 141,
203
+ "Ð": 142,
204
+ "Ñ": 143,
205
+ "Ò": 144,
206
+ "Ó": 145,
207
+ "Ô": 146,
208
+ "Õ": 147,
209
+ "Ö": 148,
210
+ "×": 149,
211
+ "Ø": 150,
212
+ "Ù": 151,
213
+ "Ú": 152,
214
+ "Û": 153,
215
+ "Ü": 154,
216
+ "Ý": 155,
217
+ "Þ": 156,
218
+ "ß": 157,
219
+ "à": 158,
220
+ "á": 159,
221
+ "â": 160,
222
+ "ã": 161,
223
+ "ä": 162,
224
+ "å": 163,
225
+ "æ": 164,
226
+ "ç": 165,
227
+ "è": 166,
228
+ "é": 167,
229
+ "ê": 168,
230
+ "ë": 169,
231
+ "ì": 170,
232
+ "í": 171,
233
+ "î": 172,
234
+ "ï": 173,
235
+ "ð": 174,
236
+ "ñ": 175,
237
+ "ò": 176,
238
+ "ó": 177,
239
+ "ô": 178,
240
+ "õ": 179,
241
+ "ö": 180,
242
+ "÷": 181,
243
+ "ø": 182,
244
+ "ù": 183,
245
+ "ú": 184,
246
+ "û": 185,
247
+ "ü": 186,
248
+ "ý": 187,
249
+ "þ": 188,
250
+ "ÿ": 189,
251
+ "Ā": 190,
252
+ "ā": 191,
253
+ "Ă": 192,
254
+ "ă": 193,
255
+ "Ą": 194,
256
+ "ą": 195,
257
+ "Ć": 196,
258
+ "ć": 197,
259
+ "Ĉ": 198,
260
+ "ĉ": 199,
261
+ "Ċ": 200,
262
+ "ċ": 201,
263
+ "Č": 202,
264
+ "č": 203,
265
+ "Ď": 204,
266
+ "ď": 205,
267
+ "Đ": 206,
268
+ "đ": 207,
269
+ "Ē": 208,
270
+ "ē": 209,
271
+ "Ĕ": 210,
272
+ "ĕ": 211,
273
+ "Ė": 212,
274
+ "ė": 213,
275
+ "Ę": 214,
276
+ "ę": 215,
277
+ "Ě": 216,
278
+ "ě": 217,
279
+ "Ĝ": 218,
280
+ "ĝ": 219,
281
+ "Ğ": 220,
282
+ "ğ": 221,
283
+ "Ġ": 222,
284
+ "ġ": 223,
285
+ "Ģ": 224,
286
+ "ģ": 225,
287
+ "Ĥ": 226,
288
+ "ĥ": 227,
289
+ "Ħ": 228,
290
+ "ħ": 229,
291
+ "Ĩ": 230,
292
+ "ĩ": 231,
293
+ "Ī": 232,
294
+ "ī": 233,
295
+ "Ĭ": 234,
296
+ "ĭ": 235,
297
+ "Į": 236,
298
+ "į": 237,
299
+ "İ": 238,
300
+ "ı": 239,
301
+ "IJ": 240,
302
+ "ij": 241,
303
+ "Ĵ": 242,
304
+ "ĵ": 243,
305
+ "Ķ": 244,
306
+ "ķ": 245,
307
+ "ĸ": 246,
308
+ "Ĺ": 247,
309
+ "ĺ": 248,
310
+ "Ļ": 249,
311
+ "ļ": 250,
312
+ "Ľ": 251,
313
+ "ľ": 252,
314
+ "Ŀ": 253,
315
+ "ŀ": 254,
316
+ "Ł": 255,
317
+ "ł": 256,
318
+ "Ń": 257,
319
+ "ng": 258,
320
+ "er": 259,
321
+ "ur": 260,
322
+ "he": 261,
323
+ "nd": 262,
324
+ "ss": 263,
325
+ "gh": 264,
326
+ "ge": 265,
327
+ "en": 266,
328
+ "gr": 267,
329
+ "up": 268,
330
+ "ke": 269,
331
+ "dy": 270,
332
+ "de": 271,
333
+ "nt": 272,
334
+ "rd": 273,
335
+ "dg": 274,
336
+ "ve": 275,
337
+ "rt": 276,
338
+ "ar": 277,
339
+ "pl": 278,
340
+ "in": 279,
341
+ "ld": 280,
342
+ "rg": 281,
343
+ "ly": 282,
344
+ "ro": 283,
345
+ "ed": 284,
346
+ "an": 285,
347
+ "mb": 286,
348
+ "te": 287,
349
+ "nc": 288,
350
+ "le": 289,
351
+ "el": 290,
352
+ "ul": 291,
353
+ "us": 292,
354
+ "ym": 293,
355
+ "qu": 294,
356
+ "gu": 295,
357
+ "ll": 296,
358
+ "wn": 297,
359
+ "rk": 298,
360
+ "her": 299,
361
+ "um": 300,
362
+ "ig": 301,
363
+ "es": 302,
364
+ "ce": 303,
365
+ "em": 304,
366
+ "ib": 305,
367
+ "tr": 306,
368
+ "on": 307,
369
+ "cl": 308,
370
+ "ma": 309,
371
+ "se": 310,
372
+ "li": 311,
373
+ "me": 312,
374
+ "th": 313,
375
+ "bl": 314,
376
+ "den": 315,
377
+ "gn": 316,
378
+ "ow": 317,
379
+ "rc": 318,
380
+ "at": 319,
381
+ "et": 320,
382
+ "ch": 321,
383
+ "du": 322,
384
+ "ud": 323,
385
+ "al": 324,
386
+ "uc": 325,
387
+ "ug": 326,
388
+ "iv": 327,
389
+ "bu": 328,
390
+ "and": 329,
391
+ "ut": 330,
392
+ "sh": 331,
393
+ "la": 332,
394
+ "ec": 333,
395
+ "io": 334,
396
+ "ard": 335,
397
+ "ra": 336,
398
+ "om": 337,
399
+ "ou": 338,
400
+ "og": 339,
401
+ "ct": 340,
402
+ "oup": 341,
403
+ "vi": 342,
404
+ "dro": 343,
405
+ "ti": 344,
406
+ "ent": 345,
407
+ "ck": 346,
408
+ "ple": 347,
409
+ "ue": 348,
410
+ "ing": 349,
411
+ "mal": 350,
412
+ "or": 351,
413
+ "po": 352,
414
+ "re": 353,
415
+ "lt": 354,
416
+ "rn": 355,
417
+ "st": 356,
418
+ "od": 357,
419
+ "rm": 358,
420
+ "ag": 359,
421
+ "sk": 360,
422
+ "ea": 361,
423
+ "to": 362,
424
+ "il": 363,
425
+ "bi": 364,
426
+ "the": 365,
427
+ "ey": 366,
428
+ "und": 367,
429
+ "ia": 368,
430
+ "hi": 369,
431
+ "ri": 370,
432
+ "ugh": 371,
433
+ "am": 372,
434
+ "ca": 373,
435
+ "get": 374,
436
+ "it": 375,
437
+ "ic": 376,
438
+ "ak": 377,
439
+ "ht": 378,
440
+ "rv": 379,
441
+ "but": 380,
442
+ "ay": 381,
443
+ "na": 382,
444
+ "as": 383,
445
+ "ad": 384,
446
+ "nce": 385,
447
+ "un": 386,
448
+ "ns": 387,
449
+ "su": 388,
450
+ "ir": 389,
451
+ "ty": 390,
452
+ "ov": 391,
453
+ "tu": 392,
454
+ "oy": 393,
455
+ "ap": 394,
456
+ "gd": 395,
457
+ "ip": 396,
458
+ "rom": 397,
459
+ "tur": 398,
460
+ "ect": 399,
461
+ "ff": 400,
462
+ "ion": 401,
463
+ "im": 402,
464
+ "lo": 403,
465
+ "pt": 404,
466
+ "use": 405,
467
+ "ui": 406,
468
+ "ish": 407,
469
+ "ne": 408,
470
+ "di": 409,
471
+ "av": 410,
472
+ "ga": 411,
473
+ "no": 412,
474
+ "op": 413,
475
+ "ger": 414,
476
+ "is": 415,
477
+ "drom": 416,
478
+ "ie": 417,
479
+ "we": 418,
480
+ "ph": 419,
481
+ "da": 420,
482
+ "rr": 421,
483
+ "mat": 422,
484
+ "mo": 423,
485
+ "be": 424,
486
+ "ai": 425,
487
+ "id": 426,
488
+ "lly": 427,
489
+ "gra": 428,
490
+ "ure": 429,
491
+ "ler": 430,
492
+ "ac": 431,
493
+ "ha": 432,
494
+ "der": 433,
495
+ "do": 434,
496
+ "nk": 435,
497
+ "age": 436,
498
+ "si": 437,
499
+ "mp": 438,
500
+ "dl": 439,
501
+ "igh": 440,
502
+ "ba": 441,
503
+ "ol": 442,
504
+ "ua": 443,
505
+ "ab": 444,
506
+ "ther": 445,
507
+ "ess": 446,
508
+ "rde": 447,
509
+ "vel": 448,
510
+ "ni": 449,
511
+ "tem": 450,
512
+ "rch": 451,
513
+ "lem": 452,
514
+ "ci": 453,
515
+ "mu": 454,
516
+ "ber": 455,
517
+ "men": 456,
518
+ "os": 457,
519
+ "mn": 458,
520
+ "ko": 459,
521
+ "ase": 460,
522
+ "ver": 461,
523
+ "oo": 462,
524
+ "dr": 463,
525
+ "ide": 464,
526
+ "rap": 465,
527
+ "ate": 466,
528
+ "est": 467,
529
+ "rty": 468,
530
+ "va": 469,
531
+ "cu": 470,
532
+ "ee": 471,
533
+ "pe": 472,
534
+ "lf": 473,
535
+ "ft": 474,
536
+ "ho": 475,
537
+ "tt": 476,
538
+ "ta": 477,
539
+ "eg": 478,
540
+ "icl": 479,
541
+ "lio": 480,
542
+ "mi": 481,
543
+ "cer": 482,
544
+ "yl": 483,
545
+ "ok": 484,
546
+ "ork": 485,
547
+ "ka": 486,
548
+ "fe": 487,
549
+ "ry": 488,
550
+ "ult": 489,
551
+ "ltur": 490,
552
+ "lu": 491,
553
+ "ram": 492,
554
+ "ethe": 493,
555
+ "ev": 494,
556
+ "nat": 495,
557
+ "rs": 496,
558
+ "tl": 497,
559
+ "ght": 498,
560
+ "tion": 499
561
+ },
562
+ "merges": [
563
+ [
564
+ "n",
565
+ "g"
566
+ ],
567
+ [
568
+ "e",
569
+ "r"
570
+ ],
571
+ [
572
+ "u",
573
+ "r"
574
+ ],
575
+ [
576
+ "h",
577
+ "e"
578
+ ],
579
+ [
580
+ "n",
581
+ "d"
582
+ ],
583
+ [
584
+ "s",
585
+ "s"
586
+ ],
587
+ [
588
+ "g",
589
+ "h"
590
+ ],
591
+ [
592
+ "g",
593
+ "e"
594
+ ],
595
+ [
596
+ "e",
597
+ "n"
598
+ ],
599
+ [
600
+ "g",
601
+ "r"
602
+ ],
603
+ [
604
+ "u",
605
+ "p"
606
+ ],
607
+ [
608
+ "k",
609
+ "e"
610
+ ],
611
+ [
612
+ "d",
613
+ "y"
614
+ ],
615
+ [
616
+ "d",
617
+ "e"
618
+ ],
619
+ [
620
+ "n",
621
+ "t"
622
+ ],
623
+ [
624
+ "r",
625
+ "d"
626
+ ],
627
+ [
628
+ "d",
629
+ "g"
630
+ ],
631
+ [
632
+ "v",
633
+ "e"
634
+ ],
635
+ [
636
+ "r",
637
+ "t"
638
+ ],
639
+ [
640
+ "a",
641
+ "r"
642
+ ],
643
+ [
644
+ "p",
645
+ "l"
646
+ ],
647
+ [
648
+ "i",
649
+ "n"
650
+ ],
651
+ [
652
+ "l",
653
+ "d"
654
+ ],
655
+ [
656
+ "r",
657
+ "g"
658
+ ],
659
+ [
660
+ "l",
661
+ "y"
662
+ ],
663
+ [
664
+ "r",
665
+ "o"
666
+ ],
667
+ [
668
+ "e",
669
+ "d"
670
+ ],
671
+ [
672
+ "a",
673
+ "n"
674
+ ],
675
+ [
676
+ "m",
677
+ "b"
678
+ ],
679
+ [
680
+ "t",
681
+ "e"
682
+ ],
683
+ [
684
+ "n",
685
+ "c"
686
+ ],
687
+ [
688
+ "l",
689
+ "e"
690
+ ],
691
+ [
692
+ "e",
693
+ "l"
694
+ ],
695
+ [
696
+ "u",
697
+ "l"
698
+ ],
699
+ [
700
+ "u",
701
+ "s"
702
+ ],
703
+ [
704
+ "y",
705
+ "m"
706
+ ],
707
+ [
708
+ "q",
709
+ "u"
710
+ ],
711
+ [
712
+ "g",
713
+ "u"
714
+ ],
715
+ [
716
+ "l",
717
+ "l"
718
+ ],
719
+ [
720
+ "w",
721
+ "n"
722
+ ],
723
+ [
724
+ "r",
725
+ "k"
726
+ ],
727
+ [
728
+ "h",
729
+ "er"
730
+ ],
731
+ [
732
+ "u",
733
+ "m"
734
+ ],
735
+ [
736
+ "i",
737
+ "g"
738
+ ],
739
+ [
740
+ "e",
741
+ "s"
742
+ ],
743
+ [
744
+ "c",
745
+ "e"
746
+ ],
747
+ [
748
+ "e",
749
+ "m"
750
+ ],
751
+ [
752
+ "i",
753
+ "b"
754
+ ],
755
+ [
756
+ "t",
757
+ "r"
758
+ ],
759
+ [
760
+ "o",
761
+ "n"
762
+ ],
763
+ [
764
+ "c",
765
+ "l"
766
+ ],
767
+ [
768
+ "m",
769
+ "a"
770
+ ],
771
+ [
772
+ "s",
773
+ "e"
774
+ ],
775
+ [
776
+ "l",
777
+ "i"
778
+ ],
779
+ [
780
+ "m",
781
+ "e"
782
+ ],
783
+ [
784
+ "t",
785
+ "h"
786
+ ],
787
+ [
788
+ "b",
789
+ "l"
790
+ ],
791
+ [
792
+ "d",
793
+ "en"
794
+ ],
795
+ [
796
+ "g",
797
+ "n"
798
+ ],
799
+ [
800
+ "o",
801
+ "w"
802
+ ],
803
+ [
804
+ "r",
805
+ "c"
806
+ ],
807
+ [
808
+ "a",
809
+ "t"
810
+ ],
811
+ [
812
+ "e",
813
+ "t"
814
+ ],
815
+ [
816
+ "c",
817
+ "h"
818
+ ],
819
+ [
820
+ "d",
821
+ "u"
822
+ ],
823
+ [
824
+ "u",
825
+ "d"
826
+ ],
827
+ [
828
+ "a",
829
+ "l"
830
+ ],
831
+ [
832
+ "u",
833
+ "c"
834
+ ],
835
+ [
836
+ "u",
837
+ "g"
838
+ ],
839
+ [
840
+ "i",
841
+ "v"
842
+ ],
843
+ [
844
+ "b",
845
+ "u"
846
+ ],
847
+ [
848
+ "a",
849
+ "nd"
850
+ ],
851
+ [
852
+ "u",
853
+ "t"
854
+ ],
855
+ [
856
+ "s",
857
+ "h"
858
+ ],
859
+ [
860
+ "l",
861
+ "a"
862
+ ],
863
+ [
864
+ "e",
865
+ "c"
866
+ ],
867
+ [
868
+ "i",
869
+ "o"
870
+ ],
871
+ [
872
+ "a",
873
+ "rd"
874
+ ],
875
+ [
876
+ "r",
877
+ "a"
878
+ ],
879
+ [
880
+ "o",
881
+ "m"
882
+ ],
883
+ [
884
+ "o",
885
+ "u"
886
+ ],
887
+ [
888
+ "o",
889
+ "g"
890
+ ],
891
+ [
892
+ "c",
893
+ "t"
894
+ ],
895
+ [
896
+ "o",
897
+ "up"
898
+ ],
899
+ [
900
+ "v",
901
+ "i"
902
+ ],
903
+ [
904
+ "d",
905
+ "ro"
906
+ ],
907
+ [
908
+ "t",
909
+ "i"
910
+ ],
911
+ [
912
+ "e",
913
+ "nt"
914
+ ],
915
+ [
916
+ "c",
917
+ "k"
918
+ ],
919
+ [
920
+ "pl",
921
+ "e"
922
+ ],
923
+ [
924
+ "u",
925
+ "e"
926
+ ],
927
+ [
928
+ "i",
929
+ "ng"
930
+ ],
931
+ [
932
+ "ma",
933
+ "l"
934
+ ],
935
+ [
936
+ "o",
937
+ "r"
938
+ ],
939
+ [
940
+ "p",
941
+ "o"
942
+ ],
943
+ [
944
+ "r",
945
+ "e"
946
+ ],
947
+ [
948
+ "l",
949
+ "t"
950
+ ],
951
+ [
952
+ "r",
953
+ "n"
954
+ ],
955
+ [
956
+ "s",
957
+ "t"
958
+ ],
959
+ [
960
+ "o",
961
+ "d"
962
+ ],
963
+ [
964
+ "r",
965
+ "m"
966
+ ],
967
+ [
968
+ "a",
969
+ "g"
970
+ ],
971
+ [
972
+ "s",
973
+ "k"
974
+ ],
975
+ [
976
+ "e",
977
+ "a"
978
+ ],
979
+ [
980
+ "t",
981
+ "o"
982
+ ],
983
+ [
984
+ "i",
985
+ "l"
986
+ ],
987
+ [
988
+ "b",
989
+ "i"
990
+ ],
991
+ [
992
+ "th",
993
+ "e"
994
+ ],
995
+ [
996
+ "e",
997
+ "y"
998
+ ],
999
+ [
1000
+ "u",
1001
+ "nd"
1002
+ ],
1003
+ [
1004
+ "i",
1005
+ "a"
1006
+ ],
1007
+ [
1008
+ "h",
1009
+ "i"
1010
+ ],
1011
+ [
1012
+ "r",
1013
+ "i"
1014
+ ],
1015
+ [
1016
+ "ug",
1017
+ "h"
1018
+ ],
1019
+ [
1020
+ "a",
1021
+ "m"
1022
+ ],
1023
+ [
1024
+ "c",
1025
+ "a"
1026
+ ],
1027
+ [
1028
+ "g",
1029
+ "et"
1030
+ ],
1031
+ [
1032
+ "i",
1033
+ "t"
1034
+ ],
1035
+ [
1036
+ "i",
1037
+ "c"
1038
+ ],
1039
+ [
1040
+ "a",
1041
+ "k"
1042
+ ],
1043
+ [
1044
+ "h",
1045
+ "t"
1046
+ ],
1047
+ [
1048
+ "r",
1049
+ "v"
1050
+ ],
1051
+ [
1052
+ "b",
1053
+ "ut"
1054
+ ],
1055
+ [
1056
+ "a",
1057
+ "y"
1058
+ ],
1059
+ [
1060
+ "n",
1061
+ "a"
1062
+ ],
1063
+ [
1064
+ "a",
1065
+ "s"
1066
+ ],
1067
+ [
1068
+ "a",
1069
+ "d"
1070
+ ],
1071
+ [
1072
+ "n",
1073
+ "ce"
1074
+ ],
1075
+ [
1076
+ "u",
1077
+ "n"
1078
+ ],
1079
+ [
1080
+ "n",
1081
+ "s"
1082
+ ],
1083
+ [
1084
+ "s",
1085
+ "u"
1086
+ ],
1087
+ [
1088
+ "i",
1089
+ "r"
1090
+ ],
1091
+ [
1092
+ "t",
1093
+ "y"
1094
+ ],
1095
+ [
1096
+ "o",
1097
+ "v"
1098
+ ],
1099
+ [
1100
+ "t",
1101
+ "u"
1102
+ ],
1103
+ [
1104
+ "o",
1105
+ "y"
1106
+ ],
1107
+ [
1108
+ "a",
1109
+ "p"
1110
+ ],
1111
+ [
1112
+ "g",
1113
+ "d"
1114
+ ],
1115
+ [
1116
+ "i",
1117
+ "p"
1118
+ ],
1119
+ [
1120
+ "ro",
1121
+ "m"
1122
+ ],
1123
+ [
1124
+ "t",
1125
+ "ur"
1126
+ ],
1127
+ [
1128
+ "ec",
1129
+ "t"
1130
+ ],
1131
+ [
1132
+ "f",
1133
+ "f"
1134
+ ],
1135
+ [
1136
+ "i",
1137
+ "on"
1138
+ ],
1139
+ [
1140
+ "i",
1141
+ "m"
1142
+ ],
1143
+ [
1144
+ "l",
1145
+ "o"
1146
+ ],
1147
+ [
1148
+ "p",
1149
+ "t"
1150
+ ],
1151
+ [
1152
+ "u",
1153
+ "se"
1154
+ ],
1155
+ [
1156
+ "u",
1157
+ "i"
1158
+ ],
1159
+ [
1160
+ "i",
1161
+ "sh"
1162
+ ],
1163
+ [
1164
+ "n",
1165
+ "e"
1166
+ ],
1167
+ [
1168
+ "d",
1169
+ "i"
1170
+ ],
1171
+ [
1172
+ "a",
1173
+ "v"
1174
+ ],
1175
+ [
1176
+ "g",
1177
+ "a"
1178
+ ],
1179
+ [
1180
+ "n",
1181
+ "o"
1182
+ ],
1183
+ [
1184
+ "o",
1185
+ "p"
1186
+ ],
1187
+ [
1188
+ "ge",
1189
+ "r"
1190
+ ],
1191
+ [
1192
+ "i",
1193
+ "s"
1194
+ ],
1195
+ [
1196
+ "d",
1197
+ "rom"
1198
+ ],
1199
+ [
1200
+ "i",
1201
+ "e"
1202
+ ],
1203
+ [
1204
+ "w",
1205
+ "e"
1206
+ ],
1207
+ [
1208
+ "p",
1209
+ "h"
1210
+ ],
1211
+ [
1212
+ "d",
1213
+ "a"
1214
+ ],
1215
+ [
1216
+ "r",
1217
+ "r"
1218
+ ],
1219
+ [
1220
+ "m",
1221
+ "at"
1222
+ ],
1223
+ [
1224
+ "m",
1225
+ "o"
1226
+ ],
1227
+ [
1228
+ "b",
1229
+ "e"
1230
+ ],
1231
+ [
1232
+ "a",
1233
+ "i"
1234
+ ],
1235
+ [
1236
+ "i",
1237
+ "d"
1238
+ ],
1239
+ [
1240
+ "ll",
1241
+ "y"
1242
+ ],
1243
+ [
1244
+ "gr",
1245
+ "a"
1246
+ ],
1247
+ [
1248
+ "u",
1249
+ "re"
1250
+ ],
1251
+ [
1252
+ "l",
1253
+ "er"
1254
+ ],
1255
+ [
1256
+ "a",
1257
+ "c"
1258
+ ],
1259
+ [
1260
+ "h",
1261
+ "a"
1262
+ ],
1263
+ [
1264
+ "d",
1265
+ "er"
1266
+ ],
1267
+ [
1268
+ "d",
1269
+ "o"
1270
+ ],
1271
+ [
1272
+ "n",
1273
+ "k"
1274
+ ],
1275
+ [
1276
+ "ag",
1277
+ "e"
1278
+ ],
1279
+ [
1280
+ "s",
1281
+ "i"
1282
+ ],
1283
+ [
1284
+ "m",
1285
+ "p"
1286
+ ],
1287
+ [
1288
+ "d",
1289
+ "l"
1290
+ ],
1291
+ [
1292
+ "ig",
1293
+ "h"
1294
+ ],
1295
+ [
1296
+ "b",
1297
+ "a"
1298
+ ],
1299
+ [
1300
+ "o",
1301
+ "l"
1302
+ ],
1303
+ [
1304
+ "u",
1305
+ "a"
1306
+ ],
1307
+ [
1308
+ "a",
1309
+ "b"
1310
+ ],
1311
+ [
1312
+ "the",
1313
+ "r"
1314
+ ],
1315
+ [
1316
+ "e",
1317
+ "ss"
1318
+ ],
1319
+ [
1320
+ "rd",
1321
+ "e"
1322
+ ],
1323
+ [
1324
+ "ve",
1325
+ "l"
1326
+ ],
1327
+ [
1328
+ "n",
1329
+ "i"
1330
+ ],
1331
+ [
1332
+ "t",
1333
+ "em"
1334
+ ],
1335
+ [
1336
+ "r",
1337
+ "ch"
1338
+ ],
1339
+ [
1340
+ "l",
1341
+ "em"
1342
+ ],
1343
+ [
1344
+ "c",
1345
+ "i"
1346
+ ],
1347
+ [
1348
+ "m",
1349
+ "u"
1350
+ ],
1351
+ [
1352
+ "b",
1353
+ "er"
1354
+ ],
1355
+ [
1356
+ "m",
1357
+ "en"
1358
+ ],
1359
+ [
1360
+ "o",
1361
+ "s"
1362
+ ],
1363
+ [
1364
+ "m",
1365
+ "n"
1366
+ ],
1367
+ [
1368
+ "k",
1369
+ "o"
1370
+ ],
1371
+ [
1372
+ "a",
1373
+ "se"
1374
+ ],
1375
+ [
1376
+ "ve",
1377
+ "r"
1378
+ ],
1379
+ [
1380
+ "o",
1381
+ "o"
1382
+ ],
1383
+ [
1384
+ "d",
1385
+ "r"
1386
+ ],
1387
+ [
1388
+ "i",
1389
+ "de"
1390
+ ],
1391
+ [
1392
+ "r",
1393
+ "ap"
1394
+ ],
1395
+ [
1396
+ "a",
1397
+ "te"
1398
+ ],
1399
+ [
1400
+ "e",
1401
+ "st"
1402
+ ],
1403
+ [
1404
+ "r",
1405
+ "ty"
1406
+ ],
1407
+ [
1408
+ "v",
1409
+ "a"
1410
+ ],
1411
+ [
1412
+ "c",
1413
+ "u"
1414
+ ],
1415
+ [
1416
+ "e",
1417
+ "e"
1418
+ ],
1419
+ [
1420
+ "p",
1421
+ "e"
1422
+ ],
1423
+ [
1424
+ "l",
1425
+ "f"
1426
+ ],
1427
+ [
1428
+ "f",
1429
+ "t"
1430
+ ],
1431
+ [
1432
+ "h",
1433
+ "o"
1434
+ ],
1435
+ [
1436
+ "t",
1437
+ "t"
1438
+ ],
1439
+ [
1440
+ "t",
1441
+ "a"
1442
+ ],
1443
+ [
1444
+ "e",
1445
+ "g"
1446
+ ],
1447
+ [
1448
+ "i",
1449
+ "cl"
1450
+ ],
1451
+ [
1452
+ "l",
1453
+ "io"
1454
+ ],
1455
+ [
1456
+ "m",
1457
+ "i"
1458
+ ],
1459
+ [
1460
+ "c",
1461
+ "er"
1462
+ ],
1463
+ [
1464
+ "y",
1465
+ "l"
1466
+ ],
1467
+ [
1468
+ "o",
1469
+ "k"
1470
+ ],
1471
+ [
1472
+ "o",
1473
+ "rk"
1474
+ ],
1475
+ [
1476
+ "k",
1477
+ "a"
1478
+ ],
1479
+ [
1480
+ "f",
1481
+ "e"
1482
+ ],
1483
+ [
1484
+ "r",
1485
+ "y"
1486
+ ],
1487
+ [
1488
+ "ul",
1489
+ "t"
1490
+ ],
1491
+ [
1492
+ "l",
1493
+ "tur"
1494
+ ],
1495
+ [
1496
+ "l",
1497
+ "u"
1498
+ ],
1499
+ [
1500
+ "r",
1501
+ "am"
1502
+ ],
1503
+ [
1504
+ "e",
1505
+ "the"
1506
+ ],
1507
+ [
1508
+ "e",
1509
+ "v"
1510
+ ],
1511
+ [
1512
+ "n",
1513
+ "at"
1514
+ ],
1515
+ [
1516
+ "r",
1517
+ "s"
1518
+ ],
1519
+ [
1520
+ "t",
1521
+ "l"
1522
+ ],
1523
+ [
1524
+ "gh",
1525
+ "t"
1526
+ ],
1527
+ [
1528
+ "t",
1529
+ "ion"
1530
+ ]
1531
+ ]
1532
+ }
1533
+ }
fw57M_Entropy_thresholdM_500/tokenizer_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": true,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<|padding|>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<|endoftext|>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ }
20
+ },
21
+ "bos_token": "<|endoftext|>",
22
+ "clean_up_tokenization_spaces": false,
23
+ "eos_token": "<|endoftext|>",
24
+ "extra_special_tokens": {},
25
+ "model_max_length": 1000000000000000019884624838656,
26
+ "pad_token": "<|padding|>",
27
+ "tokenizer_class": "PreTrainedTokenizer",
28
+ "unk_token": null
29
+ }
fw57M_Entropy_thresholdM_500/vocab.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"g": 72, "\u012b": 233, "c": 68, "\u00e3": 161, "O": 48, "\u00bd": 123, "\u00a4": 99, "\u0133": 241, "\u011f": 221, "!": 2, "0": 17, "\u00c9": 135, "\u010a": 200, "\u0131": 239, "`": 65, "\u00a5": 100, "C": 36, "\u00c1": 127, "j": 75, "#": 4, "\u00c4": 130, "\u0130": 238, "\u0120": 222, "\u00b4": 114, "\u00d5": 147, "\u00f4": 178, "\u00dc": 154, "\u00e1": 159, "\u00a2": 97, "H": 41, "d": 69, "%": 6, "Y": 58, "\u0107": 197, "\u0119": 215, "\u011c": 218, "\u00db": 153, "\u0129": 231, "\u013e": 252, "\u00b5": 115, "\u00d6": 148, "\u00f2": 176, "\u00fd": 187, "\u00fe": 188, "\u011d": 219, "\u00b3": 113, "\u0142": 256, "5": 22, "T": 53, "\u00d9": 151, "\u0113": 209, "\u00a1": 96, "\u0135": 243, "a": 66, "<|padding|>": 0, "8": 25, "|": 93, "S": 52, "\u00ba": 120, "W": 56, "\u0139": 247, "\u0140": 254, "X": 57, "E": 38, "\u00dd": 155, "\u00ab": 106, "\u00d8": 150, "\u0100": 190, ":": 27, "K": 44, "\u00a3": 98, "f": 71, "}": 94, "U": 54, "\u0105": 195, "\u00aa": 105, "h": 73, "\u013a": 248, "\u00be": 124, "\u00c5": 131, "P": 49, "s": 84, "\u0125": 227, "9": 26, "{": 92, "*": 11, "-": 14, "N": 47, "$": 5, "\u00ce": 140, "A": 34, "\u00f7": 181, "\u00fc": 186, "<": 29, "\u0132": 240, "V": 55, "(": 9, "L": 45, "l": 77, "\u00d1": 143, "r": 83, "\u00ef": 173, "G": 40, "\u00d7": 149, "J": 43, "\u0137": 245, "@": 33, "R": 51, "\u00f9": 183, "B": 35, "\u00bc": 122, "7": 24, "\u00ac": 107, "\u00e9": 167, "\u00b0": 110, "n": 79, "u": 86, "~": 95, "\u00f5": 179, "\u011a": 216, "\u011b": 217, "\u0134": 242, "\u0114": 210, "\u00a7": 102, "\u00f3": 177, "\u00eb": 169, "\u012f": 237, "y": 90, "\u00b8": 118, "\u00f6": 180, "\u00d4": 146, "\u00e4": 162, "\u00ee": 172, "\u00b1": 111, "1": 18, ".": 15, "\u00cd": 139, "\u013c": 250, "D": 37, "\u012d": 235, "m": 78, "\u00e7": 165, "\u013f": 253, "\u0141": 255, "\u00e2": 160, "+": 12, "<|endoftext|>": 1, "\u0118": 214, "\u00c0": 126, "\u00cb": 137, "\u00d2": 144, ";": 28, "z": 91, "F": 39, "\u00d0": 142, "\u0143": 257, "'": 8, "[": 60, "\u00df": 157, "\u00fb": 185, "\u00ed": 171, "\u0109": 199, "i": 74, "\u0110": 206, "\u0103": 193, "\u012e": 236, "/": 16, "\u00ff": 189, "\u00fa": 184, "\u0111": 207, "\u013b": 249, "\u013d": 251, "\u00b9": 119, "\u00da": 152, "\\": 61, "q": 82, "\u00f0": 174, "\u0115": 211, "\u00a6": 101, "\u0121": 223, "\u0112": 208, ",": 13, ">": 31, "v": 87, "\u0123": 225, "&": 7, "\u00cc": 138, "\u00c2": 128, "k": 76, "e": 70, "\u010c": 202, "\u012a": 232, "p": 81, "\u00e8": 166, "^": 63, "\u00b6": 116, "I": 42, "\u0126": 228, "Z": 59, "M": 46, "\u00b2": 112, "\u011e": 220, "\u00c3": 129, "\u00ea": 168, "\u00ec": 170, "\u012c": 234, "x": 89, "\u00bb": 121, "\u0102": 192, "w": 88, "\u00b7": 117, "_": 64, "\u0136": 244, "\u00c6": 132, "\u0122": 224, "o": 80, "\u00c8": 134, "\"": 3, "\u0104": 194, "\u0138": 246, "2": 19, "\u00c7": 133, "\u00ca": 136, "\u00f1": 175, ")": 10, "\u0106": 196, "]": 62, "\u00f8": 182, "t": 85, "\u0116": 212, "\u0117": 213, "\u010f": 205, "\u00bf": 125, "\u0101": 191, "=": 30, "\u00ae": 108, "\u0128": 230, "4": 21, "6": 23, "?": 32, "\u00cf": 141, "\u00a8": 103, "b": 67, "\u0108": 198, "\u00e0": 158, "\u00af": 109, "\u0124": 226, "3": 20, "\u00e6": 164, "\u0127": 229, "\u010e": 204, "\u00de": 156, "\u010b": 201, "\u00a9": 104, "Q": 50, "\u00e5": 163, "\u00d3": 145, "\u010d": 203, "ng": 258, "er": 259, "ur": 260, "he": 261, "nd": 262, "ss": 263, "gh": 264, "ge": 265, "en": 266, "gr": 267, "up": 268, "ke": 269, "dy": 270, "de": 271, "nt": 272, "rd": 273, "dg": 274, "ve": 275, "rt": 276, "ar": 277, "pl": 278, "in": 279, "ld": 280, "rg": 281, "ly": 282, "ro": 283, "ed": 284, "an": 285, "mb": 286, "te": 287, "nc": 288, "le": 289, "el": 290, "ul": 291, "us": 292, "ym": 293, "qu": 294, "gu": 295, "ll": 296, "wn": 297, "rk": 298, "her": 299, "um": 300, "ig": 301, "es": 302, "ce": 303, "em": 304, "ib": 305, "tr": 306, "on": 307, "cl": 308, "ma": 309, "se": 310, "li": 311, "me": 312, "th": 313, "bl": 314, "den": 315, "gn": 316, "ow": 317, "rc": 318, "at": 319, "et": 320, "ch": 321, "du": 322, "ud": 323, "al": 324, "uc": 325, "ug": 326, "iv": 327, "bu": 328, "and": 329, "ut": 330, "sh": 331, "la": 332, "ec": 333, "io": 334, "ard": 335, "ra": 336, "om": 337, "ou": 338, "og": 339, "ct": 340, "oup": 341, "vi": 342, "dro": 343, "ti": 344, "ent": 345, "ck": 346, "ple": 347, "ue": 348, "ing": 349, "mal": 350, "or": 351, "po": 352, "re": 353, "lt": 354, "rn": 355, "st": 356, "od": 357, "rm": 358, "ag": 359, "sk": 360, "ea": 361, "to": 362, "il": 363, "bi": 364, "the": 365, "ey": 366, "und": 367, "ia": 368, "hi": 369, "ri": 370, "ugh": 371, "am": 372, "ca": 373, "get": 374, "it": 375, "ic": 376, "ak": 377, "ht": 378, "rv": 379, "but": 380, "ay": 381, "na": 382, "as": 383, "ad": 384, "nce": 385, "un": 386, "ns": 387, "su": 388, "ir": 389, "ty": 390, "ov": 391, "tu": 392, "oy": 393, "ap": 394, "gd": 395, "ip": 396, "rom": 397, "tur": 398, "ect": 399, "ff": 400, "ion": 401, "im": 402, "lo": 403, "pt": 404, "use": 405, "ui": 406, "ish": 407, "ne": 408, "di": 409, "av": 410, "ga": 411, "no": 412, "op": 413, "ger": 414, "is": 415, "drom": 416, "ie": 417, "we": 418, "ph": 419, "da": 420, "rr": 421, "mat": 422, "mo": 423, "be": 424, "ai": 425, "id": 426, "lly": 427, "gra": 428, "ure": 429, "ler": 430, "ac": 431, "ha": 432, "der": 433, "do": 434, "nk": 435, "age": 436, "si": 437, "mp": 438, "dl": 439, "igh": 440, "ba": 441, "ol": 442, "ua": 443, "ab": 444, "ther": 445, "ess": 446, "rde": 447, "vel": 448, "ni": 449, "tem": 450, "rch": 451, "lem": 452, "ci": 453, "mu": 454, "ber": 455, "men": 456, "os": 457, "mn": 458, "ko": 459, "ase": 460, "ver": 461, "oo": 462, "dr": 463, "ide": 464, "rap": 465, "ate": 466, "est": 467, "rty": 468, "va": 469, "cu": 470, "ee": 471, "pe": 472, "lf": 473, "ft": 474, "ho": 475, "tt": 476, "ta": 477, "eg": 478, "icl": 479, "lio": 480, "mi": 481, "cer": 482, "yl": 483, "ok": 484, "ork": 485, "ka": 486, "fe": 487, "ry": 488, "ult": 489, "ltur": 490, "lu": 491, "ram": 492, "ethe": 493, "ev": 494, "nat": 495, "rs": 496, "tl": 497, "ght": 498, "tion": 499}