Upload tokenizer

Browse files

Files changed (4) hide show

README.md +199 -0
special_tokens_map.json +9 -0
tokenizer.json +266 -0
tokenizer_config.json +48 -0

README.md ADDED Viewed

	@@ -0,0 +1,199 @@

+---
+library_name: transformers
+tags: []
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,266 @@

+{
+  "version": "1.0",
+  "truncation": null,
+  "padding": null,
+  "added_tokens": [
+    {
+      "id": 0,
+      "content": "[UNK]",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 1,
+      "content": "[CLS]",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 2,
+      "content": "[SEP]",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 3,
+      "content": "[PAD]",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    {
+      "id": 4,
+      "content": "[MASK]",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    }
+  ],
+  "normalizer": null,
+  "pre_tokenizer": {
+    "type": "WhitespaceSplit"
+  },
+  "post_processor": null,
+  "decoder": null,
+  "model": {
+    "type": "WordLevel",
+    "vocab": {
+      "[UNK]": 0,
+      "[CLS]": 1,
+      "[SEP]": 2,
+      "[PAD]": 3,
+      "[MASK]": 4,
+      "Velocity_67": 5,
+      "Velocity_63": 6,
+      "Velocity_59": 7,
+      "Velocity_71": 8,
+      "Duration_0.4.8": 9,
+      "Duration_0.5.8": 10,
+      "Duration_1.0.8": 11,
+      "Velocity_55": 12,
+      "Duration_1.1.8": 13,
+      "Bar_None": 14,
+      "Position_0": 15,
+      "Position_16": 16,
+      "Position_24": 17,
+      "Position_8": 18,
+      "Velocity_51": 19,
+      "Duration_2.0.8": 20,
+      "Position_12": 21,
+      "Position_28": 22,
+      "Velocity_75": 23,
+      "Duration_0.6.8": 24,
+      "Position_20": 25,
+      "Duration_2.1.8": 26,
+      "Position_4": 27,
+      "Duration_1.2.8": 28,
+      "Duration_1.4.8": 29,
+      "Velocity_47": 30,
+      "Duration_0.1.8": 31,
+      "Pitch_55": 32,
+      "Pitch_43": 33,
+      "Pitch_72": 34,
+      "Pitch_74": 35,
+      "Duration_1.5.8": 36,
+      "Pitch_57": 37,
+      "Pitch_52": 38,
+      "Pitch_60": 39,
+      "Pitch_53": 40,
+      "Duration_0.2.8": 41,
+      "Pitch_54": 42,
+      "Pitch_76": 43,
+      "Pitch_69": 44,
+      "Pitch_48": 45,
+      "Pitch_67": 46,
+      "Pitch_62": 47,
+      "Pitch_71": 48,
+      "Pitch_59": 49,
+      "Pitch_45": 50,
+      "Pitch_50": 51,
+      "Pitch_70": 52,
+      "Duration_0.3.8": 53,
+      "Pitch_77": 54,
+      "Duration_0.7.8": 55,
+      "Pitch_64": 56,
+      "Pitch_81": 57,
+      "Pitch_58": 58,
+      "Pitch_79": 59,
+      "Pitch_73": 60,
+      "Pitch_46": 61,
+      "Pitch_56": 62,
+      "Pitch_51": 63,
+      "Pitch_44": 64,
+      "Pitch_63": 65,
+      "Pitch_68": 66,
+      "Pitch_61": 67,
+      "Pitch_78": 68,
+      "Pitch_47": 69,
+      "Pitch_75": 70,
+      "Pitch_49": 71,
+      "Pitch_65": 72,
+      "Duration_2.2.8": 73,
+      "Duration_1.3.8": 74,
+      "Pitch_66": 75,
+      "Pitch_82": 76,
+      "Duration_3.1.8": 77,
+      "Pitch_80": 78,
+      "Pitch_41": 79,
+      "Duration_1.6.8": 80,
+      "Duration_4.1.4": 81,
+      "Velocity_79": 82,
+      "Duration_3.0.8": 83,
+      "Duration_2.4.8": 84,
+      "Pitch_42": 85,
+      "Duration_2.5.8": 86,
+      "Velocity_43": 87,
+      "Pitch_83": 88,
+      "Duration_1.7.8": 89,
+      "Duration_4.0.4": 90,
+      "Pitch_40": 91,
+      "Pitch_84": 92,
+      "Duration_2.3.8": 93,
+      "Duration_3.2.8": 94,
+      "Pitch_38": 95,
+      "Position_6": 96,
+      "Duration_2.6.8": 97,
+      "Pitch_39": 98,
+      "Position_31": 99,
+      "Position_22": 100,
+      "Pitch_36": 101,
+      "Position_14": 102,
+      "Position_30": 103,
+      "Pitch_86": 104,
+      "Position_15": 105,
+      "Duration_3.5.8": 106,
+      "Duration_3.4.8": 107,
+      "Duration_2.7.8": 108,
+      "Position_7": 109,
+      "Pitch_85": 110,
+      "Position_23": 111,
+      "Duration_4.2.4": 112,
+      "Duration_5.1.4": 113,
+      "Pitch_37": 114,
+      "Position_26": 115,
+      "Duration_3.3.8": 116,
+      "Position_10": 117,
+      "Position_18": 118,
+      "Position_27": 119,
+      "Duration_4.3.4": 120,
+      "Position_2": 121,
+      "Duration_5.0.4": 122,
+      "Position_11": 123,
+      "Duration_3.6.8": 124,
+      "Duration_3.7.8": 125,
+      "Pitch_87": 126,
+      "Velocity_39": 127,
+      "Position_3": 128,
+      "Duration_6.1.4": 129,
+      "Position_19": 130,
+      "Pitch_88": 131,
+      "Position_21": 132,
+      "Pitch_35": 133,
+      "Velocity_83": 134,
+      "Duration_6.0.4": 135,
+      "Pitch_89": 136,
+      "Position_5": 137,
+      "Position_29": 138,
+      "Position_1": 139,
+      "Position_13": 140,
+      "Duration_5.2.4": 141,
+      "Pitch_34": 142,
+      "Pitch_91": 143,
+      "Duration_5.3.4": 144,
+      "Pitch_33": 145,
+      "Pitch_90": 146,
+      "Duration_7.1.4": 147,
+      "Duration_8.1.4": 148,
+      "Position_17": 149,
+      "Pitch_31": 150,
+      "Duration_8.0.4": 151,
+      "Position_25": 152,
+      "Duration_6.2.4": 153,
+      "Position_9": 154,
+      "Duration_7.0.4": 155,
+      "Duration_6.3.4": 156,
+      "Pitch_93": 157,
+      "Duration_12.0.4": 158,
+      "Pitch_32": 159,
+      "Duration_7.2.4": 160,
+      "Pitch_92": 161,
+      "Duration_7.3.4": 162,
+      "Velocity_35": 163,
+      "Pitch_94": 164,
+      "Duration_8.2.4": 165,
+      "Pitch_29": 166,
+      "Pitch_30": 167,
+      "Duration_9.1.4": 168,
+      "Duration_9.0.4": 169,
+      "Pitch_95": 170,
+      "Pitch_96": 171,
+      "Duration_8.3.4": 172,
+      "Pitch_28": 173,
+      "Pitch_98": 174,
+      "Duration_9.2.4": 175,
+      "Duration_10.0.4": 176,
+      "Duration_10.1.4": 177,
+      "Pitch_26": 178,
+      "Duration_9.3.4": 179,
+      "Duration_11.0.4": 180,
+      "Pitch_97": 181,
+      "Duration_11.1.4": 182,
+      "Duration_10.3.4": 183,
+      "Duration_11.2.4": 184,
+      "Pitch_27": 185,
+      "Duration_10.2.4": 186,
+      "Duration_11.3.4": 187,
+      "Pitch_24": 188,
+      "Pitch_100": 189,
+      "Velocity_31": 190,
+      "Pitch_99": 191,
+      "Pitch_101": 192,
+      "Pitch_25": 193,
+      "Pitch_102": 194,
+      "Pitch_103": 195,
+      "Velocity_27": 196,
+      "Pitch_104": 197,
+      "Velocity_87": 198,
+      "Pitch_105": 199,
+      "Pitch_22": 200,
+      "Velocity_23": 201
+    },
+    "unk_token": "[UNK]"
+  }
+}

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,48 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "4": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "[PAD]",
+  "tokenizer_class": "PreTrainedTokenizerFast"
+}