When prompted, enter your token to login: | |
from huggingface_hub import notebook_login | |
notebook_login() | |
Load OPUS Books dataset | |
Start by loading the English-French subset of the OPUS Books dataset from the 🤗 Datasets library: | |
from datasets import load_dataset | |
books = load_dataset("opus_books", "en-fr") | |
Split the dataset into a train and test set with the [~datasets.Dataset.train_test_split] method: | |
books = books["train"].train_test_split(test_size=0.2) | |
Then take a look at an example: | |
books["train"][0] | |
{'id': '90560', | |
'translation': {'en': 'But this lofty plateau measured only a few fathoms, and soon we reentered Our Element. |