johnlockejrr
/

pylaia_catmus_medieval

Model card Files Files and versions Community

johnlockejrr commited on Apr 2

Commit

1de99ec

·

verified ·

1 Parent(s): dea0fff

Update README.md

Files changed (1) hide show

README.md +51 -3

README.md CHANGED Viewed

@@ -1,3 +1,51 @@
----
-license: mit
----

+---
+library_name: PyLaia
+license: mit
+tags:
+- PyLaia
+- PyTorch
+- atr
+- htr
+- ocr
+- historical
+- handwritten
+metrics:
+- CER
+- WER
+language:
+- 'fr'
+datasets:
+- CATMuS/medieval
+pipeline_tag: image-to-text
+---
+# PyLaia - CATMuS/medieval
+This model performs Handwritten Text Recognition in Latin/Romance on historical documents.
+## Model description
+The model was trained using the PyLaia library on the [CATMuS/medieval](https://huggingface.co/datasets/CATMuS/medieval) dataset.
+Training images were resized with a fixed height of {dimension} pixels, keeping the original aspect ratio. Vertical lines are discarded.
+| set | lines |
+| :----- | ------: |
+| train | 15,2816 |
+| val   |  19,402 |
+| test  |  22,590 |
+An external 6-gram character language model can be used to improve recognition. The language model is trained on the text from the CATMuS/medieval training set.
+## Plot
+The model achieves the following results:
+| set   | Language model | CER (%)    | WER (%) | lines   |
+|:------|:---------------| ----------:| -------:|----------:|
+| test  | no             | 10.54      |   28.12 |     3,819 |
+| test  | yes            |  9.52      |   23.73 |     3,819 |
+## How to use?
+Please refer to the [PyLaia documentation](https://atr.pages.teklia.com/pylaia/usage/prediction/) to use this model.