johnlockejrr commited on
Commit
1de99ec
·
verified ·
1 Parent(s): dea0fff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -3
README.md CHANGED
@@ -1,3 +1,51 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: PyLaia
3
+ license: mit
4
+ tags:
5
+ - PyLaia
6
+ - PyTorch
7
+ - atr
8
+ - htr
9
+ - ocr
10
+ - historical
11
+ - handwritten
12
+ metrics:
13
+ - CER
14
+ - WER
15
+ language:
16
+ - 'fr'
17
+ datasets:
18
+ - CATMuS/medieval
19
+ pipeline_tag: image-to-text
20
+ ---
21
+
22
+ # PyLaia - CATMuS/medieval
23
+
24
+ This model performs Handwritten Text Recognition in Latin/Romance on historical documents.
25
+
26
+ ## Model description
27
+
28
+ The model was trained using the PyLaia library on the [CATMuS/medieval](https://huggingface.co/datasets/CATMuS/medieval) dataset.
29
+
30
+ Training images were resized with a fixed height of {dimension} pixels, keeping the original aspect ratio. Vertical lines are discarded.
31
+
32
+ | set | lines |
33
+ | :----- | ------: |
34
+ | train | 15,2816 |
35
+ | val | 19,402 |
36
+ | test | 22,590 |
37
+
38
+ An external 6-gram character language model can be used to improve recognition. The language model is trained on the text from the CATMuS/medieval training set.
39
+
40
+ ## Plot
41
+
42
+ The model achieves the following results:
43
+
44
+ | set | Language model | CER (%) | WER (%) | lines |
45
+ |:------|:---------------| ----------:| -------:|----------:|
46
+ | test | no | 10.54 | 28.12 | 3,819 |
47
+ | test | yes | 9.52 | 23.73 | 3,819 |
48
+
49
+ ## How to use?
50
+
51
+ Please refer to the [PyLaia documentation](https://atr.pages.teklia.com/pylaia/usage/prediction/) to use this model.