emanuelaboros commited on
Commit
a923b82
·
1 Parent(s): 5e2a201

review readme

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -13,7 +13,7 @@ tags:
13
  - v1.0.6
14
  ---
15
 
16
- # Model Card for impresso-project/ocr-quality-assessor-unigram-light
17
 
18
  ## Overview
19
 
@@ -22,15 +22,16 @@ This model is a **lightweight OCR quality assessor** for historical French and G
22
  It uses **Bloom filters** containing known word unigrams to evaluate text quality by measuring the proportion of known vs. unknown words in OCR outputs. It is part of the [Impresso Project](https://impresso-project.ch), which develops tools for media archive processing and exploration.
23
 
24
  ## Model Details
 
25
 
 
26
  - **Model type:** Bloom filter–based scoring via a Transformers-compatible pipeline
27
  - **Languages:** French (fr), German (de)
28
  - **License:** GPL-3.0
29
  - **Base resource:** [`impresso-project/OCR-quality-assessment-unigram`](https://huggingface.co/impresso-project/OCR-quality-assessment-unigram)
30
- - **Interface:** Transformers `pipeline`
31
  - **Input format:** Raw text string
32
- - **Output format:** Float score (OCR quality proxy)
33
- - **Developed by:** UZH, Switzerland
34
 
35
  ## How to Use
36
 
 
13
  - v1.0.6
14
  ---
15
 
16
+ # Model Card for `impresso-project/ocr-quality-assessor-unigram-light`
17
 
18
  ## Overview
19
 
 
22
  It uses **Bloom filters** containing known word unigrams to evaluate text quality by measuring the proportion of known vs. unknown words in OCR outputs. It is part of the [Impresso Project](https://impresso-project.ch), which develops tools for media archive processing and exploration.
23
 
24
  ## Model Details
25
+ ### Model Description
26
 
27
+ - **Developed by:** University of Zurich (UZH) from the [Impresso team](https://impresso-project.ch). The project is an interdisciplinary project focused on historical media analysis across languages, time, and modalities. Funded by the Swiss National Science Foundation ([CRSII5_173719](http://p3.snf.ch/project-173719), [CRSII5_213585](https://data.snf.ch/grants/grant/213585)) and the Luxembourg National Research Fund (grant No. 17498891).
28
  - **Model type:** Bloom filter–based scoring via a Transformers-compatible pipeline
29
  - **Languages:** French (fr), German (de)
30
  - **License:** GPL-3.0
31
  - **Base resource:** [`impresso-project/OCR-quality-assessment-unigram`](https://huggingface.co/impresso-project/OCR-quality-assessment-unigram)
32
+ - **Interface:** Hugging Face `transformers` pipeline
33
  - **Input format:** Raw text string
34
+ - **Output format:** Float score representing OCR quality
 
35
 
36
  ## How to Use
37