Base SaT (Segment any Text) models, to be used for sentence and paragraph segmentation. Easily adaptable via LoRA.
AI & ML interests
https://arxiv.org/abs/2406.16678
Organization Card
Authors of the paper Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation.
We host state-of-the-art sentence segmentation models.
SaT (Segment any Text) models, further trained on a Supervised Mixture of diverse styles and corruptions. Universal Sentence Segmentation models!
-
segment-any-text/sat-1l-sm
Token Classification • 0.2B • Updated • 1.58k -
segment-any-text/sat-3l-sm
Token Classification • 0.2B • Updated • 1.82M • • 8 -
segment-any-text/sat-6l-sm
Token Classification • 0.2B • Updated • 1.56k • 3 -
segment-any-text/sat-9l-sm
Token Classification • 0.3B • Updated • 3.08k
Base SaT (Segment any Text) models, to be used for sentence and paragraph segmentation. Easily adaptable via LoRA.
SaT (Segment any Text) models, further trained on a Supervised Mixture of diverse styles and corruptions. Universal Sentence Segmentation models!
-
segment-any-text/sat-1l-sm
Token Classification • 0.2B • Updated • 1.58k -
segment-any-text/sat-3l-sm
Token Classification • 0.2B • Updated • 1.82M • • 8 -
segment-any-text/sat-6l-sm
Token Classification • 0.2B • Updated • 1.56k • 3 -
segment-any-text/sat-9l-sm
Token Classification • 0.3B • Updated • 3.08k
models
15

segment-any-text/sat-12l-no-limited-lookahead
Token Classification
•
Updated
•
24
•
2

segment-any-text/sat-9l-no-limited-lookahead
Token Classification
•
Updated
•
10

segment-any-text/sat-6l-no-limited-lookahead
Token Classification
•
Updated
•
9

segment-any-text/sat-3l-no-limited-lookahead
Token Classification
•
Updated
•
7

segment-any-text/sat-1l-no-limited-lookahead
Token Classification
•
Updated
•
6

segment-any-text/sat-12l
Token Classification
•
Updated
•
2.03k
•
•
6

segment-any-text/sat-9l
Token Classification
•
Updated
•
58

segment-any-text/sat-6l
Token Classification
•
Updated
•
185

segment-any-text/sat-3l
Token Classification
•
Updated
•
5.01k
•
4

segment-any-text/sat-1l
Token Classification
•
Updated
•
41
•
•
1
datasets
0
None public yet