🇧🇷 pt-ai-detector-sent

Sentence-level Portuguese classifier that flags whether a single sentence was likely written by a Large-Language-Model (LLM) or by a human.

Why? The document-level model Detecting-ai/pt-ai-detector works great on paragraphs, but very short inputs lost accuracy.
This checkpoint inherits that backbone and is fine-tuned on 200 k balanced sentences.

property	value
Base checkpoint	`Detecting-ai/pt-ai-detector`
Fine-tune data	100 000 human + 100 000 AI sentences (≥ 4 words)
LLMs used for AI text	Azure OpenAI gpt-4o-mini, gpt-4o, gpt-35-turbo
Training	1 epoch · batch 16 · lr 1 e-5 · A100-40 GB
Validation F1	0.989 (balanced sentences)
Intended use	quick checks inside larger pipelines, sentence-by-sentence highlighting

Demo : see the free web checker at detecting-ai.com — powered by this model.

✨ Quick start

from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="Detecting-ai/pt-ai-detector-sent",
    tokenizer="Detecting-ai/pt-ai-detector-sent",
    device_map="auto"          # GPU if available
)

txt = "A inteligência artificial está transformando a educação."
print(clf(txt, top_k=None))
# → [{'label': 'LABEL_1', 'score': 0.87}]   # 1 = AI, 0 = Human

🔧 Recommended threshold

score range	interpretation
> 0.70	likely AI (LLM-generated / paraphrased)
0.30 – 0.70	uncertain – review in context
< 0.30	likely Human

For full documents, classify every sentence and aggregate (e.g. “flag as AI if ≥ 30 % of sentences score > 0.70”).

🗂️ Training data

corpus	purpose
wiki40b-pt, oscar-pt, cc100-pt, europarl-pt, opus-books-pt	human prose (web, books, parliament)
Detecting-ai/ai_pt_corpus	1 M AI sentences generated with Azure OpenAI models (news, essays, chat, tweets, dialogs, code comments)

All human corpora were cleaned (language-ID filter, deduplication, URL removal).
Sentences shorter than 4 tokens were dropped.

📈 Validation metrics

split	precision	recall	F1
Human	0.987	0.990	0.989
AI	0.991	0.988	0.989
Macro	0.989	0.989	0.989

Evaluated on a held-out, balanced set of 20 k sentences.

⚠️ Limitations & caveats

Best on Portuguese sentences ≥ 8–10 tokens; very short fragments are mostly noise.
Trained on mainstream GPT family (GPT-4o, GPT-35-turbo); accuracy may drop on entirely novel models or heavy prompt-engineering.
Occasional false-positives on very formal human writing; false-negatives on heavy slang AI output.
Not a plagiarism detector and does not guarantee authorship.

📜 License

Creative Commons CC-BY-NC 4.0 – free for research & non-commercial use.
Commercial use requires written permission from the authors.

🤝 Team & contact

Built with ❤️ by the team behind detecting-ai.com.
Questions, issues, partnership requests → support@detecting-ai.com

Detecting-ai
/

pt-ai-detector-sent