arxiv:2507.21406

Shapley Uncertainty in Natural Language Generation

Published on Jul 29

Authors:

Abstract

A Shapley-based uncertainty metric is proposed to improve the prediction of large language model performance in question-answering tasks by capturing continuous semantic relationships.

AI-generated summary

In question-answering tasks, determining when to trust the outputs is crucial to the alignment of large language models (LLMs). Kuhn et al. (2023) introduces semantic entropy as a measure of uncertainty, by incorporating linguistic invariances from the same meaning. It primarily relies on setting threshold to measure the level of semantic equivalence relation. We propose a more nuanced framework that extends beyond such thresholding by developing a Shapley-based uncertainty metric that captures the continuous nature of semantic relationships. We establish three fundamental properties that characterize valid uncertainty metrics and prove that our Shapley uncertainty satisfies these criteria. Through extensive experiments, we demonstrate that our Shapley uncertainty more accurately predicts LLM performance in question-answering and other datasets, compared to similar baseline measures.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2507.21406 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2507.21406 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2507.21406 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.