Papers
arxiv:2304.03277

Instruction Tuning with GPT-4

Published on Apr 6, 2023
Authors:
,
,

Abstract

Prior work has shown that finetuning large language models (LLMs) using machine-generated instruction-following data enables such models to achieve remarkable zero-shot capabilities on new tasks, and no human-written instructions are needed. In this paper, we present the first attempt to use GPT-4 to generate instruction-following data for LLM finetuning. Our early experiments on instruction-tuned LLaMA models show that the 52K English and Chinese instruction-following data generated by GPT-4 leads to superior zero-shot performance on new tasks to the instruction-following data generated by previous state-of-the-art models. We also collect feedback and comparison data from GPT-4 to enable a comprehensive evaluation and reward model training. We make our data generated using GPT-4 as well as our codebase publicly available.

Community

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 32

Browse 32 models citing this paper

Datasets citing this paper 10

Browse 10 datasets citing this paper

Spaces citing this paper 40

Collections including this paper 2