|
--- |
|
license: cc-by-nc-4.0 |
|
language: |
|
- en |
|
pipeline_tag: text-classification |
|
tags: |
|
- pytorch |
|
- reward_model |
|
- transformers |
|
- RLHF |
|
--- |
|
# Model Card for Model ID |
|
|
|
This is part of the Chai reward-model series, using the GPT2 architecture with a classification head, optimising for a user accepting the completion generated by the base model. |
|
|
|
Its training dataset consists of purely user-generated content [retry_and_continue_50m_reward_model](https://huggingface.co/datasets/ChaiML/retry_and_continue_50m_reward_model), where a user has the option to decline the generated response via the retry button or end the conversation. |
|
|
|
## Model Details |
|
- Developed by [Chai Research](https://www.chai-research.com/) |
|
- Model type: Transformer-based Classification Model |
|
- Language: English |
|
- License: cc-by-nc-4.0 |
|
- Contact: to ask questions about this model, join the [Chai Discord](https://discord.com/invite/4KPHkeG6VX). For general correspondence: [hello@chai-research.com](mailto:hello@chai-research.com?subject=Huggingface%20Model%20Inquiry) |