|
|
|
Decision Transformer |
|
Overview |
|
The Decision Transformer model was proposed in Decision Transformer: Reinforcement Learning via Sequence Modeling |
|
by Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch. |
|
The abstract from the paper is the following: |
|
We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. |
|
This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances |
|
in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that |
|
casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or |
|
compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked |
|
Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our |
|
Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, |
|
Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on |
|
Atari, OpenAI Gym, and Key-to-Door tasks. |
|
This version of the model is for tasks where the state is a vector. |
|
This model was contributed by edbeeching. The original code can be found here. |
|
DecisionTransformerConfig |
|
[[autodoc]] DecisionTransformerConfig |
|
DecisionTransformerGPT2Model |
|
[[autodoc]] DecisionTransformerGPT2Model |
|
- forward |
|
DecisionTransformerModel |
|
[[autodoc]] DecisionTransformerModel |
|
- forward |