This stage is primarily used for training since it's features are not relevant to inference.