|
|
|
Optimization |
|
The .optimization module provides: |
|
|
|
an optimizer with weight decay fixed that can be used to fine-tuned models, and |
|
several schedules in the form of schedule objects that inherit from _LRSchedule: |
|
a gradient accumulation class to accumulate the gradients of multiple batches |
|
|
|
AdamW (PyTorch) |
|
[[autodoc]] AdamW |
|
AdaFactor (PyTorch) |
|
[[autodoc]] Adafactor |
|
AdamWeightDecay (TensorFlow) |
|
[[autodoc]] AdamWeightDecay |
|
[[autodoc]] create_optimizer |
|
Schedules |
|
Learning Rate Schedules (Pytorch) |
|
[[autodoc]] SchedulerType |
|
[[autodoc]] get_scheduler |
|
[[autodoc]] get_constant_schedule |
|
[[autodoc]] get_constant_schedule_with_warmup |
|
|
|
[[autodoc]] get_cosine_schedule_with_warmup |
|
|
|
[[autodoc]] get_cosine_with_hard_restarts_schedule_with_warmup |
|
|
|
[[autodoc]] get_linear_schedule_with_warmup |
|
|
|
[[autodoc]] get_polynomial_decay_schedule_with_warmup |
|
[[autodoc]] get_inverse_sqrt_schedule |
|
Warmup (TensorFlow) |
|
[[autodoc]] WarmUp |
|
Gradient Strategies |
|
GradientAccumulator (TensorFlow) |
|
[[autodoc]] GradientAccumulator |