File size: 998 Bytes
57bdca5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
Optimization The .optimization module provides: an optimizer with weight decay fixed that can be used to fine-tuned models, and several schedules in the form of schedule objects that inherit from _LRSchedule: a gradient accumulation class to accumulate the gradients of multiple batches AdamW (PyTorch) [[autodoc]] AdamW AdaFactor (PyTorch) [[autodoc]] Adafactor AdamWeightDecay (TensorFlow) [[autodoc]] AdamWeightDecay [[autodoc]] create_optimizer Schedules Learning Rate Schedules (Pytorch) [[autodoc]] SchedulerType [[autodoc]] get_scheduler [[autodoc]] get_constant_schedule [[autodoc]] get_constant_schedule_with_warmup [[autodoc]] get_cosine_schedule_with_warmup [[autodoc]] get_cosine_with_hard_restarts_schedule_with_warmup [[autodoc]] get_linear_schedule_with_warmup [[autodoc]] get_polynomial_decay_schedule_with_warmup [[autodoc]] get_inverse_sqrt_schedule Warmup (TensorFlow) [[autodoc]] WarmUp Gradient Strategies GradientAccumulator (TensorFlow) [[autodoc]] GradientAccumulator |