Update README.md
Browse files
README.md
CHANGED
@@ -12,6 +12,14 @@ The runs were all performed training a smaller ViT (`vit_wee_patch16_reg1_gap_25
|
|
12 |
|
13 |
So far I have results for `adamw`, `laprop`, and `mars`. You can find full results in sub-folders by optimizer names. In all of these runs, the experiments with 'c' prefix in the name have caution enabled.
|
14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
# LaProp
|
16 |
|
17 |
|optim |best_epoch|train_loss |eval_loss |eval_top1 |eval_top5 |lr |
|
|
|
12 |
|
13 |
So far I have results for `adamw`, `laprop`, and `mars`. You can find full results in sub-folders by optimizer names. In all of these runs, the experiments with 'c' prefix in the name have caution enabled.
|
14 |
|
15 |
+
This is what the 'caution' addition looks like in an optimizer:
|
16 |
+
```python
|
17 |
+
mask = (exp_avg * grad > 0).to(grad.dtype)
|
18 |
+
mask.div_(mask.mean().clamp_(min=1e-3))
|
19 |
+
exp_avg = exp_avg * mask
|
20 |
+
```
|
21 |
+
|
22 |
+
|
23 |
# LaProp
|
24 |
|
25 |
|optim |best_epoch|train_loss |eval_loss |eval_top1 |eval_top5 |lr |
|