mirror of
https://github.com/ultralytics/ultralytics
synced 2026-04-29 11:29:16 +00:00
Motivation
fastvit-s x adaptor diverges at full scale on 7-source training (final knn
5.9%, chance-level). Forensic smokes ruled out norm hot-swap, beta2 sweep,
fixed-wd changes, and BN running-stat freezes. Two recipe-level mismatches
with DINOv3 / EUPE / UNIC / DUNE distillation papers remained:
* our pipeline still pulls Ultralytics defaults RandAugment + RandomErasing
0.4 from cfg/default.yaml, while every reference recipe disables both
and instead uses ColorJitter + Grayscale + GaussianBlur + Solarize;
* we use fixed weight_decay 0.02 with ~1pct warmup, while DINOv3 ramps
wd 0.04 -> 0.2 over training and warms up for 16pct of epochs.
What changed
callbacks/distill_aug.py: classify_augmentations_distill, sibling to
ultralytics/data/augment.py:classify_augmentations. Same signature plus
grayscale, gaussian_blur, solarize knobs (default 0.0 = bit-equivalent
to upstream). Order mirrors UNIC main_unic.py:485-521. Kept out of
ultralytics/data/ to avoid touching the upstream cls training pipeline.
callbacks/wd_schedule.py: half-cosine wd ramp matching DINOv3
dinov3/optim/schedulers.py CosineSchedule, registered DDP-safe inside
the trainer __init__ (per utils/dist.py:79 callbacks-on-rank-0 footgun).
ultralytics/cfg/__init__.py: extend allowed_custom_keys with wd_end,
grayscale, gaussian_blur, solarize so DDP arg serialisation passes.
ultralytics/models/yolo/classify/train_image_encoder.py: switch
_build_transforms to classify_augmentations_distill and forward the
three new self.args knobs; register wd_schedule callback when wd_end > 0.
run_enc_distill_phase1.py: new dinov3 recipe (lr0=2e-4, wd 0.04->0.2,
warmup 18 ep, ColorJitter 0.4/0.4/0.2/0.1, grayscale 0.2, blur 0.5,
solarize 0.2, auto_augment off, erasing off) plus override forwarding.
Existing default / eupe / radio / unic recipes untouched.
|
||
|---|---|---|
| .. | ||
| __init__.py | ||
| alpha_schedule.py | ||
| attn_prescale.py | ||
| beta2_override.py | ||
| cls_to_det_remap.py | ||
| distill_aug.py | ||
| droppath.py | ||
| grad_clip.py | ||
| grayscale.py | ||
| mixup.py | ||
| muon_w.py | ||
| nfs_sync.py | ||
| paths.py | ||
| wandb_config.py | ||
| wd_schedule.py | ||