Publications / 2026 / Gentle Collapse

The Gentle Collapse: Distributional Metrics for Continual Learning

Ahmed Anwar1,2, Andreas Wagner3, Federico Raue1, Tobias Christian Nauen1,2, Andreas Dengel1,2

1DFKI · Smart Data & Knowledge Services  ·  2RPTU Kaiserslautern–Landau  ·  3Hochschule Karlsruhe University of Applied Sciences

Pdf
The Gentle Collapse: Distributional Metrics for Continual Learning — teaser figure
tl;dr — Accuracy alone hides how models forget in continual learning. We introduce six softmax-derived metrics covering rank, confidence, and distributional divergence that expose class-level forgetting patterns invisible to accuracy. Using these as loss weights or replay sampling criteria reduces forgetting by up to 7.7 pp on TinyImageNet over uniform experience replay.

Abstract

Accuracy degradation is the standard metric for Catastrophic Forgetting (CF), however, it records only whether forgetting occurred or not. It saturates at the extremes and collapses discretely at task boundaries, hiding the internal structure of what is being forgotten. We introduce six softmax-derived metrics spanning true-label rank (TLR), predictive confidence, and distributional divergence that characterize forgetting continuously, each normalized to [0, 1] with no modification to training. On CIFAR-100, these metrics carry information where accuracy does not: at 0% accuracy, the Confusion Margin spans an IQR of [0.32, 0.50] across classes that accuracy treats identically. We demonstrate that this richer signal is actionable in mitigating catastrophic forgetting. Per-sample metric scores used as loss weights reduce forgetting by 1.3 percentage points over uniform experience replay (ER) on CIFAR-100. Furthermore, the slope of a metric over a small window provides a stable sampling criterion: at a small-window size (e.g. 3 epochs), accuracy-trend degrades to 34.79% (std. = 2.32) while log-TLR achieves 41.07% (std. = 0.57). This gap is structural since reliable small-window trend estimation requires a continuous signal. On TinyImageNet, log-TLR trend sampling reduces forgetting by 7.7 percentage points over the ER baseline.

For more information, see the paper pdf.

Citation

If you use this work, please cite our paper:

BibTeX
@misc{anwar2026gentlecollapsedistributionalmetrics,
      title={The Gentle Collapse: Distributional Metrics for Continual Learning},
      author={Ahmed Anwar and Andreas Wagner and Federico Raue and Tobias Nauen and Andreas Dengel},
      year={2026},
      eprint={2606.25165},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2606.25165},
}

Authors · 5

Andreas Wagner
Tobias Christian Nauen
DFKI · RPTU KL