Scaling Laws vs. Neural Laws: Toward More Natural Artificial Vision
Abstract
The remarkable progress of modern computer vision has been propelled by the relentless logic of scaling laws: bigger models, more data, more compute, predictably better performance. On benchmarks like ImageNet, deep networks now match or even surpass human accuracy. Yet beneath these headline results, the alignment with human vision is fragile: on deceptively simple probes from the cognitive sciences, even the largest models drop to near-chance, and on ImageNet itself, models that reach human accuracy do so by strikingly different visual strategies — a divergence that, troublingly, widens with scale.
In this plenary, I will argue that the path to more natural artificial vision lies not in pushing scaling laws further, but in a deeper engagement with the neural laws of biological vision: developmental principles that shape how brains learn to see, and architectural constraints that impose strong inductive biases on cortical processing. I will share recent work from my lab on two such laws. On the learning side, I will present preliminary evidence that pairing the right learning objectives with naturalistic video — sequences of object transformations like those the developing brain encounters — can pull deep networks toward markedly more human-like visual strategies. On the architectural side, I will show how recent advances in state space models can scale cortical recurrent feedback into a brain-inspired alternative to transformer self-attention, one that closes the gap on cognitive probes where transformers fail, and on ImageNet traces more favorable scaling laws than transformers.
Together, these results point toward a future in which scaling laws and neural laws are in agreement rather than in tension, and in which computer vision, in dialogue with the brain sciences, helps build AI systems that are not only more capable but more aligned with the kind of intelligence we ultimately seek to understand and emulate.
Speaker