Speaker
Description
We can view cortex from two fundamentally different perspectives: a powerful device for performing optimal inference, or an assembly of biological components not built for achieving statistical optimality. The former approach is attractive thanks to its elegance and potentially wide applicability, however the basic facts of human pattern vision do not support it. Instead, they indicate that the idiosyncratic behaviour produced by visual cortex is largely dictated by its hardware components. The output of these components can be steered towards optimality by our cognitive apparatus, but only to a marginal extent. We conclude that current theories of visually-guided behaviour are at best inadequate, and we turn to neural networks in an attempt to establish whether the idiosyncratic character of human vision may be learnt from a larger repertoire of functional constraints, such as the statistics of the natural environment. We challenge deep convolutional networks with the same stimuli/tasks used with human observers and apply equivalent characterization of the stimulus–response coupling. For shallow depth of behavioural characterization, some variants of network-architecture/training-protocol produce human-like trends; however, more articulate empirical descriptors expose glaring discrepancies. Our results urge caution in assessing whether neural networks do or do not capture human behavior: ultimately, our ability to assess ‘‘success’’ in this area can only be as good as afforded by the depth of behavioral characterization against which the network is evaluated. More generally, our results provide a compelling demonstration of how far we still are from securing an adequate computational account of even the most basic operations carried out by human vision.