Divisive normalization processors in the early visual system of the Drosophila brain
Biological Cybernetics
https://doi.org/10.1007/s00422-023-00972-x
ORIGINAL ARTICLE
Divisive normalization processors in the early visual system of the
Drosophila brain
Aurel A. Lazar1
· Yiyin Zhou1,2
Received: 31 December 2022 / Accepted: 6 August 2023
© The Author(s) 2023
Abstract
Divisive normalization is a model of canonical computation of brain circuits. We demonstrate that two cascaded divisive
normalization processors (DNPs), carrying out intensity/contrast gain control and elementary motion detection, respectively,
can model the robust motion detection realized by the early visual system of the fruit fly. We first introduce a model of
elementary motion detection and rewrite its underlying phase-based motion detection algorithm as a feedforward divisive
normalization processor. We then cascade the DNP modeling the photoreceptor/amacrine cell layer with the motion detection
DNP. We extensively evaluate the DNP for motion detection in dynamic environments where light intensity varies by orders
of magnitude. The results are compared to other bio-inspired motion detectors as well as state-of-the-art optic flow algorithms
under natural conditions. Our results demonstrate the potential of DNPs as canonical building blocks modeling the analog
processing of early visual systems. The model highlights analog processing for accurately detecting visual motion, in both
vertebrates and invertebrates. The results presented here shed new light on employing DNP-based algorithms in computer
vision.
Keywords Motion detection · Divisive normalization · Drosophila · Phase processing · Gain control
1 Introduction
Robust motion detection is the key first processing step for
insects to safely navigate complex environments. Current
state-of-the-art computer vision algorithms achieve good performance under demanding navigation conditions. However,
under extreme conditions, their performance often quickly
degrades (Mathis et al. 2016; Li et al. 2018). Surprisingly,
however, the early vision system of the fruit fly is remarkCommunicated by Benjamin Lindner.
The authors’ names are listed in alphabetical order.
This article is published as part of the Special Issue on “What can
Computer Vision learn from Visual Neuroscience?.
B Aurel A. Lazar
Yiyin Zhou
;
1
Department of Electrical Engineering, Columbia University,
New York, NY 10027, USA
2
Present Address: Department of Computer and Information
Science, Fordham University, New York, NY 10023, USA
ably accurate at detecting motion in complex environments
under a vast range of light intensity conditions. The logic of
computation in the fly visual system is substantially different from the traditional methods of computation employed
by current man-made counterparts. This enables the fly to
navigate through terrains with rapid light intensity changes,
even though it only possesses a single photoreceptor type
(van Hateren 1997).
Modern optic flow algorithms may take seconds or even
minutes to compare two consecutive frames (Baker et al.
2011; Menze et al. 2018). While the processing speed was
recently improved upon by using deep neural network-based
algorithms, the cost of training time and the required large
amounts of training data remain excessive. For the low level
tasks such as elementary motion detection, fly vision is far
more efficient, faster and more robust without loss of precision during events that are critical for survival, such as
rapid predator attacks taking place on short time scales (hundreds of milliseconds). Strikingly, in fruit flies, like in many
other insects and mammals, processing delays are minimal. It only takes 3 synapses from photoreceptors to reach
the neurons responsible for detecting low-level directional
motion with minimal energy expenditure (Sy et al. 2013)
123
Biological Cybernetics
yyz :(see Fig. 1A); an efficient computational principle of
motion detection seems to be at work.
This calls for developing biologically informed robust
motion detection algorithms. Two half-century-old computational theories of motion detection, namely the Reichardt
motion detector (Hassenstein and Reichardt 1956) and
Barlow–Levick motion detector (Barlow and Levick 1965),
have dominated the field. Recent studies have unveiled the
basic anatomical structure of the fly’s motion detection pathways (Yang and Clandinin 2018; Borst et al. 2020). While
these and other studies did rapidly advance our understanding of motion detection in the early vision system of the
fruit fly, the underlying models have yet to be successful in capturing the surprising robustness of fly vision. In
Lazar et al. (2016), we compared the two prevailing models
of fly motion detection with a more complex phase-based
algorithm that we devised. Under different luminance and
contrast conditions, we demonstrated that (i) none of the three
algorithms could fully account for motion in natural scenes,
and (ii) the detection of motion was not robust at low luminance/contrast levels. This suggests fundamental limits in
current modeling approaches to visual motion detection that
are narrowly focused on simple feedforward motion detection mechanisms. The latter do not match the vastly superior
performance of the motion detection circuits in flies.
There is strong evidence showing that divisive normalization may contribute to gain control in olfaction (Olsen
et al. 2010), vertebrate retina (Beaudoin et al. 2007), primary
visual cortex (Carandini and Heeger 1994), primary auditory cortex (Rabinowitz et al. 2011) and sensory integration
(Ohshiro et al. 2017). Feedforward divisive normalization
has been proposed as a model of canonical neural computation (Carandini and Heeger 2012), and used in nonlinear
image representation for achieving perceptual advantages
(Lyu and Simoncelli 2008). This computation is key to many
sensory processing circuits underlying adaptation and attention. Feedforward normalization is also frequently used in
deep neural networks (Goodfellow et al. 2016; Ioffe and
Szegedy 2015).
In Lazar et al. (2020), we presented a class of divisive normalization processors (DNPs) that operate in the time and the
space-time domain. A DNP example is shown in the left block
of Fig. 1B. Each DNP channel exhibits Volterra processors
(VP) in a feedforward and local feedback divisive normalization branch. In addition, a multi-input Volterra processor
(MVP) provides global feedback. With input from all channel outputs, the MVP provides feedback into each channel.
This type of MIMO circuit architecture has been observed in
many neural systems (Lazar et al. 2022b).
Stimuli processed by DNPs can be faithfully recovered
from the output (Lazar et al. 2022a), suggesting that no
information is lost during processing. We posit that a feedforward/feedback divisive normalization processor is embedded
123
in every layer of the motion detection pathway (see Fig. 1A),
including the lamina, medulla and a single layer in the lobula.
In this paper, we demonstrate th (...truncated)