A fast and accurate zebra finch syllable detector (pdf)

Article PDF cannot be displayed. You can download it here:

https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0181992&type=printable

A fast and accurate zebra finch syllable detector

RESEARCH ARTICLE A fast and accurate zebra finch syllable detector Ben Pearre1*, L. Nathan Perkins1, Jeffrey E. Markowitz2, Timothy J. Gardner1 1 Department of Biology, Boston University, Boston, Massachusetts, United States of America, 2 Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America * a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 OPEN ACCESS Citation: Pearre B, Perkins LN, Markowitz JE, Gardner TJ (2017) A fast and accurate zebra finch syllable detector. PLoS ONE 12(7): e0181992. https://doi.org/10.1371/journal.pone.0181992 Editor: Brenton G. Cooper, Texas Christian University, UNITED STATES Abstract The song of the adult male zebra finch is strikingly stereotyped. Efforts to understand motor output, pattern generation, and learning have taken advantage of this consistency by investigating the bird’s ability to modify specific parts of song under external cues, and by examining timing relationships between neural activity and vocal output. Such experiments require that precise moments during song be identified in real time as the bird sings. Various syllable-detection methods exist, but many require special hardware, software, and know-how, and details on their implementation and performance are scarce. We present an accurate, versatile, and fast syllable detector that can control hardware at precisely timed moments during zebra finch song. Many moments during song can be isolated and detected with false negative and false positive rates well under 1% and 0.005% respectively. The detector can run on a stock Mac Mini with triggering delay of less than a millisecond and a jitter of σ 2 milliseconds. Received: September 15, 2016 Accepted: March 31, 2017 Published: July 28, 2017 Copyright: © 2017 Pearre et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: Song data used for training and testing are available at 10.17605/OSF. IO/BX76R The four software packages are available under Open Source licenses from DOIs listed in Appendix A of the manuscript, and also here: 10. 5281/zenodo.437555 10.5281/zenodo.437557 10. 5281/zenodo.437559 10.5281/zenodo.437558. Funding: This work was funded by NIH grants 5R01NS089679-02 and 5U01NS090454-02. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. 1 Introduction The adult zebra finch (Taeniopygia guttata) sings a song made up of 2–6 syllables, with longer songs taking on the order of a second. The song may be repeated hundreds of times per day, and is almost identical each time. Several brain areas reflect this consistency in highly stereotyped neural firing patterns, which makes the zebra finch one of the most popular models for the study of the neural basis of learning, audition, and control. If precise moments in song can reliably be detected quickly enough to trigger other apparatus during singing, then this consistency of behaviour allows a variety of experiments. A common area of study with song-triggered experiments is the anterior forebrain pathway (AFP), a homologue of mammalian basal ganglia consisting of a few distinct brain areas concerned with the learning and production of song. For example, stimulation of the lateral magnocellular nucleus of the anterior nidopallium (LMAN)—the output nucleus of the AFP—at precisely timed moments during song showed that this area controls specific variables in song output [1]. Song-synchronised stimulation of LMAN and the high vocal centre (HVC) in one hemisphere or the other showed that control of song rapidly switches between hemispheres [2]. Feedback experiments have shown that Field L and the caudolateral mesopallium may hold a PLOS ONE | https://doi.org/10.1371/journal.pone.0181992 July 28, 2017 1 / 18 A fast and accurate zebra finch syllable detector Competing interests: The authors have declared that no competing interests exist. representation of song against which auditory signals are compared [3]. The disruption by white noise of renditions of a syllable that were slightly above (or below) the syllable’s average pitch showed that the apparently random natural variability in songbird motor output is used to drive change in the song [4], and the AFP produces a corrective signal to bias song away from those disruptions [5]. The song change is isolated to within roughly 10 milliseconds (ms) of the stimulus, and the shape of the learned response can be predicted by a simple mechanism [6]. The AFP transfers the error signal to the robust nucleus of the arcopallium (RA) using NMDA-receptor–mediated glutamatergic transmission [7]. The course of song recovery after applying such a pitch-shift paradigm showed that the caudal medial nidopallium is implicated in memorising or recalling a recent song target, but in neither auditory processing nor directed motor learning [8]. Despite the power and versatility of vocal feedback experiments, there is no standard syllable detector. Desiderata for such a detector include: Accuracy: How often does the system produce false positives or false negatives? Latency: The average delay between the target syllable being sung and the detection. Jitter: The amount that latency changes from instance to instance of song. Our measure of jitter is the standard deviation of latency. Versatility: Is detection possible at “difficult” syllables? Ease of use: How automated is the process of programming a detector? Cost: What are the hardware and software requirements? A variety of syllable-triggering systems have been used, but few have been documented or characterised in detail. In 1999, detection was achieved by a group of IIR filters with handtuned logical operators [9]. The system had a latency of 50 or 100 ms, and accuracy and jitter were not reported. As access to computational resources has improved, approaches have changed: in 2009, hand-tuned filters were implemented on a Tucker-Davis Technologies digital signal processor, bringing latency down to around 4 ms [5]. But as with other filter-bank techniques, it is not strictly a syllable detector but rather a pitch and timbre detector—it cannot identify a frequency sweep, or distinguish a short chirp from a long one—and thus requires careful selection of target syllables. Furthermore, the method is neither inexpensive nor, based on our experience with a similar technique, accurate. 2009 saw the application of a neural network to a spectral image of song [3]. They reported a jitter of 4.3 ms, but further implementation and performance details are not available. In 2011, stable portions of syllables were matched to spectral templates in 8-ms segments [7]. This detector achieved a jitter of 4.5 ms, and false-negative and false-positive rates of (...truncated)