Imperfect pitch: Gabor’s uncertainty principle and the pitch of extremely brief sounds
Psychon Bull Rev
Imperfect pitch: Gabor's uncertainty principle and the pitch of extremely brief sounds
I-Hui Hsieh 0 1
Kourosh Saberi 0 1
0 Department of Cognitive Sciences, University of California , Irvine, CA 92697-5100 , USA
1 Institute of Cognitive Neuroscience, National Central University , Taoyuan City , Taiwan
How brief must a sound be before its pitch is no longer perceived? The uncertainty tradeoff between temporal and spectral resolution (Gabor's principle) limits the minimum duration required for accurate pitch identification or discrimination. Prior studies have reported that pitch can be extracted from sinusoidal pulses as brief as half a cycle. This finding has been used in a number of classic papers to develop models of pitch encoding. We have found that phase randomization, which eliminates timbre confounds, degrades this ability to chance, raising serious concerns over the foundation on which classic pitch models have been built. The current study investigated whether subthreshold pitch cues may still exist in partial-cycle pulses revealed through statistical integration in a time series containing multiple pulses. To this end, we measured frequencydiscrimination thresholds in a two-interval forced-choice task for trains of partial-cycle random-phase tone pulses. We found that residual pitch cues exist in these pulses but discriminating them requires an order of magnitude (ten times) larger frequency difference than that reported previously, necessitating a reevaluation of pitch models built on earlier findings. We also found that as pulse duration is decreased to less than two cycles its pitch becomes biased toward higher frequencies, consistent with predictions of an auto-correlation model of pitch extraction.
Discrimination; Computational modeling; Pitch
-
In 1946, Dennis Gabor published his seminal work on
communication theory based on Heisenberg’s
uncertainty principle in quantum physics. He showed that one
cannot simultaneously specify a sound’s exact frequency
and time of occurrence. Encapsulated in the
mathematical identity ΔfΔt ≥ 0.5, the theory states that there is a
tradeoff between temporal and spectral resolution. In
colloquial terms, the briefer the sound, the broader is
its observed spectrum. Transient sounds such as clicks
have broad bandwidths. Pure tones of long durations
have narrow bandwidths. The question then arises as
to the efficiency with which the auditory system can
perceptually encode the pitch of very brief sounds given
the limitations imposed on physical stimuli by Gabor’s
uncertainty principle.
Several studies have investigated the minimum number of
pure-tone periods required for reliable identification or
discrimination of pitch (Freyman and Nelson, 1986; Henning,
1970; Hsieh and Saberi, 2007; Kietz, 1963; Konig, 1957;
Moore, 1973; Patterson et al., 1983; Robinson and Patterson,
1995; Ronken, 1971; Savart, 1830; Sekey, 1963; Turnbull,
1944; von Békésy, 1972). The question has been of interest
not only for what it can reveal about how pitch salience
declines as a function of duration, but also for what it may
contribute to models of pitch encoding (Freyman and Nelson,
1986; Hsieh and Saberi, 2007; Moore, 1973; Patterson et al.,
1983; Robinson and Patterson, 1995; Zwicker, 1970). To our
knowledge, two studies have attempted to evaluate pitch
extraction from partial- or single-cycle tones. Sipovsky et al.
(1972) reported a 2 % frequency discrimination threshold for
a 0.5-cycle pure tone (Δf=30 Hz at 1500 Hz) and Mark and
Rattay (1990) reported thresholds as low as 5 % for
singlecycle tones. One difficulty with interpreting the results of
these studies is that discrimination thresholds may not have
represented pitch extracted from waveform fine structure as
intended, but on confounds associated with pulse duration and
phase. Given a fixed number of cycles, changing
stimulus frequency results in a change in duration and a
detectable change in timbre associated with burst
duration in a two-alternative forced-choice (2IFC) task. This
is especially problematic for very brief tone pulses.
Decreasing pulse duration results in an upward shift in the
cutoff frequency of the pulse spectrum and hence an
increase in high-frequency energy that may be used in
a frequency discrimination task. Using zero-phase pulses
also introduces a timbre confound in a 2IFC frequency
discrimination task.
The current study was designed to investigate
whether pitch cues may be extracted from the fine structure of
partial-cycle pure tones under conditions that
appropriately control for confounds. This has not been
previously demonstrated. When confounds are accounted for,
pitch discrimination performance is at chance for a
0.5-cycle pulse. However, this does not mean that
fine-structure pitch cues are inaccessible to the system.
Subthreshold pitch cues may be detected (and hence
quantified) if vectorially summed in a time series
containing multiple pulses. In the current study, we
mea (...truncated)