|
|
||||||||
The Journal of Neurophysiology Vol. 85 No. 3 March 2001, pp. 1220-1234
Copyright ©2001 by the American Physiological Society
1Institute for Systems Research and 2Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742-3311
| |
ABSTRACT |
|---|
|
|
|---|
Depireux, Didier A., Jonathan Z. Simon, David J. Klein, and Shihab A. Shamma. Spectro-Temporal Response Field Characterization With Dynamic Ripples in Ferret Primary Auditory Cortex. J. Neurophysiol. 85: 1220-1234, 2001. To understand the neural representation of broadband, dynamic sounds in primary auditory cortex (AI), we characterize responses using the spectro-temporal response field (STRF). The STRF describes, predicts, and fully characterizes the linear dynamics of neurons in response to sounds with rich spectro-temporal envelopes. It is computed from the responses to elementary "ripples," a family of sounds with drifting sinusoidal spectral envelopes. The collection of responses to all elementary ripples is the spectro-temporal transfer function. The complex spectro-temporal envelope of any broadband, dynamic sound can expressed as the linear sum of individual ripples. Previous experiments using ripples with downward drifting spectra suggested that the transfer function is separable, i.e., it is reducible into a product of purely temporal and purely spectral functions. Here we measure the responses to upward and downward drifting ripples, assuming reparability within each direction, to determine if the total bidirectional transfer function is fully separable. In general, the combined transfer function for two directions is not symmetric, and hence units in AI are not, in general, fully separable. Consequently, many AI units have complex response properties such as sensitivity to direction of motion, though most inseparable units are not strongly directionally selective. We show that for most neurons, the lack of full separability stems from differences between the upward and downward spectral cross-sections but not from the temporal cross-sections; this places strong constraints on the neural inputs of these AI units.
| |
INTRODUCTION |
|---|
|
|
|---|
Only a few general
organizational features are known in primary auditory cortex (AI). They
include a spatially ordered tonotopic axis (Evans et al.
1965
), bands of alternating binaural response properties
(Imig and Adrian 1977
; Middlebrooks et al.
1980
), and a variety of other response features that change
systematically along the isofrequency planes such as thresholds
(Heil et al. 1994
; Schreiner et al.
1992
), bandwidths (Schreiner and Sutter 1992
),
FM selectivity (Heil et al. 1992
; Mendelson et
al. 1993
; Shamma et al. 1993
), and asymmetry of
response areas (RAs; the span of frequencies that influence, both
through excitation and inhibition, the response of a cell)
(Shamma et al. 1993
). To derive a functionally coherent
picture of these maps, it is necessary to integrate these features
within a comprehensive descriptor of the unit responses; one that can
be quantitatively derived and employed to predict responses to novel stimuli.
Traditionally measured response areas are inadequate because
they rarely include response dynamics and cannot be used to predict responses quantitatively. An alternative is the response field (RF)
(Schreiner and Calhoun 1994
; Shamma et al.
1995
), a static, purely spectral function analogous to the RA
except for the use of broadband sounds (but see Nelken et al.
1994
; Sutter et al. 1996
). A dynamic
generalization of the RF is the spectro-temporal response field (STRF),
a characteristic function of a neuron obtained using broadband sounds
(Aertsen and Johannesma 1981
; deCharms et al.
1998
; Eggermont 1993
and references therein;
Escabi and Schreiner 1999
; Kowalski et al.
1996a
; Kvale and Schreiner 1995
; Theunissen et al. 2000
). A schematic of an idealized
STRF is illustrated in Fig. 1.
Qualitatively, its spectral axis reflects the range of frequencies that
influence the response or firing rate of the neuron being
characterized, and its temporal axis reflects how this influence
changes as a function of time. Positive-valued regions of the STRF
describe excitatory influence, and negative regions describe inhibitory
influence. The interplay between the spectral and temporal axes can
give multiple interpretations to the STRF, e.g., as a time-evolving
spectral response field or a family of impulse responses labeled by
frequency band.
|
Over the last few years, we have developed new methods to derive the
STRFs and characterize the responses of both single and multiple units
in the ferret AI (Kowalski et al. 1996a
,b
). These methods use "moving ripples": time-varying broadband sounds with sinusoidal spectral envelopes that drift a constant velocity along the
logarithmic frequency axis. Figure 2
illustrates the spectrogram of such a stimulus. Neuronal responses are
vigorous and well phase-locked to these spectral and temporal envelope
modulations over a range of ripple velocities and densities. Measuring
the amplitude and phase of the locked component of the response enables
one to construct transfer functions. A transfer function can
be inverse-Fourier transformed to obtain the STRF that characterizes a
unit's dynamics and selectivity along the tonotopic axis.
|
In developing these measurement and analysis methods, we use two
fundamental assumptions. The first is that the responses are
substantially linear with respect to the time-varying spectral envelope
of stimuli. In particular, this implies that the response to the
spectro-temporally rich stimulus
whose envelope can always be
described as the sum of multiple moving ripples
will be the sum of its
responses to the individual ripple components. This assumption was
confirmed by successfully predicting responses to the superposition of
multiple ripples (Kowalski et al. 1996b
).
The second important assumption deals with the separability of the
temporal and spectral aspects of the responses. Specifically we have
demonstrated in other reports that temporal and spectral transfer
functions can be measured independently of each other and then combined
with a simple product to compute the total transfer function
(Kowalski et al. 1996a
). The importance of this finding stems from its experimental implications for measuring the STRFs and
theoretical consequences for the biophysical and functional models of
the STRFs. On the experimental side, separability makes it possible to
infer responses to all ripple velocities and peak densities based on
only a pair of temporal and spectral transfer functions. Without this
assumption, measuring the two-dimensional transfer function is
difficult because of the extended times needed to collect adequate
spike counts. On the theoretical side, separability suggests that
certain features of the STRF (as we shall discuss in detail in the
following text) are formed by independent (and likely sequential)
spectral and temporal processing stages.
In our earlier study (Kowalski et al. 1996a
),
separability was validated for ripples moving only in one direction
(spectral envelope moving downward in frequency), a notion also known
as "quadrant separability." In this report, we compare the
separable functions (spectral and temporal) across upward and downward
quadrants. If the functions are the same across quadrants, the
responses are "fully separable" (i.e., they are separable);
otherwise they are quadrant separable, which is a (specialized) form of inseparability.
Like quadrant separability, full separability has experimental and
theoretical implications. On the experimental side, fully separable
STRFs can be measured with either upward or downward moving ripples.
Theoretically, fully separable responses imply an STRF that is fully
decomposable into the product of a purely temporal impulse response and
a purely spectral response field. It also implies a unit that responds
equally well to upward and downward moving ripples and hence has
necessarily a symmetric transfer function magnitude with respect to
direction (Watson and Ahumada 1985
). By contrast, cells
that are only quadrant separable necessarily respond in asymmetric
fashion with respect to direction, i.e., are direction sensitive.
We restrict our presentation in this paper to measurements with singly
presented moving ripples in contrast to simultaneously presented
ripples discussed in Klein et al. (2000)
.
There are several goals of this paper. We present a method of measuring the complete descriptor of the linear spectro-temporal properties of an auditory cell, the STRF. We describe examples of STRFs measured in AI and summarize the distribution of the STRF and transfer function parameters encountered. We show that there is a directional sensitivity in the response to the upward versus downward moving components of a sound's spectral envelope. This breaks the symmetry of full spectro-temporal separability and produces quadrant separability. We propose measures to quantify quadrant and full separability. Finally, we discuss the significance of the results and their relationship to results from similar auditory and analogous visual experimental paradigms.
| |
METHODS |
|---|
|
|
|---|
Surgery and animal preparation
Data were collected from a total of 11 domestic ferrets
(Mustela putorius) supplied by Marshall Farms (Rochester,
NY). The ferrets were anesthetized with pentobarbital sodium (40 mg/kg) and maintained under deep anesthesia during the surgery. Once the
recording session started, a combination of ketamine (8 mg · kg
1 · h
1),
xylazine (1.6 mg · kg
1 · h
1), atropine (10 µg · kg
1 · h
1), and
dexamethasone (40 µg · kg
1 · h
1) was given throughout the experiment by
continuous intravenous infusion, together with dextrose, 5% in Ringer
solution, at a rate of 1 ml · kg
1
· h
1 to maintain metabolic stability. The
ectosylvian gyrus, which includes the primary auditory cortex, was
exposed by craniotomy and the dura was reflected. The contralateral ear
canal was exposed and partly resected, and a cone-shaped speculum
containing a miniature speaker (Sony MDR-E464) was sutured to the
meatal stump. For more details on the surgery, see Shamma et al.
(1993)
.
Recordings
Action potentials from single units were recorded using
glass-insulated tungsten microelectrodes with 5-7 M
tip impedance at 1 kHz. Neural signals were fed through a window discriminator, and
the time of spike occurrence relative to stimulus delivery was stored
using a computer. In each animal, electrode penetrations were made
orthogonal to the cortical surface. In each penetration, cells were
typically isolated at depths of 350-600 µm corresponding to cortical
layers III and IV (Shamma et al. 1993
). In many
instances, it was difficult to isolate reliably a single unit for
extended recordings, and hence several units were recorded instead.
Such data were labeled "multiunit recordings" and are explicitly
designated as such and separated from the single-unit records in all
data presentations in the paper.
Acoustic stimuli
All stimuli are computer synthesized. For each unit isolated, initial tests are carried out using tonal stimuli to measure the basic frequency response at several intensities to determine the best frequency (BF) and response threshold. All other stimuli used in these experiments have broadband spectra with a sinusoidally modulated (or rippled) envelope. We used the knowledge of the cell's BF to adjust the frequency range of the broadband sound so that the cell's excitatory and inhibitory regions lay well within the frequency range of the sounds.
In practice, it is hard to generate noise and then shape it with
filters to a desired dynamic spectral envelope, so we generate ripples
over a range of five octaves by taking logarithmically spaced pure
tones with random (temporal) phases. The amplitude S(t, x) of each tone is then
|
(1) |
A) in percentage or decibels; ripple velocity
(w) in units of cycles/s (or Hz); ripple density (
) in units of cycles/octave; and the initial phase of the ripple
. The spectra consist either of 20 or 100 tones per octave equally spaced along the logarithmic frequency axis or with a spacing of 1 tone/Hz with an amplitude decay producing equal power per octave. The
spectra typically span five octaves (e.g., 0.25-8 kHz) with the range
chosen such that the response area of the cell tested lay within the
stimulus spectrum. The choice of a density of 20 or 100 tones per
octave does not alter the cortical responses; hence we do not specify
which density was used.
A single-ripple stimulus at overall level L dB SPL would
typically be composed of N logarithmically spaced
components, each at L
10 log10 (N)
L
20 dB for N = 101. The overall stimulus level was
chosen on the basis of threshold at BF; typically L was set
10-20 dB above threshold. High levels (L > 70 dB) were avoided to ensure the linearity of our stimulus delivery system. The
amplitude of a single ripple was defined as the maximum percentage or
logarithm change in the component amplitudes. Ripple amplitudes were
either 90% (linear) or 10 dB (logarithmic) modulations.
The ripple velocities w and ripple densities
used were
determined by the response properties of the neuron, but the typical range was |w| < 25 Hz (with some units
requiring up to 100 Hz) and |
| < 1.6 cycles/octaves (with some
units requiring up to 4 cycles/octaves). Single ripples were always
presented with
= 0.
By the convention established in Eq. 1, a ripple whose
spectral envelope is moving downward in frequency, as in Fig. 2, has positive w and positive
; equivalently, it can be
described by a ripple with negative w and negative
, and
an added phase shift of
, by Eq. 1 and the identity sin
(
) = sin (
+
). A ripple whose spectral peaks are
moving upward in frequency has negative w and positive
,
or by Eq. 1 and the same identity, positive w,
negative
, and an added phase shift of
.
The stimulus bursts had an 8-ms rise/fall time and duration of 1.0 or
1.7 s, repeated every 3-4 s. All stimuli were gated and fed
through an equalizer into the earphone. Calibration of the sound
delivery system (to obtain a flat frequency response up to 20 kHz) was
performed in situ with the use of a 
.
Theoretical considerations
DEFINING THE STRF.
The fundamental tool to measure linearity and separability of primary
cortical cell is to measure their STRF. The STRF is a spectro-temporal
function STRF(t, x). The linear response rate y(t) of a cell is related to its STRF(t,
x) and the spectro-temporal envelope of the stimulus
S(t, x) by y(t) =
dt'dxS(t'
t, x) · STRF(t, x), i.e., convolution along the time dimension
t and integration along the spectral dimension x.
) =
w
[STRF(t
x)], and then inverse transformed to compute the STRF,
where the coordinates dual to t and x are
w and
, respectively (see Fig.
3). By measuring the sinusoidal component
with temporal frequency w of the response
yw
(t) of a cell to a
ripple of specific ripple velocity w and ripple density
,
we can obtain the transfer function T(w,
) at
one point in w
space (Depireux et al.
1998
|
(2) |
)| and phase
(w,
) of the complex transfer
function T(w,
) by measuring the amplitude and
phase of the (real) response of the cell. Note that the use of complex
numbers is not theoretically necessary, but it does simplify the
calculations in the transfer function space considerably. By the
definition of the transfer function, it follows that the inverse
Fourier transform of T(w,
) is the STRF of the
cell
|
(3) |
) is complex, there is complex conjugate
symmetry
|
(4) |
|
DEFINING AND ASSESSING SEPARABILITY.
Separability is an important property of the transfer functions. A
fully separable transfer function is one that factorizes into a
function of w and a function of
over all quadrants:
T(w,
) = F(w)
· G(
). This implies that STRF(t, x) is
time-spectrum separable: STRF(t, x) = IR(t) · RF(x). In this case, one needs only
measure the transfer function for all
at a convenient w and for all w at a convenient
.
F(w) and G(
) are each
complex-conjugate symmetric [F(
w) = F*(w), G(
) = G*(
)] because IR(t) and RF(x) are
real, so one needs only consider the positive values of each. This
dramatically decreases the number of measurements needed to
characterize the STRF.
|
(5) |
> 0 quadrant, and the subscript 2 the w < 0,
> 0 quadrant (see Fig. 3). Note that by reality of the STRF, the
value of the transfer function in quadrants 3 (w < 0,
< 0) and 4 (w > 0,
< 0) is complex conjugate to the value in quadrants 1 and 2, respectively. In this
case, the STRF is not separable in spectrum and time but is the linear
superposition of two functions, one with support only in quadrant 1 (and 3) and one with support only in quadrant 2 (and 4).
Separability need not be an all-or-none property but rather can be
assessed in a graded fashion. To do so, we apply singular value
decomposition (SVD) of the matrix T of measured
transfer-function values (Haykin 1996
with random
noise added to each sample. SVD decomposes T as
|
|
(6) |
denotes the Hermitian transpose and U, V are
matrices containing "singular" row vectors
ui and
i corresponding to
spectral and temporal cross-sections, respectively, of separable transfer functions. Thus the SVD can be viewed as decomposing T into a linear sum of n separable matrices, each
weighted by its ability to approximate T as a weighted
product of two vectors as in Eq. 6, as given by the
"singular values"
's. Because of the presence of noise in the
measurement, the
's are all expected to be nonzero with their
values decreasing monotonically to a noise floor, which depends on the
level of the noise.
With respect to this floor, the number of significant singular values
depends on the nature of the measured transfer function T.
The closer T is to being separable, the more dominant the
first singular value
1 will be over its
counterparts, which share the residual error in a manner that depends
on the precise nature of the inseparability. We have used this fact to
define a single measure of the "distance" of the system from
separability or alternatively the "degree of inseparability"
SVD
|
(7) |
i

SVD brands inseparability
by its strength but otherwise reveals nothing of its nature. Therefore
we examine the origin of inseparability by other means. Specifically we
shall analyze three factors that give rise to inseparability.
1) The relative power in the first and second quadrants
|
(8) |
d near one implies strong selectivity
of the responses to the direction of ripple movement and hence strong inseparability.
2) The asymmetry of the spectral transfer function around
= 0 is
|
(9) |
) and
G2(
). Index
s values near one imply strong asymmetry (i.e., lack of correlation) in the
transfer function to different directions and hence strong inseparability.
3) The asymmetry of the temporal transfer function around
w = 0 is
|
(10) |
w).
Index
t values near 1 imply strong asymmetry (i.e.,
lack of correlation) in the transfer function to different directions, and hence strong inseparability.
EFFECT OF FINITE SAMPLING.
We measure the transfer function of cells by varying two parameters,
ripple velocity and ripple density. For consistency's sake, we used
the same range of parameters for a majority of cells. However, for some
cells, the transfer function has not decreased significantly at the
"edges" (for instance, in Fig. 9C, the temporal transfer
function is clearly still strong at ±64 Hz and above). This is
equivalent to multiplying the true transfer function by a rectangular
function which is zero everywhere except between
64 and 64 Hz, over
which range it is 1. In the dual Fourier space of the transfer function
space, that is, in the STRF space with coordinates t and
x, this corresponds to convolving along each dimension the
STRF with the Fourier transform of a rectangular pulse, that is,
with sin (x)/x. This leads to spurious
oscillations in the display of the STRF as can be seen in
Fig. 9C and others. These oscillations would disappear if we
had measured the transfer functions all the way to their vanishing values.
|
DEVIATIONS FROM LINEARITY.
Because the STRF is a measure of the linear part of the dynamics of a
cell, we only consider effects that might modify the measurement of the
first component of the Fourier transform of the period histograms. The
most prominent nonlinearities are (approximate) half-wave rectification
and compression. The half-wave rectification is primarily due to the
positivity of spike rates (ordinarily the steady-state response to a
flat spectrum is significantly less than half the peak firing rate of
the unit); the distortion of a sinusoid due to half-wave rectification
does not affect the phase of the response, and its effect on the
amplitude of the first Fourier component is a constant factor,
independent of w and
. The distortion due to compression
or saturation, similarly, does not affect the phase of the Fourier
transform components of the response and similarly affects the
amplitude only by an overall constant factor for stimuli of moderate level.
Data reduction
Many of the data analysis methods described here are similar or
straightforward extensions of those developed earlier in
Kowalski et al. (1996a)
, and those will be only briefly
reviewed here. Figures 4 and
5 illustrate the nature of the responses
to the ripple stimuli and the analysis to extract the spectral (Fig. 4)
and temporal (Fig. 5) transfer functions. In Fig. 4A, the
ripples are presented at 8 Hz for ripple densities from
1.6 to 1.6 cycle/octave in steps of 0.2 cycle/octave. Each stimulus is presented
15 times.
|
|
For each ripple density, we compute at 16-bin period histogram based on
the responses starting at 120 ms (to exclude the onset response; Fig.
4B). A 16-point Fourier transform (FFT) is then performed on the period histogram, and the amplitude and phase of the
first component is taken to be the amplitude and phase of the transfer
function. If the modulation of the response was that of a purely linear
system, the higher FFT coefficients would be negligible, but because of
half-wave rectification and compression, they sometimes are
significant. In general Tw(
) can be
written as
|
(11) |

)| and the unwrapped phase
w(
) of the transfer function
Tw(
). The ripple density at which
|Tw(
)| is a maximum is designated
as
m (= 0.0 octave/cycle in Fig.
4C).
Analogous steps are followed in measuring the temporal transfer
function as shown in Fig. 5 where ripples are presented at 0.2 cycle/octave for ripple velocities from
24 to 24 Hz in steps of 4 Hz.
Note that in the previous paper (Kowalski et al. 1996a
),
we weighted the measurement of the first component of the Fourier transforms of the period histograms by a weighted sum of the higher frequency components of the transform. This, however, is not compatible with the idea of a linear system so that the resultant STRF or equivalently the ripple transfer function T would not be
expected to be the best possible predictor of the response to new
sounds. Therefore in this paper, the values of T correspond
directly to the first component of the Fourier transform.
Once the ripple transfer function has been measured, it can be inverse Fourier transformed to display the STRF. Since the transfer function is typically measured over fewer than 8 points along each dimension in each quadrant, the resulting STRF as computed would look very jagged even if the underlying STRF was smooth. We therefore interpolate to a smooth STRF for display purposes, padding the transfer function with zeros to a size of 64 × 64. All statistics and predictions use the measured unsmoothed STRF.
To construct the two-dimensional transfer function, we assume quadrant
separability, measure the transfer function along the cross-sections
shown in Fig. 3, to combine these spectral and temporal cross sections
as illustrated in Fig. 6. For each
quadrant, the transfer function is the outer product of the
cross-section, divided by the (complex) value of the transfer function
at the crossover (×) point. In Fig. 6, the point is
(w×1,
×1) = (8 Hz, 0.2 cycle/octave) in
quadrant 1 and (w×2,
×2) = (
8 Hz, 0.2 cycles/octave) in
quadrant 2.
|
(12) |
×q). The results of the two
measurements may differ, and so we use the (complex) geometric mean of
the two measured values as the divisor in Eq. 12,
Teff(w×q,
×q) = [T1st(w×q,
×q)T2nd(w×q,
×q)]1/2.
|
The ratio
T1st(w×q,
×q)/T2nd(w×q,
×q), which should be unity, reflects
noise in the system and is used to estimate reliability in the
following text.
The value of the transfer function along the w = 0 axis
is set to zero because the modulation transfer function is not well defined there, i.e., there is no modulation of firing rate around the
DC (average) rate with a frequency of 0 Hz. The value of the transfer
function along the
= 0 axis is not measured directly, so the
value used is the mean of the value inferred from being the boundary of
quadrant 1 and that inferred from being the boundary of quadrant 2.
Once the values of transfer functions for quadrants 1 and 2 and their
boundaries are measured, the values for quadrants 3 and 4 are given by
Eq. 4 (see also Fig. 3). The STRF is then computed by an
inverse Fourier transform (as in Eq. 3) and is illustrated in Fig. 6B (left). This interpolated version of
the STRF (used for display) is obtained by using Eq. 3 on
the transfer function padded with zeros at high |w| and
|
| (see Fig. 6A).
Deriving STRF parameters from the phase functions
Numerous parameters can be derived from the STRF (or equivalently the transfer function) that are analogous to traditional response measures such as BF, tuning curve bandwidth, and latency. Most of these parameters are best derived from analysis of the phase of the transfer functions (Fig. 7).
We model the phase of the transfer function within each quadrant
q(w,
), q = 1, 2 (see Eq. 2) as a linear function of w and
|
(13) |




q is a
constant phase angle, for each quadrant q. The
complex-conjugate symmetry of the transfer function means that these
six independent parameters describe the phase everywhere in the
w
plane. The convention of the minus sign before
d allows the time-dependent responses to be
functions of (t
d) as is
appropriate for a delay.
The justification for assuming linear fits of the phase functions has
been discussed in detail earlier in (Depireux et al. 1998
) and is strongly motivated by the data (Kowalski et
al. 1996a
). Note, however, that the assumption of
phase linearity is used only for parameter estimation and is
not assumed in computing the STRF. The first linear term in Eq. 13 stems from the fact that auditory units differing in their mean
neural delays will exhibit linear phase dependence on w with
different slope depending on delay. Analogous arguments apply for units
that are located at different places along the tonotopic axis: the
response phase of different units (with otherwise identical STRFs)
changes linearly with
at different rates, depending on the relative
center frequency locations. In both cases, the slopes of the linear
phase function indicate the absolute shift of the STRF relative to the
origin, i.e., the mean time delay












An interpretation of
d, for each quadrant, is
that it is the sum of the pure response latency and (roughly) half the
temporal width of the STRF. This is in contrast to the STRF's peak
delay,
STRF, defined to be the delay for which
the STRF achieves its maximum value, which may lead or lag
d, depending on the constant temporal phase
shift,
, defined in the following text. Similarly, fm for each quadrant may or may not
fall on the STRF's best frequency, BFSTRF, defined to be the frequency at
which the STRF achieves its maximum value, depending on the constant
spectral phase shift,
, defined in the following text.
A convenient convention for interpreting the constant component of the
phase is to break up the constant phase angle
q into two parts
|
(14) |
and
are, respectively, the temporal polarity and
spectral asymmetry of the STRF. Spectral asymmetry parameterizes the balance of the STRF along the spectral axis about its center. For
example, a unit with
= 0 would have its
BFSTRF in the center of the spectral
envelope of the STRF, possibly surrounded by inhibitory regions. A unit
with
> 0 would have its
BFSTRF at a lower frequency than the
center of the STRF with an inhibitory sideband above BFSTRF. A unit with
< 0 would have its BFSTRF at a higher
frequency than the center of the STRF, with an inhibitory sideband
below BFSTRF (see example in Fig.
4C of Shamma et al. 1995
< 0 ("onset response
at BF") or
> 0 ("offset response at BF"). There is an
ambiguity in fixing
and
that we remove by restricting
to
lie between
90 and +90°, while
ranges the full
180 to +180°. See Fig. 7 as an illustration of
the phase behavior in the different quadrants.
|
In past reports (Kowalski et al. 1996a
),
and
could be measured without measuring the transfer function in the upward
moving quadrant 2 by measuring the constant component of the phase in quadrant 1 (
1 = 
+
) and along the
w axis, where the constant component of the phase is
expected to be the mean across the quadrants
[(
1
2)/2 = 
; note the change in convention of

between the present work and Kowalski et al. (1996a)
].
Because of response variability, we only fit to those points of the
transfer function that have more than half of the response power in the
first component of the Fourier transform. Then the fit is done across
the entire two-dimensional phase plane for each quadrant. Ultimately
our unwrapping method is less than ideal, and estimates of
and
especially reflect that (Ghiglia and Pritt 1998
).
Estimating response variability: the bootstrap method
Variability in our experiments originates from multiple sources,
including internal neural mechanisms (e.g., Poisson-like distributions
of spike times), extracellular recording/identifying methods, and
equipment noise. Quantitative estimates of the reliability of our
measurements is crucial to its analysis and subsequent interpretation.
A method of variability estimation that is especially appropriate to
these measurements is the bootstrap method (Efron and Tibshirani
1993
; Politis 1998
).
The essence of this method is to use "resamples," in which N samples of bootstrap data are drawn with replacement from the N original samples of data. Repeating this procedure a large number of times creates a population of bootstrap resamples whose probability distribution is a good estimator of the probability distribution from which the original data were drawn.
To illustrate this procedure, consider measuring the transfer function
at a point (w,
). This is done by presenting the same (w,
) stimulus N times and constructing a
period histogram based on all N sweeps. The amplitude and
phase of the first Fourier component of the period histogram are
assigned to the amplitude and phase of the transfer function. A single
bootstrap resampling of the responses will have N sweeps,
where, because they are drawn from the original responses with
replacement, some will be duplicated and some will be unused.
Nevertheless a period of histogram is constructed, and the bootstrap
estimate of the transfer function is assigned to its first Fourier
component. Performing a large number of bootstrap resamples results in
a population of estimates for the transfer function. This population
has a mean, variance, and higher-order moments. These moments are
estimators of the moments of the original population (of all transfer
functions of all allowable neuronal responses to the stimulus). For
example, the standard deviation of all bootstrap estimates of the
transfer function is an estimator of the standard deviation of
measurements of the transfer function. This allows us to put error bars
on our transfer functions and STRFs.
Effects of crossover point errors
Another significant source of error is the difference between
the responses of repeated measurements at the transfer function crossover points. The ratio of these independent measurements, T1st(w





×
|
(15) |
×(t, x) captures the
systematic error from not having taken all data at the same time and is
given by
|
(16) |
|
and
|
(17) |
|
(18) |
T and
X are the length of time
and number of octaves over which the STRF was measured.
is a measure of the average standard deviation in units of the
maximum of the STRF.
is a measure of the variance in units of
power. If noise is additive, then
= P
/(P + P
) = 1/(SNR + 1), with
P = power, P
= noise power, and SNR = signal-to-noise ratio.
should go down
with the number of recordings, assuming the system can be described as
the time-invariant random process.
| |
RESULTS |
|---|
|
|
|---|
Data presented here were collected from 22 single-unit and 54 multiunit recordings in 11 ferrets. In the summary histograms, both single units and multiunit are included but are distinguished from each other.
Most units encountered in AI respond well to moving ripples. Responses
are typically phase-locked to the moving envelope of the ripple over a
range of ripple velocities and densities. However, of a total of 172 recordings made, only 76 cases provided adequate quality and quantity
of responses. The reasons for this low yield vary. For example, we have
encountered responses from a few units that were either poorly
phase-locked or were inconsistent from trial to trial; such units were
abandoned since our analysis methods are unsuitable for their
characterization. Also because of extended recording times, typically
over an hour, units were sometimes lost before sufficient data could be
collected to carry out a full analysis. In other cases, the unit or
animal changed state during the recording session, rendering the data
unreliable. The reason for the extended recording time is to present
ripple sounds and other sounds consisting of combinations of ripples,
so we can verify linearity by using the STRFs to predict the response of the cell to new sounds. We found empirically that about 10,000 spikes are typically needed to obtain an STRF with well-defined features in response to single ripples, which with our sound paradigm usually corresponds to a 20-min presentation per cross-section. To
eliminate data corresponding to unreliable cells, as described in the
preceding text, we use units only with values of
0.12 and
0.7 (see METHODS) as the threshold for
rejecting the data. These reliability statistics takes into account
most of the preceding sources of error. The values of 0.12 and 0.7 are
somewhat arbitrary, though we found that cells tended to separate
themselves into two populations above and below these thresholds,
respectively, and that the mathematical criteria of reliable versus
noisy cell corresponded well with our intuitive perception based on
visual inspection.
Responses to moving ripples
On average, AI units synchronize their responses to upward and downward moving ripples equally effectively with ripple velocities ranging from 2 to over 100 Hz, and ripple densities up to 4 cycle/octave. Examples of several temporal and spectral transfer function magnitudes are shown in Figs. 8-10, each with its corresponding STRF. In all cases, units respond well only over a specific range of ripple velocities and ripple densities, but the detailed shape and extent of the transfer functions vary from one unit to another. For instance, the unit in Fig. 9A responds well only to ripple velocities of ±4 Hz, whereas the unit in Fig. 9C responds well at least up to ±64 Hz. The unit in Fig. 6 responds well to ripple densities within ±0.4 cycle/octave, whereas the unit in Fig. 10A responds over a wider range of densities but poorly at 0 cycle/octave.
|
|