|
|
||||||||
J Neurophysiol (February 1, 2003). 10.1152/jn.00563.2002
Submitted on Submitted 15 July 2002; accepted in final form 3 October 2002
Departments of Physiology and Psychology and the Waisman Center, University of Wisconsin, Madison, Wisconsin 53711
| |
ABSTRACT |
|---|
|
|
|---|
Reale, Richard A., Rick L. Jenison, and John F. Brugge. Directional Sensitivity of Neurons in the Primary Auditory (AI) Cortex: Effects of Sound-Source Intensity Level. J. Neurophysiol. 89: 1024-1038, 2003. Transient sounds were delivered from different directions in virtual acoustic space while recording from single neurons in primary auditory cortex (AI) of cats under general anesthesia. The intensity level of the sound source was varied parametrically to determine the operating characteristics of the spatial receptive field. The spatial receptive field was constructed from the onset latency of the response to a sound at each sampled direction. Spatial gradients of response latency composing a receptive field are due partially to a systematic co-dependence on sound-source direction and intensity level. Typically, at any given intensity level, the distribution of response latency within the receptive field was unimodal with a range of approximately 3-4 ms, although for some cells and some levels, the spread could be as much as 20 or as little as 2 ms. Response latency, averaged across directions, differed among neurons for the same intensity level, and also differed among intensity levels for the same neuron. Generally, increases in intensity level resulted in decreases in the mean and variance, which follows an inverse Gaussian distribution. Receptive field models, based on response latency, are developed using multiple parameters (azimuth, elevation, intensity), validated with Monte Carlo simulation, and their spatial filtering described using spherical harmonic analysis. Observations from an ensemble of modeled receptive fields are obtained by linking the inverse Gaussian density to the probabilistic inverse problem of estimating sound-source direction and intensity. Upper bounds on acuity is derived from the ensemble using Fisher information, and the predicted patterns of estimation errors are related to psychophysical performance.
| |
INTRODUCTION |
|---|
|
|
|---|
Animals must localize
the source of transient sound under a wide range of conditions in the
natural world. This localization ability is tied to the functional
integrity of primary auditory (AI) cortex (Jenkins and Merzenich
1984
) and, presumably, to those AI neurons sensitive to sound
source direction (Barone et al. 1996
; Benson et
al. 1981
; Brugge et al. 1994
1996
;
Eggermont and Mossop 1998
; Eisenman 1974
;
Imig et al. 1990
; Middlebrooks and Pettigrew
1981
; Rajan et al. 1990
; Samson et al.
1993
, 1994
; Soviarvi and Hyvarinen 1974
) and to
the major cues that the animal uses in localizing sound on the
horizontal plane (Brugge et al. 1969
, 1973
;
Irvine et al. 1996
; Phillips and Irvine 1981
,
1983
; Reale and Brugge 1990; Reale and
Kettner 1986
; Semple and Kitzes 1993a
,b
).
Directional sensitivity and selectivity of an AI neuron are embodied in
the auditory spatial receptive field, which is defined by those
sound-source directions in azimuth and elevation from which a sound
systematically affects the response of the cell (Brugge et al.
1994
, 1996
). Auditory spatial receptive fields are not
static
they change in size and shape when competing sound is
introduced into the acoustic environment (Brugge et al.
1998
; Reale et al. 2000
) and when the intensity
of the sound source varies (Brugge et al. 1996
).
Typically, when there are no competing sounds and the intensity level
of the source is at or very near the threshold, an AI spatial receptive
field is confined to a small portion of acoustic space (Brugge
et al. 1994
, 1996
, 1998
; Eisenman 1974
;
Middlebrooks and Pettigrew 1981
; Rajan et al.
1990
). With few exceptions (see Imig et al.
1990
; Rajan et al. 1990
; Samson et al.
1993
, 1994
), increasing the stimulus strength by no more than
10 dB over a wide range (40-80 dB) of suprathreshold intensities
results in marked increase in receptive field size whether measured
along the azimuth, along the elevation, or along both (Brugge et
al. 1994
, 1996
, 1998
; Imig et al. 1990
;
Middlebrooks and Pettigrew 1981
; Rajan et al.
1990
). This attribute of AI spatial tuning is observed in other
auditory cortical fields as well (Middlebrooks et al. 1994
,
1998
). Thus rather than providing a highly restricted view of
auditory space, spatial tuning at the cortical level affords a nearly
complete view of possible sound-source directions. In this setting,
detailed information on sound-source direction must be encoded by a
receptive field metric (e.g., discharge rate or latency or pattern)
that has an orderly relationship to direction (Brugge et al.
1996
; Jenison 1998
; Middlebrooks
et al. 1994
, 1998
; Furukawa and Middlebrooks
2002
).
Previous studies from our laboratory showed that the first-spike
latency of most AI cells was tightly locked to the onset of a transient
directional stimulus and that for a sizeable proportion of these cells
this latency metric was distributed in an orderly way across the
spatial receptive field (Brugge et al. 1996
, 1998
; Reale et al. 2000
). We hypothesized that directional
information was derived from these spatial gradients. To examine this
possibility in a more rigorous way and to relate the findings to extant
psychophysical data, we first developed functional approximations to
auditory spatial receptive fields. These approximations employed only
directional dimensions of azimuth and elevation (Jenison
1998
), although we later extended that approach to space-time
(Jenison et al. 2001b
). We then used maximum likelihood
estimation methods to demonstrate that an ensemble of AI neurons with
receptive fields having gradients of response latency contained
sufficient information to account for auditory spatial acuity of both
cat and human (Jenison 1998
; Jenison et al.
1998
). In an extension to this theoretical observer approach,
we showed how a relative timing referent could be derived from the
ensemble response and thereby obviate the apparent lack of a temporal
marker for the onset of the stimulus (Jension 2001a
).
Systematic spatial gradients of response latency were often evident in
AI spatial receptive fields over a wide range of intensity level of the
sound source and hence the size of the field. We argued that this
preservation of a functional gradient may account for a listener's
ability to localized sound under changing intensity conditions
(Brugge et al. 1996
). Under most natural common
listening conditions the intensity of the sound changes and is
unknown to the listener. In our initial receptive field
analyses, however, the intensity level of the sound source was assumed
to be known to the theoretical observer and hence was
ignored in the functional approximation. We have now extended our
functional modeling approach to include, along with directional
parameters, the intensity level of the sound source as an
unknown parameter to be estimated by the theoretical observer.
In this paper we describe the changes in the AI spatial receptive field
that typically occurs with changes in intensity of the source. We show
that the resultant systematic changes of response latency within the
receptive field are faithfully modeled using functional approximation
techniques that include intensity as an unknown parameter.
We then introduce the use of spherical harmonic analysis to quantify
concomitant changes in receptive field shape. We go on to develop a
probability density function that links the receptive field model with
an inverse Gaussian distribution for response latency, which we then
use in an information theoretic analyses of a simulated ensemble of AI
neurons to derive estimation errors in azimuth and elevation when the
intensity level of the sound source was assumed to be
unknown. The modeled results are in agreement with
psychophysical findings. Preliminary reports have been presented
(Jenison 2001b
; Jenison et al. 2001a
).
| |
METHODS |
|---|
|
|
|---|
Adult cats, with no sign of external or middle ear infection, were premedicated with Acepromazine (maleate) (0.2 mg/kg, im) and Ketamine (hydrochloride) (20 mg/kg, im). A catheter was inserted into the femoral vein for iv drug administration and fluid replacement. Atropine sulfate (0.1 mg/kg, sc), dexamethasone sodium (0.2 mg/kg, iv), and procaine penicillin (300,000 units, im) were also administered before the animal was deeply anesthetized either with sodium pentobarbital (11 cats) or with halothane (4 cats). Pentobarbital sodium was administered intravenously (40 mg/kg, iv). Halothane (0.8-1.8%) was administered with a carrier-gas mixture of oxygen (33%) and nitrous oxide (66%) through an endotracheal tube using a scavenged Verni-Trol vaporizer system and an anesthesia ventilator. Samples of inspiratory and expiratory air were drawn continuously from within the endotracheal tube and a respiratory gas analyzer (Ohmeda 5250) used to measure pulse rate, oxygen saturation, airway pressure, and concentrations of O2, CO2, N2O, and halothane, on a breath-by-breath basis. When halothane was employed a muscle relaxant (pancuronium bromide, 0.15 mg/kg, iv) was administered just before recordings began, if spontaneous respiration was irregular or otherwise compromised. Paralysis could be maintained throughout the experiment by supplemental doses of pancuronium. Muscle relaxation under halothane anesthesia, combined with careful monitoring of inspired and expired gases and vital signs, provided a highly stable long-term recording environment. Experimental protocols were approved by the University of Wisconsin Institutional Animal Care and Use Committee.
When the animal reached a surgical plane of anesthesia, the pinnae and
other soft tissue were removed from the head. Hollow earpieces were
inserted into the truncated ear canals, sealed in place, and connected
to specially designed earphones. The transfer characteristics of the
left- and right-ear sound delivery systems were measured in vivo near
the tympanic membrane. A chamber was cemented to the skull over the
exposed left auditory cortex, filled with warm silicone oil, and
hydraulically sealed with a glass plate on which a Davies-type
microdrive was mounted. Action potentials were recorded extracellularly
with tungsten-in-glass microelectrodes in cortical area AI; their
times-of-occurrence were measured with respect to stimulus onset within
a window extending from 5 to 100 ms either by using a 1-µs resolution
and storing for off-line analyses or by digitizing their waveforms at
25 kHz and using BrainWare software (TDT, Gainesville, FL) on-line and
off-line to sort action potentials among single units. Tone burst
stimuli delivered monaurally or binaurally were used to estimate the
best frequency (BF) of a neuron and some response area features related to binaural interactions as described previously (Brugge et al. 1996
). The partial tonotopic map obtained by repeated electrode penetrations made during the course of an experiment confirmed that the
recordings were obtained from neurons in AI.
Sound-source stimuli were impulsive transients (either 6.4 or 10 ms
duration) delivered in virtual acoustic space, as described previously
(Brugge et al. 1994
; Reale et al.
1996
). In later experiments this stimulus presentation was
accomplished with a TDT System II (TDT). A veridical model of virtual
acoustic space (Chen et al. 1995
; Wu et al.
1997
) was used to synthesize, in quasi-real-time, transient
signals for sound-source directions positioned in a spherical
coordinate system (
180° to +180° azimuth,
36° to +90° elevation) and centered on the cat's interaural axis. The virtual acoustic space used was derived from HRTFs measured in a single cat and
hence not tailored to each of our experimental animals. Acoustic
calibration, performed in-ear on each animal, was used to provide a
common intensity reference among animals; namely, the maximum intensity
for a particular sound-source stimulus that occurs in the virtual
acoustic space. Intensity level is expressed as decibels of attenuation
(dBA) from this maximum. The spatial receptive field of a neuron, for a
sound source of a particular intensity level, was mapped using a
virtual acoustic space composed of either 1,650 directions on a
Cartesian graticule (approximately 4.5° azimuth by 9° elevation
spacing, Brugge et al. 1996
) or 524 directions arranged
along a spiral path (approximately 9° separation) to provide uniform
spherical sampling (Jenison et al. 2001b
). In either
case, the spatial receptive field is rendered (on paper) using the
quartic-authalic equal area projection, which minimizes distortion in
the frontal hemisphere and includes all of auditory space around the cat.
| |
RESULTS |
|---|
|
|
|---|
Our results were derived from 244 single AI neurons from which
spatial receptive fields were obtained at different intensity levels.
The BFs of these neurons ranged from 5.9 to 29.1 kHz. Typically, AI
neurons in our sample exhibited little or no spontaneous activity and
responded to an effective spatial stimulus with a single spike or a
short burst of spikes for up to tens of hundreds of closely spaced
directions. At each effective direction, we measured the latency to the
first spike evoked by that stimulus. There were several observations
common to all neurons studied, as described in detail in a previous
report (Brugge et al. 1996
). At any given intensity,
response latency often varied within the receptive field by
approximately 3-4 ms, although for some cells the spread could be as
much as 20 ms. The distribution of response latency within the
receptive field at any given intensity was typically unimodal. The mean
of the distribution differed among neurons for the same intensity
level, and most often also differed among intensity levels for the same
neuron. Increases in intensity typically resulted in decreases in the
mean of the distribution. Response latency, averaged across directions,
was longest for an intensity level near threshold for the cell, and
decreased rapidly at intensity levels approximately 20-30 dB above
this threshold. Further increases in intensity could evoke either
asymptotic or nonmonotonic behaviors. Over a range of 10-50 dB, the
mean latency across a receptive field would typically shorten by at least approximately 1-5 ms. Figure 1
illustrates several of these response attributes for one neuron. The
left-hand column shows quartic-authalic maps of response latency (color
coded) at each of six intensity levels with highest intensity at the
top. Here the empirically measured response latency, averaged across
all effective directions, increased from 12.5 to 32.7 ms over the 45-dB
range of intensity studied (right-hand column, open symbols). Similarly
the SD of the distribution increased from 2.2 to 27.2 ms. This
empirical SD (dashed horizontal lines) contains both the
unsystematic and the systematic components of the
response variability. The systematic component at any given intensity
is due to the dependence of response latency on the direction of the
sound source.
|
Functional approximation of the auditory spatial receptive field under changes in sound source intensity
Initially we developed a receptive field model that accounted for
the systematic variability in response latency that depended on
sound-source direction, and partitioned out the unsystematic variability (Jenison 1998
). Here we illustrate the
behavior of this model, which has now been extended to account for the
systematic variability that depends also on sound-source intensity (for
a theoretical treatment, see Jenison 2001b
;
Jenison et al. 2001a
). This extended functional
approximation method was applied to all 244 single units in our sample.
No systematic differences were seen in the modeled data that could be
attributed to the anesthetic used.
Our prior functional approximation work on spatial receptive fields
used spherical basis functions with free parameters
(wij,
ij,
ij, and
ij) for
fitting the basis functions. The free parameters serve only to
mathematically approximate the receptive field, where
and
specify the placement of each basis function on the sphere,
specifies each width, and w specifies each weight. The
current extension of this approximation now includes an exponential dependence on the intensity level,
, of the sound source defined as
|
(1) |
|
. The nonlinear
dependence on
was introduced to modulate the width parameter
of
the spherical basis functions and allow for the shape of the receptive
field to depend on sound intensity. Similarly, the nonlinear dependence
of mean latency on intensity is provided by introduction of the final
term exp{
ij
} in Eq. 1.
Constrained optimization techniques were used to fit the parameters
wij,
ij,
ij, and
ij of the M
basis functions defined by Eq. 1 to the dependent neural
response of interest
which in this case was response latency. The
details of the approximation techniques can be found in Jenison
(1998The middle column of Fig. 1 shows the results of applying this extended model to the empirical data (left column) from the same neuron. Although the latency data occupies a spherical coordinate system, the model approximation is simply a form of regression through a scatter of data where the predicted latency corresponds to the systematic mean value for any given direction and intensity. Thus the model captures the mean latency as a function of direction (middle column) as well as the increase in response latency, averaged across directions, as a function of decreasing intensity (right column, filled symbols). The modeled receptive fields also exhibit an increasing RMS residual error with decreasing intensity (right column, solid horizontal lines). Since this RMS residual error estimates the unsystematic component of response variability, it is seen to be smaller that the total variability (i.e., corresponding dashed horizontal lines). We note here features of these receptive fields that we return to in the DISCUSSION. At highest stimulus intensities, the receptive field is very large and the latency distribution is very narrow. Under this condition, the spatial latency gradients are shallow and variance in response latency is small. At the other end of the dynamic range, the receptive fields are relatively small with latency gradients that are steep and latency variance that is high.
The relationship between modeled and empirical data are further exemplified using data from six additional neurons in Fig. 2. For each neuron, the spatial receptive field was sampled at between four and six intensity levels separated by 5, 10, or 20 dB. The lowest level was usually chosen to be within 20 dB of the minimum threshold for that cell. For many cells, this lowest level produced a mean (and SD) of response latency that measured in tens of milliseconds. Furthermore, increasing the intensity level resulted in significantly reduced mean values and SD that typically asymptote to <5 ms. For other cells, like those in the lower right of Fig. 2, the magnitude of change observed across all sampled intensity levels was <10 ms. Regardless of these idiosyncrasies, the modeling process is clearly faithful to the specifics of each neuron's "latency versus intensity" profile. For each neuron modeled, the input set consists of all measured response latencies (1,003-4,334 for these 6 units) together with their corresponding sound-source direction and intensity level.
|
Monte Carlo simulation
The veracity of the model in estimating the systematic component
of response variability is limited by the inherent variability (noise)
introduced by the modeling process itself. This model noise is the
result of the data dependence on the constrained optimization
techniques that were used to fit parameters to the basis functions that
served to model the response latency (see Jenison 1998
,
2001b
). There is no true solution of the model for a particular
neuron given a finite input set consisting of measured response
latencies together with their corresponding sound-source directions and
intensity levels. Therefore the estimates of systematic variability
shown in Figs. 1 and 2 need to be compared with this inherent noise to
judge their validity. The method we employed to measure the model's
noise used Monte Carlo simulations. Figure 3 illustrates the results of this
analysis using the same neuron that was depicted in Fig. 1. The total
number of measured response latencies composing the input set for this
cell was 4,987. In the Monte Carol approach, one-half of these
potential inputs are chosen at random (with replacement) and that
sub-set was used to model the cell's spatial receptive field. The
sample and model process was repeated 40 times as prescribed by
Efron and Tibshirani (1993)
. The left column
maps the receptive fields for each intensity level using the mean value
at each direction obtained from the Monte Carlo simulation. The
middle column maps the receptive fields using the SD at each
direction. In general, regions with the shortest response latencies are
also the regions that map to the smallest SD. The right
column reproduces the function (model) from Fig. 1 that showed the
systematic increases in response latency and RMS error (from 2.6 to
22.1), collapsed across directions (filled circles, solid horizontal
lines). In addition, the function that results from Monte Carlo
simulation plots the mean response latency (filled squares) and its RMS
error (i.e., model noise), collapsed across directions, for direct
comparison. At each intensity level, the model noise is seen to be
significantly less than its paired value. These findings suggest that
the model approximations to the receptive fields shown in Figs. 1 and 2
were indeed valid estimates of systematic variability in response
latency that is due to both the dependence of latency on the direction
of the sound source and to the intensity level of the source.
|
Receptive field shape changes with intensity level
We showed above that average response latency data from the functionally modeled receptive fields in our study typically followed their empirically derived counterparts, whether the nonlinear growth in latency as a function of intensity was expansive or compressive. We have also observed that in extending the model from its original form there appears to be sufficient degrees of freedom to capture the potential changes in shape of the receptive field that typically attend intensity level changes. One visualization of these shape changes is provided by iso-response contours, as shown by solid white lines on the spatial receptive fields of three neurons illustrated in Fig. 4. Because the approximation encompasses both the dependence on spatial direction and on intensity level, a spatial receptive field can be examined at any chosen intensity level and on any graticule. Here we chose intensities varying from 10 (top) to 60 dBA (bottom) and a graticule with 9° spacing. At a near-threshold intensity level, a particular response contour is typically confined to only a quadrant of acoustic space, most often contralateral to the cerebral location of the cell under study. For these exemplar cells, and for most others in our sample, raising the intensity level produced concomitant changes in size and location of the contours. Here, for example, the iso-response-latency contours for 1 unit (left column) is seen to change in size, location, and orientation as intensity is increased from 50 to 10 dBA. Neither contour is present in this unit's receptive field determined at 60 dBA. In other neurons, still more complicated changes are seen due to a nonmonotonic relationship between response latency and intensity level. To capture these changes quantitatively, and thereby study systematically intensity related changes in spatial receptive field shape, we turned to spherical harmonic analysis.
|
Spherical harmonic analysis
Spherical harmonics provide a complete orthonormal basis for
expressing a function defined on a sphere (Hobson 1965
),
and the spatial receptive field is a natural spherical function. They play a role similar to that of cosines and sines in the Fourier transform. Spherical harmonics vary according to their so-called spatial frequency, l, ranging from 0 to
, and their
moment, m, which ranges from
l to
+l. The complex spherical harmonics themselves are defined
in terms of the associated Legendre functions as follows
|
) dependence via the Legendre function, and azimuthal
(
) dependence via the moment parameter m in the complex exponential function.
Since spherical harmonics form a complete orthonormal basis, it allows
an arbitrary function, in our case, rf (
,
,
), to be
expanded in terms of complex spherical harmonics such that
|
,
,
) over
and
|


,
) is the complex conjugate. The spherical magnitude
coefficients |a
Lmax to [l,m] = Lmax,
+Lmax. The magnitude weight of each
harmonic is colored coded. Lmax can be
as large as necessary to account for the highest spatial frequency in
the receptive field, although in our analyses it has been
conservatively set to 20. Examples of zonal, sectorial, and tesseral
harmonics are shown for their corresponding element in this spherical
magnitude spectrum. Since the coefficients
a
|
We have used spherical harmonic analysis to study shape changes in our
sample population. Figure 6 illustrates
an example of that analysis using the data from the three units in Fig.
4. As sound intensity was decreased (top to
bottom), there was a general trend for the receptive field
to become more spatially high-pass, as signaled by the increased
magnitudes of the coefficients toward the base of the spectrum pyramid.
The unit's spectra shown in the right panel tends to be composed
primarily of sectorial harmonics relative to zonal harmonics, in
contrast with the other two units. All include the more complex
tesseral harmonics. To quantify these changes across intensity we
derived distributions using the ratio of averaged magnitude
coefficients from 165 units with appropriately sampled intensity
levels. Specifically, the log ratio of average sectorial
(l = m) magnitude coefficients to average
zonal (m = 0) magnitude coefficients

|
|
Information theoretic analysis of sound-source direction from ensemble responses
Information provided by a single neuron is not likely to be
sufficient to localize the direction of a sound source due to both the
broadness of the spatial receptive field and to the ambiguity between a
given response and the direction and/or intensity of the source
eliciting that response. This can be appreciated by inspecting the
iso-response contours within a receptive field (e.g., Fig. 4). Rather,
we hypothesize that sound direction is encoded by a population of
neurons having different but overlapping spatial receptive fields
(Jenison 1998
, 2001a
,b
; Jenison et al. 2001a
). In this approach, the neural responses (in our case
response latency) are considered as random variables, and each neuron
has a probability density function that links the receptive field model
to statistical behavior of the random variable. A population of such
cells can then be investigated analytically using Fisher information to show how directional acuity is enhanced or degraded as a consequence of neural response variability and the structure of
the receptive field.
Most analytical derivations for Fisher information are based on the
assumption of a multivariate Gaussian distribution of error. However we
have recently argued that the inverse Gaussian (IG) density also
performs well in capturing the dependence of response-latency variance
as a function of the mean latency (Jenison 2001b
;
Jenison et al. 2001a
). Previously, we had only
considered a linear model of variance growth. The inverse Gaussian
density, with the linked receptive field model,
rfi (
,
,
), is
|
(2) |
,
,
) and the variance is

has typically been used as an additional degree of freedom for purposes of improving the fit of univariate models. The Fisher information derivation for the IG distribution can be found in Jenison (2001b)
and
, and the intensity parameter
. The
off-diagonal cells correspond to the cross-information. Inverting the
Fisher information matrix results in a covariance matrix containing the
individual Cramer-Rao lower bounds on estimation variance for each
parameter and the covariance in the off-diagonals.
Consideration of only one parameter in the Fisher information matrix
leads to the following construction with respect to the azimuth
direction parameter
for a population of N neurons
|
(3) |
|
| |
DISCUSSION |
|---|
|
|
|---|
Characterizing auditory spatial receptive fields is not a
single-factor (i.e., sound-source direction) problem since under natural conditions listeners localize sounds under varying behavioral conditions that include environments where sound sources vary in
intensity over a wide range. These nondirectional variables can be
reasonably hypothesized or experimentally demonstrated to be important
in determining the operating characteristics of a cell's spatial
receptive field (Benson et al. 1981
; Brugge et al. 1998
; Furukawa and Middlebrooks 2001
;
Reale et al. 2000
; Recanzone et al. 1998
,
2000
; Su and Recanzone 2001
), which in turn
could confound localization ability.
The dependence of AI spatial receptive field properties on sound-source
intensity level is also indicated by dichotic stimulation studies that
employed the two major interaural cues for directional hearing (i.e.,
interaural time and level differences). In these experiments, the exact
relationship between neural response and an interaural cue was often
critically dependent on the intensity level of the source
(Brugge et al. 1969
, 1973
; Irvine et al.
1996
; Phillips and Irvine 1981
, 1983
;
Reale and Brugge 1990; Reale and Kettner
1986
; Semple and Kitzes 1993a
,b
). Thus for most
AI cells, uncertainty in the intensity level of the source introduces
an inherent ambiguity between a given response and the interaural difference cue that maps onto the azimuthal direction of that source.
We have extended the nonparametric functional modeling of auditory space receptive fields to include the dimension of sound-source intensity. This new construction of the functional model is an important extension because it characterizes formally the covariation of response latency between two stimulus dimensions. Thus the model captured the systematic response variability due to the interplay of sound-source direction and sound-source intensity with negligible modeling error as demonstrated by cross-validation (i.e., Monte Carlo simulation).
Spherical harmonics
One interpretation of the spatial receptive field is that it
reveals the spatial filtering characteristics of the neuron. The
neuron, however, responds in both a linear, as well as nonlinear fashion, as a function of space, time, and intensity (Jenison et
al. 2001b
). A linear systems analysis analogous to Fourier analysis was used to expand the spatial function into characteristic patterns of spherical harmonics on the sphere. The composition of the
receptive field may be dominated by a particular class of spherical
harmonics. Although the tesseral harmonics are difficult to interpret
in terms of patterns of directional sensitivity, the zonal harmonics
reflect elevational spatial periodicity, and the sectorial harmonics
reflect azimuthal periodicity. The azimuthal sensitivity of spatial
receptive fields obtained from high-frequency neurons in other auditory
cortical and subcortical areas appears determined largely by the
pattern of interaural intensity differences (IID) caused by separation
of the ears on the head (Delgutte et al. 1995
;
Fuzessery et al. 1990
; Nelken et al.
1998
; Tollin and Yin 2002b
; Wenstrup et
al. 1988
). However, across these frequencies, the IID-azimuth
relationship in the cat varies with spherical elevation (Martin
and Webster 1989
; Musicant et al. 1990
). Taken together, these relationships predict that spatial receptive fields should be characterized by neither a predominance of solely zonal nor
sectorial harmonics. Our data (Fig. 7) are consistent with this
prediction in that the population distribution characterizing the range
from zonal to sectorial was not peaked at either extreme, but rather in
the middle.
Analytic formulation of the spatial receptive field
All cortical response metrics that have been studied as
neural-code candidates for directional hearing suffer from
some form of ambiguity between stimulus dimensions and unique response
measurement. For example, in our AI sample, it is common for a cell to
produce the same discharge rate or response latency for an array of
sound-source directions (i.e., iso-response contour) in acoustic space;
even when all other stimulus variables are held constant (Brugge
et al. 1996
, 1997
; Jenison 1998
). This ambiguity
is easily compounded when additional stimulus dimensions (e.g.,
background noise or competing sound) are investigated (Brugge et
al. 1998
; Reale and Brugge 2000). The intensity
level of the sound source is particularly notable in this regard
(Heil et al. 1994
; Phillips et al. 1994
; Schreiner 1998
). For example, most high-BF neurons in
cat AI cortex are reported to exhibit an azimuthal sensitivity that is
dependent on the intensity level of a free-field stimulus
(Clarey et al. 1994
; Imig et al. 1990
;
Rajan et al. 1990
; Samson et al. 1993
, 1994
). A similar result is inferred when cat AI neurons are
studied for the affect of intensity level on their interaural intensity difference sensitivity (Irvine et al. 1996
;
Semple and Kitzes 1993a
,b
); a major cue for the
azimuthal direction of high-frequency sound sources. These intensity
level effects are also common in other auditory cortical areas
(Middlebrooks et al. 1998
) and in lower levels of the
mammalian central auditory system using both free-field or dichotic
stimulus delivery (Boudreau and Tsuchitani 1968
;
Fuzessery et al. 1990
; Irvine and Gago
1990
; Semple and Kitzes 1987
; Tollin and
Yin 2002a
; Wenstrup et al. 1988
).
In most of the studies cited above, a small proportion of neurons has
been identified with spatial receptive field characteristics that can
be classified as intensity invariant. One reasonable suggestion,
therefore, is that a neural code for sound-source direction is carried
by this sub-population using one of the hypothesized receptive field
characteristics (e.g., maximal response). In our studies, however, we
have investigated an alternative proposal. Namely, that within an
ensemble of AI cortical neurons with spatial receptive fields that are
typically large and exhibit multiple co-variations among stimulus
dimensions, there is sufficient information (in a statistical sense) to
code for sound-source direction (Jenison 1998
, 2000
,
2001a
; Jenison et al. 2001a
). This information
theoretic approach benefits greatly from the analytic formulation of
the spatial receptive field and the application of standard
quantitative tools for parameter estimation.
Fisher information
We, as well as others, have investigated the consequences of broad
receptive fields on population coding using Fisher information and the
Cramer-Rao lower bound (CRLB) under the assumption of independent noise
(Jenison 1998
; Paradiso 1988
;
Seung and Sompolinsky 1993
), and correlated noise
(Abbott and Dayan 1999
; Gruner and Johnson
1999
; Jenison 2000
; Sompolinsky et al.
2001
). The CRLB is a lower bound on the variance, or the SE, of
any unbiased estimator, and is derived from Fisher information with
respect to a family of parametric probability distributions. The CRLB
is inversely related to Fisher information mathematically and
intuitively. As the magnitude of Fisher information increases, we
expect the estimated SE to diminish. If the CRLB can be derived
analytically, it can be used to compute the minimum possible variance
about any value estimated by a theoretical ideal observer. Under the assumption of independence, even very broad and nonuniform spatial receptive fields in auditory cortex can demonstrate psychophysical localization acuity with as few as 10 cells in the population (Jenison 1998
, 2000
).
Most analytical derivations for Fisher information are based on the
assumption of a multivariate Gaussian distribution of error; however,
deviation from the standard Gaussian assumption requires alternative
constructions for Fisher. By examining the residual error from the
current model, we ascertained the magnitude of response-latency
variance and modeled that variability using an alternative to the
Gaussian distribution, that of the IG, whose variance depends on the
mean latency and allows formal evaluation of the growth in variance
using Fisher information. This relationship may prove useful beyond
field AI, since response latency metrics have recently been shown to
carry a significant proportion of the directional information in
nonprimary auditory cortical areas (Furukawa and Middlebrooks
2002
), and in visual (Gawne et al. 1996
;
Heller et al. 1995
; Wiener and Richmond
1999
) and somesthetic (Petersen et al. 2001
)
sensory representations.
The normal and Poisson densities are well known. The Poisson and its
variants have been used extensively as point-process and rate models.
Less familiar Tweedie densities include the inverse Gaussian and gamma.
Most recently Barbieri et al. (2001)
have modeled spike
trains using these densities to address deficiencies in their earlier
Poisson models (Brown et al. 1998
). Tweedie densities are characterized by an index p where E[x] = µ and var[x] =
µp. The
indices p = [0, 1, 2, 3] correspond to the normal, Poisson, gamma, and IG, respectively (Jorgenson 1987
, 1999
). The
IG distribution has a history dating back to 1915 when Schrodinger
presented derivations of the density of the first passage time
distribution of Brownian motion with motion drift (Chhikara and
Folks 1989
; Seshadri 1999
). Tweedie
(1941)
coined the term inverse Gaussian based on his
observation that the cumulant generating function of IG is the inverse
of the cumulant generating function for the Gaussian. We have analyzed the goodness-of-fit of the IG (Jenison 2001b
;
Jenison et al. 2001a
), and found it to be a reasonable
model of increasing variability as a function of mean first-spike latency.
In this study, we employed the Fisher information derivation for the IG
distribution that was recently suggested as a viable alternative to the
standard Gaussian (Jenison 2001b
; Jenison et al.
2001a
). When a small ensemble of AI cells was studied in this way, the influence of sound-source intensity was manifested as a
nonmonotonic relationship with acuity. Except near the midline (i.e., 0 azimuth), acuity was best at an intensity between the minimum and
maximum level tested. These results have some support in the
psychophysical literature at high-intensity levels (MacPherson and Middlebrooks 2000
) and at low intensity levels (Su
and Recanzone 2001
). The nonmonotonic behavior can be explained
in terms of the competing contributions to population coding. As the
sound intensity increases the general trend for the population is to broaden and flatten the receptive fields that results in a general decrease in spatial gradients (see Figs. 1 and 6). However, it is also
the case that as the intensity decreases the mean first-spike latency increases together with an increase in variance (see Figs. 1
and 2). These two characteristics contribute to the increase in the
standard error at high and low intensities.
| |
ACKNOWLEDGMENTS |
|---|
This work was supported by National Institutes of Health Grants DC-03554, DC-00116, and HD-03352.
| |
FOOTNOTES |
|---|
Address for reprint requests: R. A. Reale, 627 Waisman Center, University of Wisconsin, Madison, WI 53711 (E-mail: reale{at}physiology.wisc.edu).
| |
REFERENCES |
|---|
|
|
|---|