Abstract
Heil, Peter. Auditory cortical onset responses revisited. I. Firstspike timing. J. Neurophysiol. 77: 2616–2641, 1997. Sound onsets are salient and behaviorally relevant, and most auditory neurons discharge spikes locked to such transients. The acoustic parameters of sound onsets that shape such onset responses are unknown. In this paper is analyzed the timing of spikes of single neurons in the primary auditory cortex of barbiturateanesthetized cats to the onsets of tone bursts. By parametric variation of sound pressure level, rise time, and rise function (linear or cosinesquared), the time courses of peak pressure, rate of change of peak pressure, and acceleration of peak pressure during the tones' onsets were systematically varied. For cosinesquared rise function tones of a given frequency and laterality, any neuron's mean firstspike latency was an invariant and inverse function of the maximum acceleration of peak pressure occurring at tone onset. For linear rise function tones, latency was an invariant and inverse function of the rate of change of peak pressure. Thus latency is independent of rise time or sound pressure level per se. Latencyacceleration functions, obtained with cosinesquared rise function tones under different stimulus conditions (frequency, laterality) from any given neuron and across the neuronal pool, were of strikingly similar shape. The same was true for latency–rate of change of peak pressure functions obtained with linear rise function tones. Latency–acceleration/rate of change of peak pressure functions could differ in their extent and in their position within the coordinate system. The positional differences reflect neuronal differences in minimum latency L _{min} and in a sensitivity S to acceleration and rate of change of peak pressure (transient sensitivity), a hitherto unrecognized neuronal property that is distinctly different from firing threshold. Estimates of L _{min} and S, which were derived by fitting a simple function to the neuronal latency–acceleration/rate of change of peak pressure functions, were independent of rise function. On average, L _{min} decreased with increasing characteristic frequency (CF), but varied widely for neurons with the same CF. S varied with CF in a fashion similar to the cat's audiogram and, for a given neuron, varied with frequency. SD of firstspike latency was roughly proportional to the slope of the functions relating latency to acceleration/rate of change of peak pressure. Thus SD increased exponentially, rather than linearly, with mean latency, and did so at about twice the rate for linear than for cosinesquared rise function tones. The proportionality coefficients were quite similar across the neuronal pool and similar for both rise functions. Minimum SD increased nonlinearly with increasing L _{min}. These findings suggest a peripheral origin of S and a peripheral establishment of latency–acceleration/rate of change of peak pressure functions. Because of the striking similarity in the shapes of such functions across the neuronal pool, sound onsets will produce orderly and predictable spatiotemporal patterns of firstspike timing, which could be used to instantaneously track rapid transients and to represent transient features by partly scaleinvariant temporal codes.
INTRODUCTION
Natural acoustic signals, including many of those used by animals and humans for auditory communication, are spectrally and temporally complex. A recent study has emphasized the importance of the temporal structure of the envelope by showing that it can convey an unexpected amount of information needed for speech recognition (Shannon et al. 1995). Animal studies have shown that throughout the auditory pathway neurons can be excited by rapid temporal changes in stimulus envelopes, provided that the stimuli have an adequate spectral content. In many studies researchers have used stimuli with repetitive envelope fluctuations, such as periodically amplitudemodulated sinusoids or noise or click trains, and have demonstrated that neuronal responses can be locked to the individual repetitive envelope fluctuations (e.g., auditory nerve: Joris and Yin 1992; cochlear nucleus: Frisina et al. 1985; Rhode and Greenberg 1994; inferior colliculus: Heil et al. 1995; Langner and Schreiner 1988; Rees and Møller 1983; thalamus: Rouiller et al. 1981; cortex: Eggermont 1993; Schreiner and Urbas 1988).
A particularly salient temporal envelope change is the onset of a sound, and nearly all neurons along the auditory pathway respond briskly to such a transient. For example, all physiologically classified neuron types of the cochlear nucleus, with the exception of buildup neurons in the dorsal division, display an initial peak in their poststimulus time histograms recorded in response to short tone bursts (e.g., Rhode and Greenberg 1992). This peak reflects the locking of the neuron's initial spike(s) to the tone's onset, and therefore such responses or response components are sometimes referred to as onset responses. Because of the demonstrated phaselocking of spikes to amplitudemodulated signals or click trains, such signals may constitute a rapid series of like onsets for a neuron. In fact, Rhode and Greenberg (1992) have noted that cochlear nucleus neurons, classified as onset units, phaselock with high precision also to lowfrequency signals (sinusoidal carriers and amplitudemodulated sounds) “. . . responding as if each cycle is an effective excitatory stimulus” (p. 100). Onset response components are also evident in the discharge patterns of neurons in locations higher up the pathway, such as the medial geniculate or the auditory cortex (for review see Clarey et al. 1992). Onset responses appear to be least vulnerable to the effects of anesthesia (Zurita et al. 1994), and the responses of neurons in the auditory cortices of chloralose and barbiturateanesthetized animals are dominated by discharges locked to the stimulus onset (e.g., Brugge et al. 1969; Phillips 1988; Zurita et al. 1994).
Although it is widely accepted that the initial discharges of most auditory neurons are evoked by stimulus onset, little attention has been given to the question of which physical parameters of the stimulus onset actually shape a neuron's onset response. When auditory neurons are probed with narrowband stimuli, such as pure tone bursts, the effects of the abruptness of the amplitude change on the shortterm frequency spectrum (e.g., Durrant and Lovrinic 1984; Pickles 1988) have been of some concern, and to reduce spectral splatter at signal onset, signals are generally shaped with some finite rise time. The neglect of the physical parameters of sound onsets (other than the general concern about spectral splatter), despite the recognition that the initial discharges of most auditory neurons are evoked by stimulus onsets, has an almost paradoxical consequence: it can be seen in innumerable studies that measures of neuronal properties that were extracted from onset responses (or responses that contained an onset component) are reported and analyzed with respect to stimulus parameters that characterize features of the steadystate or plateau portion of the stimulus. An important case in point is the effect of sound pressure level (SPL) on neuronal onset responses. Alterations of the SPL of a stimulus inevitably coalter features of its onset, particularly when the rise function and the rise time are held constant, as is routinely done. When stimuli are shaped with the widely used linear rise function, for example, the most obvious feature is the slope of the envelope, i.e., the rate at which the peak pressure changes until the plateau value is reached. Any 6dB increase in SPL will double this rate. A second feature that is coaltered with SPL under such conditions is the quasiinstantaneous acceleration of peak pressure, a parameter whose potential relevance has not been recognized at all. Both stimulus onset parameters are also coaltered when the rise time is altered and the SPL is held constant. Thus it is conceivable that neuronal onset responses might be shaped by factors other than the SPL or the shortterm frequency spectrum.
Natural sound onsets will not only vary due to variation of signal SPL, but, because of differences in the manner in which sounds are produced, also due to variation in signal rise time (e.g., Cutting and Rosner 1974; Hall and Feng 1988). In speech sounds, for example, rise time can vary with the manner of articulation (Pickett 1980; Stevens 1980). Rise time can in fact cue perceptual categories in speech (Cutting and Rosner 1974; Stevens 1980), but clearly affects the perception of nonspeech sounds as well (Cutting and Rosner 1974). In humans, the just noticable difference for a change in rise time is ∼25% of the duration of the rise time (van Heuven and van den Broecke 1979). Natural signals, including speech sounds, also differ in rise function, but according to our knowledge, in no physiological or psychophysical studies has the potential relevance of this onset feature been investigated. Nevertheless, the auditory system will experience, and may be able to discriminate, a wealth of different sound onsets.
In the present study and the companion paper (Heil 1997) the question of how auditory onset responses code or represent auditory onsets is investigated. This question is addressed by focusing the analysis on onset parameters such as the rate of change or the acceleration of peak pressure. Onset features were varied by varying SPL, rise time, and rise function. In addition to the widely used linear rise function, which is characterized by a constant rate of change of peak pressure during the rise time, cosinesquared rise functions were used. These have the advantage that peak pressure, rate of change, and acceleration of peak pressure are smooth and assessable functions of time that reach their maxima at different points during the rise time and are differentially affected by manipulations of rise time or SPL. Neurons of the primary auditory cortex (AI) are particularly suited to tackle the issue of onset coding because they preferentially respond to sound onsets, and any later discharges, if they occur, can be readily distinguished (e.g., Brugge et al. 1969). Because auditory cortical neurons have complex frequency filters, we have employed simple tonal stimuli to more easily decipher the effects of carrier frequency. A thorough understanding of coding strategies for isolated onsets will also promote our understanding of the coding of envelope transients that occur periodically or aperiodically during the course of complex auditory signals and that are so critical for speech recognition (Shannon et al. 1995). Preliminary reports of some of the findings have been presented (Heil 1996; Heil and Irvine 1996a).
METHODS
Animal preparation
Seven adult cats (3 females and 4 males, weighing between 2.6 and 3.8 kg) contributed data to this study. All had healthy ears as judged by otoscopic inspections of the tympani and middle ears and by the shapes and sensitivities of the N_{1} audiogram. Each cat was deeply anesthetized with pentobarbitone sodium (40 mg/kg ip). Atropine (0.3 ml im) was administered to reduce tracheal mucous secretion. A broadspectrum antibiotic (Amoxil; 0.5 ml im) was also given. The trachea and the radial vein were cannulated and anesthesia was maintained throughout surgery and recordings (up to 30 h) by intravenous injections of pentobarbitone in a physiological saline solution that also contained a few drops of heparin. The electrocardiogram was continuously monitored and rectal temperature was held near 38°C by a thermostatically controlled DC blanket. Surgical procedures have been described in detail elsewhere (Heil et al. 1992b). In brief, the left auditory cortex was exposed by trepanation of the overlying skull and removal of the dura. A specially designed Perspex chamber was mounted to the skull surrounding the opening, filled with warm saline, and sealed with a glass plate on which a small hydraulic microdrive was mounted and that housed the glassinsulated tungsten microelectrode. Each bulla was exposed and a roundwindow electrode and a length of finebore polyethylene tubing, allowing static pressure equalization within the middle ear, were inserted through a small hole. Thereafter the bullae were resealed with dental acrylic. The external meati were also cleared of surrounding tissue and transected to leave only short meatal stubs.
Acoustic stimulation and recording procedures
The cat was located in a soundattenuating chamber. Stimuli were digitally produced (Tucker Davis Technology) and presented to the cat's ears via precalibrated sealed sound delivery systems. Each system consisted of a STAX SRSMK3 transducer in a coupler. The sound delivery tube of the coupler fitted snugly into the meatal stub.
During viewing under an operating microscope, the microelectrode (tip diameter ∼ 10 μm; impedances ∼ 3–5 MΩ at 1 kHz) was positioned manually close above a chosen point on the cortical surface and was then advanced nearnormal to the surface by means of the microdrive. Neural activity was amplified (×1,000) and, for recording of action potentials, also filtered (500–5,000 Hz) and displayed on storage oscilloscopes.
Once a neuron was well isolated, its characteristic frequency (CF; frequency of lowest response threshold) and its preferred laterality of stimulus presentation (viz., monaural ipsilateral, monaural contralateral, or binaural with identical tones to each ear) were determined by manually varying the appropriate stimulus parameters. The discriminator level was set to trigger off either the positive or the negative slope of the filtered action potential waveform, but was not switched between the two during data acquisition. Adjustments of the trigger level during data acquisition were sometimes necessary. However, the effects of this procedure on the trigger instant were very small (<0.1 ms, as judged by inspection of the oscilloscope traces). Event times were stored on disk with 10μs resolution for offline analysis.
Under computer control, 20 repetitions of CF tones with a given rise function and a fixed rise time were presented at 1 Hz, at SPLs ranging from below threshold up to 90 dB SPL in 10dB steps, followed by a measure of spontaneous activity. A different rise time was then selected and the recording procedure was repeated. As many as seven different rise times, covering the range of 1–170 ms, were tested and presented in random sequence. Most neurons were tested with CF tones of their preferred stimulus laterality, but some were tested with other stimulus lateralities and at other frequencies as well. In the latter cases, tones of a given rise function and rise time but of different frequencies and amplitudes were presented pseudorandomly as described in detail elsewhere (Heil et al. 1992b).
All tone bursts were 400 ms in duration including the times comprised by the symmetrical rise and fall functions. Tone bursts were shaped with either linear or cosinesquared rise and fall functions. Because it is the peak pressure PP (measured in Pa, and not the SPL, expressed in dB SPL), that changes according to the rise function, it is thus convenient for the present purpose to express the SPL as the plateau peak pressure PP _{plateau}.
With the cosinesquared rise function used here, peak pressure (in Pa) changes as a function of time t (in s) according to
The rate of change of peak pressure RCPP (in Pa/s) varies with time according to
The time courses of peak pressure, rate of change, and acceleration of peak pressure for cosinesquared and for linear rise functions are schematically illustrated in Figs. 1 and 9, respectively.
Data analysis
Spikes in response to the 20 presentations of a given stimulus were displayed offline as a poststimulus time histogram. The histogram was used to select analysis windows that would comprise only onset responses and would discard late discharges, offset responses, and occasionally presumed spontaneous spikes. Spontaneous activity was generally very low (<3 spikes/s) and late discharges, if they occurred at all, were clearly separated in time from onset responses by marked intervals of no activity. Thus the selection of an appropriate onset window was generally straightforward. In most cases, analysis windows used for a given neuron were the same for all rise times and amplitudes studied (e.g., from 5 to 100 ms after tone burst onset). In some instances, however, different windows had to be selected. In these cases, windows for tones of long rise times and low amplitudes were longer or delayed relative to windows for tones of short rise times and high amplitudes, because otherwise onset responses would have been missed or late responses would have been included, respectively. In the present paper aspects of spike timing are analyzed, whereas in the companion report the focus is on response magnitudes. Only the timing of the first (and in many neurons the only) spike will be considered because the interspike intervals of the onset responses of auditory cortex neurons, which discharge more than one spike per stimulus, are very regular and independent of stimulus level (Phillips and Sark 1991). Mean and SD of firstspike latency, measured from stimulus onset, response probability, and number of discharges in the window were computed. As a rule, only means and SDs based on response probabilities of ≥0.15 were considered further.
RESULTS
The results on mean firstspike latency are presented first, and then those on the variability of firstspike latency. In each section, data recorded with cosinesquared rise function tones are presented before those recorded with linear rise function tones, followed by a comparison of the results obtained with the two different rise functions.
Data base
This study is based on 74 wellisolated single neurons, recorded in the left AI, as inferred from the locations of the recording sites with respect to the sulcal pattern, the tonotopic sequence, and the presence of a shortlatency strong evoked potential to tone bursts. In only one penetration in one cat did we not see an AIlike evoked potential. The twoneurons recorded in this penetration (9587/03 and 9587/04)had very long minimum latencies (>30 ms). A few isolated AI neurons, which were spontaneously active, appeared not to be driven by tone bursts. Sixtyfive neurons were studied with tones shaped with cosinesquared rise functions, 39 neurons were studied with tone bursts shaped with linear rise functions, and 30 neurons were studied with both types of tones. Tones were presented with the neuron's preferred stimulus laterality, which was binaural for 31 neurons, contralateral for 40 neurons, and ipsilateral for 3 neurons. Four neurons were in addition studied with several stimulus lateralities. The neurons in the sample had CFs ranging from 1.5 to 35.2 kHz, with most CFs in the octave band from 12 to 24 kHz. Three neurons were also studied at multiple frequencies other than their CFs.
Mean firstspike timing
ASPECTS OF COSINESQUARED RISE FUNCTION TONES.
Figure 1, left, schematically illustrates the time courses of the envelopes of the onsets of cosinesquared rise function signals. During the rise time the peak pressure (in Pa), but not the SPL (in dB SPL), of the signal changes according to the rise function (Fig. 1, top left). The rate of change of peak pressure also changes gradually during the rise time (Fig. 1, top middle). It is zero at the beginning and at the end of the rise time and reaches a maximum halfway through the rise time. Acceleration of peak pressure is maximal at the beginning of the rise time and decreases smoothly with time. It is zero halfway through the rise time. From then on acceleration becomes increasingly negative (deceleration) and reaches a negative maximum at the end of the rise time (Fig. 1, top right). Thereafter acceleration is zero. Alterations of both plateau peak pressure and rise time effect the onset of stimuli shaped with cosinesquared rise functions, but in different fashions. A 6dB increase in the plateau SPL of stimuli with a given rise time will lead to a twofold increase in the maximum rate of change of peak pressure and in the maximum acceleration of peak pressure. Shortening the rise time by a factor of 2 for any given plateau SPL also leads to a twofold increase in the maximum rate of change of peak pressure (Fig. 1, 2nd row, middle), but maximum acceleration of peak pressure increases fourfold (Fig. 1, 2nd row, right). Therefore signals can be grouped to match in rise time, plateau peak pressure, maximum rate of change of peak pressure, or maximum acceleration of peak pressure (Fig. 1, 1st–4th rows, respectively). Signals that share the same value of maximum acceleration of peak pressure differ in rise time and in plateau peak pressure (Fig. 1, bottom row).
MEAN FIRSTSPIKE LATENCY TO COSINESQUARED RISE FUNCTION TONES.
Figure 2 a shows the mean first spike latencies of one AI neuron (9595/04) to contralateral CF tone bursts of 22 kHz, all shaped with cosinesquared rise functions. The data are plotted as a function of plateau peak pressure (in Pa). The longest mean firstspike latency of ∼100 ms was measured in response to tones with 170ms rise times and plateau peak pressures of 0.00028 Pa, equivalent to 20 dB SPL. For each rise time, latency declines nonlinearly with increasing plateau peak pressure. For tones of any given plateau peak pressure, latency increases systematically with rise time, although the different functions appear to converge on a single minimum at ∼12.3 ms.
Similar observations can be made when latency is plotted as a function of the maximum rate of change of peak pressure (Fig. 2 b). Although the functions relating latency to maximum rate of change of peak pressure obtained with different rise times are closer together than those relating latency to plateau peak pressure, latency still increases with rise time for signals with the same maximum rate of change of peak pressure. Also, for some tones of long rise time the neuron discharges before the maximum rate of change of peak pressure is reached.
In contrast, when latency is plotted as a function of maximum acceleration of peak pressure, all five latencyacceleration functions obtained with different rise times are in close register, i.e., at any given acceleration the functions are within 1 SD of the means (Fig. 2 c). For clarity, SDs are not plotted in Fig. 2, but decreased for neuron 9595/04 from 6.4 to 0.5 ms.
Tones with a common maximum acceleration at their onsets also share a number of other properties. These are the maximum deceleration occurring at the end of the rise time; the mean acceleration and mean deceleration averaged over the first and second half of the rise time, respectively; the ratio of RCPP _{max} and rise time; and the ratio of PP _{plateau} and the square of the rise time (Fig. 1; Eq. 4 and 5). However, these parameters or any combination thereof can be ruled out as determinants of latency because in response to many tones, particularly of long rise times, the first spike occurs long before the end, or even the midpoint, of the rise time (Fig. 2).
Data from a second neuron (9598/03) are illustrated inFig. 2, d–f. This neuron's CF was similar to that of 9595/04(viz., 21 kHz), but the neuron was excited best by tones presented to the ipsilateral ear. The total range of latencies obtained with the six different rise times tested was nearly 100 ms. As was the case for neuron 9595/04, all latency functions obtained with different rise times were in close register only when plotted as a function of the acceleration of peak pressure (Fig. 2 f).
In all 65 neurons studied with cosinesquared rise function tones, mean latencies obtained at any given acceleration of peak pressure with tones of different rise times were within 1 SD of each other. Note that over the range of rise times used (1.7–170 ms), tones with the same maximum acceleration differ in plateau peak pressure by as much as 80 dB, i.e., by a factor of 10,000.
In several cases, some mean latencies could be systematically longer than others recorded to tones of the same maximum acceleration of peak pressure. In all these cases, the extraordinarily long latencies were measured to tones closest to the firing threshold of the neuron (e.g., the mean latencies to the 1.7ms rise time tones with accelerations between 100 and 1,000 Pa/s^{2} in Fig. 2, c and f). Neuron 9598/14 (Fig. 14 a) represents the most drastic example of this “nearthreshold effect.” In this case the means of firstspike latency closest to threshold are based on the same response probability (viz., 100%) as are all the other means. In other cases in which the nearthreshold effect was observed, the exceptionally long nearthreshold means were mostly based on much lower response probabilities (e.g., neuron 9598/08 in Fig. 3 a).
Latencyacceleration functions obtained with shorter rise times take up the common course of the latencyacceleration functions obtained with longer rise times at consecutively higher values of maximum acceleration of peak pressure (e.g., Figs. 2, 3, and 13, a and d). Thus a neuron's firing threshold is not determined by the maximum acceleration of peak pressure at signal onset (see also companion paper).
COMPARISON OF LATENCYACCELERATION FUNCTIONS AMONG DIFFERENT NEURONS.
The latencies of neurons 9595/04 and 9598/03 in Fig. 2 are plotted with the same resolution, and comparison of Fig. 2, c and f, reveals that their latencyacceleration functions are very similar in shape. In Fig. 3 a, latencyacceleration functions obtained from another five neurons are plotted in a single graph, facilitating a comparison of latencyacceleration functions among different neurons. The data illustrated in Fig. 3 a were selected to represent neurons recorded in different cats and with widely different CFs (range 2.3–30 kHz), data obtained with different laterality of presentation, and functions covering very different ranges of latency. The latencyacceleration function of neuron 9598/08 (♦) covered an extensive range of latency (130–15 ms) and of maximum acceleration of peak pressure (>8 orders of magnitude). Because of higher response thresholds, strongly nonmonotonic spike count functions, or both, the latencyacceleration functions of the other neurons were more restricted along the abscissa, but also along the ordinate. However, an inspection of Fig. 3 a suggests that the shapes of these more restricted functions closely resemble sectors of the extensive function of neuron 9598/08. This is most obvious for neuron 9595/03 (○), which had a threshold slightly higher than that of neuron 9598/08. Neuron 9592/21 (▴) had a considerably higher threshold, but also slightly longer mean latencies, than neuron 9598/08. But even the course of the latencyacceleration function of neuron 9598/16 (•), which is restricted at each end, resembles the course of the extensive function of neuron 9598/08 in its intermediate part.
All 93 latencyacceleration functions, obtained from the 65 neurons studied with cosinesquared rise functions tones, had strikingly similar shapes. All functions could be brought into very close register by allowing them to be shifted only along the ordinate and along the abscissa, as initially judged by visual inspection. Shifts along the ordinate compensate for differences in the minimum or asymptotic latency, and shifts along the abscissa compensate for differences in sensitivity to acceleration.
MATHEMATICAL DESCRIPTION OF LATENCYACCELERATION FUNCTIONS.
To get quantitative measures of the similarity of the latencyacceleration functions of different neurons and of the shifts along the ordinate and abscissa required to obtain congruence, a simple mathematical function was selected that described the form of the latencyacceleration functions, and also allowed quantification of the positional differences along the abscissa and the ordinate
Iterative curve fitting was performed in the following way. In initial fitting procedures, L
_{min}, A
_{CRF}, S, and the exponent α were allowed to vary. Each deviation of the fitted function from the measured mean latency was squared and then weighted by multiplying it with the response probability on which the measured mean was based. The smallest sum of the weighted squared deviations, i.e., the best fit, was generally found with <1,000 iterations. In some cases the fit was found to improve with increasing α. The improvement, however, was marginal for α > 4, and also pushed A
_{CRF} into unwieldy dimensions (e.g., years for α = 10). For a second fitting step, we therefore selected α = 4, and allowed L
_{min}, A
_{CRF}, and S to vary. For the 93 different functions fitted, A
_{CRF} showed a unimodal distribution. Figure 4
a shows a scatterplot of A
_{CRF} against the number of first spikes that had contributed to the fitted function. The figure shows that the width of the distribution of A
_{CRF} diminished rapidly with increasing number of first spikes and converged toward theweighted average of Ã
_{CRF} = 12,791 ms (Fig. 4
a, – – –).In a third and final fitting procedure, A
_{CRF} was also kept constant (at 12,791 ms). In this way, a function with a fixed shape, as determined by α and A
_{CRF}, but free to be placed within the coordinate system of latency and maximum acceleration of peak pressure, was fitted to the data
In Fig. 3 b, the mean latencies of two of the neurons of Fig. 3 a (viz., 9595/03 and 9598/16) are reproduced together with the fitted functions (Eq. 8 ), which are of identical shape. The figure allows a visual assessment of the quality of the fit and the similarity of the fitted function with neuronal latencyacceleration functions. The best solutions for S and L _{min} found by the final fitting procedure are 3.96 and 18.8 ms for neuron 9598/16 and 4.91 and 10.5 ms for neuron 9595/03. Thus, according to the fitting results, the latencyacceleration function of neuron 9598/16 is displaced upward by 8.3 ms and rightward by 0.95 log units of acceleration relative to the function of neuron 9595/03.
COMPARISON OF TRANSIENT SENSITIVITY AND FIRING THRESHOLD.
S is not to be confused with firing threshold, a measure generally expressed in dB SPL and related to peak pressure. To emphasize this point more clearly, note, for example, that in Fig. 3 a, the latency functions of neurons 9598/08 (♦) and 9595/03 (○) are in nearly perfect register, without requiring any notable shifts to obtain congruence, i.e., the two neurons have the same S. However, the latency functions do not start at the same point along the abscissa, reflecting differences in their firing thresholds. Figure 5 presents, for all neurons in the sample, a scatterplot of the firing thresholds (in dB SPL) against S. Each neuron contributed multiple data points to the plot, because threshold SPL increased with rise time (see companion paper and also Fig. 6). Although a low transient sensitivity seems to exclude lowthreshold SPLs, there is only a loose relationship between the two parameters (r ^{2} = 0.123; n = 319). Threshold SPLs can vary over a range of ≥100 dB for the same S.
EFFECTS OF STIMULUS LATERALITY ON LATENCYACCELERATION FUNCTIONS.
In four neurons latencies to tone bursts were presented with different stimulus lateralities, i.e., binaural, monaural contralateral, and monaural ipsilateral. In general, stimulus laterality had a very small, if any, effect on the shapes of the latencyacceleration functions or their horizontal position within the coordinate systems. In a comparison of stimulus laterality in a given neuron, fitting results yielded differences in S that averaged 0.1 and were all <0.3, ∼1/10 of the variation seen across neurons. The largest effect of stimulus laterality was on the estimated L _{min}. With monaural ipsilateral stimulation L _{min} was consistently 2–3 ms longer than with contralateral or binaural stimulation, whereas differences in L _{min} between monaural contralateral and binaural stimulation were <0.9 ms.
EFFECTS OF STIMULUS FREQUENCY ON LATENCYACCELERATION FUNCTIONS IN A GIVEN NEURON.
In three neurons latencies were obtained to tone bursts of different frequencies including the CF. Results from two of these neurons (9595/18 and 9595/09) are illustrated in Fig. 6. Figure 6, a and d, shows mean latencies plotted against maximum acceleration of peak pressure. The latencyacceleration functions for different frequencies all have similar shape, but are obviously dispersed along the abscissa. The analysis of the fitting results illustrates the systematic nature of this dispersion: in Fig. 6, b and e, the value of S obtained from these fits is plotted against tone burst frequency. For neuron 9595/18 the highest transient sensitivity is obtained for 26.8 and 24.8 kHz, and S decreases toward higher and lower frequencies, whereas for neuron 9595/09 the function is more complex.
The transient sensitivity versus frequency functions can be compared with the more conventional threshold or tuning curves based on firing probabilities (Fig. 6, c and f). Tones of the same rise time that differ in peak pressure by 20 dB SPL differ in the acceleration of peak pressure by a factor of 10. This is equivalent to a difference in S of 1, so that the ordinates in Fig. 6, b and c and e and f, have the same relative scaling. The CF of neuron 9595/18 was near 26.8 kHz when tone bursts with 1.7 and 8.5ms rise time were used, but shifted to 28.8 kHz with tone bursts having 17ms rise times. In addition, with prolongation of the rise time systematic elevations in response threshold were observed throughout the excitatory frequency range, when threshold was expressed as a function of the plateau peak pressure or level (in dB SPL; Fig. 6 c, see also companion paper). Neuron 9595/09 had twinpeaked tuning curves with lowest thresholds at 24 and at 14 kHz. Again, threshold SPLs increased systematically with rise time at all frequencies, although not by identical amounts (Fig. 6 f). Note that the transient sensitivity versus frequency curves obtained from the analysis of latencyacceleration functions and the tuning curves share common features, but are not identical. For neuron 9595/09, S and the tuning curves show a dip at 18 kHz (cf. Figs. 6, e and f), but this dip is more pronounced in the tuning curves, particularly those obtained with longer rise times.
For neuron 9595/18, the estimated values of L _{min} obtained from the fits of the latencyacceleration functions varied between 8 and 9 ms and were not systematically related to frequency, whereas for neuron 9595/09, L _{min} varied between ∼4.5 and 7 ms with frequency, and its course approximately paralleled the tuning curve, with L _{min} being shortest at 14 and 26 kHz (not shown).
EFFECTS OF STIMULUS FREQUENCY ON LATENCYACCELERATION FUNCTIONS ACROSS NEURONS.
For a comparison of latencyacceleration functions among different neurons, only measures obtained at CF were considered. Figure 7 provides a scatterplot of L _{min}, as obtained from the fits, against frequency. In different neurons, L _{min} varied between 5.6 and 37 ms, with most values between 9 and 15 ms. On average, L _{min} decreased with increasing CF. This decrease is obvious for the shortest L _{min} and a similar trend for the entire data set emerged from a regression analysis. L _{min} was closely correlated with the shortest measured latency (r ^{2} = 0.894), but on average was 1.8 ms shorter.
Figure 8 shows a scatterplot of S obtained from the fits over frequency. The least reliable S estimates are shown by open squares. The degree of reliability of S was quantified by the increase in the sum of the weighted leastsquared deviations, when S was arbitrarily incremented by 1 after the best fit had been obtained. This increase could be as small as twofold, indicating low reliability for the obtained value of S, and as high as 1,200fold, with a mean of 60fold. For the open squares, the increase was <15fold. The distribution of S, particularly that of the most reliable measures (solid squares), is similar to the cat's compound action potential audiogram. The audiogram shows highest sensitivity at ∼10 kHz, and a steeper rolloff for higher than for lower frequencies (see Rajan et al. 1991 for illustrations of audiograms measured under different stimulus conditions). At most frequencies the vertical scatter in the data points of Fig. 8 is in the range of only 0.5, equivalent to 10 dB. Because there may have been differences in hearing sensitivity among the six cats that contributed data to this figure, differences in the sensitivities of the two ears in a given cat, and imprecisions in CF determination (cf. Fig. 6), it is conceivable that some, if not all, of this vertical scatter may be noise due to these factors.
ASPECTS OF LINEAR RISE FUNCTION TONES.
With linear rise functions, the rate of change of peak pressure during the rise time is constant (Fig. 9) and, for a given rise time, its magnitude is directly proportional to the plateau peak pressure achieved at the end of the rise time, and, for a given plateau peak pressure, is inversely proportional to rise time. Thus the first derivative of the stimulus envelope has the shape of a rectangle, with its vertical axis proportional to rate of change of peak pressure (expressed in Pa/s) and its horizontal axis equivalent to the rise time (Fig. 9, middle). Signals shaped with linear rise functions can be grouped to match either in rise time (Fig. 9, top), in plateau peak pressure (middle), or in the rate of change of peak pressure (bottom). Acceleration of peak pressure occurs at the beginning of the rise time and deceleration occurs at the end of the rise time. Mathematically, acceleration and deceleration are instantaneous and their magnitudes are infinite.
MEAN FIRSTSPIKE TIMING TO LINEAR RISE FUNCTION TONES.
Figure 10, a and b, shows the mean firstspike latencies of neuron 9595/04 to linear rise function tones. It is the same neuron for which latencies obtained with cosinesquared rise function tones were illustrated in Fig. 2, a–c. Figure 10 a illustrates that for each rise time, latency declines nonlinearly with plateau peak pressure. For tones of a given plateau peak pressure, latency increases with rise time. As was the case with cosinesquared rise function tones, the curves appear to converge on a single minimum and in response to some tones of long rise times the neuron discharges long before the plateau peak pressure is reached.
Figure 10 b shows mean first spike latencies plotted over the rate of change of peak pressure during the rise time. Note that all five functions are now in very close register, i.e., for any given rate of change of peak pressure, mean latencies are within <1 SD of each other.
A second example (neuron 9587/13) is illustrated in Fig. 10, c and d. This neuron had a CF similar to that of neuron 9595/04 (viz., 20.5 kHz), but was stimulated binaurally and had a much more restricted range of latencies (∼12–18 ms).
In all 39 neurons studied with linear rise function tones, tone bursts characterized by the same rate of change of peak pressure during the rise times elicited a response from a given neuron with the same firstspike latency, i.e., within 1 SD of the mean, irrespective of differences in rise time or plateau peak pressure. Tones of identical rate of change of peak pressure that differ in rise times by a factor of 100 differ in plateau peak pressure by the same factor, i.e., by 40 dB. The finding that tones with rise times of 1–100 ms and possibly beyond those limits initiate spikes with the same latency, provided they have identical rate of change of peak pressure, suggests that the latency of the first spike must be determined very early during the rise time, viz., within <1 ms after stimulus onset.
POST HOC ANALYSIS OF PREVIOUSLY PUBLISHED LATENCY DATA.
There has been one previous report on the effect of varying rise time and level of linear rise function tones on the responses of AI neurons (Phillips 1988). In the following, I present a post hoc analysis of latency data published in that paper, because they showed a behavior that is markedly different from that of all neurons in my sample. Figure 11, left, replots latency of one of the three units (viz., RT206) for which Phillips has presented data, and Fig. 11 a does so in the published and conventional form, viz., as a function of plateau peak pressure or tone level (in dB SPL). As noted by Phillips (1988), for each rise time latency declines with increasing level toward asymptotic values, but the functions do not converge on a single minimum.
In Fig. 11 b, some of these same data are replotted as functions of rise time for tones of specified plateau peak pressure. Note that for every plateau peak pressure, latency increases roughly linearly with rise time. The slopes (as well as the Yintercepts) decline systematically with increasing level, but unlike those of the neurons in our sample (see Heil and Irvine 1996b), all slopes are ≥1 (for comparison, unity slope is illustrated by the dashed line in Fig. 11 b). Linear regression analysis revealed slopes of 2.94 ± 0.05, 1.91 ± 0.02, 1.54 ± 0.04, 1.46 ± 0.05, and 1.05 ± 0.03 for plateau peak pressures equivalent to 22, 34, 46, 58, and 70 dB SPL, respectively. In other words, the differences in response latencies to tone bursts of the same plateau peak pressure are larger than the differences in rise time. In Fig. 11 c, the latency data of Fig. 11 a are plotted against the rate of change of peak pressure during the rise time. Note that the functions obtained with different rise times are not in register, quite unlike the behavior of all neurons in our sample (cf. Fig. 10, b and d). Instead, latency for tones of the same rate of change of peak pressure still increases systematically with rise time. This is more clearly illustrated in Fig. 11 d, where latency is plotted as a function of rise time, and where each function represents latencies obtained from tone bursts characterized by the same rate of change of peak pressure.
Several points are noteworthy here. First, for any given rate of change of peak pressure latency increases with rise time, and thus increases with the plateau peak pressure of the tone bursts (cf. Fig. 9, bottom left), a result that at first glance may seem paradoxical or at least counterintuitive. Second, all functions can be approximated by linear functions, but their slopes do not vary with rate of change of peak pressure. In fact, the slopes relating latency to rise time were 1.01 ± 0.20, 1.06 ± 0.14, 1.03 ± 0.13, 1.02 ± 0.02, and 0.95 ± 0.06 for the five rates of change of peak pressure in ascending order, i.e., they are very close to, and not significantly different from, 1.
The only reasonable interpretation of this result is that the spikes are in fact triggered at or by the end of the rise time. This point in time is characterized by the quasiinstantaneous deceleration of peak pressure. Figure 11e therefore plots the response latency corrected for the rise time as a function of the rate of change of peak pressure. Now the five functions are in close register, and the corrected latency decreases nonlinearly with this parameter. The same results were obtained for the other two neurons for which Phillips (1988) has published latency data, and are illustrated for RT209 in Fig. 11, f–j.
COMPARISON OF LATENCY–RATE OF CHANGE OF PEAK PRESSURE FUNCTIONS AMONG DIFFERENT NEURONS.
As was the case for latencyacceleration functions obtained with cosinesquared rise function tones, the latency–rate of change of peak pressure functions of different neurons obtained with linear rise functions tones could be brought into very close register by allowing shifts along the ordinate and the abscissa (not shown). The common form of these latency–rate of change of peak pressure functions, which differed from that of the latencyacceleration functions, and the shifts along the coordinates were found with fitting procedures analogous to those described above for cosinesquared rise functions and using the same type of formula
COMPARISON OF LATENCY WITH LINEAR AND WITH COSINESQUARED RISE FUNCTION TONES.
Thirty neurons were studied with both linear and cosinesquared rise function tones of the same frequency and can therefore be used for a direct comparison of the relevant features of latency functions. Figure 12 a shows a scatterplot of the estimated minimum latencies obtained with linear and with cosinesquared rise functions tones. As expected, the two estimates are nearly identical. Note that they lie close to the line of unity slope. A linear regression analysis yielded a slope of 0.87 with r ^{2} = 0.973. Exclusion of only the rightmost point increases the slope to 0.94.
Figure 12 b shows a scatterplot of the corresponding estimated S values obtained from the fits. Again, the estimates are nearly identical. A linear regression analysis yielded a slope of 0.91 with r ^{2} = 0.898.
Because the estimates of minimum latency and transient sensitivity are basically independent of the rise function, it is easy to derive the characteristics of linear and of cosinesquared rise function tones that would ideally yield a response from a given neuron with the same firstspike latency. With the formulas used here to describe the neuronal latency functions (Eq. 8
and 12), this is the case when
With Ã
_{CRF} = 12,791 ms and Ã
_{LRF} = 1,277 ms, it follows
Thus, for a given neuron and for tones of the same frequency, the isolatency conditions are described by a linear relationship between the logarithm of the rate of change of peak pressure of linear rise function tones and the logarithm of the maximum acceleration of peak pressure of cosinesquared rise function tones. The ordinate intercept is proportional to the neuron's S. In other words, to yield the same latency from a neuron as a cosinesquared rise function tone with a given acceleration of peak pressure, a linear rise function tone with only a low rate of change of peak pressure is required when the neuron's transient sensitivity is high, whereas a higher rate of change of peak pressure is required when the neuron's transient sensitivity is low.
SD of firstspike timing
The data presented so far have been based on the mean firstspike latency derived from up to 20 individual measures of latency on consecutive stimulus repetitions. However, the timing of the first spike varied from trial to trial. In accordance with previous studies (e.g., Aitkin et al. 1970; Brugge et al. 1969; Kitzes et al. 1978; Phillips and Hall 1990; Phillips et al. 1989), the SD of the firstspike latency around the mean will be used here as a measure of this variability.
COSINESQUARED RISE FUNCTIONS.
The finding that with cosinesquared rise function tones a neuron's mean firstspike latency is a function of the maximum acceleration of peak pressure suggests the possibility that the SD of the firstspike latency may also be a function of this parameter.
Figures 13 and 14 present data on SD and its relationship with maximum acceleration of peak pressure for three neurons. Figures 13, a and d, and 14a show the now familiar finding that mean firstspike latency is a unique function of maximum acceleration of peak pressure. Figures 13, b and e, and 14b show the corresponding SDs of firstspike latency, also plotted against this parameter. Several observations are important.
First, SD is also inversely related to maximum acceleration of peak pressure, and consequently increases with mean latency (Figs. 13, c and f, and 14c).
Second, the SD of firstspike latency is also an unambiguous function of maximum acceleration of peak pressure, irrespective of the rise time or of the plateau peak pressure, and also approaches some asymptotic value. SD is more variable than mean latency when the two parameters are compared for stimuli of identical maximum acceleration of peak pressure. Therefore SDacceleration functions were generally noisier than the mean latencyacceleration functions. As illustrated in Fig. 14 b, the nearthreshold effect, described above for mean latency, could be quite pronounced for SD.
Third, the shapes of the SDacceleration functions are distinctly different from the shapes of the mean latencyacceleration functions. In particular, the decline in SD with acceleration is relatively steeper than the decline of mean latency for low magnitudes and relatively shallower for high magnitudes of maximum acceleration of peak pressure.
Such differences in function shape are inconsistent with a linear relationship between SD and mean firstspike latency as proposed by Phillips and Hall (1990). Instead, the shape differences suggest that SD may be proportional to the slope of the latencyacceleration function. Such a relationship would result from jitter in the effective acceleration of peak pressure, viz., in the term (APP _{max} + S).
Let us therefore assume that
This prediction was tested as follows. For each set of data, best fits to the SDacceleration functions and to the SD–mean latency plots were found by applying Eq. 15 and 16, respectively. SD_{min} and c _{CRF} were allowed to vary, whereas S and L _{min} were adopted from the best solution of Eq. 8 found for the same data set, as reported above.
The best solutions obtained for the variation of the SD of firstspike latency with maximum acceleration of peak pressure are plotted in Figs. 13, b and d, and 14b by solid lines without symbols. The fact that these lines are difficult to discern in the graphs actually emphasizes the high quality of the fits to the data. For the fit of the data of neuron 9598/14, shown in Fig. 14 b, the seven nearthreshold points were discarded. This was the only neuron in which the inclusion of the nearthreshold points severely impaired the quality of the fit.
The exponential functions describing the relationship between SD and mean latency (Eq. 16 ), which emerged from the same fitting procedure, are also plotted by solid lines in Figs. 13, c and f, and 14c. Note their good approximation of the data.
For comparison, the dashed lines in these charts represent the best linear fits to the data. In some cases linear fits would systematically underestimate the SDs at short mean latencies and systematically overestimate them at longer mean latencies (e.g., neuron 9598/11, Fig. 13 c). Nevertheless, and in agreement with previous findings (Phillips and Hall 1990), linear fits did provide good descriptions of the relationships between SD and mean latency: values of r ^{2} were as high as 0.969 with a weighted average of 0.610. However, the nonlinear fits proposed here (Eq. 16 ) were in some cases markedly better than linear fits (up to 40%). Averaged over the entire data sample, the nonlinear functions provided a fit that was ∼2% better than the linear functions. In a small number of data sets, SD appeared to be independent of maximum acceleration of peak pressure, and thus also independent of mean firstspike latency. This was the case with 10 linear and 8 nonlinear fits. These data sets were all among those with the smallest numbers of first spikes.
Figures 15 and 16 show the population data for the proportionality coefficient and the estimated minimum SD. In Fig. 15, the c
_{CRF} of Eq. 15
and 16 is plotted against the number of first spikes that contributed to the fit. This coefficient shows a unimodal distribution, which narrows rapidly with increasing number of first spikes and converges on the weighted average of c
_{CRF} = −0.102. This observation is reminiscent of the one made above for the scaling factor A (Fig. 4
a), used in the description of the latency–acceleration functions. Thus c
_{CRF} between SD and slope of the latencyacceleration function may in fact be very similar across the neuronal pool and the range of stimulus conditions used here. For a description of the average relationship between SD and acceleration of peak pressure, Eq. 15
may therefore be written as
LINEAR RISE FUNCTION TONES.
Comparable results were obtained with linear rise function tones. Data from one neuron (9592/02) are illustrated in Fig. 17. With linear rise function tones, SD of firstspike latency is an inverse function of the rate of change of peak pressure (Fig. 17
b), just as for mean latency (Fig. 17
a). The systematic differences in the shapes of the functions relating mean latency and SD to rate of change of peak pressure again suggest that SD may be proportional to the slope of the function relating mean latency to rate of change of peak pressure
On average then, Eq. 20
can be written as
COMPARISON OF COSINESQUARED AND LINEAR RISE FUNCTION TONES.
Figure 18 a shows a scatterplot of the estimated minimum SDs obtained with linear and cosinesquared rise function tones. Both stimuli yielded very similar estimates, the points lying close to the line of unity slope. A linear regression analysis yielded a slope of 0.75 with r ^{2} = 0.912. Exclusion of only the rightmost point increased the slope to 0.98 with r ^{2} = 0.938. In Fig. 19, the estimated minimum SD obtained with linear rise function tones is plotted against the estimated minimum firstspike latency (open circles). This plot also suggests a nonlinear relationship between the two estimates. To emphasize the similarity with the results obtained with cosinesquared rise functions, the data of Fig. 16 are retained in Fig. 19 (solid squares).
Figure 18
b provides a scatterplot of the proportionality coefficients between SD of firstspike latency and the slopes of the functions relating mean latency to rate of change and to acceleration of peak pressure, obtained with linear and with cosinesquared rise function tones, respectively. The two estimates are also correlated, although more loosely than those of minimum SD. A regression analysis yielded a slope of 1.21 ± 0.13 with r
^{2} = 0.761. Exclusion of the rightmost data point decreased the slope to 1.09 ± 0.17. The slope is not significantly different from 1 (dashed line). With one exception, all data points are below the line with a slope of 1.78 (dotdashed line). This slope would have been expected if there were a fixed exponential relationship between SD and mean latency, irrespective of the rise function. If this were the case, the coefficients of Eq. 18
(i.e., 0.04) and 23 (i.e., 0.08) should have been identical. Rather, comparison of these equations, which provide average descriptions of the data, reveals that for any given mean latency (L
_{LRF} = L
_{CRF}) the difference between the corresponding SD and the minimum SD is twice as high for linear as for cosinesquared rise function tones
These observations, as summarized by Eq. 24, are also difficult to reconcile with the assumption of a linear relationship between SD and mean firstspike latency. However, they are compatible with the suggestion made here that the SD may originate from jitter in the effective acceleration (i.e., log APP _{max} + S) or the effective rate of change of peak pressure (i.e., log RCPP + S). It follows from Eq. 8 and 12 that for a given neuron (same S) and for any two linear and cosinesquared rise function tones that yield the same mean latency, the SD produced by the same jitter in S (say ±0.1) will be about twice as high for linear as for cosinesquared rise functions.
DISCUSSION
The present paper demonstrates that firstspike latency of auditory cortical neurons is an unambiguous function of acceleration of peak pressure at tone onset for cosinesquared rise function tones, and of the (constant) rate of change of peak pressure for linear rise function tones. With linear rise functions, acceleration of peak pressure is mathematically instantaneous and infinite in amplitude. However, the acoustic signal is transformed into a receptor potential, and, as jugded from intracellular recordings of inner hair cells (e.g., Russell and Sellick 1983), the rise of the DC receptor potential to highfrequency tones shaped with linear rise functions is no longer precisely linear, but rather somewhat curvilinear. Consequently, the rate of rise of the DC receptor potential is no longer constant, and its acceleration is no longer instantaneous and infinite. Acceleration of the DC receptor potential may rather be a rapidly decaying function of time (in principle similar to the course of acceleration of peak pressure in cosinesquared rise function tones; Fig. 1). Thus, with linear rise function tones, latency may also be a function of the (transformed) acceleration at tone onset, a view also favored by the finding that latency to such tones could be determined very early during the rise time (i.e., within <1 ms). For both cosinesquared and linear rise functions, acceleration is maximal at the beginning of the rise time. The present study therefore cannot resolve the question of whether it is the magnitude of the initial or the maximum acceleration occurring during the rise time that is the critical value.
The initial time course of the peak pressure in signals with a common maximum acceleration or a common rate of change of peak pressure is very similar or identical (see Figs. 1 and 9, bottom left), so that it may be argued that such signals could reach some (very low) firing threshold peak pressure at nearly the same time and therefore lead to a response with the same latency. However, careful analysis of latency of responses to linear rise function tones of different rise time and plateau peak pressure showed that the change in a neuron's latency with alterations of rate of change of peak pressure is incompatible with such a firing threshold interpretation of latency and the opposite of what would be expected if adaptive processes were to prolong the time necessary to reach the presumed threshold for long rise times or low rates of change of peak pressure (Heil and Irvine 1996b). Furthermore, as is shown in the companion paper, the first spike can be triggered at very different signal amplitudes, even for signals that share the same acceleration or, with linear rise functions, the same rate of change of peak pressure.
Comparison with previous studies
Acceleration of peak pressure has previously not been recognized as a relevant parameter of acoustic signals. But this parameter has been varied, almost certainly without the experimenters' awareness, in a huge number of studies, e.g., in all those in which stimulus manipulations affected the SPL at the eardrum, whereas rise time and rise function were kept constant. In the context of the recent proposal of a temporal code for sound location by cortical neurons (Middlebrooks et al. 1994), it is worth emphasizing that the SPL is also affected by changes in a sound source's position in threedimensional space (azimuth, elevation, and distance) given the frequencydependent shadowing effect of the head and the frequencydependent pressure transformations performed by the pinna (for review see Carlile 1996). The neglect of acceleration in onsets has likely been due to the focus of attention on features characteristic of the steadystate parts of signals, such as the SPL or, in binaural studies, interaural intensity differences. Thus, in previous studies, latency was usually plotted as a function of SPL, and was found to be inversely related to it (e.g., Aitkin et al. 1970; Brugge et al. 1969; Hind et al. 1963; Kitzes et al. 1978; Phillips 1985, 1988; Phillips and Hall 1990; Phillips et al. 1989). Rise functions of artificial auditory signals have mainly been introduced with the purpose of reducing spectral splatter at signal onsets. However, even in studies specifically designed to investigate the effects of the shape of tone onsets on neuronal responses by altering rise times, neuronal response properties were plotted with respect to SPL (Phillips 1988; Phillips et al. 1995).
For all AI neurons sampled in the present study, latency was a function of the positive acceleration occurring at the beginning of the rise time. However, the post hoc analysis of data published previously by Phillips (1988) showed that for these neurons the first spike is triggered at the end of the rise time, i.e., at the instant of deceleration of peak pressure (see Fig. 11). A consequence of this behavior is that when signals are grouped according to the same rate of change of peak pressure during the rise time, latency increases with rise time, and thus increases with tone level, a phenomenon that at first glance seems paradoxical. Phillips (1988) did not measure firstspike latency, but instead latency was defined as the interval between stimulus onset and the peak bin of the poststimulus time histogram. Although this methodological difference is highly unlikely to account for the observed differences in response latency between the present data and those of Phillips, there may be further, undetected, methodological differences between the two studies. On the other hand, it cannot be excluded that there might be two classes of cells in auditory cortex, in one of which the latencies are a function of acceleration and in the other a function of deceleration of peak pressure. The latter neurons might be either very rare, or located in areas that were not surveyed in the present study, although we probably did study a representative sample of AI neurons, as judged by their distributions of CF and minimum latency, their frequency tuning and binaural characteristics (see Data base), and the shapes of their spike count functions (see companion paper).
SD of firstspike latency was shown here also to be a function of maximum acceleration or, for linear rise function tones, of rate of change of peak pressure. In previous studies, SD was plotted as a function of tone level (e.g., Aitkin et al. 1970; Brugge et al. 1969; Kitzes et al. 1978; Phillips 1985, 1988; Phillips and Hall 1990; Phillips et al. 1989), and in two studies as a function of mean firstspike latency (e.g., Phillips and Hall 1990; Phillips et al. 1989). Although Phillips and Hall proposed a linear relationship between SD and mean latency, several observations in the present study are incompatible with a linear relationship. The different growth rates of SD with mean latency for cosinesquared and linear rise function tones (Fig. 20) and the finding that SD declined more rapidly than latency with acceleration (or rate of change) of peak pressure for low values of this parameter and less rapidly for higher values are difficult to reconcile with a linear relationship. Careful inspection of previous publications (e.g., Aitkin et al. 1970; Brugge et al. 1969; Kitzes et al. 1978; Phillips and Hall 1990) reveals that the latter result was also obtained by these authors. The nature of the differences in the shapes of the functions relating mean latency and SD to maximum acceleration (or rate of change) of peak pressure suggested that SD is proportional to the slope of the functions relating mean latency to acceleration (or rate of change) of peak pressure, and this relationship described the data somewhat better than a linear one. Jitter in the effective acceleration of peak pressure, i.e., in the term (log APP _{max} + S) of Eq. 8 or, for linear rise function tones, jitter in the effective rate of change of peak pressure, i.e., in the term (log RCPP + S) of Eq. 12, would cause or closely approach such a relationship. Thus jitter in the neuronal S or jitter in the way in which a given acceleration or rate of change of peak pressure is represented in the motion of the basilar membrane, i.e., jitter in peripheral mechanics and cochlear amplifiers, might underlie the SD of firstspike timing. The finding that the proportionality coefficient between SD and the slope of the latency functions is so similar across the neuronal population, and similar for the two rise functions, also favors this notion.
I have also assumed a minimum SD, thought to reflect the jitter in the same processes that underlie the minimum latency, such as cochlear travel time, axonal travel times, and synaptic factors. Whereas these delays simply add up to yield a minimum latency, this will not be the case for the minimum SD. For example, at every synapse along the pathway the variability of firstspike timing in the afferent axon(s) would be expected to increase (or decrease, see, e.g., Joris et al. 1994) by some factor in the postsynaptic neuron. Thus a nonlinear relationship between minimum SD and minimum latency should be expected, in line with the finding of the nonlinear growth of the minimum SD with the minimum mean latency (Figs. 16 and 19).
Other factors influencing firstspike latency
Although firstspike latency is a function of the acceleration/rate of change of peak pressure at tone onset, this is not to say that a particular acceleration/rate of change in a signal will under all circumstances evoke a response from a neuron with the same latency. The nearthreshold effect, as described in this paper (e.g., Fig. 14), is a case in point. The laterality of stimulus presentation is another factor. Laterality does not appear to influence the shape of latency–acceleration/rate of change functions, but it affects the functions' position along the ordinate, i.e., it affects the minimum latency. In the few neurons excited by stimulation of either ear, studied here, minimum latency was systematically longer by 2–3 ms for ipsilateral stimulation than for contralateral or binaural stimulation. This could reflect one or two additional synapses, slower conduction velocities, or longer lengths of the ipsilateral pathways to these neurons.
Latency of a given neuron is also a function of frequency, generally being shortest at or near the CF for tones of the same plateau peak pressure and rise time (e.g., Aitkin et al. 1970; Brugge et al. 1969; Heil et al. 1992a; Hind et al. 1963; Kitzes et al. 1978). The present study shows in addition that the forms of the functions relating latency to acceleration are very similar for different frequencies, and that it is their position along the acceleration axis that differs systematically with frequency. Latency to a tone burst has also been shown to increase by up to 3 ms when the tone burst is preceded by a masking tone (Calford and Semple 1995). Similarly, latency also increases with the repetition rate of tone bursts (Phillips and Hall 1990; Phillips et al. 1989), a parameter that was not varied in the present study. Close inspection of the figures provided by Phillips and coworkers (Figs. 1 c, 2c, and 8b in Phillips et al. 1989; Fig. 3 a in Phillips and Hall 1990) reveals that the effect of repetition rate on latency seems to be a systematic displacement of the latency functions (i.e., latencylevel functions) along the ordinate. This suggests that the effects of repetition rate and frequency on latency are different in origin. And finally, latency to tone bursts is prolonged when the tone bursts are presented after the onset of a longduration broadband noise masker (Phillips 1985). Inspection of the relevant figures (Figs. 2, c and f, 6, and 7, c and f) suggests that background noise has a similar effect as frequency, i.e., brings about a displacement of the latency functions along the abscissa.
Common shape of latencyacceleration functions
A comparison of latencyacceleration functions, or of latency–rate of change of peak pressure functions, among different neurons or stimulus conditions revealed that they are all of strikingly similar shape despite differences in the position and in the extent of these functions along the coordinates. The latter observation simply reflects differences in the shape of the spike count functions (see companion paper for a detailed account of these issues).
The mathematical function selected to describe the similar shape of the latencyacceleration (or rate of change of peak pressure) functions and to quantify their different positions in the coordinate system (Eq. 8 and 12) may not adequately reflect the (unknown) nature of the physical and biochemical processes that may underlie the generation of the latency–acceleration/rate of change of peak pressure relationships. However, it provides an excellent approximation of the functions, it is simple, it quantifies the dispersion of the functions along either coordinate, and it makes the reasonable assumption that the latency of a neuron contains components that are independent of the magnitude of the stimulus. These components, such as acoustic delays, cochlear travel time (e.g., Robles et al. 1976; Ruggero and Rich 1987), axonal travel times, and possibly some synaptic components, simply add up and yield a minimum latency.
The estimated minimum latencies varied considerably for different neurons (Fig. 7), a finding readily expected in light of the diversity of pathways over which they may derive their inputs, even at CF. Furthermore, the shortest estimated minimum latencies decreased with increasing CF. Such a trend would have to be expected if cochlear travel time is one of the components contributing to the estimated minimum latency. A similar decrease in latency to CF tones with increasing CF was also observed in other nuclei of the auditory pathway (e.g., Heil and Scheich 1991; Heil et al. 1995; Kitzes et al. 1978; Langner et al. 1987), although latency was measured either at some fixed tone level or at some level above firing threshold (30–60 dB in different studies).
The differences in the horizontal positions of the latency–acceleration/rate of change of peak pressure functions reflect differences in sensitivity to acceleration or rate of change of peak pressure (transient sensitivity). It is worth reemphasizing here that this measure is not equivalent to the firing threshold of the neuron (Fig. 5), although for a given neuron the transient sensitivityfrequency filter functions and the conventional tuning curves share some characteristics (Fig. 6). Neurons with the same transient sensitivity, reflected in a common position of their latency–acceleration/rate of change of peak pressure functions along the abscissa, can differ in firing threshold by some 100 dB (Fig. 5). Moreover, in response to tones of the same frequency, the measure of transient sensitivity is unambiguous, i.e., it is a single value, whereas firing threshold in dB SPL can increase with rise time (see Fig. 6, b and e) (Phillips 1988; but see companion paper). The distribution of transient sensitivities at CF for different neurons is also a function of frequency, and appears to be grossly similar to the cat's audiogram. For comparison, the reader is referred to a study by Rajan et al. (1991) that summarizes a range of N_{1} audiograms measured under various experimental conditions in barbiturateanesthetized cats. Although N_{1} thresholds depend critically on stimulus paradigms, such as rise time, and other conditions, audiogram shapes are fairly similar, particularly for frequencies >3 kHz.
Possible origin of the latency–acceleration/rate of change of peak pressure relationship
The finding of nearly identical shapes of latency–acceleration/rate of change of peak pressure functions among different neurons and different stimulus conditions is surprising given the enormous degree of convergence and divergence of connections at nuclei peripheral to the cortex, as well as within the cortex itself, and given that cortical cells are likely to differ widely in the number of serial synapses in their afferent pathways. However, the findings that the shortest estimated minimum latencies decrease with increasing CF, that the distribution of estimated transient sensitivities grossly parallels the cat's audiogram, and that the relationship between SD and firstspike latency could possibly be accounted for by jitter in peripheral mechanics, suggest that the common relationship between latency and acceleration/rate of change of peak pressure may have its origin in the peripheral auditory system. The latencies of basilar membrane vibration and of the receptor potential of inner hair cells appear to be independent of the amplitude of acoustic stimuli, such as clicks (Robles et al. 1976), whereas in response to similar stimuli the latencies of auditory nerve fibers are a sensitive function of amplitude (e.g., Pfeiffer and Kim 1972). If these findings can be extrapolated to stimuli with longer rise times, it then appears that the relationship between latency and acceleration/rate of change of peak pressure originates in the synapses between inner hair cells and afferent fibers. This proposal requires that the same relationship between latency and acceleration as seen in cortical cells must exist in the auditory nerve.
Functional implications
Although a particular acceleration or rate of change of peak pressure at tone onset is transformed into a particular neuronal latency in a smooth analog fashion, the brain has no means of measuring the latency. However, one way for the brain to derive useful information through latency is by means of a comparison of the timing of spikes across a neuronal population, as originally proposed by Hind et al. (1963). It has long been recognized, for example, that differences in the timing of inputs from the two ears provide an important cue for sound localization, and that the differences are extracted via appropriate delays and coincidence detection mechanisms in brain stem auditory nuclei (e.g., Goldberg and Brown 1969; Overholt et al. 1992; Yin and Chan 1990). Likewise, interaural intensity differences of otherwise identical signals will lead to interaural latency differences from the two ears, and thus interaural intensity differences could be processed in a similar way as neural time differences (“latency hypothesis”; Jeffress 1948). However, as suggested by the present study, such latency differences would not be brought about by the intensity differences per se, but rather by the associated differences in acceleration or rate of change of peak pressure, urging a reinterpretation of observations made on timeintensity trading (see, e.g., Irvine et al. 1995).
The targetrange sensitivities of neurons in the auditory system of echolocating bats are also thought to be mediated by coincidence detection (e.g., Sullivan 1986). These neurons are best excited by a particular delay between a component of the pulse emitted by the bat and a component of the returning echo (e.g., O'Neill and Suga 1982). Interestingly, these neurons are tuned to combinations of an echo component with the highest amplitude preceded by the pulse component with the lowest amplitude, and thus likely with the lowest acceleration of peak pressure. This particular selection of pulse and echo components for target range computation is ideal with respect to minimizing the delay requirements of spikes triggered by the pulse so that they coincide at some comparator neuron with the spikes triggered by the echo.
The fact that latency–acceleration/rate of change of peak pressure functions of different neurons are strikingly similar in shape has some interesting consequences for the sequence of first spikes in neuronal populations. Consider a population of neurons that differ in minimum latency and in S to a stimulus of a given frequency (Fig. 21, top, neurons 1–6). Depending on the acceleration of peak pressure, each of these neurons will fire with a particular delay, and the latency functions of the different neurons can cross each other at different acceleration magnitudes. However, function crossing only occurs for neurons that differ in transient sensitivity and in minimum latency. The latency functions of neurons that share the same sensitivity, but that have different minimum latencies, are only shifted along the ordinate (neurons 1–3 in Fig. 21, ——). Consequently, the temporal relationships between the first spikes in such a population of neurons are constant and independent of the magnitude of acceleration of peak pressure (Fig. 21, middle). Such a temporal pattern could therefore constitute a scaleinvariant representation, as recently suggested by Hopfield (1995), of this stimulus parameter. Minimum latencies seem to be laid out in orderly topographic fashions within isofrequency domains of various auditory nuclei (e.g., Heil and Scheich 1991; Heil et al. 1992a; Park and Pollak 1993; Schreiner and Langner 1988). It is therefore conceivable that the detection of such temporal patterns might be mediated through neurons receiving coincident inputs from neurons with some range of minimum latencies. Discharges from the higherorder neurons could then decode the presence of acceleration of peak pressure in a scaleinvariant fashion, i.e., such neurons would signal the presence of a transient.
In contrast, the intervals between the first spikes in a population of neurons with similar minimum latencies but different transient sensitivites vary systematically with the magnitude of acceleration (Fig. 21, bottom). The temporal dispersion of spikes in such a population increases in an orderly fashion with decreasing acceleration, but the order of succession of first spikes remains unaltered. Thus neurons in such a population would fire at various instances of a transient, and could possibly be used to systematically track instantaneous properties (such as peak pressure, see companion paper) of transients. Because the transient sensitivity of a given neuron is a function of frequency (Fig. 6), this proposed transient tracking would consecutively involve neurons with different CF that are laid out in an orderly tonotopic map. Such a mechanism might contribute to the instantaneous coding of transients thought to underlie the categorical perception of speech and some nonlinguistic sounds (Cutting and Rosner 1974).
Although the present study focuses on the timing of the first spike to isolated tone onsets, similar relationships between latency and acceleration/rate of change of peak pressure might hold for signals other than tone bursts and for more rapid sequences of envelope transients. The phaselocking of spikes to various amplitudemodulated signals, as observed at different levels of the auditory pathway (for references see introduction), suggests that each cycle effectively constitutes a new onset. The observation that cortical neurons have accelerationfrequency filters (Fig. 6) suggests that a given neuron will “view” a spectrally complex signal through this filter, and that the acceleration of the frequency component to which the neuron is most sensitive will determine its response latency. It is worthwhile reemphasizing that every onset, even that of a “pure tone,” is spectrally complex. These considerations also suggest that the perceived isochrony of anisochronous musical and speech sounds (Tuller and Fowler 1980; Vos and Rasch 1981) may have its origin in differences in acceleration of peak pressure, rather than in “. . . adaptation of the hearing mechanism to a certain relative stimulus level . . .” (Vos and Rasch 1981, p. 323).
Finally, the companion paper reveals that the firstspike latency of a given neuron has an unexpected consequence for the response of the neuron itself.
Acknowledgments
I am grateful to Drs. D.R.F. Irvine and R. Rajan for help with the experiments; to J. F. Cassell, M. Farrington, V. N. Park, and R. Williams for technical support; to Drs. M. B. Calford, D.R.F. Irvine, and G. K. Yates and two anonymous reviewers for comments on the manuscript; and to many colleagues for critical discussions.
This study was supported by the National Health and Medical Research Council of Australia.
Footnotes

FN1 Tucker Davis Technologies utilizes a fudge factor of 0.5903 in their builtin cosinesquared rise function so that the actual rise time is 1.69 times as long as the one specified by the experimenter.
REFERENCES
 B1.↵
 B2.↵
 B3.↵
 B4.↵
 B5.↵
 B6.↵
 B7.↵
 B8.↵
 B9.↵
 B10.↵
 B11.↵
 B12.↵
 B13.↵
 B14.↵
 B15.↵
 B16.↵
 B17.↵
 B18.↵
 B19.↵
 B20.↵
 B21.↵
 B22.↵
 B23.↵
 B24.↵
 B25.↵
 B26.↵
 B27.↵
 B28.↵
 B29.↵
 B30.↵
 B31.↵
 B32.↵
 B33.↵
 B34.↵
 B35.↵
 B36.↵
 B37.↵
 B38.↵
 B39.↵
 B40.↵
 B41.↵
 B42.↵
 B43.↵
 B44.↵
 B45.↵
 B46.↵
 B47.↵
 B48.
 B49.↵
 B50.↵
 B51.↵
 B52.↵
 B53.↵
 B54.↵
 B55.↵
 B56.↵
 B57.↵
 B58.↵
 B59.↵
 B60.↵