Neurophysiological studies with animals suggest that sounds modulate activity in primary visual cortex in the presence of concurrent visual stimulation. Noninvasive neuroimaging studies in humans have similarly shown that sounds modulate activity in visual areas even in the absence of visual stimuli or visual task demands. However, the spatial and temporal limitations of these noninvasive methods prevent the determination of how rapidly sounds activate early visual cortex and what information about the sounds is relayed there. Using spatially and temporally precise measures of local synaptic activity acquired from depth electrodes in humans, we demonstrate that peripherally presented sounds evoke activity in the anterior portion of the contralateral, but not ipsilateral, calcarine sulcus within 28 ms of sound onset. These results suggest that auditory stimuli rapidly evoke spatially specific activity in visual cortex even in the absence of concurrent visual stimulation or visual task demands. This rapid auditory-evoked activation of primary visual cortex is likely to be mediated by subcortical pathways or direct cortical projections from auditory to visual areas.
- auditory localization
sounds facilitate visual processes in natural environments: concurrent auditory stimuli lower the threshold to detect and discriminate visual stimuli (McDonald et al. 2000; Noesselt et al. 2010), increase their subjective intensity (Stormer et al. 2009), and speed perceptual and motor responses to visual targets (Brang et al. 2013; Cappe et al. 2010; Miller 1982). These benefits of sounds on visual perception are due in part to the redundancy of spatial/temporal information across modalities (for a review, see Stein and Stanford 2008). However, it remains poorly understood precisely what types of information can be relayed between the senses and the mechanisms that mediate cross-modal facilitation.
Neuroimaging research in humans indicates that sounds can modulate activity in early visual cortex (e.g., Mercier et al. 2013; Raij et al. 2010) leading to increased cortical excitability in visual areas (Romei et al. 2007). However, research on sound-evoked activation of visual cortex in humans has not systematically investigated what auditory information is relayed to primary visual cortex. Recently, McDonald and colleagues (2013) reported event-related potential (ERP) results suggesting that peripherally presented sounds activate visual cortex with some spatial selectivity, consistent with prior neurophysiological demonstrations in cats (Morrell 1972). They demonstrated a late (∼200 ms postsound onset) ERP positivity for contralateral (relative to ipsilateral) sounds that was localized to neural generators in extrastriate visual areas using dipole modeling, suggesting that sounds activate visual areas in a hemifield-specific manner. However, the poor spatial resolution of scalp-recorded ERPs does not permit a precise specification of the neural locus of this cross-modal effect, making it difficult to identify any rapid auditory-evoked activation of primary visual cortex or its spatial selectivity within a visual hemifield. In particular, if peripheral sounds elicited short-latency neural activity in anterior regions of the calcarine sulcus, which maps retinotopically to the peripheral position of the sound source, ERP recordings from the scalp would be relatively insensitive to the electrical activity generated from this region of cortex because it lies several centimeters away from the surface of the scalp.
To overcome these limitations, we recorded electrocorticographic (ECoG) activity from intracranial depth electrodes localized to the calcarine sulcus (primary visual cortex) in two patients with intractable epilepsy. ECoG affords excellent spatial and temporal resolution as well as a high signal-to-noise ratio, permitting the direct evaluation of auditory-evoked activity in primary visual cortex. Patients performed a central-sound detection task similar to that used by McDonald and colleagues (2013). On each trial, either a tone was presented centrally or a noise burst was presented peripherally at 45° to the left or right of the patient's midline. Patients were required to respond only to the central tone so that neural responses to the peripheral sounds were obtained without potentially confounding response-related activity (Fig. 1A). ECoG activity was examined to determine whether the peripherally presented sounds evoked early visual cortical activity in a spatially specific manner.
MATERIALS AND METHODS
Two right-handed male patients with mesial temporal lobe epilepsy (27 and 41 yr of age) participated in this study during clinical monitoring for medically intractable seizures (due to hippocampal sclerosis in patient 1 and viral encephalitis in patient 2) using ECoG recordings from chronically implanted depth electrodes (5-mm center-to-center spacing, 2-mm diameter). Patient 1 was implanted with four temporal and one frontal depth electrode probes in the left hemisphere as well as scalp EEG; the posterior portion of an anterior-posterior probe passed through medial temporal areas and included coverage of the calcarine sulcus as did the posterior portion of a medial-lateral probe angled posteriorly from the middle temporal gyrus (Fig. 1B). Patient 2 was implanted with 2 depth electrode probes, 1 in each of the left and right temporal lobes with the posterior half of the left probe including coverage of the calcarine sulcus (Fig. 1C), and 56 subdural electrodes along the right frontal and parietal lobes and right superior temporal gyrus. Suspected mesial temporal lobe epilepsies commonly require an anterior-posterior probe through visual cortex to reach the hippocampus. Use of this probe does not produce any detectable visual scotoma. Electrodes were placed according to the clinical needs of the patients. This study was approved by the Institutional Review Board at the University of Chicago. Informed, written consent was obtained from each patient.
MRI and CT acquisition and processing.
Preoperative T1-weighted MRI and postoperative computerized tomography (CT) scans were acquired for each patient to aid in localization of electrodes. Cortical reconstruction and volumetric segmentation of each patient's MRI was performed with the Freesurfer (Dale et al. 1999). Postoperative CT scans were registered to the T1-weighted MRI using Statistical Parametric Mapping (SPM), and electrodes were localized using software developed in our laboratories (https://github.com/towle-lab/electrode-registration-app/).
Sound localization paradigm.
Patients were seated in a hospital bed or nearby chair. Stimuli were delivered using a laptop (using Psychtoolbox; Brainard 1997) and 2 free-field speakers placed ∼45° to the right and left of the patient's midline. Three types of sounds were presented in random order during the experiment: a 53-ms, 1,000-Hz sinewave tone presented from both speakers simultaneously and thus localized centrally (30 trials) and a 83-ms pink noise burst presented from either the left (60 trials) or right (60 trials) speaker. Stimuli were selected for consistency with the paradigm used by McDonald and colleagues (2013). A central fixation cross was displayed on the laptop throughout the experiment. Patients were instructed to maintain central fixation and to respond via button press to the central 1,000-Hz tone while making no response to the peripheral noise bursts (for which ECoG responses were analyzed). The interstimulus interval varied randomly between 2.0 and 2.5 s (uniform distribution).
While retinotopic mapping was unavailable, we used two methods to record visual-evoked responses in visual cortex. Patient 2 viewed centrally presented visual images (1,000 ms per image) spanning ∼20° visual angle at a rate of 1 picture every 3 s during a control task. A drawback was that these central images would not effectively stimulate visual neurons that were tuned to peripheral locations. For patient 1, we measured visual-evoked responses to luminance changes across the entire visual field by identifying 60 natural eye blinks using the frontally recorded scalp EEG (electrode FP1) taken during rest periods. Events were time-locked to the offset of the blink (when eyes began to open), which has been shown to elicit increased neuronal firing in V1 (Gawne and Martin 2000). Unfortunately, scalp-EEG data were unavailable from patient 2.
ECoG recordings and analyses.
ECoG signals were analyzed from 5 depth electrode probes in patient 1 (60 electrode contacts) and 2 depth electrode probes in patient 2 (24 contacts) at sampling rate of 1,024 Hz with no filtering for maximal temporal resolution. Data from electrodes near regions previously or subsequently surgically resected were removed from analyses, as were excessively noisy electrodes (with average amplitude variability exceeding 3 SD). Electrodes within primary visual cortex were selected for analyses based on their anatomic location within the calcarine sulcus. Studies examining the cytoarchitectonically delineated border of V1 in humans show that in a small subset of individuals secondary visual cortex and visual region prostriata expand into the anterior calcarine sulcus (Amunts et al. 2000; Yu et al. 2012), introducing the possibility that the recorded activity in our patients included neural responses from these regions as well as from primary visual cortex.
The onset of the auditory stimulus was marked on ECoG recordings with a voltage-isolated transistor-transistor logic (TTL). ECoG-ERPs were obtained by averaging ECoG signals offline across trials with a time window of −100 to 500 ms around auditory onset. Epochs containing excessively variable activity (≥3 SD from the mean variance of all epochs) were rejected offline (<1% of trials). The 50-ms period preceding auditory onset served as the baseline for each epoch. Statistical significance of the difference between the contralateral and ipsilateral conditions was estimated at each electrode/time point by comparing the observed contralateral-ipsilateral difference in the ECoG-ERP with a null distribution generated by randomly shuffling the contralateral and ipsilateral labels across trials, computing the mean contralateral-ipsilateral difference, and repeating this procedure 1,000 times (2-sample permutation tests). This analysis revealed the hemifield dependence of auditory-evoked visual activity. To evaluate the effects of the contralateral and ipsilateral sounds separately, we also compared the observed contralateral or ipsilateral ECoG-ERP with a corresponding null baseline that was generated by randomly shuffling the sign (positive or negative) of the ECoG-ERP at each time bin for each trial, computing the mean waveform, and repeating this procedure 1,000 times (1-sample permutation tests; Cohen 2014). To control for the identification of spuriously significant differences, an ECoG-ERP effect at each time bin was considered to be statistically significant only if the time bin was a part of a series of 10 contiguous significant time bins (Guthrie and Buchwald 1991). Figures 2–4 highlight these significant regions with gray boxes. Separately, we applied time-point-based multiple-comparison corrections (Cohen 2014) between −100 and 100 ms in which a distribution of maximum difference values was constructed from the permuted (null) data and the 95th percentile of that distribution was used as a cutoff for significance across all time points in the real data; this method controls for multiple comparisons of ERP data at the familywise error rate (FWER) of 0.05 (Cohen 2014).
In patient 1, ECoG-ERPs obtained from three of the four electrodes implanted in the anterior-inferior portion of the calcarine sulcus (Fig. 2A) revealed significantly greater positive amplitudes to contralateral than ipsilateral sounds at latencies under 100 ms. Significant differences (P < 0.05) were identified at electrode 1 (28–70 ms postsound onset), electrode 3 (51–64 ms), and electrode 4 (35–71 ms). Applying FWER multiple-comparison corrections between the prestimulus interval and a priori time points of interest (−100 to 100 ms), the differences observed before 100 ms remained significant at electrodes 1 and 4. The short onset times at these electrodes suggest the presence of a fast mechanism through which contralateral sounds evoke neural activity in the anterior portion of primary visual cortex.
Patient 2, who had five depth electrodes along the lateral edge of the calcarine sulcus (Fig. 2B), showed a similar pattern of results, demonstrating significant early (<100 ms) differences in ECoG-ERPs evoked by contralateral and ipsilateral sounds during 45–97 ms postsound onset at electrode 6 and 53–76 ms at electrode 7. Using FWER corrections, the differences observed before 100 ms remained significant at electrode 6. Interestingly, this electrode was near the anterior calcarine sulcus, a region that retinotopically maps onto the peripheral visual field, whereas the early auditory-evoked response was absent at electrodes 8 and 9 in the posterior calcarine sulcus, a region that retinotopically maps onto the central visual field, suggestive of a coarse retinotopic mapping of peripheral sounds within a visual hemifield (note that all electrodes in patient 1 were located in the anterior calcarine sulcus).
In addition to the early contralateral-ipsilateral differences (<100 ms), electrodes 2 and 3 in patient 1 and electrodes 6-9 in patient 2 showed significant differences in later time windows (200–500 ms; see Fig. 2). These later differences, obtained from both the anterior and posterior calcarine electrodes, may underlie the long-latency contralateral occipital positivity (over 200–500 ms) observed by McDonald and colleagues (2013) using scalp recordings.
To confirm that these early differences were driven by modulation of neural activity elicited by the contralateral sounds (and not by the ipsilateral sounds), we separately analyzed the contralateral and ipsilateral ECoG-ERP waveforms for the anterior calcarine electrodes (electrodes 1-4 in patient 1 and electrodes 5-7 in patient 2) relative to their respective null baselines (see materials and methods). In five of these seven electrodes, the early differences were driven by increased responses to contralateral sounds over baseline beginning 28–53 ms after sound onset: electrode 1 (28–87 ms), electrode 2 (53–87 ms), electrode 3 (52–72 ms), electrode 4 (34–97 ms), and electrode 6 (50–76 ms); these significant differences were confirmed at electrodes 1, 4, and 6 after controlling for the FWER at P < 0.05. Critically, ipsilateral sounds evoked no significant changes in activity relative to baseline in the 1st 100 ms at any of these seven electrodes, suggesting that peripheral sounds have privileged early access to contralateral visual cortex.
To compare the timing of auditory-evoked and visual-evoked activity in visual cortex, we analyzed responses to visual stimuli at these electrodes. Although retinotopic mapping was unavailable for these patients, we recorded responses to full-field luminance changes elicited by eye blinks (Gawne and Martin 2000) in patient 1 and responses to centrally flashed visual stimuli in patient 2. Significant visual activity began between 39 and 58 ms (Fig. 3; significant responses before 100 ms at electrodes 1-4 and 9 were confirmed after controlling for the FWER at P < 0.05). The full-field luminance changes evoked strong activity from all four anterior calcarine electrodes (Fig. 3A), whereas the centrally flashed images evoked activity only from electrodes neighboring the posterior calcarine sulcus (Fig. 3B). Importantly, contralateral sounds elicited visual cortical responses (Fig. 2) as rapidly as did visual stimuli (Fig. 3).
To compare the timing of auditory-evoked activity between visual and auditory cortices, we analyzed auditory cortical responses to the noise bursts for patient 1 (patient 2 did not have any superior temporal gyrus electrodes that showed significant auditory responses). Figure 4 shows the auditory activity from 6 electrodes along a single depth probe penetrating the inferior side of the superior temporal sulcus. Maximal early responses were observed at the 2 most medial electrodes (15 mm inferior to primary auditory cortex; Fig. 4, top 2 panels) beginning at 26 ms (confirmed after controlling for the FWER at P < 0.05). Crucially, contralateral sounds elicited visual cortical responses (Fig. 2) nearly as rapidly as they elicited auditory cortical responses (Fig. 4).
As we observed maximal auditory-evoked responses from peripheral sounds at electrodes along the anterior calcarine sulcus, which maps retinotopically to peripheral regions of visual space, these data provide preliminary evidence of a sound-based spatial topography in visual cortex that operates at a resolution finer than visual hemifields. Although this pattern was observed in both patients, unexpectedly, no difference between contralateral and ipsilateral sounds was observed at the most anterior electrode in patient 2 (electrode 5). However, as Fig. 3 shows, the responses to visual stimuli at this electrode differed markedly from those at the remaining electrodes in patient 2, raising the possibility that this electrode was in fact outside of calcarine sulcus. If so, our results of anterior calcarine activation by peripheral sounds (Fig. 2) and posterior calcarine activation by central visual images (Fig. 3B) is consistent with the possibility that sounds generate spatially selective activity in primary visual cortex. However, the determination of the extent of sound-based spatial selectivity in visual cortex would require systematic spatial sampling of visual and auditory stimuli to compare the visual and auditory spatial selectivity for each calcarine electrode.
A general concern of both surface and intracranial electrophysiological recordings is an uncertainty of the spatial specificity of the observed potentials as dipolar electrical activity may spread through neighboring tissue (i.e., far-field effects). To mitigate this risk, we additionally examined the ECoG-ERPs using local common average referencing in which the average electrical activity generated by all electrodes along each depth probe was subtracted from the activity generated by each electrode on the probe. This local referencing scheme ensures that any spatially dispersed activity present broadly at each of the 5.5-cm depth electrode probes is removed while preserving the signal unique to the local cortex. Using this local referencing framework, the pattern of results was unchanged. To confirm further that the observed potentials were generated locally along the calcarine sulci, we examined activity at the depth electrodes adjacent to (but not along) the calcarine sulcus (gray circles closest to the yellow circles in Fig. 2); neither of the two electrodes in patient 1 nor the two electrodes in patient 2 showed significant differences between contralateral and ipsilateral sounds at any time point.
Another potential concern could be the impact of patients' inadvertent eye movements to the sound source. Although eye movements were not rigorously monitored in these patients, in the study of McDonald et al. (2013), which presented lateralized sounds very similar to those used here, the sounds did not elicit any eye deviations as measured by the electrooculogram. In addition, the early onset of the auditory-evoked visual cortical responses (<50 ms) precludes the possibility that eye movements contaminated our ECoG-ERP results because saccade-induced neural responses in primary visual cortex occur 50–100 ms postsaccade onset (e.g., Kagan et al. 2008). Moreover, experimenters monitored patients' gaze position throughout the study, providing feedback to the patients if their eyes deviated from the central fixation cross.
In two patients with uncommon but fortuitously located depth electrodes within the calcarine sulcus, we found evidence that peripheral auditory signals rapidly (beginning between 28 and 45 ms) evoke spatially specific responses in primary visual cortex. The early responses were driven by contralateral (but not ipsilateral) sounds and were maximal at anterior calcarine sites, suggesting that auditory spatial information coarsely follows retinotopic mapping in visual cortex. This rapid transfer of peripheral spatial information from auditory to visual processing may play a role in initiating attentional and/or saccadic shifts. Whereas the present results showed both early (28–100 ms) and later (200–500 ms) activation of visual cortex by sounds, McDonald and colleagues (2013) only observed the late component possibility due to the low sensitivity of scalp-recorded ERPs to calcarine activity originating several centimeters away from the surface of the scalp.
The current finding is consistent with well-documented perceptual benefits of sounds on visual perception (e.g., McDonald et al. 2000; Romei et al. 2007) particularly when sounds and visual targets are spatially coincident. It is also consistent with reports of early multisensory interactions in human visual cortex (Cappe et al. 2010; Raij et al. 2010) and the broadly spatiotopic responses of visual cortical neurons to sounds in the cat (Morrell 1972). Intriguingly, we found that the earliest auditory activation (28 ms) of primary visual cortex by contralateral sounds preceded the earliest visual activation (39 ms) of primary visual cortex by visual stimuli, raising the possibility that a contralateral auditory cue might spatiotopically bias visual processing toward the location of an upcoming visual object before the onset of visually evoked activity. Additional research is necessary to confirm this possibility as well as to investigate what additional types of auditory information are rapidly relayed to primary visual cortex and how they benefit perceptual processing.
The rapid auditory activation of visual cortex that we found could be mediated by direct connections between primary auditory and primary visual cortices (Falchier et al. 2002; Rockland and Ojima 2003) and/or by auditory subcortical inputs to visual cortex (Cappe et al. 2009). Whatever the neural connections that mediate the rapid auditory input to visual cortex, the current findings suggest that those connections map early auditory spatial coding in primary auditory cortex (Lee and Middlebrooks 2011) or in subcortical areas (Cappe et al. 2009) onto the retinotopic coding of visual space in primary visual cortex.
This study was supported by National Institute on Deafness and Other Communication Disorders Grant K99-DC-013828, National Eye Institute Grant R01-EY-021184, National Science Foundation Grant BCS-1029084, National Institute of Mental Health Grant 7P50-MH-086385, and National Institute of Neurological Disorders and Stroke Grant 2T32-NS-047987.
No conflicts of interest, financial or otherwise, are declared by the author(s).
D.B., S.S., S.A.H., and M.G. conception and design of research; D.B., V.L.T., Z.D., J.T., and S.W. performed experiments; D.B., S.D.T., and Z.D. analyzed data; D.B., S.S., S.A.H., and M.G. interpreted results of experiments; D.B. prepared figures; D.B. drafted manuscript; D.B., V.L.T., S.S., S.A.H., S.D.T., Z.D., J.T., S.W., and M.G. edited and revised manuscript; D.B., V.L.T., S.S., S.A.H., S.D.T., Z.D., J.T., S.W., and M.G. approved final version of manuscript.
- Copyright © 2015 the American Physiological Society