The receptive fields of complex cells in the early visual cortex are economically modeled by combining outputs of a quadrature pair of linear filters. For actual complex cells, such a minimal model may be insufficient because many more simple cells are thought to make up a complex cell receptive field. To examine the minimalist model physiologically, we analyzed spatial relationships between the internal structure (subunits) and the overall receptive fields of individual complex cells by a two-stimulus interaction technique. The receptive fields of complex cells are more circular and only slightly larger than their subunits in size. In addition, complex cell subunits occupy spatial extents similar to those of simple cell receptive fields. Therefore in these respects, the minimalist schema is a fair approximation to actual complex cells. However, there are violations against the minimal model. Simple cell receptive fields have significantly fewer subregions than complex cell subunits and, in general, simple cell receptive fields are elongated more horizontally than vertically. This bias is absent in complex cell subunits and receptive fields. Thus simple cells cannot be equated to individual complex cell subunits and spatial pooling of simple cells may occur anisotropically to constitute a complex cell subunit. Moreover, when linear filters for complex cell subunits are examined separately for bright and dark responses, there are significant imbalances and position displacements between them. This suggests that actual complex cell receptive fields are constructed by a richer combination of linear filters than proposed by the minimalist model.
Two classes of neurons—simple and complex cells—were described in early explorations of neuronal stimulus selectivity in the primary visual cortex (Hubel and Wiesel 1959, 1962). After nearly 50 years, simple cells are now typically modeled as a linear spatiotemporal filter followed by half-wave rectification and squaring, as shown in Fig. 1 A (DeAngelis et al. 1993; Jones and Palmer 1987; Movshon et al. 1978a), with additional properties such as gain control and other nonlinear interactions. It is now possible to predict most simple cell responses from their receptive field properties. For complex cell receptive fields, pairs of bars and dots have been used intensively to elucidate the functional internal structure, called subunits, which are responsible for producing tuning for orientation, direction, spatial, and temporal frequencies (Emerson et al. 1987; Gaska et al. 1994; Livingstone and Conway 2003; Movshon et al. 1978b). More sophisticated techniques have been developed recently to estimate key internal structures for complex cells from responses to dynamic dense noise, or overwhelmingly multidimensional, stimuli (Rust et al. 2005; Touryan et al. 2002).
Despite these noted recent advances, many of the details are still unclear regarding how the receptive fields of complex cells are constructed from the subunits. The minimalist energy model for the receptive fields of complex cells (Fig. 1B) posits two linear filters that are separated by 90° in spatial phase and proposes that they feed outputs to complex cells after filtering of visual stimuli and squaring operation (Adelson and Bergen 1985; Fleet et al. 1996). In reality, four linear filters with half-squaring operation are needed for equivalent computation to account for the fact that spike output cannot transmit a negative signal (Ohzawa et al. 1990; Pollen et al. 1989). Because the energy model economically explains insensitivity to contrast polarity, i.e., spatial overlap between on and off subregions, by the smallest number of linear filters, the filters possess identical positions and spatial extents among themselves as well as the overall complex cell receptive fields. However, complex cell receptive fields can be substantially larger than spatial extents of their constituent linear filters if these linear filters occupy different positions within the overall receptive fields, as proposed in Fig. 1C. In fact, previous studies suggest that many more than four linear filters are required for constructing complex cell receptive fields (Alonso and Martinez 1998; Sanada and Ohzawa 2006).
Are subunits pooled spatially to constitute the receptive fields of complex cells? What is the shape of the overall receptive fields of complex cells, and how is it related to those of the subunits? To answer these questions, we have taken advantage of a second-order interaction analysis for dynamic two-dimensional (2D) noise stimuli. Analysis for second-order interactions can reveal the shape and spatial extent of luminance signal integration, which we define as a subunit. Subunits are intimately related but not identical to the underlying individual linear filters shown in Fig. 1, B and C. In addition, comparisons of properties of these subunits with those of simple cells are also made to examine the details of the hierarchical model of complex cell receptive fields.
All animal care and experimental guidelines conformed to those established by the National Institutes of Health (Bethesda, MD) and were approved by the Osaka University Animal Care and Use Committee.
Surgical procedure and animal maintenance
After initial preanesthetic doses of hydroxyzine (Atarax, 2.5 mg) and atropine (0.05 mg), each cat was anesthetized with isoflurane (2–3.5% in O2) for the remainder of the surgical preparation. Lidocaine was injected subcutaneously or applied topically at all points of pressure and possible sources of pain. Electrocardiogram (ECG) electrodes were secured, a rectal temperature probe was inserted, and a femoral vein was catheterized. Body temperature was maintained near 38°C with the use of a feedback-controlled heating pad for the remainder of the experiment. Subsequently, a tracheostomy was performed and a glass tracheal tube was inserted for artificial respiration. The animal was then secured in a stereotaxic apparatus with the use of ear and mouth bars and clamps on the orbital rim. Anesthesia was then switched to sodium thiopental (Ravonal, 1.0 mg · kg−1 · h−1). After the stabilization of anesthesia, the animal was paralyzed with a loading dose of gallamine triethiodide (Flaxedil, 10 mg · kg−1 · h−1) to minimize eye movements during single-unit recording, and placed under artificial ventilation with a gas mixture of 70% N2O-30% O2. End-tidal CO2 was maintained at a constant level of 3.5–4.3% throughout the experiment. For the remainder of the experiment, the infusion fluid was delivered, containing sodium thiopental (Ravonal, 1.0 mg · kg−1 · h−1), gallamine triethiodide (Flaxedil, 10 mg · kg−1 · h−1), and glucose (40 mg · kg−1 · h−1) in lactated Ringer solution. In addition to body temperature and end-tidal CO2, heart rate, ECG, and intratracheal pressure were monitored and maintained within a normal range throughout the experiment. Pupils were dilated with atropine (1%) and nictitating membranes were retracted with phenylephrine hydrochloride (Neosynesin, 5%). Contact lenses with 4-mm artificial pupils were then placed on each cornea.
A craniotomy was carried out above the central representation of the visual field in visual area 17 approximately at the Horsley–Clarke coordinate, P4–L2.5. This corresponds to <10° in retinal eccentricity. The dura was removed to permit the insertion of tungsten microelectrodes for single-unit recording. After setting the electrodes close to the cortical surface, agar was applied over the cortex for protection and melted wax was applied over the agar to create a sealed chamber for stabilization. When the electrodes were retracted, electrolytic lesions (5 μA, 5 s) were made at 700- to 1,500-μm intervals for each electrode track. Typically, recordings from an animal lasted 4 days.
At the end of an experiment, the animal was administered an overdose of pentobarbital sodium (Nembutal) and perfused through the heart with formalin (4% in buffered saline). The recorded areas were frozen, sectioned into 40- to 60-μm slices, and stained with thionin. The locations of electrode tracks were identified.
Lacquer-coated tungsten microelectrodes (1–5 MΩ; A-M Systems, Sequim, WA) were used to record single-unit activities. Typically, two electrodes mounted in a single protective guide tube were driven in parallel with a common microelectrode drive to increase the chance of encountering neurons. Signals from the electrodes were amplified (×10,000), band-pass filtered (300–5,000 Hz), and fed into a custom-made data acquisition computer system (Ohzawa et al. 1996). The data acquisition system consisted of A/D converters and a spike-waveform discriminator that sorted signals from each electrode in real time. The time-stamped spike events (time resolution, 40 μs) were sent from the data acquisition system, along with their waveforms, to a separate computer that controlled the experiment. This second computer saved to files the spike data and other events, including the onset and the offset times of each trial and stimulus presentations, parameters and conditions of the spike sorter, and the entire set of experimental parameters. The stored data were analyzed on a third computer during and after the experiment.
Generations of visual stimuli were performed using custom-built software on yet another computer. At the request of the experiment control computer noted earlier, visual stimuli were generated by the fourth computer controlling a graphics card (Millenium G550; Matrox, Dorval, Quebec, Canada) and displayed on a cathode ray tube monitor (76-Hz frame rate, 1,600 × 1,024 pixels; GDM-FW900; Sony, Tokyo, Japan) through the green channel only to avoid color misconvergence across channels. In each experiment, the luminance nonlinearity of the display was measured using a photometer (Minolta CS-100; Konica Minolta, Osaka, Japan) and linearized by gamma-corrected lookup tables. The animal saw the display through a custom-built haploscope, which allows visual stimuli to be presented to the left and right eyes separately (Sanada and Ohzawa 2006). A black separator was placed between the left and right visual fields to preclude the projection of stimuli to unintended eyes. Distance (total length of light paths) between the screen and the eyes was set to 57 cm, subtending the visual field of 23 × 30° for each eye.
When a single unit was isolated, preliminary observations were made to determine its optimal orientation, spatial frequency, and both the position and the size of its receptive field. In this “search” procedure, the orientation, spatial frequency, position, and size of a patch of drifting sinusoidal grating could be adjusted with the use of a pointing device (mouse). Having completed the above-cited procedure, tuning in the orientation and spatial frequency domain was measured for the cell by presenting flashed sinusoidal grating stimuli (Nishimoto et al. 2005) and/or drifting sinusoidal grating stimuli. The degree of response modulation was assessed by presenting sinusoidal grating with a combination of orientation and spatial frequency that elicited the largest number of spikes. The cell was then classified into simple or complex on the basis of the amplitude of the first harmonic component relative to the average firing rate (F1/F0 ratio; Li et al. 2003; Skottun et al. 1991).
To evaluate the receptive field structure, we presented dynamic 2D noise stimuli with 51 × 51 small dots. The noise stimuli covered an area typically two- to threefold larger than the receptive field of the neuron in both the horizontal and the vertical directions. Each dot was assigned with dark (∼3 cd · m−2), bright (∼90 cd · m−2), or gray luminance (∼47 cd · m−2) at equal probability. The gray dots had the same luminance value as the mean luminance of the display. The dot size was determined for each cell primarily based on optimal spatial frequency to achieve both sufficient spatial resolution and signal-to-noise ratio (0.12 × 0.12 to 0.67 × 0.67° for individual dots; 5.3 dots per subregion width on average). The noise pattern was refreshed at every 26 or 13 ms, which corresponded to two video frames or one, respectively. Presentation of the dynamic noise stimuli lasted about 30 min for collecting a sufficient number of spikes for the following data analysis.
First-order map and second-order interaction map
Monocular visual stimuli (S) can be fully characterized by two dimensions of space (x, y) and one of time (t), S(x, y, t). The 2D dynamic noise stimuli used in this study consisted of three luminance values, 0, −1, 1, which corresponded to gray, dark, and bright dots, respectively. Spike-triggered stimuli were of particular interest among S(x, y, t).
For simple cells, the first-order maps were calculated to analyze the receptive field structure.
The first-order map, h1(x, y, τ), which represents the linear component of neuronal responses, was obtained by the spike-triggered averaging of S(x, y, ti − τ) where N is the total number of spikes collected during the stimulus presentation and i is an index for the ith spike, produced at time ti. The first-order map was obtained for correlation delays ranging from 0 to 289 ms in 13-ms (duration of one video frame) intervals.
For complex cells, the second-order interaction maps were computed to examine the internal structure of their receptive fields. This procedure is schematically illustrated in Fig. 2.
The second-order interaction map, h2(x1, x2, y1, y2, τ1, τ2), was calculated by accumulating the product of two spike-triggered stimuli, i.e., S(x1, y1, ti − τ1) and S(x2, y2, ti − τ2) for the ith spike where dx and dy represent spatial displacements between the two stimuli and dτ corresponds to a temporal offset between them. Thus the second-order interactions are described by the six-dimensional function.
This computation casts a positive vote for an interaction between the stimuli with the same contrast polarity (dark–dark and bright–bright) and a negative vote for an interaction between those with the opposite contrast polarity (dark–bright, bright–dark). When at least one member of a pair of stimuli is a gray dot, no vote is presented. This calculation is thus essentially identical to one used for sparse noise stimuli in previous studies (Gaska et al. 1994; Livingstone and Conway 2003).
To obtain a second-order interaction map, one stimulus location was selected as the reference and fixed at a particular spatial coordinate (x1, y1) and a particular correlation delay τ1 as shown in Fig. 2B. Then, the second-order interaction map for the reference (x1, y1, τ1) was calculated within the local neighborhood of this reference, as indicated by the dashed square in Fig. 2B. The spatial displacements for dx and dy ranged from −10 to 10 stimulus dots. Empirically, this extent for the second-order analysis was sufficiently wide for containing significant interactions. The second-order interaction with itself (i.e., dx = 0, dy = 0, and dτ = 0) cannot be measured with ternary dense noise stimuli because overlapping stimuli cannot be represented. For further analysis, the value at zero displacement was filled in by a spline interpolation from neighboring pixels in the interaction map. Because the two-stimulus interaction profiles are strongest and occupy the largest spatial extent between simultaneously presented stimuli in general (Anzai et al. 2001; Gaska et al. 1994), the temporal offset dτ was always set to 0 ms to evaluate the spatial structure and extent of second-order interaction maps. Accumulation of votes for all spikes produced the second-order interaction map for the reference (x1, y1, τ1). By changing the values of x1, y1, and τ1, second-order interaction maps were obtained for references at locations of every dot in visual stimuli for correlation delays from 0 to 197 ms in 13-ms (duration of one video frame) steps.
Second-order interaction map and subunit
The term subunit has been used extensively for referring to a functional or virtual unit that operates as a linear spatial (or spatiotemporal) filter internal to a complex cell (Livingstone and Conway 2003; Movshon et al. 1978b; Ohzawa and Freeman 1986; Rust et al. 2005). In this report, we define a subunit as a functional unit that is described by a second-order interaction profile as explained earlier. A subunit is closely related to, but is not identical to, linear filters depicted in models for complex cells (Fig. 1, B and C). As mentioned in the previous reports, a subunit contains contributions from multiple linear filters, presumably simple cells according to the hierarchical model (Hubel and Wiesel 1962) and thus does not represent a single anatomically identifiable cell.
First-order receptive fields and subunits in second-order maps are usually modulated along an axis perpendicular to the optimal orientation. We obtained their spatial extents by using a partial Hilbert transform (Hahn 1992). This method for obtaining an envelope is applicable for any multidimensional data modulated along a single axis because it does not assume a specific functional form. Alternatively, our maps could have been fitted by a model function selected a priori and its envelopes may have been extracted for subsequent analyses. A reasonable candidate for such a model function is a Gabor function, as used by most recent studies dealing with simple cell receptive fields. However, some second-order maps of complex cells may have broader flanking subregions than a central one (Szulborski and Palmer 1990). This characteristic is not captured well by a Gabor function and thus we sought to avoid fitting second-order maps by it.
Here, the partial Hilbert transform was carried out in the 2D spatial frequency domain along an axis perpendicular to the optimal orientation. First, the 2D Fourier transform of the map was computed. Then, the spatial frequency domain was divided into two mirror-symmetrical halves about the axis perpendicular to a vector pointing to the optimal orientation and spatial frequency from the origin. As a result, each half of the spatial frequency domain had spectral components that are identical in amplitude and are different in the sign of imaginary parts. The spatial frequency components in one half were shifted by 90° in phase; for the other half, they were shifted by −90° in phase. The inverse Fourier transform of the result yielded a complex signal where the real parts retained the original signal (Fig. 2E) and formed a quadrature pair with the imaginary parts (Fig. 2F). The envelope of the map (Fig. 2G) was obtained by calculating the absolute value (amplitude) of the complex signal. Because the second-order interaction maps were obtained for all dot locations in visual stimuli, some of maps were outside the complex cell receptive field and contained no signal. For these maps, the axis for the partial Hilbert transform could not be determined. Likewise, for cells without significant orientation tuning or spatial antagonism, the axis could not be determined either. Therefore for these cases, the absolute values of interaction maps themselves were used as their envelopes.
In previous studies, the second-order interaction maps were often averaged across all reference positions within the receptive field of a complex cell to enhance signal-to-noise ratios for a single final map (Emerson et al. 1987; Gaska et al. 1994; Livingstone and Conway 2003). This means that the six-dimensional interaction maps, h2(x1, x1 + dx, y1, y1 + dy, τ1, τ1 + dτ), were reduced to just three dimensions of dx, dy, and dτ. In this approach, the original maps were assumed to be spatially homogeneous. This assumption is justified for some complex cells (Emerson et al. 1987), but not for others (Szulborski and Palmer 1990). Therefore we examined second-order interaction maps and their envelopes for all individual reference positions separately. We assume that the strength of a second-order subunit is proportional to the collective responses of underlying linear filters contributing to that subunit. With this assumption, the overall receptive field of a complex cell may be obtained by collecting the envelopes of the second-order interaction maps. A complex cell receptive field at a correlation delay of τ was obtained as follows. First, we squared the envelopes of second-order interaction maps at a delay τ at all reference locations. The squaring operation was incorporated to approximate the effects of power-law static nonlinearities at the outputs of subunits (Gaska et al. 1994). Then, the squared envelopes were summed into a larger map (Fig. 2H) at their respective positions. Finally, its square root was computed. For further analysis, an optimal correlation delay was determined at which the amplitude of the receptive field was at maximum. We found that >95% interaction maps exhibited the strongest interactions at the optimal correlation delay. This supports that the receptive field size was estimated appropriately because of constancy in timing across interaction maps.
As described earlier, the receptive field envelopes of a simple cell were obtained in 13-ms intervals. These were then spline-interpolated along the time axis in 1-ms steps. For a correlation delay where the interpolated envelope had the maximum response, the receptive field and its envelope were computed for further analysis.
We sought to evaluate the spatial shape for the second-order interaction envelopes and the receptive fields of complex cells and the receptive field envelopes of simple cells. To extract parameters for characterizing them, each of them was fitted by a 2D Gaussian function having the form where B, K, x0, y0, σx′, σy′, and θ are free parameters. Specifically, the spatial coordinate (x0, y0) corresponds to the center position and θ rotates the translated coordinate system to align x and y axes with the major and minor axes of the Gaussian function. σx′ and σy′ are measures of the spatial extent, B is a baseline parameter, and K is simply a scaling factor. The Gaussian fit accounted for 70% of variance in data on average.
Spatial properties for the subunits and receptive fields were quantified on the basis of the parameters of the best-fitted Gaussian function. To evaluate size parameters such as length, width, and area, a bound was drawn at a criterion level that is 5% of the peak amplitude of the fitted Gaussian function. For this boundary, length and width were measured along the parallel and perpendicular axes to the optimal orientation, respectively. Figure 3 schematically shows the length and the width for the subunits and the receptive fields of complex cells. The length and the width for the receptive fields of complex cells were defined with respect to the optimal orientation of the subunits with the highest amplitude. For the receptive fields of simple cells, the length and the width were obtained for the optimal orientation. Optimal orientation, obtained by Fourier analysis, is a parameter independent of the major axis of the fitted Gaussian function.
We analyzed data from 86 complex cells in area 17 of 37 adult cats. Of these, three neurons were recorded in penetrations in the area 17/18 border zones, but their tuning properties such as the receptive field size and optimal spatial frequency were close to those of average area 17 cells. Therefore these cells are included in the rest of the analyses. To compare spatial properties between complex cell subunits and simple cell receptive fields, we also analyzed data from 152 simple cells in area 17. Simple and complex cells were classified based on the ratio of the first harmonic amplitude to the mean discharge rate for responses to drifting sinusoidal gratings (Li et al. 2003; Skottun et al. 1991). For complex cells, the spatial structure of subunits and receptive fields were computed through the second-order interaction analysis. As described in the previous section, second-order interaction maps were calculated at a temporal offset (=0) between the reference and neighboring stimuli, i.e., between two simultaneously presented stimuli. To obtain second-order interaction maps with sufficient signal-to-noise ratios, the analysis requires more spikes than the first-order analysis. Thus our sample contained fewer complex cells than simple cells, for which receptive fields were calculated through the first-order analysis. The average numbers of spikes collected during one measurement with the dynamic noise stimuli were 9,867.7 ± 9,221.8 for complex cells and 4,454.2 ± 3,935.0 for simple cells (mean ± SD). The minimum numbers of spikes were 649 and 180 for complex and simple cells, respectively. Likewise, the maxima were 47,115 and 20,017.
One of the major purposes of this study was to examine spatial pooling of subunits that collectively constitute the receptive fields of single complex cells. To achieve this goal, spatial characteristics were compared between the subunits and the overall receptive fields for individual complex cells. If the hierarchical model of Hubel and Wiesel (1962) is correct (Alonso and Martinez 1998; Martinez and Alonso 2001; but see also Toyama et al. 1981), subunits of a complex cell should closely reflect properties of simple cells contributing inputs to it. Although simple cells in our population were recorded separately from complex cells and do not have direct connection to the complex cells, we were able to compare the two cell types as groups. Specifically, it is worth comparing corresponding parameters between the subunits of complex cells and the receptive fields of simple cells. The simple and complex cells were recorded from the same set of electrode tracks in the same animals using 2D dynamic noise stimuli, although the proportions of cell types varied from one track to another.
Second-order interaction maps
Figure 4 shows a representative example of the second-order interaction analysis for a complex cell. In Fig. 4A, interaction maps are tiled to reflect the spatial locations of their references. Although the references may be at every dot location of the stimuli, the results are displayed here for reference positions at every two stimulus dots. The interaction maps at the top left and center in Fig. 4A are partially overlapping in the actual stimulus domain, as indicated on a noise stimulus frame (Fig. 4B), and outlined by dashed and thick lines, respectively. The interaction maps vary in shape depending on reference positions (Szulborski and Palmer 1990). The central interaction map in A is also shown magnified in Fig. 4C. To determine the spatial extent of a subunit in the map, the partial Hilbert transform (Hahn 1992) was performed as shown in Fig. 4D and then the envelope was obtained as shown in Fig. 4E. Note that Fig. 4, C and D differ in spatial phase by 90° and thus form a quadrature pair. To illustrate the bounds of the interaction map, the dashed line is drawn in Fig. 4, C and E at a criterion level that is 5% of the peak height of the 2D Gaussian function fitted to the envelope of the interaction map. By taking the square root of the sum of the squared envelope maps for all reference positions, the receptive field of the complex cell was obtained as shown in Fig. 4F.
In principle, the size of the overall receptive field of a complex cell may be much larger than that of individual interaction envelopes if there is extensive spatial pooling. However, there was little spatial pooling for this cell. Therefore the region depicted in Fig. 4F is cropped to match the spatial position and scale of Fig. 4, C, D, and E for ease of comparison. The bounds for the overall receptive field were again determined using a criterion amplitude that was 5% of the peak height of the Gaussian function fitted to the receptive field. This area is marked by a solid contour in Fig. 4F, and also shown in Fig. 4, C and E for size comparison with the size of the subunit. Note that the solid and dashed contours in Fig. 4, C and E are superimposed nearly exactly, indicating that the subunit size was nearly the same as that of the overall receptive field for this complex cell. The reference position of the interaction map in Fig. 4C was at the center of the receptive field and therefore showed the strongest interactions among all reference locations.
The degree of spatial pooling of subunits may be quantified by the difference in spatial extent between the subunits and the overall receptive field. When the outputs of many subunits are pooled across different spatial positions to generate an overall receptive field, its size should occupy a much larger spatial extent than that of individual subunit envelopes. However, when pooling is absent, the size of a receptive field is expected to be identical to that of subunit envelopes. Therefore the complex cell depicted in Fig. 4 exhibited hardly any spatial pooling. The absence of spatial pooling is in fact what is predicted by the minimalist energy model; therefore the energy model is an accurate representation of the cell in Fig. 4.
Figure 5 shows data from another complex cell in the same format as that for Fig. 4. For this cell, the size of the overall receptive field (solid ellipse in Fig. 5, C, E, and F) was significantly larger than that of the subunit envelope (dashed ellipse; 169% in terms of area; P < 0.05, resampling). Therefore spatial pooling of subunits did take place for this neuron.
Are there complex cells with concentric second-order interaction maps showing poor orientation selectivity (Szulborski and Palmer 1990)? Among 86 complex cells, we found only one neuron that exhibited such characteristics as shown in Fig. 6. The subunits were round, and appeared to have a weak antagonistic surround (Fig. 6C). However, because of its weak contribution, the subunit envelope was essentially determined by the central region alone (Fig. 6D). Because the optimal axis of the partial Hilbert transform could not be determined for this neuron, envelopes of the interaction maps were obtained by taking their absolute values. The size of the receptive field (solid ellipse in Fig. 6, C, D, and E) is much larger than that of the subunit envelope (dashed ellipse; 473% in terms of area; P < 0.05, resampling). This complex cell showed the largest degree of spatial pooling in our sample.
For five other complex cells, Fig. 7 shows the second-order interaction maps, their envelopes, and the overall receptive fields. For each example in this figure, the interaction map illustrated was obtained with the reference position at the center of the Gaussian function fitted to the receptive field of the cell. Using the same criteria as used in Figs. 4–6, solid and dashed contours are drawn to demarcate the bounds of receptive fields and second-order interactions, respectively. Figure 7C illustrates a cell that had the largest number of subregions within a subunit among all the complex cells we analyzed. Figure 7D depicts a complex cell that exhibited a substantial spatial pooling of subunits along the axis of the optimal orientation, but not along the width dimension.
For the vast majority of complex cells, the neuronal activities were recorded with dynamic noise stimulation to their dominant eye alone. However, responses to dynamic noise stimuli were measured for both eyes separately in a few binocular neurons. Figure 7, E and F shows results for the second-order interaction analysis for a single complex cell in the right and the left eyes, respectively. Both interaction maps and receptive field profiles were highly similar for the two eyes.
Length and width
Having examined interaction maps and the extent of pooling for representative complex cells, we now analyze these properties for the population of cells we have recorded. In the population analyses, complex cell subunits were sampled at the center of the receptive fields. First, what relationships are found between the length and the width for complex cell subunits, receptive fields, and simple cell receptive fields? Figure 8 A shows a relationship between the length and the width for the subunits of complex cells. The length and the width were defined with respect to the optimal orientation (obtained by Fourier analysis) for the subunits (Fig. 3). To make a cell-by-cell comparison of the two values possible, subunit length is plotted against subunit width. Each circle in the scatterplot represents a datum from one complex cell. Note that the majority (63%) of circles lie above the identity line. The histograms show the distributions of the length (right) and the width (top) for subunits, with median values of 4.05 and 3.63°, respectively, as indicated by arrows. The subunit length was generally longer than the subunit width (P = 0.0042, Wilcoxon signed-rank test).
Are complex cell receptive fields circular in shape or are they elongated along a specific direction? Figure 8B shows a relationship between the length and the width for the receptive fields of complex cells. Again, the length and the width of receptive fields were measured with respect to the optimal orientation of their subunits (Fig. 3). On average, the receptive fields of complex cells were longer along the length axis than along the width axis (P = 0.015, Wilcoxon signed-rank test). However, the degree of elongation was not as significant as that for subunits. The median values of the receptive field length and the width were 4.72 and 4.40° (arrows), respectively.
Figure 8C illustrates a relationship between the length and the width for the receptive fields of simple cells. Each circle in the scatterplot denotes a datum from a simple cell. The histograms show the distributions of the length (right) and the width (top) for the receptive fields of simple cells, respectively. The majority (65%) of symbols are above the identity line. The median length was 4.40° and the median width was 4.00°. This difference was statistically significant (P < 0.001, Wilcoxon signed-rank test). This was exactly the same trend as observed for the subunits of complex cells as shown in Fig. 8A. Are the distributions of subunit length and width similar to those of receptive fields of simple cells? The answer is yes, and there was not a significant difference in either the length or the width (P > 0.05, Mann–Whitney U test).
Number of subregions
A chief functional characteristic of the early visual cortical neurons is selectivity for orientation and spatial frequency (Hubel and Wiesel 1959, 1962; Movshon et al. 1978a,b). Among various mechanisms contributing to the determination of tuning parameters (Benevento et al. 1972; Bredfeldt and Ringach 2002; Ringach et al. 2002), the number of subregions in complex cell subunits is inversely related to tuning bandwidths for orientation and spatial frequency (Gaska et al. 1994; Movshon et al. 1978b). The number of subregions for a subunit was calculated as follows where width is the subunit width (see Fig. 3) and SFopt is the optimal spatial frequency obtained from the Fourier analysis of the second-order interaction map at the center of the complex cell receptive field. The product of the subunit width and the optimal frequency gives the number of cycles within the subunit. Therefore this value is doubled to convert it into number of subregions because one cycle of modulation contains two subregions. Figure 9 A shows the distribution of the number of subregions in the subunits of complex cells. The mean of this distribution was 3.21 and the SD was 1.08. These values are roughly comparable to those reported by Movshon et al. (1978b).
Does the number of subregions match between the subunits of complex cells and the receptive fields of simple cells? Figure 9B shows the distribution of the number of subregions within the receptive fields of simple cells. The number of subregions for simple cell receptive fields was calculated in the same manner as that for complex cell subunits. The median number of subregions within the receptive fields of simple cells was 2.53, which was significantly different from 3.00, the median number of subregions within the subunits of complex cells (P < 0.001, Mann–Whitney U test; see also Fig. 9A). Therefore the subunits of complex cells tended to have significantly more subregions than simple cells. This difference may be related to the fact that subunits are functionally defined units and cannot be equated to individual simple cells (Movshon et al. 1978b; Rust et al. 2005).
Length-to-width ratios and aspect ratios
Figure 10 A compares length-to-width ratios between subunits and receptive fields. Note that length-to-width ratios are aspect ratios defined with respect to the preferred orientation. However, we reserve the term aspect ratios to refer only to those related to the elliptic elongation of subunit envelopes and complex cell receptive fields (see following text). There was a strong correlation between the two ratios (r = 0.81). The two distributions also had indistinguishable medians (1.12 and 1.04 for the length-to-width ratios of subunits and receptive fields, respectively, as indicated by arrows; P > 0.05, Wilcoxon signed-rank test). These results appear to indicate that shapes of subunits are reflected directly to those of the overall receptive fields of complex cells.
With respect to length-to-width ratios, do simple cell receptive fields have distributions indistinguishable from those for complex cell subunits? Figure 10B shows the distribution of length-to-width ratios for the receptive fields of simple cells. The median of the distribution was 1.08. There was not a significant difference in the median values of length-to-width ratios between the receptive fields of simple cells (Fig. 10B) and the subunits of complex cells (Fig. 10A, top; P > 0.05, Mann–Whitney U test).
The direction of receptive field elongation, if any, can be different and may be independent of the preferred orientation of a neuron in general (Fig. 3). A similar distinction also applies for properties of subunits. Are complex cell receptive fields more elongated in a particular direction than their subunits? Is there any relationship between the direction and degree of subunit elongation and pooling? To address these questions, we quantified the degree of elongation by aspect ratios, which were defined by the length ratios of major and minor axes for subunit envelopes or complex cell receptive fields.
Figure 11 A compares aspect ratios between the subunit envelopes and the receptive fields of complex cells. The two ratios were correlated (r = 0.68). The median value of the distribution of aspect ratios for receptive fields, 1.28, was significantly smaller than that for subunits, 1.32 (P < 0.001, Wilcoxon signed-rank test). This indicates that the receptive fields of complex cells are more circular in shape than the subunits. The reduction of aspect ratios may be caused by the spatial pooling of subunits, which will be subsequently examined in Pooling of subunits.
Figure 11B shows the distribution of aspect ratios for the receptive fields of simple cells. The median of the distribution was 1.34, which was not significantly different from that for the subunits of complex cells, 1.32 (P > 0.05, Mann–Whitney U test; see also the top histogram in Fig. 11A). Therefore both the length-to-width ratios and the aspect ratios were consistent between the receptive fields of simple and the subunits of complex cells, as a population.
Orientation of elongation
There may be a bias in the elongation of the subunit envelopes and the receptive fields of complex cells with respect to the absolute horizontal and vertical. Scatterplots in Fig. 12 show aspect ratios against the orientation of the major axis for subunit envelopes (left) and receptive fields (right). These two quantities show no systematic relationship for respective panels. The horizontal solid line is drawn at an arbitrary criterion level (aspect ratio = 1.2) to determine whether subunit envelopes and receptive fields are sufficiently elliptic for the subsequent analysis. For subunit envelopes and receptive fields with small aspect ratios (≤1.2; open symbols), their major axis cannot be determined reliably. On the other hand, those with large values (>1.2; filled symbols) were judged as sufficiently elliptic, and were counted in histograms to examine distributions of the orientation of the major axis (n = 66 for subunit envelopes, n = 56 for receptive fields). These distributions were not different from uniform distributions (P > 0.05, Kolmogorov–Smirnov test). Therefore the results show no special orientation bias in the aspect ratios for both subunits and overall receptive fields of complex cells.
The prominent “oblique effect” is reported for the distribution of preferred orientations of simple cells (Li et al. 2003). Does the orientation of receptive field envelopes of simple cells exhibit any biases? Figure 13 shows a relationship between the orientation of the major axis for the receptive field envelopes for simple cells and their optimal orientation. These values were measured counterclockwise from the horizontal axis in visual scene (0°). Based on the same criterion as that used in Fig. 12, the receptive field envelopes were classified as elongated (aspect ratio >1.2; filled symbols; n = 114) or circular (aspect ratio ≤1.2; open symbols; n = 38). Simple cells that prefer horizontal orientation (0 and 180°) appear less likely to have vertically elongated receptive field envelopes (90°). To investigate the biased elongation of receptive field envelopes regardless of optimal orientation, the number of simple cells was counted in the top histogram (for cells with sufficiently elongated envelopes). Its striking U shape demonstrates a general tendency in which there were more simple cells with horizontally elongated envelopes than those elongated vertically (P < 0.05, Kolmogorov–Smirnov test for uniform distribution).
Because optimal orientation seems related to the elongation angle of receptive field envelopes for simple cells, length-to-width ratios are evaluated in terms of optimal orientation in Fig. 14 A. Circles, triangles, and square symbols denote neurons that prefer low, middle, and high spatial frequency, respectively. Most of the cells that prefer horizontal orientation (0 and 180°) appear to have length-to-width ratios >1. This means that these cells tended to be elongated along the length, or the horizontal, axis, which is consistent with the biased distribution in Fig. 13. A solid curve and error bars indicate the geometric mean and SD values of the length-to-width ratios for simple cells grouped based on the optimal orientation in 30° steps. Variances of the length-to-width ratios were significantly different across these groups (P < 0.05, Bartlett test). The receptive fields of simple cells pointed to by arrows in Fig. 14A are illustrated as representative examples in Fig. 14, C–J.
Optimal spatial frequency is also related to length-to-width ratios (Fig. 14B). Circles, triangles, and square symbols denote simple cells that prefer horizontal, oblique, and vertical orientations, respectively. Neurons that prefer low spatial frequency tended to exhibit low length-to-width ratios (i.e., elongated along the width axis), whereas those that prefer high spatial frequency were elongated along the length axis in general. A positive correlation is evident between optimal spatial frequency and length-to-width ratios (r = 0.31). The median values of the length-to-width ratios were significantly different across different spatial frequency groups (P < 0.01, Kruskal–Wallis test).
Pooling of subunits
To evaluate the degree of the spatial pooling of subunits that make up complex cells, a pooling ratio in terms of area was calculated for each neuron The areas were those enclosed within the 5% contours defined previously (see Figs. 4–7). To achieve response invariance to contrast polarity, the simplest version of the energy model for complex cells posits a minimal number of linear filters, which forms a quadrature pair, and their outputs are squared and combined. In this model, there is no spatial pooling as defined in Fig. 1C because the subunit size matches the receptive field size of a complex cell. Therefore the areal pooling ratio is predicted to be ≈1. On the other hand, much larger values for the areal pooling ratio are expected when subunits are pooled spatially. The histogram of Fig. 15 A shows the distribution of the areal pooling ratios observed for 86 complex cells. The median of areal pooling ratios was 1.21. In length or width terms, this amounts to a 10% difference on average. These values indicate that, if anything, the spatial extents of receptive fields of complex cells were only slightly larger than those of the subunits. For each complex cell, a bootstrap test was performed to assess whether the areal pooling ratio was significantly different from one. For 32 of 86 cells (37%), the areal pooling ratio was significantly >1 (P < 0.05, resampling; black bars).
Deep layers are thought to have neurons with large receptive fields (Gilbert 1977). Thus the laminar dependence of the areal pooling ratios was examined for complex cells having areal pooling ratios >1.5. Based on limited laminar analyses for tracks for which histological determinations of layers were carried out with confidence (29% of cells), cells with large areal pooling ratios were present in both deep layers and supragranular layers. For example, the cell with a circular subunit profile and the largest areal pooling ratio (4.73; Fig. 6) among our samples was recorded at the cortical depth of 180 μm, which must be in layer 2. This neuron was not likely to be a special complex cell because it had a receptive field size of <2 deg and had almost no spontaneous discharge. Therefore unfortunately, we did not have a neuron in our sample that was positively identified as a special complex cell (Gilbert 1977). A complete picture on the laminar distribution of cells with large pooling ratios requires further investigation.
It is possible that our analyses might underestimate the receptive field sizes of complex cells, if weak subregions of subunits were to be missed in noisy maps. This could lead to incorrect estimation of the degree of spatial pooling. If this were the case, quality of data would have influenced the number of subregions in subunits and the areal pooling ratios. Thus a signal-to-noise ratio was calculated for each subunit. Signal can be considered as an amplitude parameter obtained by the Gaussian function fitted to the subunit envelope. Noise level was estimated as the SD of amplitudes of envelopes that were obtained by correlating spike sequence we actually recorded to stimulus noise patterns we did not present but with otherwise identical statistical properties. Examination of the signal-to-noise ratios of subunits revealed no systematic relationships with the number of subregions in subunits or the areal pooling ratios.
Figure 15, B and C compares the sizes of complex cell subunits, receptive fields, and simple cell receptive fields. By using hand-plotting mapping protocol for the minimum response fields, Hubel and Wiesel (1962) reported that the receptive fields of complex cells were, on average, threefold larger than those of simple cells. The median of the receptive field areas of simple cells we recorded was 2.42 deg2 and that of the subunit areas and the receptive field areas of complex cells were 1.93 and 2.89 deg2, respectively. We found that the receptive fields of simple cells have spatial extents equal to those of subunits and the receptive fields of complex cells (P > 0.05, Mann–Whitney U test).
Is there any particular axis along which pooling of subunits tends to occur? As shown in Fig. 11A, aspect ratios are smaller for the receptive fields of complex cells than for the subunits. Decrease in the aspect ratios can be readily explained, if subunits are pooled more extensively along the minor axis of their envelopes. To examine the possible radial anisotropy of spatial pooling, the degree of pooling was evaluated directionally along radial axes that were at various angles with the major axis of subunit envelope, as illustrated in Fig. 16 C. Specifically, the sizes were measured along each radial axis for the subunit envelope and the receptive field. The directional pooling ratio was defined as the receptive field size divided by the subunit envelope size. Directions were then searched for which the directional pooling ratios are maximal and minimal.
Figure 16A shows distributions of the directional angle at which the maximal directional pooling occurs (left) and those for the minimal pooling (right). Directions for maximal and minimal pooling are not necessarily orthogonal to each other. The subunit envelopes of complex cells were grouped as sufficiently elliptic (aspect ratio >1.2; filled symbols) or not (open symbols) in the scatterplots. Only complex cells with sufficiently elliptic subunit envelopes were accumulated in the histograms above. For neurons for which the areal pooling ratio was significantly >1 (P < 0.05, resampling), the data are represented by black circles in the scatterplots and by black bars in the histograms. For cells without pooling, the data are shown by gray circles and by gray bars. Remarkably, the directional angle is clustered around 90 and −90° for the maximal directional pooling ratios and 0° for the minimal pooling ratios. This means that pooling tended to occur along the direction for which the subunits were skinnier. Therefore the result is consistent with the earlier hypothesis for reconciling the discrepancy of aspect ratios between subunit envelopes and receptive fields. Distributions of the directional angle for both the maximal and the minimal directional pooling ratios were significantly different from a uniform distribution (P < 0.001, Kolmogorov–Smirnov test for uniform distribution).
Figure 16B depicts the average ± SE values of the directional pooling ratios as a function of the angle from the subunit major axis. The difference in the pooling ratios across different directional angles was statistically significant (P < 0.001, Friedman test).
Subunit organizations of complex cells underlying contrast-sign invariance
How do complex cells acquire response invariance to the sign or polarity of contrast of visual stimuli? The minimalist scheme, often used in modeling, for response invariance to contrast polarity is a full-wave rectification of the outputs of two linear filters that differ in spatial phase by 90° (Adelson and Bergen 1985; Fleet et al. 1996). In reality, biological substrates for the linear filters feeding into a complex cell are selective for the contrast polarity and their outputs are half-wave rectified. Functionally, linear filters must thus be organized as a push–pull pair (Ohzawa et al. 1990; Pollen et al. 1989; Troyer et al. 1998). In theory, members of such push–pull pairs should have receptive fields that are inverted versions of each other, as assumed by various energy model schemes (Adelson and Bergen 1985; Emerson et al. 1992; Ohzawa et al. 1990). However, there is no guarantee that this is actually the case for real complex cells. Therefore we will examine positive and negative halves of subunit organizations by sorting separately the second-order response maps according to the contrast polarity of reference stimuli (Emerson et al. 1987; Livingstone and Conway 2003).
Together with second-order interaction maps, Fig. 17 depicts bright-minus-dark maps for dark- and bright-reference stimuli separately for four complex cells. These maps in a given row were obtained at the identical location at which the strongest second-order interactions were observed for each cell. Figure 17, left shows a bright-minus-dark map when a dark stimulus was presented as the reference and Fig. 17, center shows a bright-minus-dark map when a bright stimulus was presented as the reference. The gray scale is matched between the two maps. As described in methods, subtraction of these bright-minus-dark maps yielded a second-order interaction map as shown in Fig. 17, right. [Multiplication by a negative (dark) reference corresponds to subtraction.] The reference location is denoted by the crosshair in each map. Different biological substrates are recruited by visual stimuli having opposite contrast polarities as the reference and, in ideal complex cells, the two bright-minus-dark maps should be an inverted version of each other. This appears to be the case for cells illustrated in Fig. 17, A and B.
However, in general, the two bright-minus-dark maps could not be related by simply inverting the sign for other complex cells. For a cell shown in Fig. 17C, the center positions of the envelopes are different between the bright-minus-dark maps. Although these maps cannot be equated to the receptive fields of simple cells, it is tempting to consider that these responses are contributed by separate populations of antecedent simple cells for which receptive fields are displaced along the width direction (e.g., Rust et al. 2005). Another cell depicted in Fig. 17D shows imbalance in amplitude as well as the displacement of the center positions between the two bright-minus-dark maps.
All four complex cells presented here exhibit even-symmetric second-order interaction maps (Fig. 17, right column) as reported previously (Emerson et al. 1987; Gaska et al. 1994; Livingstone and Conway 2003). However, a distinction must be made between our second-order maps and those of previous studies. For previous results, due to simple methodological reasons, the interaction maps were guaranteed to be even-symmetric because the maps were averaged for all reference positions. Because each of our maps is for a single reference position, there is no such methodological basis for the symmetry. Therefore the even symmetry in our maps represents the true nature of the two-stimulus interaction sampled near the center of the complex cell receptive fields. In fact, examinations of interaction maps near the fringes of complex cell receptive fields often revealed asymmetric interaction maps (see maps in the fifth row in Fig. 4A).
In contrast to the second-order interaction maps, however, internal organizations of subunits exhibited various response patterns to dark- and bright-reference stimuli. The bright-minus-dark maps can show different center positions and unequal strength between dark and bright references.
The minimalist energy model predicts that, when the reference is located at the center of a complex cell receptive field, the bright-minus-dark maps for dark and bright references have 1) centers at the reference location and 2) identical amplitude. These predictions are obviously violated in some complex cells as shown in Fig. 17, C and D. Thus we sought to evaluate deviations from the model predictions for a population of complex cells actually recorded. First, the envelope for a bright-minus-dark map was computed by partial Hilbert transform. Then, a Gaussian function was fitted to the envelope, and the parameters for center position and amplitude were examined. The center position was measured from the reference location and converted into spatial phase angle with respect to the optimal spatial frequency when the center was projected onto the width axis (the axis orthogonal to the optimal orientation for the subunit).
Figure 18 illustrates the distributions of center position and amplitude for the envelopes of bright-minus-dark maps for dark (left) and bright (right) references. Neurons shown in Fig. 17 are indicated by the corresponding labels. The center position expressed as a phase angle represents a spatial offset—thus it was not artificially wrapped into a range from −180 to 180°. However, the actual data points were scattered within the range. The amplitude was normalized by the SD of the amplitude values on the edges of the bright-minus-dark map envelope, which is considered to represent the noise level. Therefore the normalized amplitude may be thought of as a signal-to-noise ratio. Both of the distributions show an inverted-V shape centered about zero, indicating that any map with high amplitude tended to have its center at the reference location. This is consistent with the minimalist energy model. However, the center position could be distant from the reference location as the amplitude decreased.
It is still possible that centers are at an identical position between the dark- and bright-reference maps if their centers are shifted in the same direction to the same amount. Figure 19 shows the ratio of amplitudes against the difference in center positions between the two maps. For this analysis, 78 complex cells were selected for which the normalized amplitude exceeded three for both dark- and bright-reference maps. As noted earlier, the minimalist energy model predicts no difference in center positions and identical amplitudes between the two maps, as marked by the crosshairs. In contrast, actual complex cells exhibited a variety of combinations between these two parameters. The two bright-minus-dark map envelopes differ in center positions by 48.2 ± 41.7° (mean ± SD) and the stronger of the bright-minus-dark maps had an amplitude 1.53 ± 1.38-fold larger than that of the weaker one (mean ± SD). Therefore although the minimalist energy model is generally capable of explaining various properties of complex cells, subunits consisting of actual complex cells do not necessarily follow the “ideal” scheme, e.g., often showing amplitude differences >50%.
We have studied the internal spatial organization of complex cells in the early visual cortex with the use of the second-order interaction analysis. Previous studies examined the spatiotemporal characteristics of the second-order interaction maps of complex cells in an attempt to account for selectivity to orientation, direction, and spatial and temporal frequencies (Emerson et al. 1987; Gaska et al. 1994; Livingstone and Conway 2003; Movshon et al. 1978b). Instead of focusing on tuning for these stimulus parameters, we have examined the spatial and structural relationships between the receptive field of a complex cell and their subunits as revealed through the second-order interaction maps. The results are summarized in Table 1.
Comparison of spatial properties between subunit envelopes and receptive fields for complex cells
A notion often held regarding complex cells is that their receptive fields tend to be large compared with simple cells, occupying an area threefold as large on average (Hubel and Wiesel 1962). A prediction based on this idea is that a large number of simple cells must be combined hierarchically for constructing a complex cell receptive field. To our surprise, in general this initial expectation was not fulfilled. When examined quantitatively, the sizes of receptive fields of complex cells on average were only slightly larger (1.21-fold in area) than those of their subunit envelopes (Fig. 15A). The sizes of receptive fields for simple cells were also comparable to those of subunits for complex cells. Therefore the casually held notion is not correct for typical complex cells.
What could be the sources of the discrepancy between the Hubel and Wiesel's result and ours? Sampling biases between the cell types are unlikely to be the cause because our simple and complex cells were from the same set of electrode penetrations. Although individual tracks varied in their eccentricity and ratio of cell types, these factors should not influence the overall distributions for the entire population of cells. Furthermore, we have obtained about 10 pairs of simultaneously recorded simple and complex cells. The receptive field sizes of simple and complex cells within a given pair were always closely similar. Essentially the same results, obtained for the relationships between the complex cells and their subunits, and those between the simple and complex cells, also strengthen our results.
It is possible that the “minimum response field” plotting (Bishop and Henry 1972) used in typical hand mapping of receptive fields may miss weak subregions of simple cells, thereby underestimating the receptive field size. This problem would be more acute if only a bright bar is used in mapping (which is true for nearly all early studies including those by Hubel and Wiesel), because a dark-excitatory subregion of a simple cell may not generate any off response if the temporal response is monophasic (Fig. 5B of DeAngelis et al. 1995). In such cases, even a very strong dark-excitatory subregion would be missed, leading to an underestimation of receptive field area by a factor of ≥2. On the other hand, however, complex cells are not likely to be affected by this problem because, by definition, on and off subregions are overlapping for these cells and the use of a bright bar only is not an obstacle for correctly estimating the receptive field size.
Spatial pooling of subunits
In theory, if there is a substantial degree of spatial pooling, the second-order interaction for a particular reference location can be limited to a portion of the receptive field and does not necessarily cover the entire spatial extent of the receptive field of a complex cell. However, in the cat early visual cortex, we have demonstrated that subunits obtained by the second-order interaction analysis are pooled spatially to only a small degree for most cells. In other cortical areas, such a pooling does take place. For example, Pack et al. (2006) measured second-order interaction maps for middle temporal (MT) neurons in the spatiotemporal domain, to study the direction selectivity, and found that the spatial extent of two-stimulus interaction is limited to a small portion in the receptive field. Because neurons projecting from the primary visual cortex to MT are predominantly direction-selective complex cells (Movshon and Newsome 1996), complex cells having the receptive fields in various positions might serve as subunits that are pooled spatially to constitute the receptive field of MT cells. Note that our results do not prove or disprove pooling of simple cells in making up spatially larger subunits. This is because summation of multiple linear receptive fields cannot be distinguished from a single large linear receptive field.
Subunit organizations of complex cells underlying contrast-sign invariance
The seminal characteristic of complex cells is that on and off responses may be elicited everywhere within the receptive field (Hubel and Wiesel 1959, 1962). Such insensitivity to the contrast polarity is thought to be achieved by combining the outputs of multiple simple cells that are sensitive to the contrast sign. Internal structures of these complex cells have been studied with various nonlinear systems analysis techniques, most of which have assumed symmetrical structures for the contrast sign (e.g., Marmarelis and Marmarelis 1978). However, actual complex cells do not necessarily respond identically to dark and bright stimuli (e.g., Fig. 1B of Ohzawa et al. 1990). Therefore we have examined second-order interaction maps separately for the bright- and dark-reference stimuli. This analysis demonstrated that, contrary to the energy model assumption, the two maps could not be related to each other simply by inverting signs. The unmatched positions of bright-minus-dark maps could account for the fact that complex cell subunits have more subregions than simple cells. Together with the position displacement, the imbalanced amplitudes between these maps might contribute to produce “imperfect” complex cells, which exhibit a minor degree of response modulation (e.g., 0.2 < F1/F0 ratio < 0.8) when a drifting sinusoidal grating is presented.
Relationship to other studies on internal organizations of complex cell receptive fields
Second-order interaction maps represent how two visual stimuli interact to elicit the responses of complex cells. Thus they may be interpreted as one possible set of linear filters for explaining complex cell responses, although they do not necessarily correspond to biological entities such as the receptive fields of simple cells that feed into complex cells.
Several spike-triggered techniques are also suited for revealing the filter structure of complex cells. Recently developed spike-triggered covariance (STC) techniques discover multidimensional axes, or filters, relevant for explaining the variance structure of spike-triggered stimuli in the multidimensional space (Rust et al. 2005; Touryan et al. 2002). These methods rely on principal component analysis to minimize the number of filters by imposing mutually orthogonal relationships on them. Therefore a set of recovered filters can economically span the entire subspace in which neuronal responses are increased (excitatory filters) and decreased (suppressive filters). This has an advantage over second-order interaction analysis, which reveals only a single net excitatory filter for each reference position. However, filters recovered by the STC can be too large in spatial extent, in the sense that visual stimuli may not interact in the entire spatial extents of recovered filters to elicit neuronal responses and shape tuning for stimulus features (Rust et al. 2005). Therefore even though neither the map obtained from the second-order interaction analysis nor that from the STC technique represents responses of a single biological entity, the second-order interaction maps are closer to the biological reality, in that they accurately reflect the spatial limits within which linear summation of stimuli takes place.
Other promising methods examine the net excitatory and inhibitory responses of complex cells in the spatial frequency domain. With these techniques, spike-triggered stimuli are first converted to their spectra in the spatial frequency domain by the Fourier transform, and then they are averaged in the frequency domain separately for different phases (David et al. 2004) or without regard to phase (Nishimoto et al. 2006; Ringach et al. 1997). Spectral representation potentially enables suppressive responses, such as suppression at low spatial frequency and cross-orientation suppression, to be dissociated from excitatory responses (Bredfeldt and Ringach 2002; Nishimoto et al. 2006; Ringach et al. 2002). These suppressive responses are masked by excitatory responses in second-order interaction analysis because they are usually overlapped in the space domain.
Comparison of spatial properties between complex cell subunits and simple cell receptive fields
As a population, the receptive fields of simple cells were comparable to complex cell subunits in width, length, width-to-length ratios, and aspect ratios. However, complex cell subunits contain more subregions than simple cell receptive fields. This is consistent with our findings that bright-minus-dark maps for dark- and bright-reference stimuli are often displaced and therefore underlying linear filters for complex cells do not necessarily form a push–pull pair. Another interesting fact is that the receptive field envelopes of simple cells tended to be elongated horizontally rather than vertically. Complex cells, on the other hand, exhibited no such bias either for the subunits or for the receptive fields. Therefore multiple simple cells could feed into a complex cell in an anisotropic manner such that the bias toward the horizontal elongation of their receptive fields is weakened.
Herein we have examined spatial relationships between the subunits and receptive fields for individual complex cells. Our strategy is a logical practical step toward understanding how the receptive fields of complex cells are built up from simple cells, as modeled by Hubel and Wiesel (1962). To obtain an unequivocal answer, future studies must compare subunit properties of complex cells to those of antecedent simple cells that are directly connected to them (Alonso and Martinez 1998).
This work was supported by Ministry of Education, Culture, Sports, Science and Technology Grant 18020017 and by 21st Century Center of Excellence Program from Japan Society for the Promotion of Science.
We thank laboratory members H. Tanaka, S. Nishimoto, T. Sanada, R. Kimura, M. Fukui, M. Iida, M. Arai, T. Ninomiya, T. Ishida, Y. Asada, and Y. Tabuchi for help in experiments and valuable discussions.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2007 by the American Physiological Society