JN Journal of Neurophysiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 97: 2423-2438, 2007. First published January 24, 2007; doi:10.1152/jn.00713.2006
0022-3077/07 $8.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
97/3/2423    most recent
00713.2006v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Mechler, F.
Right arrow Articles by Victor, J. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Mechler, F.
Right arrow Articles by Victor, J. D.

Speed Dependence of Tuning to One-Dimensional Features in V1

Ferenc Mechler, Ifije E. Ohiorhenuan and Jonathan D. Victor

Department of Neurology and Neuroscience, Medical College of Cornell University, New York, New York

Submitted 12 July 2006; accepted in final form 15 January 2007


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Using drifting compound grating stimuli matched in energy and frequency spectrum, we previously showed that neurons in the primary visual cortex (V1) were tuned to line-like, edge-like, and intermediate one-dimensional features. Because these compound grating stimuli were drifting, allowing for potential interaction between shape and motion, we examine here the dependence of V1 feature tuning on drift speed. We find that the feature selectivity and specificity of individual V1 neurons strongly depend on speed. A simple model explains these observations in terms of an interaction between linear filtering by the receptive field and the static nonlinearity of spike threshold, embedded in a recurrent network. Although the speed-dependent behaviors in single V1 neurons preclude their acting as extractors of one-dimensional features, the population as a whole retains a representation of a full suite of features.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Lines and edges are salient features and their detection and discrimination is implicated in processes fundamental to object vision, including image segmentation, contour continuation (Field et al. 1993Go, 2000Go), and completion (Kovacs and Julesz 1993Go). Various extrastriate visual cortical areas were previously physiologically identified as candidates to process assignment of boundary ownership, contour integration, figure–ground segregation (for a thorough review, see, e.g., von der Heydt 2003Go), all of which depend on low-level local feature extraction and manipulation. The local image processing that takes place in the primary visual cortex (V1) appears to receive global context by top-down modulatory feedback from extrastriate areas that extract texture boundaries (Zipser et al. 1996Go) or collinear contours (Kourtzi et al. 2003Go; Polat et al. 1998Go). However, bottom-up feature processing might already begin in earnest at the earliest cortical stage of visual processing: we provided evidence in our first study of this subject (Mechler et al. 2002Go) that typical neurons of the primary visual cortex in the anesthetized primate already exhibit "feature tuning" to optimally oriented one-dimensional spatial profiles, including lines and edges. Although in that study we used drifting stimuli, we did not examine how those results might have depended on the drift velocity of the stimulus.

The possible dependence of feature tuning on velocity is important from several points of view. First, V1 neurons can be considered to signal the presence of these features only if they do so in a velocity-independent way. Second, psychophysical studies show various degrees of degradation of visual performance with increasing speed (Burr et al. 1986Go; Morgan and Castet 1995Go). Finally, an increasing number of neurophysiological studies suggest, contrary to previous assertions (Ungerleider and Haxby 1994Go) of parallel processing of shape and motion, that these two streams of scene analysis are not independent at various stages of extrastriate visual processing (Desimone and Schein 1987Go; Tolias et al. 2005Go). Our study fits in this context by seeking to elucidate the velocity dependence of how single V1 neurons and their ensembles represent the stimulus attributes that determine one-dimensional spatial features.

The view that single neurons function as feature detectors, which would imply speed invariance among other characteristics, enjoyed early but not uncontroversial popularity (Barlow 1972Go; Lettvin et al. 1959Go) and, when applied to the primary visual cortex, initially appeared to gain support from influential early experiments (Hubel and Wiesel 1962Go) on simple cells. However, decades of work consistently failed to turn up direct experimental evidence for the single-neuron-as-detector view in any cortical area examined. The evidence accumulated in V1, reviewed most recently by Carandini et al. (2005)Go, instead favors the current consensus, according to which V1 neurons represent banks of variously tuned nonlinear filters that adapt to local contrast energy. The "adaptive filter" view is validated by results obtained mostly with stimuli confined to a narrow frequency band such as gratings and Gabor patches. However, salient features such as lines and edges are defined by phase coherences across a range of spatial frequencies (Morrone and Burr 1988Go). In fact, natural stimuli (which are natural because of, among other factors, their highly nonrandom local phase spectra) highlighted the weaknesses of the current adaptive filter model by pointing to the need for the incorporation of a pattern-selective modulatory influence (Felsen et al. 2005Go). In the absence of a vetted nonlinear model of sufficient accuracy (Rust and Movshon 2005Go; Wu et al. 2006Go), the sensitivity of cortical neurons to features defined by phase cannot be predicted from their sensitivity to sinusoidal gratings or Gabor patches, but rather, must be determined experimentally.

To this end, we use a family of compound gratings (whose spatial frequencies span a sevenfold range), parameterized by phase congruence (Morrone and Burr 1988Go). The stimuli are matched in spectrum and energy, to eliminate any confounding effects of spatiotemporal filtering on feature tuning. Using this stimulus set, we showed earlier (Mechler et al. 2002Go) that typical V1 neurons have nonlinearities that allow them to exhibit "feature tuning" to optimally oriented line-like, edge-like, and intermediate one-dimensional spatial profiles. Here we find that speed strongly influenced specificity and depth of feature tuning of individual neurons. These speed-induced changes in feature tuning were comparable in simple cells and complex cells. We also find that, although the feature tuning of individual V1 neurons is strongly speed dependent, the population as a whole retained a full suite of feature analyzers.

Finally, we analyze a simple model to see how well feature tuning is explained, in qualitative terms, by the known basic properties of V1. We consider a recurrent network model for V1 that was proposed to account for the range of behaviors across the simple–complex gamut observed in response to single gratings (Chance et al. 1999Go). In the model, feature selectivity essentially arises from the interaction between the phase-sensitive linear kernel and the static nonlinearity of the spike threshold. This "iceberg" effect can be either diluted by a phase-insensitive recurrent pooling or compounded by phase-biased recurrent pooling or inhomogeneity in the network. We show that this model accounts for several aspects of responses to compound gratings that we observed experimentally: at each speed, there is a full representation within the V1 population of the entire space of one-dimensional features; there is a comparable degree of feature tuning at different speeds and in simple and complex cells; moreover, this tuning has a comparable degree of speed dependence.

Our results are qualitatively consistent with the consensus view that V1 neurons are adapting nonlinear filters. Specifically, our experimental observations constitute direct evidence against the possibility that individual orientation-selective V1 simple cells function as detectors of oriented lines or edges. Rather, it appears V1 neurons provide an ensemble with selectivity and coding properties that depend dynamically on the stimulus.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Physiological preparation

Standard acute preparation techniques were used for electrophysiological recordings from single units in the primary visual cortex (V1) of the primate (cynomolgus monkeys, Macaca fascicularis) previously described in detail (Mechler et al. 1998Go, 2002Go). All procedures were in accordance with institutional and National Institutes of Health guidelines for the care and experimental use of animals.

In brief, extracellular recordings were made with tetrodes (quartz-coated platinum–tungsten fibers; Thomas Recording, Giessen, Germany) placed in the occipital cortex (near Horsley–Clark 14 mm posterior, 14 mm lateral) of 14 adult animals under general opiate (sufentanil) anesthesia and muscle paralysis. The analogue signal from each tetrode channel was amplified, filtered (0.6–6 kHz), and digitized (25 kHz). Multiple single units were isolated by cluster analysis of spike waveforms initially performed on-line (Autocut, DataWave Technologies) then off-line (custom software; Reich 2001Go). Isolation criteria included stability of principal components of spike waveforms and a 1.2-ms minimum interspike interval consistent with a physiologic refractory period. Spike times were identified to 0.1-ms precision. Recording tracks and the laminar position of recording sites were anatomically reconstructed using standard histological techniques (Mechler et al. 2002Go).

Visual stimulation

The pupils were dilated with topical atropine and covered with gas-permeable contact lenses (Metro Optics, Houston, TX). Artificial pupils (2 mm) and corrective lenses were used to focus the stimulus on the retina. Optical correction was optimized by the aid of responses of isolated single units to high spatial frequency visual stimuli.

Foveae and the receptive fields of isolated neurons were mapped on a tangent board. Visual stimuli were generated by a special-purpose stimulus generator (Milkman et al. 1978Go, 1980Go) under the control of a PDP-11/93 computer and displayed on a Tektronix 608 monochrome oscilloscope (green phosphor, 150 cd/m2 mean luminance, 270.32 Hz frame refresh). Luminance of the display was linearized with lookup tables in the range 0 to 300 cd/m2. At the 114-cm viewing distance of the animal, the stimuli appeared in a 4° circular aperture on dark background.

The receptive fields of isolated single units fell between 3 and 6° eccentricity and were always fully covered by the stimulus patch. The receptive fields were characterized in a standard way using drifting sine gratings: tuning was measured first for orientation, then for spatial frequency, and finally for temporal frequency, each parameter optimized for subsequent tuning measurements. The contrast response function was measured using the optimal sine grating. With tetrodes, simultaneous isolation of two to eight (on average, three) single units per site was routine. To keep experimental time within practical limits, receptive field characterization (i.e., finding the optimal grating) was limited to the most responsive one or two units.

In each trial of the main experiment, taken at a fixed stimulus drift velocity, each of eight compound gratings, each of the four component gratings, and one blank stimulus was presented for 4 s in a randomly interleaved sequence. Trials were rerandomized and repeated (typically 12 to 25 times) until a target signal-to-noise ratio was obtained for at least one isolated unit. The experiment was then repeated with fourfold increase in the drift speed (by changing the temporal but not the spatial fundamental frequency).

Compound gratings

Compound gratings were of near-optimal orientation and drifting in the optimal direction for the V1 neurons. As in our previous study (Mechler et al. 2002Go), each of our compound-grating stimuli was a superposition of the first four odd harmonics of a common fundamental, each with a contrast inversely proportional to the harmonic number. Here, a brief formal description of the stimuli follows.

Let {nu} denote the spatial frequency; f, the temporal frequency; and C, the Michelson contrast of the fundamental component. Thus formally, the spatiotemporal light intensity variation around its mean for the mth component grating is given by

Formula 1(1)
and, for a compound grating, summing the above components, it is given by

Formula 2(2)
The parameter {phi} is the phase of each component grating at the origin.

CONGRUENCE PHASE. Across a stimulus set, with the spatial and temporal frequencies and the contrasts of the four components fixed, the phase {phi} was varied systematically to specify the shape of the compound waveform. With the spatial origin (x = 0) centered on the display, all component gratings share the same phase {phi} at the center of the display at time t = 0. If {phi} = 0, each component peaks at x = 0. Because they reinforce each other, they produce a line-like shape. If {phi} = {pi}/2, the components’ sharpest rising parts coincide at x = 0 and, reinforcing each other, produce an edge-like shape—as expected because they constitute the truncated Fourier approximation of a square wave. Following Morrone and Burr (1988)Go, we therefore designate {phi} the "congruence phase" of the compound grating.

The feature space, defined by the congruence phase, is periodic in {pi}. Because compound gratings are sums of only odd harmonics, two stimuli whose congruence phases differ by {pi} have identical spatial waveforms save for a half-cycle shift, which makes them equivalent as periodic stimuli. As shown in Fig. 1, we sampled the congruence phase in eight equal steps on the [0, {pi}) phase interval to construct eight different rigidly drifting compound waveforms.


Figure 1
View larger version (19K):
[in this window]
[in a new window]

 
FIG. 1. Thick curves: one spatial period of the 8 equal-energy compound luminance gratings used in our experiments. Each stimulus had the same set of 4 sinusoidal components (thin lines), the first 4 nonzero components of an edge (f through 7f), but in different phase combinations. Identical relative phase of the components at the location of the spatial feature (indicated by vertical dotted lines for each compound grating) is called the congruence phase {phi}. We sampled {phi} in 8 equal steps counterclockwise around the phase circle [0, {pi}). Spatial waveform of the compound gratings varies smoothly with {phi}, from line-like ({phi} = 0) through edge-like ({phi} = {pi}/2) back to line-like ({phi} = {pi}) through intermediate transient waveforms. Line-like waveform obtained with {phi} = {pi} (not shown) is a half-cycle–shifted version of the waveform obtained with {phi} = 0. This is a consequence of a general property of these stimuli: shifting the congruence phase by {pi} is equivalent to a half-cycle shift in the compound waveforms. Because all stimuli were presented as drifting waveforms, stimuli on the [{pi}, 2{pi}) phase interval duplicate those in the [0, {pi}) phase interval.

 
EQUAL ENERGY. The compound gratings thus constructed constitute a set of equal-energy stimuli because the amplitudes of the components were the same for each stimulus. The root-mean-square contrast was 0.38 for each compound grating, corresponding to C = 0.5 in Eq. 2. Note that the Michelson contrast varies with the congruence phase ~ |cos ({phi})|, with the maximum (0.84) realized by the line and the minimum (0.47) by the edge. The reader is referred to the preceding paper (Mechler et al. 2002Go) for a fuller discussion of the mathematical properties of these compound gratings.

DRIFT VELOCITY. Two drift velocities were used to determine how stimulus speed interacted with a neuron's sensitivity to congruence phase. Drift velocity, V = f/{nu}, was changed from V = 3.1 deg/s "low" speed to V = 12.4 deg/s at "high" speed. This was done by increasing the temporal frequency of each component grating fourfold while keeping their spatial frequency fixed (the fundamental was at {nu} = 0.25 c/deg). The specific temporal frequencies used for the fundamental and the higher harmonics were (values in Hz) f = 0.78, 3f = 2.34, 5f = 3.90, and 7f = 5.46 at low speed; and f = 3.12, 3f = 9.36, 5f = 15.6, and 7f = 21.84 at high speed. Because all recordings were at approximately the same eccentricity, this choice allowed all four components of the compound grating to be within the spatiotemporal pass-band of each cell at a "low" speed. A "data set" denotes recordings of responses of one cell to the eight compound gratings at a single drift velocity.

Selection and classification of neurons

The 63 cells with 100 data sets (out of a total of 226 data sets recorded in 137 cells) selected for analysis were those that 1) maintained good spike isolation throughout the experiment and 2) passed a signal-to-noise criterion in the compound-grating experiments. Signal variance was defined for each Fourier component as the squared Fourier amplitude of the trial-averaged response to each compound grating summed over all stimuli. Noise was defined as the trial-by-trial variance of the same component summed over all stimuli. The selection criterion required that the median ratio of signal over noise variance taken over the first eight Fourier components of the response be >0.3.

This data set substantially overlaps with that presented earlier (Mechler et al. 2002Go), but the two are not identical. The earlier paper, which focused on analyzing single-response harmonics but did not look into the influence of speed, used a different signal-to-noise criterion (it was based on a d' threshold placed on the Fourier components in comparison to the blank) and also included data sets that were obtained with stimuli of different fundamental frequencies. As a result, the 100 data sets analyzed here included 78 of the 121 presented in the earlier paper and 22 from the same pool that were not analyzed earlier.

Cell classification is based on the modulation ratio (Skottun et al. 1991Go). According to this convention, the fundamental (F1) of the response to a single drifting grating of near-optimal spatial parameters was compared with the DC component after subtraction of the maintained rate of firing (F0) and a cell was labeled simple if F1/F0 > 1 and complex otherwise. Accordingly, there were 24 complex and 13 simple cells in the speed-paired sample. We analyze and report dependence on the modulation ratio F1/F0 both categorically and parametrically.

Recurrent network model

Chance et al. (1999)Go introduced a network model for V1 with variable recurrent gain. In response to drifting gratings, this model produces phase-modulated, simple-like responses at low gain and phase-invariant, complex-like responses at high gain. We asked whether this model could account for various aspects of the feature tuning we observed experimentally. As detailed below, only minor changes to this model were made: we changed the time constant of the feedforward impulse response and we varied the nonlinearity to include nonzero firing thresholds and half-squaring.

In this model, the continuous firing rate of the ith neuron, ri, is instantaneously boosted by the sum of the input from its feedforward sources (Iiff) and those from its recurrent connections (Iirec) and relaxes with a time constant {tau}r (set to 1 ms)

Formula 3(3)

Note that there is no spontaneous activity in the model. The effect of including spontaneous activity would be to allow for negative thresholds, but would not alter the simulation results.

A two-stage linear–nonlinear (LN) operator acting on the stimulus supplies the feedforward input Iiff

Formula 4(4)
Here the linear filter stage is represented by the convolution of the compound grating W(x, t; {nu}, f, {phi}) (Eq. 2), with the separable spatiotemporal kernel Gi(x)H(t). The scale factor A sets the absolute response magnitude. The nonlinear operator has two stages. The first is a static nonlinearity that consists of a threshold {theta} and a rectifier [x]+ = max (0, x); the second is a power function with an exponent n ≥ 1. As an example, {theta} = 0 and n = 1 represent perfect half-wave rectification and {theta} = 0 and n = 2, half-squaring. The value of {theta} was chosen to be zero for some networks; for other networks, a nonzero {theta} was chosen such that the response of the neurons with the smallest receptive field to the fundamental component (presented alone) was half-maximal.

Gi(x), the spatial filter of the ith cell, is a Gabor function

Formula 5(5)
whose shape is fully determined by the envelope size {sigma}i, the carrier (or Gabor) frequency ki, and the carrier (or Gabor) phase under the envelope {gamma}i. The model included nk = 7 spatial frequency channels, with the Gabor frequency k sampled in equal steps of 0.5 c/deg from 0.5 to 3.5 c/deg, a 3-octave range. For each Gabor frequency, the Gabor phase was evenly sampled in steps of {pi}/32 radians from the entire [–{pi}, {pi}] interval (n{gamma} = 33). Thus the network size was N = nkn{gamma} = 231.

If the shape of receptive field profiles were independent of their size, then {sigma}i would be proportional to 1/ki. That is, the dimensionless combination {sigma}k, which measures the average number of cycles of the optimal grating "seen" by the neuron within the aperture of its receptive field, would be constant. An alternative to this picture ({sigma} = const/k) would be that receptive field size is independent of the optimal spatial frequency, i.e., {sigma} = const. Macaque V1 neurons apparently represent a compromise between these two possibilities. This is based on the observation of a weak negative correlation between size ({sigma}) and optimum spatial frequency (k) (D Xing, MJ Hawken, and RM Shapley, personal communication). To endow the model with a bit of realism but keep its details simple, we implemented the compromise between constant shape and constant size by allowing two shape factors, a smaller one, that held at large scales ({sigma}i2{pi}ki = 2.5, ki ≤ 1.5 c/deg), and a slightly larger one that held for small scales ({sigma}i2{pi}ki = 2.7, ki ≥ 2.0 c/deg). In equivalent terms, the high spatial frequency channels in this model have somewhat narrower frequency bandwidths than the low spatial frequency channels.

H(t), the temporal response, is a single-parameter biphasic function

Formula 6(6)
scaled by the time constant {alpha}. The time constant was set identical for each unit ({alpha} = 66 s–1) except as noted.

The recurrent input to each neuron is pooled from all other neurons in the entire network by a kernel defined as a difference of two Gaussians in the space of the Gabor frequencies ki of the feedforward inputs

Formula 7(7)
This pooling kernel is shaped like a Mexican hat, with the excitatory center and inhibitory surround centered on each cell's own Gabor frequency. The characteristic widths of center and surround are identical for each unit ({sigma}c = 0.5 c/deg and {sigma}s = 1 c/deg, respectively). The bandwidth of the resulting spatial-frequency tuning curve is similar for all units because it is primarily determined together by {sigma}c and {sigma}s, and less dependent on {sigma}i, the width of the Gabor envelope of the feedforward input. The gain term gi, normalized by the network size, sets the strength of the recurrent input that each neuron receives. In homogeneous-gain networks, all cells behave like ideal simple cells when g = 0, and increasingly like ideal complex cells as g -> gmax, where gmax denotes the maximum gain attainable in homogeneous-gain networks. For gains g ≥ gmax, recurrent amplification makes the network unstable. Numerical values of recurrent gain are presented, even for inhomogeneous-gain ("mixed-gain") networks, as g/gmax, relative to gmax of the homogeneous-gain networks. However, in mixed-gain networks, g is not bounded by gmax. This is because the true maximum gain is a parameter that depends on other network parameters, including the distribution of gains. In particular, the true maximum gain in a mixed-gain network can be made arbitrarily large if the number of units with very high gain are kept sufficiently low, and this in turn permits some units to have gains g ≥ gmax.

Data analysis

Firing rate responses for each neuron in the network were analyzed in exactly the same way as the spiking responses collected experimentally from V1 neurons. Off-line data analysis and statistical tests were performed using Matlab (The MathWorks, Natick, MA) toolbox functions and custom software written in Matlab.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
The 100 data sets selected for analysis in this paper were obtained from 37 cells that provided data suitable for analysis at both drift velocities (74 data sets), 18 cells that provided data at the low drift velocity only (for two of which high-speed responses were measured but excluded by selection criteria), and eight cells that provided data at the high drift velocity only (for all of which low-speed responses were measured but excluded by selection criteria).

Feature tuning and its dependence on speed

Earlier we showed (Mechler et al. 2002Go) that V1 neurons are tuned to the congruence phase of compound gratings, and that response energy and other response measures based on harmonics beyond the DC are especially sensitive to this tuning. Here we demonstrate that in most V1 neurons, feature tuning is dependent on the drift velocity of the compound gratings.

It is tempting to analyze the responses to compound gratings in terms of the responses to their components and a nonlinear response model. However, as indicated in our earlier study (Mechler et al. 2002Go), the accounting for the compound-grating responses requires a highly nonlinear model; idealized rectifiers and energy mechanisms do not suffice. This is further illustrated in Fig. 2. It shows the time histograms of the responses of three representative V1 neurons to the compound gratings (arranged along the phase circle in the same way these stimuli were introduced in Fig. 1), as well as to the four component gratings presented alone (stacked in the center, as labeled in Fig. 2A). For each cell, the set of responses on the left correspond to the stimuli drifting at low speed, and the set on the right, to stimuli drifting at high speed. Other examples (not paired for speed) can be found in Mechler et al. (2002)Go.


Figure 2
View larger version (9K):
[in this window]
[in a new window]

 
FIG. 2. Response histograms from 3 representative primary visual cortex (V1) neurons, recorded at different sites, obtained for stimuli drifting at low speed (v = 3.1 deg/s) on the left and a 4-fold higher speed (v = 12.4 deg/s) on the right. Responses to the compound stimuli are arranged around the phase circle exactly as the stimuli themselves in Fig. 1; those to component gratings are stacked in the center, as labeled in A. Spontaneous firing rate is indicated by the stand-alone histogram. Its vertical scale bar, common to both speeds, indicates response magnitude. Horizontal span (timescale) is 1,263 ms for the low-speed sets (left) and 316 ms for the high-speed sets (right). A: L450205.u, complex cell (F1/F0 = 0.57; vertical scale 60 impulses/s). B: L431103.s, complex cell (F1/F0 < 0.1; 65 impulses/s). C: L450101.t, borderline complex/simple cell (F1/F0 = 1; 90 impulses/s).

 
These examples, especially the complex cells (A and B) illustrate the difficulties that prevent a simple prediction of the responses to the compound stimuli from the responses to the single components. The responses to compound gratings are much more peaked than responses to the components and the magnitude of these peaks is selective for specific congruence phases. Qualitatively, simple thresholds would not account for this kind of behavior, in that peaks in the compound-grating responses occur even though all of the component-grating responses are characterized by a weakly modulated steady firing rate. As analyzed in detail in Mechler et al. (2002)Go, such behavior is also qualitatively inconsistent with global energy models. Note also that overall gain controls cannot confer the observed response selectivity either because all compound grating stimuli are equated for power.

On the other hand, local squaring operations (Burr and Morrone 1992Go) can provide some feature selectivity. Additionally, the behavior of Fourier components of the response as a function of congruence phase implies the presence of high-order nonlinearities (order ≥3), for both complex and simple cells (Mechler et al. 2002Go). Another way to rescue a linear-static nonlinear model would be to add phase-sensitive (Felsen et al. 2005Go) or strongly dynamic nonlinearities. However, specific forms for such nonlinearities have not yet been proposed, so it is difficult to test models of this kind from the data of individual cells.

The example cells of Fig. 2 typify another feature of our data. They exhibit, to various degrees, a more low pass spatial sensitivity at the (fourfold) higher temporal frequency, indicating that spatiotemporal sensitivity of these neurons is not separable in the two frequency domains. On the other hand, their spatial frequency optimum does not seem to decrease in inverse proportion to the temporal frequency change, indicating that these neurons were not exactly tuned to velocity either. Cells like these, whose sensitivity was neither separable in spatial and temporal frequency nor tuned to velocity when assayed with single gratings, were found to constitute a large fraction of cells in V1 (Priebe et al. 2006Go). This mixed behavior in the responses to single gratings (spatiotemporal inseparability) further complicates predictions of the responses to compound gratings when their drift velocity is varied.

In sum, a cell-by-cell approach to fitting the compound grating responses from the single-grating responses is insufficiently constrained by existing models that could conceivably work (spatiotemporally inseparable models with high-order phase-sensitive and/or dynamic nonlinearities). For this reason, our analytical approach will consist of an attempt to account for the range of behaviors across the population from a minimal network model, rather than the details of individual cells.

The first step is the extraction of indices that describe the responses to the compound gratings. Figure 3 shows the tuning to congruence phase (feature tuning) for the three cells in Fig. 2. The three illustrate the observed range of behavior and are ordered (from top to bottom) by increasing difference between the optimal phases at the two stimulus speeds. Each panel shows the response (total energy) of a single cell at low speed (open symbols) and high speed (filled symbols). Total response energy is defined as the summed squared amplitudes of the DC (after subtracting the baseline level) and the first eight Fourier components of the mean response. It is one of many alternative scalar response measures that were shown in our earlier paper to be consistent in identifying the feature optimum and comparable in their sensitivity (depth) of feature tuning.


Figure 3
View larger version (12K):
[in this window]
[in a new window]

 
FIG. 3. Feature tuning, derived from the responses of the 3 V1 neurons shown in Fig. 2. Total response energy is plotted as a function of congruence phase. Tuning functions are paired for low velocity (open symbols) and high velocity (filled symbols). Continuous curves are the optimally fitting 4th-order harmonic functions of the form of Eq. 8. Error bars indicate 95% confidence limits. Arrowheads indicate the optimal stimulus, thin arrows for low speed, thick arrows for high speed.

 
As in Mechler et al. (2002)Go, we fit these tuning curves with a family of even-harmonic functions of the congruence phase {phi}

Formula 8(8)
by adjusting the five parameters—a0, a1, a2, {alpha}1, {alpha}2—to minimize the mean squared error of the fit. This family is a natural choice for the empirical description of feature tuning because it encompasses contributions of nonlinearities up to and including fourth-order and captures much of the variance in the tuning. The best-fitting function from Eq. 8 (thick continuous lines in Fig. 3) was used to extract objective measures of tuning curves and their change for further analysis.

We defined the optimal stimulus by its congruence phase, {phi}opt, at the peak position of the tuning curve (Fig. 3, thin arrows for low speed, thick arrows for high speed). The congruence phase, {phi}, which parameterizes the feature space, is periodic with period {pi}. {phi}opt = 0 corresponds to a line-like stimulus; {phi}opt = {pi}/2 corresponds to an edge-like stimulus; and intermediate values of the congruence phase correspond to intermediate one-dimensional features.

To quantify the change in the optimal stimulus, {phi}opt, induced by a change in the drift velocity, we determined

Formula 9(9)
the signed minimum phase-shift. {Delta}{phi}opt must lie between –{pi}/2 and {pi}/2. A value of {Delta}{phi}opt = 0 indicates no speed-dependent change in optimal congruence phase; values of {Delta}{phi}opt = ±{pi}/2 are the maximum possible changes. We also consider the unsigned quantity |{Delta}{phi}opt|, which indicates the change in feature selectivity independent of the direction of change (0 < |{Delta}{phi}opt| < {pi}/2).

To quantify the overall similarity of two tuning curves measured at different velocities, we use the Pearson correlation coefficient, r, which is sensitive to the shape of the phase variation but not to the size of the untuned part (mean elevation) of the tuning curves. For a pair of sinusoidal tuning curves, maximum positive and negative correlation (r = ±1) correspond to minimum ({Delta}{phi}opt = 0) and maximum (|{Delta}{phi}opt| = {pi}/2) phase shifts, respectively, and minimum correlation (r = 0) corresponds to the intermediate shifts ({Delta}{phi}opt = ±{pi}/4). The latter are quarter-cycle shifts of tuning curves in this feature space, defining quadrature pairs. Although r = 1 implies that there is no change in the peak of the tuning curve ({Delta}{phi}opt = 0), the converse is not true because the tuning curve may peak in the same position ({Delta}{phi}opt = 0) yet change in shape (r < 1).

For most neurons, {phi}opt depended on stimulus velocity, but the extent of this dependence varied widely across the population. The same was true for the relative size of the responses to a given spatial waveform. Exemplifying one extreme is the neuron shown in Fig. 3A. This cell responded about twice as vigorously at high velocity (filled symbols) as at low velocity (open symbols). Despite this overall change in responsiveness, the tuning curves at the two velocities were similar in shape (Pearson correlation coefficient, r > 0.8). Correspondingly, the optimal stimulus was line-like ({phi}opt {approx} 0), at both stimulus speeds (|{Delta}{phi}opt| < 0.11{pi}). Illustrating the other extreme, the neuron shown in Fig. 3C was tuned to almost perfectly opponent congruence phases at the two velocities (|{Delta}{phi}opt| {approx} 0.4{pi}). Its tuning curves at the two speeds were strongly anticorrelated (r < –0.6). This neuron decreased, rather than increased, its response magnitude from low speed (open) to high speed (filled). The neuron shown in Fig. 3B was approximately halfway between these extremes. Its phase preference at the two stimulus speeds approximated a quadrature pair (|{Delta}{phi}opt| {approx} 0.25{pi}), and the correlation coefficient (|r| < 0.3) was small, as expected for a quadrature shift. This neuron responded equally vigorously at both speeds.

The range of the speed-induced changes of the optimal phase and of the shape and size of tuning curves in the examples shown in Fig. 3 is representative of the range observed in the entire V1 sample. (The sign and magnitude of the velocity-induced change in response size were not correlated with the velocity-induced change in feature preference, although the three examples of Fig. 3 may give an impression of correlation.) These and other aspects of feature tuning are shown for the entire V1 sample in Fig. 4. The plot on the left (Fig. 4A) summarizes how the optimal congruence phase depends on the drift velocity of the compound gratings. Note that these scattergrams are periodic in {pi} on both dimensions, corresponding to the periodicity of the stimulus space. In these plots, speed invariance would correspond to a concentration of data points near the diagonal and a constant phase shift from low to high speed would correspond to a concentration of data points on a line that is parallel to the diagonal. The pair of dotted off-diagonal lines traces the locus of maximum phase offset (|{Delta}{phi}opt| = 0.5{pi}). In our sample, the optimal features obtained at low speed ({phi}opt,low) and high speed ({phi}opt,high) exhibited no significant (linear) circular association as measured by the circular correlation modulus (Fisher 1993Go) (|r| ≤ 0.1, P > 0.5). Because the modulus of the circular correlation is not significant, there is no observed tendency for an average speed-induced {Delta}{phi}opt. In sum, we find no evidence either for speed invariance or a net speed dependence of feature tuning in V1. Rather, we find a scattering of tuning at low and high velocities, which, from our finite data sample, is indistinguishable from random.


Figure 4
View larger version (9K):
[in this window]
[in a new window]

 
FIG. 4. Population analysis of the speed dependence of feature tuning in V1 (n = 37). Response magnitude was measured by total response energy. In each plot, the circles indicate complex cells (F1/F0 ≤ 1) and the triangles indicate simple cells (F1/F0 > 1). A: scattergram comparing the optimal congruence phase obtained for each neuron with compound gratings drifting at low speed (horizontal axis) and at high speed (vertical axis). Maximal phase offset is indicated by the pair of off-diagonal dotted lines, set off from the identity line by {Delta}{phi}opt = ±{pi}/2. Only paired ({phi}opt,low, {phi}opt,high) data sets with measurable tuning at both speeds were included in the circular association analysis; these were plotted inside the box of the coordinate frame. Paired data sets with measured but no significant response at one speed were not included in the analysis but were plotted outside the coordinate frame at a nominal negative coordinate for the speed to which the cell was not responsive. B: dependence of {Delta}{phi}opt = |{phi}opt,high{phi}opt,low|, the speed-induced shift in the optimal congruence phase, on F1/F0, the cell-classifying index (horizontal axis). There were 24 complex and 13 simple cells in the sample. C: dependence of rhigh,low, the Pearson correlation coefficient of the feature tuning curves obtained at low and high stimulus speed (vertical axis) on {Delta}{phi}opt = |{phi}opt,high{phi}opt,low|, the speed-induced shift in the optimal congruence phase. A pure speed-induced shift that preserved the shape of the tuning would confine the scatter of the data within a sigmoid domain (see text).

 
Simple cells have traditionally been considered as better suited than complex cells for reliably signaling phase information. It is thus natural to ask whether simple cells signal these one-dimensional features (formalized as relative spatial phase) in a more speed-invariant manner than complex cells. The summary answer, derived with limited statistical power from evidence shown in the middle plot (Fig. 4B), is that simple cells’ phase preferences are not more speed invariant. This plot shows how the speed-induced change in feature preference (measured by 0 ≤ |{Delta}{phi}opt| on the vertical axis) varies with the F1/F0 modulation ratio, a traditional index of nonlinearity and the simple–complex type (Skottun et al. 1991Go). F1/F0 is expected to form a bimodal distribution as the result of a nonlinear effect of the spike threshold (Mechler and Ringach 2002Go), and it does in our sample, too. However, both complex cells (n = 24) and simple cells (n = 13) were broadly scattered with respect to |{Delta}{phi}opt| and the negative correlation between |{Delta}{phi}opt| and F1/F0 was not significant (Pearson correlation coefficient –0.3 < r < 0 and P > 0.08). Moreover, the distributions of the speed-induced phase-shifts, both the signed and unsigned quantity, were statistically indistinguishable in simple and complex cells (Kolmogorov–Smirnov two-sample test, P > 0.05 for |{Delta}{phi}opt|, P > 0.2 for {Delta}{phi}opt). However, these statistical results are not robust given the rather small sample size. It is possible that with a larger sample size one would find a significantly stronger tendency among simple cells to maintain their phase preference or that the size of speed-induced change in feature preference negatively correlated with the index of cell type.

The meaning of the optimal feature parameter depends on the selectivity of tuning. Therefore we also analyzed the selectivity of the tuning, as measured by the circular variance (CV) of the tuning curve. (Here CV denotes 1 minus the usual measure. For calibration, a delta function of a circular variable has CV = 1 and the CV of a cosine raised to a constant pedestal is about half the modulation depth measured by the Michaelson contrast.) The CV indicated that at both speeds, most cells were broadly tuned: CV < 0.3 for all but two cells. Unlike the preferred feature, tuning selectivity as measured by the CV was highly correlated at the two speeds (r = 0.71). The median CV at low speed was 0.11 and increased to 0.13 at high speed, a slight and marginally significant change (paired sign-rank test, P < 0.1). Also unlike the preferred feature, both the CV and the speed-induced change in the CV were uncorrelated with F1/F0 and these measures were similarly distributed in simple and complex cells (Kolmogorov–Smirnov two-sample test, P > 0.5).

The CV, unlike the bandwidth or the depth of modulation, is a good measure of the overall shape of a tuning curve. The above results were not dependent on the measure of selectivity, though: the same conclusions were reached when the measure was the depth of modulation of the tuning curve. Thus the relative magnitude of the feature-independent and the feature-modulated components of the compound-grating responses of V1 neurons are essentially independent of stimulus speed.

In principle, a speed-induced change in feature tuning could be attributable to a shift in optimal phase, a change in the shape of the tuning curve, or both. The third plot (Fig. 4C) examines this issue. If the tuning curves at low and high velocities were related by a pure shift in optimal phase {Delta}{phi} (i.e., a translation, permitting a rescaling of the tuning curve), it follows that the correlation coefficient r of the two tuning curves is given by

Formula 10(10)
Here a1 and a2 are the parameters in Eq. 8 that describe the shape of the tuning curve. Because typically a1 > a2, Eq. 6 predicts that the relationship between r and |{Delta}{phi}opt| is dominated by declining sigmoid. This accounts for the general shape of the scattergram in Fig. 4C. Thus a {Delta}{phi} shift accounts for a substantial component of the velocity-induced change in tuning. On the other hand, if a shift in {Delta}{phi} were the sole cause of the velocity-induced change in tuning, then an appropriate translation in the tuning curve measured at high velocity should bring it into coincidence with the tuning curve measured at low velocity (permitting rescaling). We determined this "corrective" phase shift as the phase shift that {Delta}{phi}corr maximizes the correlation coefficient r of the tuning curve measured at low velocity and the tuning curve measured at high velocity after a translation by {Delta}{phi}corr. Not surprisingly, {Delta}{phi}corr is highly correlated with the speed-induced shift in the optimal congruence phase {Delta}{phi}opt (r > 0.9). However, this translation does not bring the low- and high-velocity tuning curves into coincidence. Rather, the median correlation coefficient between the speed-paired tuning functions was r = 0.73. Thus a translation of the tuning curve accounts for only about half of the variance (r2 {approx} 0.5). A change in shape of the tuning curve, as well as measurement error, constitutes the other half of the variance.

As a final point, we mention that feature preference or tuning depth did not correlate with relative cortical depth. Laminar location was identified histologically for most cells, but possible laminar variations could not be studied because of the small sample size.

A model of feature tuning

Many aspects of the behavior of real V1 neurons can be understood in terms of some variant of the "iceberg effect," i.e., in terms of the interaction between a linear filter (the spatiotemporal kernel of the receptive field) and a static nonlinearity (that of spike threshold). As we show later, this mechanism is also fundamental in endowing V1 neurons with feature tuning. We now examine to what extent this can account for our data.

A linear operator scales the amplitude and shifts the phase of the frequency components present in the stimulus but adds no new frequency components. Moreover, the amplitude in the output of a linear transform depends only on the frequency but not the phase of the input. Thus neither the amplitude nor its square (the energy), taken in any combination of output components, can exhibit feature tuning for the stimuli used here: feature tuning signifies nonlinearity.

By general considerations similar to those laid out in Mechler et al. (2002)Go, one can show that an isolated linear–nonlinear (rectified) simple cell receptive field model is expected to exhibit feature tuning, that the tuning is periodic in twice the congruence phase, and that the dominant term in its harmonic expansion in phase is ~ cos [2({phi}{phi}opt)]. Furthermore, the energy model of complex cells that sums with equal weight the squared output of two quadrature pairs of simple cell (rectified linear) subunits (one even symmetric and one odd symmetric as well as their opposites in contrast polarity) will by design produce no phase tuning because the subunits’ outputs combine to a phase-independent constant DC elevation. The key premise necessary to reach these conclusions is that, by design, the congruence phase is the same in each component of a given compound grating. The key observation in the analysis is that for a nonlinear contribution of order n, the output phase is the sum of the phases of the interacting components.

However, simple LN models cannot account for the responses to compound gratings—for example, the peaking of the responses seen in Fig. 2 or the manner by which the response Fourier components depend on the congruence phase (Mechler et al. 2002Go). Adding phase-sensitive nonlinearities or dynamic gain controls might recover such features within the context of a feedforward model, but concisely parameterized models of this sort capable of predicting responses to moving stimuli are not yet in hand. An alternative approach to determine whether the critical features of our responses could be derived from a physiologically reasonable elaboration of idealized LN models is to incorporate idealized LN neurons into a simple recurrent network (Chance et al. 1999Go). This model departs from the Hubel and Wiesel (1962)Go hierarchical (feedforward) model of V1 in which complex cells pool their inputs from simple cells that have complementary receptive field profiles and reflects the growing consensus that corticocortical interactions are critical to understanding responses of individual cortical neurons. Chance et al. (1999)Go proposed that complex-cell responses arise through recurrent amplification of simple-cell responses and that simple and complex cells represent the weakly and highly coupled regimes of the same basic cortical circuit. We now ask whether the same basic network model can account for the characteristics of feature tuning that we observe.

Although the isolated linear–nonlinear receptive field model is tractable (as outlined earlier), interconnection of such units requires numerical simulation to determine the contributions from single-cell receptive fields and network mechanisms that shape feature tuning.

We implemented several variants of the above network model (as detailed in METHODS). Briefly, the network consists of interconnected rectified Gabor units whose receptive fields are identically centered and oriented. Gabor frequency and phase, representing the linear feedforward input to the network, tile the space of spatial frequency and phase. The recurrent gain relative to the strength of the linear kernel can be varied. Previously, we showed that this model could account for much of the diversity of feature preference and selectivity seen in V1 responses to compound gratings (Ohiorhenuan et al. 2004Go). Here we report that this model captures most of the qualitative behavior of V1 neurons to one-dimensional features and, specifically, the model can explain the pattern of speed dependence of V1 responses to this stimulus set.

To develop an intuition for how the recurrent model leads to feature tuning, we begin with homogeneous-gain models, in which the gain of recurrent feedback is the same for every cell. Figure 5 shows tuning to compound gratings drifting at low and high speeds for model neurons in three homogeneous-gain networks that differed only in the gain parameter. In each data set, neurons are organized in rows by k, their Gabor spatial frequency, and in columns by {gamma}, their Gabor phase. The network of neurons is evenly subsampled for display. For each model neuron, tuning curves are plotted analogously to Fig. 3. In the simulated experiments, the fundamental grating component's spatial frequency was 0.25 c/deg and its temporal frequency was 1 Hz at low speed, 4 Hz at high speed; each parameter value was chosen to be similar to those used in our V1 experiments.


Figure 5
View larger version (18K):
[in this window]
[in a new window]

 
FIG. 5. Tuning to compound gratings by a representative subset of model neurons from 3 simulated homogeneous-gain networks. A: network of isolated (noninteracting) ideal simple cells (g = 0). B: network of interconnected complex cells (g/gmax = 0.7). C: very strongly coupled ideal complex cells (g/gmax = 0.97). For each network, units selected for display are organized in rows by their Gabor frequency (k), indicated on the left of the rows, and in columns by Gabor phase ({gamma}), indicated on top. {gamma} isin [{pi}, 2{pi}] range, not shown, repeats exactly the range shown. Tuning plots and curve fits follow the conventions used in Fig. 3. Note that the responses (total energy) plotted on the abscissa are in arbitrary units but their relative size is preserved within a network. Scale for each spatial frequency channel is indicated for the leftmost unit shown for that channel.

 
At the zero-gain extreme (Fig. 5A), the decoupled network becomes a set of isolated simple cells. The most important observation for these model units is that the nonlinear interaction between the spike threshold and the feedforward kernel results in feature tuning. There are several characteristics of feature tuning, detailed later, that are commonly observed in all our simulations and demonstrate the fundamental role of the rectified feedforward input in the genesis of feature tuning in V1. These key results are not particular to the choice of parameters used in the simulations. In the model units shown the threshold was set to a moderate level (defined in METHODS) and the exponent used for the static nonlinearity was n = 2. However, similar feature tuning resulted for other static nonlinearities (not shown). The notable exception is the piecewise linear perfect half-wave rectifier ({theta} = 0 and n = 1), which uniquely precludes tuning to equal-energy compound gratings because its output preserves the equal-energy property of the input. Next, we describe the characteristics of feature tuning that are common to all model networks studied.

First, at any given stimulus speed, feature sensitivity in each simple cell varies approximately as ~ cos [2({phi}{phi}opt)] function of congruence phase, with a distinct feature preference, {phi}opt. Thus the simulation of the zero-gain network affirms the qualitative inferences made earlier for the shape of feature tuning in an isolated rectified feedforward unit.

Second, at a given drift velocity, for any particular cell, the feature preference monotonically depends on the receptive field's Gabor phase, i.e., {phi}opt({gamma}) ~ ({gamma} + const) mod {pi}. This dependence on Gabor phase survives increased recurrent interactions and points to the critical role that the symmetry of the feedforward kernel plays in shaping feature preference in V1. Furthermore, although the form of this dependence does not change with a change in stimulus velocity, the constant offset and thus the tuning optimum itself depends on speed: changing the drift speed V of the stimulus results in a drift-dependent shift, {Delta}{phi}opt(V), in the preferred stimulus, i.e., {phi}opt({gamma}, V) ~ [{gamma} + {Delta}{phi}opt(V)] mod {pi}.

The dependence of the constant offset is the signature of the complex multipliers of the spatiotemporal kernel. The kernel need not be separable in the frequency domain to have this effect. The phase offset depends on the complex amplitudes (and thus phases) of the spatial and temporal transfer functions of the feedforward kernel. In the simulations discussed so far, all units in a network had identical temporal integration property, which translates into identical complex multipliers in the time domain. Model neurons in different Gabor channels are expected to differ in their spatial complex multipliers, but because of the similar overall shape of their spatial tuning function this difference does not alter the phase dependence very much (its extent is reflected by the scatter in GoFig. 7B)—thus the approximately constant phase offset at a fixed stimulus velocity.


Figure 6
View larger version (8K):
[in this window]
[in a new window]

 
FIG. 6. Representation of the simple–complex continuum in a mixed-gain network. Gain was randomly selected for each unit in the network from a uniform distribution spanning the range shown on the horizontal axis (normalized with the same gmax as used in Fig. 5). Simple–complex continuum is indexed by the F1/F0 ratio measured for the optimal grating for each model unit. Functional relationship is very different from a linear one (slanted dotted line) that holds in homogeneous-gain networks. Horizontal dotted line indicates the class boundary between simple cells (triangles) and complex cells (squares).

 

Figure 7
View larger version (18K):
[in this window]
[in a new window]

 
FIG. 7. A: feature tuning in units in the mixed-gain network introduced in Fig. 6 (presentation follows the format of Fig. 5). BD: population analysis of feature tuning and its dependence on speed in the network model. Data presentation follows exactly the format used for V1 data in Fig. 4. For details about the model simulations, see main text.

 
The form of this dependence of preferred phase on velocity guarantees that, at any given speed, preferred features cover the entire feature space in a population of cells in which the Gabor phases sample the entire phase space.

Third, within each spatial frequency channel corresponding to a fixed Gabor frequency k, the magnitude of the response varies regularly with {gamma}, the Gabor phase, approximately as ~ cos (2{gamma}). Thus the units with the symmetric Gabor kernel (first column, labeled {gamma} = 0, in the plots shown) have the largest and the units with the asymmetric Gabor kernel (column labeled {gamma} = {pi}/2) have the smallest responses. This pattern arises through the feedforward input because the even-symmetric linear component, taken after rectification, is larger than the odd-symmetric one. A similar pattern would arise in any family of kernels that sample a mixture of odd and even functions.

Because it arises from an interaction between the linear kernel and the static nonlinearity, this pattern is enhanced by an increase in the threshold or in the acceleration (i.e., the exponent) of the power function. This mechanism is especially prominent in the high spatial frequency channels. This is explained as follows. Stimulus energy, by construction, declines with component frequency. Thus cells of the highest Gabor frequency (largest k values) respond to the compound gratings with the smallest magnitude in the entire network, which, assuming a networkwide constant threshold, makes them the most sensitive to clipping.

Our simulations also indicate that changing the drift speed does not affect the ~ cos (2{gamma}) dependence of the magnitude of the responses across units, but can affect the absolute magnitude of the responses as well as the selectivity of the feature-tuning curve in a spatial frequency-dependent manner.

Chance et al. (1999)Go showed that for homogenous gain networks, increasing the gain results in increasing phase-insensitive pooling and leads to single grating responses that are progressively more complex-like. The same mechanism decreases the sensitivity (modulation depth) of feature tuning to compound gratings, as illustrated by Fig. 5, B and C for various (high) levels of gain. Thus when pooled phases are balanced, recurrent pooling acts against the static nonlinearity of the receptive field by making responses more complex-like. Underlying the importance of the role that the rectified feedforward component plays in setting up feature tuning is the fact that the recurrent gain must be quite high to generate a noticeable change in the shape of the feature tuning curves. Specifically, feature tuning remains stable while the recurrent gain is raised from zero (g/gmax = 0, all feedforward simple cells) all the way up to an intermediate level (g/gmax = 0.5, a value that results in interacting model neurons that are all borderline simple–complex by the measure of the modulation ratio; not shown). Thus a point of special emphasis here is that intermediate gains generate complex cells that exhibit significant feature (phase) tuning. This is all the more notable because the F1/F0 ratio, the index of the simple–complex continuum, is also a measure of phase sensitivity.

Notice that the preferred feature in each unit is independent of the choice of the static nonlinearity or the recurrent gain, but only if the latter is not too high. At very high homogeneous gains (Fig. 5C), feature tuning becomes homogeneous because all units begin to behave independently of their own afferent input and similarly to the units that respond the most strongly. That is, in the high homogeneous-gain regime, these strongly coupled networks exhibit winner-take-all behavior, which is expected from strongly coupled recurrent networks in general. For these networks, the "winner" among Gabor units of the same spatial frequency k is the one with a symmetric kernel (Gabor phase {gamma} = 0 or {gamma} = {pi}).

This winner-take-all behavior is more prominent when clipping by the rectifier is more severe. This accounts for the more prominent winner-take-all behavior in the higher spatial frequency channels (Fig. 5C, bottom row) because, in these channels (see above), the linearly filtered stimulus energy is smaller. The winner-take-all favoring of the symmetric Gabor is powerfully reinforced by the recurrent excitation from neighboring frequency channels, where this mechanism is similarly prominent.

Note that even though the high-gain regime of the model leads to cells with complex-like behavior in terms of F1/F0 (Chance et al. 1999Go), the high-gain regime does not lead to energy-like behavior in terms of feature tuning. This follows from the biases set up by the feedforward input as explained earlier, along with the winner-take-all behavior. The selectivity of tuning remains larger in the higher-frequency channels because of the relatively stronger effect of clipping in those channels.

INHOMOGENEOUS (MIXED-GAIN) NETWORKS. Homogeneous-gain networks illuminate the genesis of feature tuning in model neurons. However, a single homogeneous gain can produce only one kind of behavior, not a simple–complex continuum. Moreover, a well-documented observation about the primate V1 (Ringach et al. 2002Go) is that simple and complex cells are both present in every cortical layer, with slight variation of their relative abundance across layers but no obvious spatial segregation within layers. Thus by virtue of its ability to generate an arbitrary simple–complex continuum, a random-gain network is likely to be a more realistic model of the V1 population.

Before proceeding to the presentation of the mixed-gain network simulations, a technical point about the behavior of the gain parameter needs to be made. In the preceding analysis of homogeneous-gain networks, we (following Chance et al. 1999Go) have referenced values of the homogeneous gain g to the maximum stable value of the gain gmax. An inhomogeneous network can remain stable even if some cells have g > gmax—provided that there are not too many of them. Thus for inhomogeneous-gain networks g can be sampled in a wider range than the one limited by gmax of homogeneous networks of otherwise identical parameters.

To illustrate this point and to examine how gain determines the simple–complex character in the mixed gain network we plotted in Fig. 6 the F1/F0 modulation ratio for the optimal sine grating as a function of the gain. To facilitate comparison with results for homogeneous-gain networks, we normalized gain with gmax of homogeneous networks of otherwise identical parameters (thus g/gmax > 1 could be realized). Gains were randomly chosen from a uniform distribution over the g/gmax isin [0, 1.4] range. The functional relationship is a slowly decaying one, with F1/F0 -> 0 at very large gains. (Thus complex cells with F1/F0 < 0.2 can be realized by recurrent gains greater than the range sampled in Fig. 6.) The dependence of F1/F0 on gain is parametric in the Gabor frequency, as indicated by the fine thread-like densities in the scatterplot, each of which is composed of data from units of a particular spatial frequency channel. The asymptotic dependence is very different from the linear relationship (slanted dotted line) known for the homogeneous gain networks (Chance et al. 1999Go). This difference is reflected in the range of gains associated with simple cells (triangles) and complex cells (squares). In homogeneous networks, the class boundary (horizontal dotted line) intersects in a single point with the linear regression of data, sharply dividing the continuum of gain between simple cells (g/gmax < 0.41) and complex cells (g/gmax ≥ 0.41). In mixed-gain networks, simple cells are confined to a narrower range of gains and the boundary is not sharp (scatter of triangles and squares along abscissa overlap in Fig. 6). This is because the location of the intersection of class boundary (horizontal dotted line) with the data depends on the Gabor frequency.

Figure 7 A shows the tuning curves for model units in a "mixed-gain" network in an arrangement similar to that in Fig. 5. As may be expected from the observations already made, unit by unit, feature preference in the mixed-gain population closely resembles that observed in the homogeneous intermediate-gain network (Fig. 5B), although there are differences. Selectivity, but especially response magnitude, response parameters that are more dependent on recurrent gain are more varied in the mixed-gain network. A case in point is the lawful variation of tuning magnitude with Gabor phase observed in homogeneous networks. That pattern, which survived even in strongly coupled units in a network of homogeneous high gain (Fig. 5C), is diluted here. The pattern is expected to be fully eliminated in a sufficiently inhomogeneous network.

To compare the mixed-gain model with the V1 population, Fig. 7, BD presents the same population analyses as in Fig. 4. In V1 (Fig. 4A), the scattergram of optimal feature at low speed ({phi}opt,low) versus high speed ({phi}opt,high) showed no statistical association by linear circular correlation statistics. However, the simulations (Fig. 7B) for the recurrent network model show prominent "tracks," indicating strong correlation between feature tuning at the two speeds. They signify the monotonic dependence of feature preference on Gabor phase, a legacy of the linear kernel. Thus not surprisingly, the tracks were also seen in homogeneous-gain networks and their pattern and position were preserved across gain levels (data not shown). The exact shape of that dependence, and thus the shape of the track (e.g., the degree of deviation of the data points from a line of unity slope), depends on the relative spatial frequency of the stimulus and the Gabor frequency—data from units of the same frequency channel form fine "fibers" within the track. We show later (Fig. 8) that the offset of the track along the axes s