JN Fuel your research with LabChart
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 95: 379-400, 2006. First published September 7, 2005; doi:10.1152/jn.00498.2005
0022-3077/06 $8.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
95/1/379    most recent
00498.2005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (5)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Victor, J. D.
Right arrow Articles by Sharpee, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Victor, J. D.
Right arrow Articles by Sharpee, T.

Responses of V1 Neurons to Two-Dimensional Hermite Functions

Jonathan D. Victor1, Ferenc Mechler1, Michael A. Repucci1,3, Keith P. Purpura1 and Tatyana Sharpee2

1Department of Neurology and Neuroscience, Weill Medical College of Cornell University, New York, New York; 2Department of Physiology, University of California, San Francisco, San Francisco, California; and 3Center for Molecular & Behavioral Neuroscience, Rutgers University, Newark, New Jersey

Submitted 12 May 2005; accepted in final form 28 August 2005


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX: DETAILED DESCRIPTION...
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Neurons in primary visual cortex are widely considered to be oriented filters or energy detectors that perform one-dimensional feature analysis. The main deviations from this picture are generally thought to include gain controls and modulatory influences. Here we investigate receptive field (RF) properties of single neurons with localized two-dimensional stimuli, the two-dimensional Hermite functions (TDHs). TDHs can be grouped into distinct complete orthonormal bases that are matched in contrast energy, spatial extent, and spatial frequency content but differ in two-dimensional form, and thus can be used to probe spatially specific nonlinearities. Here we use two such bases: Cartesian TDHs, which resemble vignetted gratings and checkerboards, and polar TDHs, which resemble vignetted annuli and dartboards. Of 63 isolated units, 51 responded to TDH stimuli. In 37/51 units, we found significant differences in overall response size (21/51) or apparent RF shape (28/51) that depended on which basis set was used. Because of the properties of the TDH stimuli, these findings are inconsistent with simple feedforward nonlinearities and with many variants of energy models. Rather, they imply the presence of nonlinearities that are not local in either space or spatial frequency. Units showing these differences were present to a similar degree in cat and monkey, in simple and complex cells, and in supragranular, infragranular, and granular layers. We thus find a widely distributed neurophysiological substrate for two-dimensional spatial analysis at the earliest stages of cortical processing. Moreover, the population pattern of tuning to TDH functions suggests that V1 neurons sample not only orientations, but a larger space of two-dimensional form, in an even-handed manner.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX: DETAILED DESCRIPTION...
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
It is remarkable that a predictively accurate account of the responses of primary visual cortex (V1) neurons remains elusive, despite several decades of quantitative study (Olshausen and Field 2004Go). These studies used a multitude of simple stimuli, including bars (Hubel and Wiesel 1959Go, 1968Go; Kagan et al. 2002Go; Movshon et al. 1978aGo,bGo; Sun and Bonds 1994Go; gratings (Anderson et al 2001Go; Bonds 1989Go; De Valois et al. 1979Go; Jagadeesh et al. 1997Go; Kagan et al. 2002a; Movshon et al. 1978aGo,bGo; Ringach et al. 1997aGo), annuli (Jones et al. 2001Go), Gabor functions (Bauer and Heinze 2002Go), random or pseudorandom noise, both dense and sparse (Chen et al. 1993Go; Hirsch et al. 1998Go; Jones and Palmer 1987Go; Palmer and Davis 1981Go; Reid et al. 1997Go), other geometric stimuli (Conway and Livingstone 2003Go; De Valois et al. 1979Go; Hammond and MacKay 1975Go; Mechler et al. 2002Go; Pollen et al. 1988Go; Purpura et al. 1994Go; Skottun et al. 1991aGo; Smith et al. 2002Go and natural scenes (David et al. 2004Go; Ringach et al. 2002Go; Smyth et al. 2003Go; Vinje and Gallant 2002Go; Willmore and Smyth 2003Go). It is generally thought that response properties of at least some V1 cells can be accounted for by a linear filter, perhaps followed by a static nonlinearity such as a firing threshold, as reviewed by Simoncelli et al. (2004)Go. However, such a linear–nonlinear (LN) model is recognized to be incomplete even for classic simple cells. The LN model's failure to predict responses to stimuli outside the set used to specify the model is usually attributed to modulatory influences such as gain controls and other influences from the nonclassical receptive field (Freeman et al. 2001Go; Heeger 1992aGo; Ohzawa et al. 1982Go; Sceniak et al. 1999Go, 2002Go; Smyth et al. 2003Go). For complex cells, energy models (Adelson and Bergen 1985Go) and their variants (David et al. 2004Go; Rust et al. 2003Go, 2005Go; Touryan et al. 2005Go) have been proposed to account for the relative lack of phase dependency of responses and for their ONOFF character.

Deviations between responses predicted from simple geometric stimuli and measured responses can be particularly prominent for natural scenes (David et al. 2004Go; Smyth et al. 2003Go). However, it is unclear whether these prediction failures are specific to natural scenes or, rather, reflect a more general failing of LN and energy models derived from simple stimuli. The latter might become apparent if neurons were examined with stimuli outside the usual analytic stimuli used to specify models. The usual analytic stimuli fall into three classes: uniform in space but localized in spatial frequency (e.g., gratings), localized in space but broadband (e.g., spots, bars, and edges), or uniform in space and broadband (e.g., spatiotemporal white noise). Additionally, standard analytic stimuli are typically unstructured in space (e.g., white noise) or structured along a single dimension, with a single dominant orientation (e.g., bars and gratings). In contrast, "features" are typically localized both in space and in spatial frequency (Morrone and Burr 1988Go). Moreover, some aspects of natural visual scenes, such as T-junctions, have two-dimensional structure and multiple orientations.

With these considerations in mind, we studied the responses of V1 neurons to another set of analytic visual stimuli, the two-dimensional Hermite functions (TDHs), shown in Fig. 1. These functions are localized in space and spatial frequency, in a manner that is precisely intermediate between the extremes of points (localized in space, uniform in spatial frequency) and gratings (uniform in space, localized in spatial frequency). The formal sense in which these functions achieve joint localization in space and spatial frequency (Victor and Knight 2003Go) is distinct from the sense of joint localization that leads to Gabor functions (Daugman 1985Go; Gabor 1946Go; Marcelja 1980Go), and does not require consideration of complex-valued profiles (Klein and Beutter 1992Go; Stork and Wilson 1990Go). Gabor functions optimize localization in space and spatial frequency in the sense that they minimize the product of the variances of the distribution of spatial sensitivity profile and its Fourier transform. TDH functions optimize localization in space and spatial frequency in the sense that their spatial profile is minimally altered by truncation of its power spectrum and windowing in space.



View larger version (47K):
[in this window]
[in a new window]
 
FIG. 1. Two-dimensional Hermite (TDH) functions used in these experiments. Each family (Cartesian, left; polar, right) forms an orthonormal basis for 2-dimensional patterns and increases gradually in spatial extent and bandwidth as rank (row) increases. For the Cartesian functions, the indices j and k specify the number of zero-crossings along the x- and y-coordinates. Each index is constant along a set of parallel lines, as indicated by the arrows. Rank of a Cartesian function is equal to j + k. For the polar functions, the index {nu} specifies the number of zero-crossings along each radius and is constant along the inverted "vees" that begin at the bottom right, peak along the middle of the array, and then continue to the bottom left. Index µ specifies the number of zero-crossings along concentric circles and is constant along vertical lines as indicated by the down-pointing arrows. Rank of a polar function is equal to µ + 2v; the "cosine" and "sine" halves of the array contain the functions whose dependency on polar angle {theta} is given by cos (µ{theta}) and sin (µ{theta}), respectively, where {theta} is measured clockwise from the horizontal (x-) axis. Midline of the polar array contains the functions that are independent of {theta}.

 
One consequence of the difference between the defining characteristics of Gabor functions and TDHs is that the latter [and their one-dimensional analogs, used previously in psychophysical (Yang and Reeves 2001Go) and VEP (Yang and Reeves 1995Go) studies] are readily organized into discrete orthogonal basis sets. Although a continuum of basis sets exist, we focus on basis sets that have Cartesian or polar symmetry. Gabor functions do not form basis sets in any natural way.

The two-dimensional structure of the TDHs depends on the choice of the basis set, but all basis sets are equated in contrast, spatial spread, and power spectra. Thus, the TDHs share with standard stimulus sets the ability to reconstruct linear receptive fields (because they form basis sets), but also can distinguish between the effects of two-dimensional structure and the effects of context-dependent modulation because they are equated for contrast, spatial spread, and power spectrum, yet differ in two-dimensional structure.

We find that the linear-static nonlinear picture fails to account for responses to TDH functions in the majority of V1 neurons. Rather, the apparent shape and strength of the reconstructed receptive field depends on the choice of TDH basis set. As described below, these failures are often striking and qualitative and are present in all cortical laminae. The analytic properties of the TDH functions also allow our data to exclude a wide class of generalized energy models as the source of these discrepancies. Moreover, because of these analytic properties, it is difficult to account for these discrepancies on the basis of modulatory influences. Rather, the findings suggest that our current picture of V1 receptive fields may be limited by the relatively simple kinds of stimuli typically used to investigate them. Secondarily, the parameterization of visual form provided by the TDH functions provides a new insight into the diversity of spatial selectivities of V1 neurons: the uniform coverage of orientation space (Blasdel 1992Go; Dragoi et al. 2000Go; Sirovich and Uglesich 2004Go) may be part of a more general coverage of a larger space of local form. Portions of this material were presented at the annual meetings of the Vision Sciences Society (2004) and Society for Neuroscience (Victor et al. 2004a,b).


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX: DETAILED DESCRIPTION...
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Our methods for animal preparation, visual stimulation, and recording have been previously described in detail (Aronov 2003Go; Mechler et al. 2002Go); we summarize them here. All animal procedures were performed in accordance with NIH and local IACUC standards.

Physiologic preparation

Recordings were made after initial atropine [0.04 mg, administered intramuscularly (im)], anesthesia with ketamine 10 mg/kg im (cats) or telazol 2–4 mg/kg im (macaques), and placement of an endotracheal tube and catheters in both femoral veins, one femoral artery, and the urethra. During recording, anesthesia was maintained with propofol and sufentanil (mixture containing 10 mg/ml of propofol and 0.25 µg/ml sufentanil, initially at 2 mg · kg–1 · h–1 propofol then titrated) and neuromuscular blockade was provided by vecuronium 0.25 mg/kg intravenous (iv) bolus, 0.25 mg · kg–1 · h–1 iv. Heart rate and rhythm, arterial blood pressure, body temperature, end-expiratory pCO2, arterial oxygen saturation, urine output, and EEG were monitored during the course of the experiment. Animal maintenance included intravenous fluids (lactated Ringer solution with 5% glucose, 2–3 cm3 · kg–1 · h–1), administration of supplemental O2 every 6 h, antibiotics (procaine penicillin G 75,000 U/kg im prophyllactically, gentamicin 5 mg/kg im daily if evidence of infection), application of 0.5% bupivicaine to wounds, and ocular instillation of atropine 1% and flurbiprofen 2.5% (and, for cats, Neosynephrine eyedrops 10% to retract the nictitating membranes), dexamethasone (1 mg/kg im daily), and periodic cleaning of the contact lenses. With these measures, the preparation remained physiologically stable for 2 or 3 days (cats) and 4 or 5 days (macaques).

RECORDING. After a craniotomy near P3, L1 (cats) or P15, L14 (macaques), a tetrode (Thomas Recording, Giessen, Germany), coated with DiI (Molecular Probes, Eugene, OR) to aid subsequent localization of the track, is inserted through a small durotomy. Once spiking activity from one or more units is encountered, the region of the receptive field(s) is hand-mapped and then centered on the display of a Sony GDM-F500 19-in. monitor (displaying a 1,024 x 768 raster at 100 Hz, 35 cd/m2), typically at a distance of 114 cm, directly or by a mirror. Real-time spike-sorting software (Datawave Technologies) is engaged to provide TTL pulses corresponding to the time of spikes of tentatively identified single units. Rapid, qualitative characterization of these units' ocularity and grating responses is accomplished by keyboard or mouse control of the visual stimulator.

QUANTITATIVE CHARACTERIZATION. Among the multiple spikes simultaneously recorded by the tetrode, one well-isolated spike (signal-to-noise >2:1 and usually >3:1, distinctive shape by on-line spike sorting) is selected as the "target" neuron. Beginning with the parameters determined by the qualitative characterization, computer-controlled stimulation paradigms are used to characterize the target neuron quantitatively with sine gratings. Orientation tuning is determined by the mean response (F0) and the fundamental modulated response (F1) to drifting gratings at orientations spaced in steps of 22.5 deg (or, for narrowly tuned units, 11.25 deg), presented at a contrast c = (LmaxLmin)/(Lmax + Lmin) of 0.5 or 1.0, with spatial and temporal frequency determined by the initial assessment. Next, spatial frequency tuning is determined by responses to drifting gratings at an eight- to 16-fold range of spatial frequencies straddling the value determined by the auditory assessment, a contrast 0.5 or 1.0, an orientation determined by the orientation tuning run, and a temporal frequency determined by the auditory assessment. Temporal tuning is then assessed by responses to 1-, 2-, 4-, 8-, and 16-Hz drifting gratings at the optimal orientation and spatial frequency. Finally, a contrast response function is determined by responses to drifting gratings at contrasts of 0, 0.0625, 0.125, 0.25, 0.5, and 1.0, with orientation, spatial frequency, and temporal frequency determined by the previous quantitative runs. The position of the receptive field (RF) of the target neuron is then assessed from online-generated poststimulus time histograms (PSTHs) of the response to either a bright or dark bar, moving slowly (≤1 deg/s) and symmetrically about the origin in both directions along the preferred axis. To center the RF along the preferred axis, the stimulus coordinate system origin is digitally adjusted so that the mean of the times of the peak responses (to stimuli swept in each direction) occurs when the bar traverses the origin of the coordinate system. To center the RF in the orthogonal direction, the origin is digitally adjusted so that it lies halfway between the upper and lower edges of the RF, as determined by the appearance of a response to slowly swept patches along multiple trajectories parallel to the preferred axis.

Once centered, the size of the classical RF is determined from responses to a drifting grating (all parameters optimized) presented in discs of increasing diameter and in a series of annuli with fixed outer radius and decreasing inner radii. In each case, stimuli and blanks are presented for 3-s runs, and four to eight randomized repeats are obtained for adequate statistics on the Fourier components of the responses. The effective diameter D of the RF of the target neuron (used below to determine the size of the TDH patterns) was taken to be the smallest inner diameter of an annulus that did not produce a measurable response, as assessed by t-statistics for F0 or Tcirc2 statistics (Victor and Mast 1991Go) for F1 (as diagrammed in Fig. 2A, unit 1). The set of annuli were chosen so that D was determined to within deg or, for smaller receptive fields, deg.



View larger version (22K):
[in this window]
[in a new window]
 
FIG. 2. Relationship of scaling of the TDH stimuli to the classical receptive field (RF) size. Left: diameter D of the classical RF for the target unit (diagrammed as unit 1) is taken to be the smallest inner diameter of an annulus that did not produce a measurable response (bottom left); other units (diagrammed as unit 2) might lead to a somewhat different choice of D and units might show increasing responses to patches of diameter >D (top). See METHODS for additional details. Parameter {sigma} that defines the spatial spread of the TDH stimuli (see Eqs. A1, A2, A4, and A5) is then chosen as {sigma} = D/10, which produces spatial profiles that are confined to a disk of radius D for low ranks, but extend beyond it for high ranks (right). h0 indicates the radial dependency of the TDH stimulus of rank 0 (common to Cartesian and polar separations); h7 indicates the dependency of the rank-7 Cartesian TDH C0,7 along its long axis.

 
The ratio of the Fourier component at the modulation frequency to the mean, F1/F0, was calculated from the response to a drifting grating, and units were classified as "simple" if F1/F0 > 1 and "complex" if F1/F0 ≤1 (Skottun et al. 1991bGo). A direction selectivity index DSI = (RprefRanti)/(Rpref + Ranti) was calculated from grating responses F1 or F0, depending on which component dominated the response.

Usually, there are two to four simultaneously recorded neurons whose spikes are well isolated by the above criteria, and whose spike shapes across the tetrode are reliably discriminated. At some recording sites, some of these neurons differed substantially from the target neuron in RF position, spatial frequency, and/or orientation tuning. At approximately one third of recording sites, we repeated the quantitative characterizations above for one of these additional neurons, so that they could also serve as the "target." Discriminated event pulses corresponding to the tentatively identified single units are logged by the PC that controls the visual stimulus (AS1b board on the VSG system, NI PCI-6602 on the OpenGL system) for on-line analysis. Timing pulses from the PC that controls the visual stimulus are also led to a PC that hosts the Datawave spike-sorting system and records event waveforms (32 samples at 0.04-ms resolution) for later analysis. Off-line spike sorting is performed with an in-house Matlab implementation (Reich 2000Go) of the methods of Fee (1996)Go and Sahani (1998)Go. All the data below are derived from these off-line spike sorts. Because the stimulus lineup was performed on the basis of on-line discriminations and the definitive analysis was defined from an independent analysis of stored waveforms, the identification of the "target" neuron in the off-line analysis is only presumptive and plays no role in the quantitative analysis. Moreover, as will be shown, our main findings were present both for neurons in which the receptive field maps were well centered (a set that includes the presumptive target neurons and likely others) and also for neurons whose receptive field maps were off-center but still within the common envelope of the TDH functions.

STIMULATION WITH TWO-DIMENSIONAL HERMITE FUNCTIONS. After characterization and alignment of one or more target neurons, we recorded responses to patches whose spatial contrast was determined by a two-dimensional Hermite function (TDH) (see Fig. 1 and detailed description in the APPENDIX). Each TDH is a polynomial in the coordinates (x, y), multiplied by a Gaussian envelope. Stimuli were rotated so that the x-axis was along the target neuron's preferred orientation and the positive y-axis was the preferred direction for drifting gratings, if any. We set the spatial scale parameter {sigma} of the Gaussian envelope (see APPENDIX Eqs. A1, A2, A4, and A5) at {sigma} = D/10, where D is the diameter of the classical RF of the target neuron as determined by responses to disks and annuli containing the optimal drifting grating.

The reasoning behind this choice is as follows. The choice of {sigma} simultaneously sets the spatial extent of the two-dimensional Gaussian envelope common to all TDH functions {exp[(x2 + y2)/4{sigma}2]}, and the range of spatial frequencies explored at each rank. As illustrated in Fig. 2, choosing {sigma} = D/10 provides for stimuli that have one, two, or three oscillations within a region of space that covers the receptive field, well-matched to sample (in the Nyquist sense) the typical sensitivity profiles of cortical neurons (Ringach 2002Go), which have two or three lobes. Had we chosen a substantially larger value of {sigma}, most of the stimuli have would be relatively uniform over the receptive field, and thus not examined the spatial frequencies to which the neuron was likely to be tuned. Had we chosen a substantially smaller value of {sigma}, most of the stimuli would have been confined only to a subregion of the receptive field.

This choice is also supported by the sense in which linear combinations of TDH functions represent receptive fields. Linear combination of TDH functions converges to a target spatial profile in a least-squares sense as weighted by the square of the common envelope {i.e., exp[(x2 + y2)/2{sigma}2]}. That is, if {sigma} is large, the approximation will be inefficient in that it will be weighted by areas far removed from the receptive field. Conversely if {sigma} is small, the convergence will not be valid across the entire receptive field until an unreasonably large number of terms have been added. A choice of {sigma} for which TDH profiles have an envelope that is similar to that of the receptive fields to be approximated avoids these difficulties. The fact that sensitivity profiles have relatively stereotyped shapes (Ringach 2002Go) allows a common universal choice to be made.

We carried out pilot experiments in one cat (two sites, seven units) and one monkey (three sites, nine units) in which we used the standard choice of {sigma} and one or two values that differed from the standard choice. For values that differed from the standard choice by a factor of 1.5 or 1.66, corresponding features of the derived "L" and "E"-filter profiles (see below) could be identified within the common range of convergence. For values that differed from the standard choice by a factor of 2 or 3, only the coarsest commonalities of the maps could be seen, consistent with the above theoretical considerations.

Finally, we also note (as illustrated in Fig. 2) that with this choice of {sigma} = D/10, the contrast profiles of the lowest-rank stimuli lie within the classical RF, although the contrast profiles of the higher-rank stimuli (by design) extend beyond the classical receptive field.

The TDH patterns each have the same total power, and contrast was scaled by setting (see APPENDIX Eqs. A1, A2, A4, and A5), so that the maximum contrast was 1.

Each pattern was presented with the polarity shown in Fig. 1, and in inverted contrast polarity. Up to rank 7, this amounted to 144 stimuli (36 Cartesian stimuli, 36 polar stimuli, and their contrast-inverses). Rank 0 and 1 Cartesian and polar stimuli (i.e., the first three stimuli of each set) were identical. These three stimuli and their contrast-inverted counterparts were not duplicated in the stimulus sequence, reducing the number of stimuli to 138 = 144 – (3 x 2). (There is also a single rank 2 duplication, C1,1,{sigma} = A2, 0,{sigma}sin, but both stimuli were presented.) In addition, four stimulus periods of the "blank" stimulus, in which the contrast was held at zero, were added to the sequence. These 142 stimuli were each presented for 250 ms, each followed by 250 ms of a blank, in randomized order, for eight to 16 blocks.

Visual stimulus generation

Control signals for the CRT display are provided by a PC-hosted VSG2/5 (8 Mb) for grating stimuli and by a separate PC-hosted system optimized for OpenGL (NVidia GeForce3 chipset) for the bar and TDH stimuli, both programmed in Delphi. For presentation, TDH, stimuli were discretized as limited by the display resolution. This typically meant at least 64 x 64 display pixels across the stimulus, with each display pixel subtending approximately 1 min. At the edge of each patch, stimulus contrast was reduced to less than -th of its peak value.

Intensity linearization is separately performed for each display controller by VSG software or in-house software of comparable function.

Histology

At three locations along the electrode track bracketing the recording sites, lesions are made by current passage (typically 3 µA x 3 s, electrode negative). After all recordings, the animal was killed and perfused (4% paraformaldehyde) in phosphate-buffered saline. Digital microphotographs are first taken of histologically unstained 40-µm cryostatic sections under the fluorescence microscope to capture the DiI trace of the track. Digital microphotographs of the same sections are retaken under light microscopy after Nissl staining (Hevner and Wong-Riley 1990Go), to highlight laminar organization perpendicular to the track as well as the location of lesions. Laminar location of the recording sites is recovered by digital overlay of the image pairs corresponding to a section. Typically two to six consecutive sections fully contain a single track.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX: DETAILED DESCRIPTION...
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Waveform classification of the tetrode recordings yielded 45 units from 12 sites in three cats and 18 units from five sites in two macaques, whose spiking activity could be driven by drifting gratings or bars. All recordings were within 5 deg of the area centralis (cats) or fovea (macaques); 34/45 of the cat units and 17/18 of the macaque units had responses to TDH stimuli that were clearly distinguishable from their baseline activity. We restrict our further analysis to these 51 units.

Example responses to TDH stimuli

We begin by showing some example responses to TDH stimuli, initially describing them qualitatively, and then introducing quantitative approaches and illustrating their application.

SITE 1: RESPONSE HISTOGRAMS. Figure 3 shows PSTHs of responses of four simultaneously recorded units in upper layer III of cat V1. A fifth isolated neuron at this site was not responsive to TDH stimuli. Response histograms are laid out corresponding to the stimulus arrays of Fig. 1, with responses to Cartesian stimuli on the left and responses to polar stimuli on the right. For each stimulus, there is a pair of histograms: in the top histogram of each pair, the stimulus was presented as shown in Fig. 1; in the bottom histogram, the stimulus was presented with reversed polarity. Unit 3003t had a classical RF diameter of 3 deg; thus the size of the stimuli corresponded to {sigma} = 0.3 deg (see METHODS). All units at this site had a similar preferred orientation (45 deg) and, with the exception of unit 3003s, were narrowly tuned. Stimulus coordinate axes were rotated to conform to this common orientation preference.



View larger version (50K):
[in this window]
[in a new window]
 
FIG. 3. Poststimulus time histograms (PSTHs) of responses of 4 simultaneously recorded neurons in layer III of cat V1 to TDH functions (left; Cartesian stimuli; right polar stimuli), each presented for 250 ms and followed by 250 ms of mean illumination. In each pair of histograms, the top histogram is the response to the stimulus shown in Fig. 1, and the bottom histogram is the response to the contrast-inverse of that stimulus. Four pseudocolor maps represent the spatial filters Lcart, Lpolar, Ecart, and Epolar for the model of Fig. 4, derived as described by Eqs. 2 and 4. Circle on each color map is of diameter (D is the diameter of the circle in Fig. 2), which marks the point at which the Gaussian component of each Hermite function falls to e–2 times its peak value. For each unit, a common linear pseudocolor scale (color bar as shown in top right) is used for the 4 filters, with green representing 0, red representing the highest positive value, and blue representing the lowest negative value. For the units of panels A and B, there is at least a qualitative similarity of the filters L and E deduced from the 2 basis sets. For the unit of panel C, the shapes of the filters differ substantially. For the unit of panel D, there is a difference in the relative strengths of the linear and nonlinear components (L < E for the Cartesian functions. L comparable to E for the polar functions). A, B, C, and D: units 3003t, s, u, and x. PSTH scale bar: 100 impulses/s in all panels. Range for pseudocolor maps of filters: ±10 impulses/s (A), ±37 impulses/s (B), ±10 impulses/s (C), and ±13 impulses/s (D).

 
When studied with gratings, unit 3003t was a nondirectional simple (F1/F0 = 1.8) cell with narrow orientation tuning. The unit responded only to Cartesian stimuli that had uninterrupted contrast bands along its orientation preference: the stimuli C0,k. These are the only stimuli that have uninterrupted contrast bands along the preferred orientation. The other Cartesian TDH stimuli Cj,k (j > 0) have j contrast-inversions along the preferred axis.

For some Cartesian stimuli C0,k (e.g., C0,3 and C0,4), this unit responded at stimulus onset, and was quiet at stimulus offset (top histogram of the pair). When the polarity of these stimuli was reversed (bottom histogram of the pair), it was quiet at onset, but produced a burst at stimulus offset. The opposite pattern was seen for Cartesian stimuli C0,2 and C0,5: response at offset for the first polarity, with response at onset for the opposite polarity. Responses to the polar-separated stimuli generally had this temporal pattern as well. This kind of behavior is qualitatively consistent with a linear filter that accounts for spatial selectivity, followed by temporal high-pass filtering and half-wave rectification (e.g., a low maintained firing rate) that accounts for the pattern of responses to a stimulus and its contrast-inverse. The unit responded robustly to some polar TDH stimuli and not to others; as we will see below, this spatial selectivity is fully consistent with that of an oriented filter-then-rectify ("LN") model.

Unit 3003s was a simple (F1/F0 = 1.7) cell, more broadly tuned than unit 3003t, and also not directionally selective. It had a similar temporal pattern of responses to TDH stimuli of each polarity pair, both for the Cartesian and polar stimuli. In contrast to unit 3003t, however, there were also modest responses to stimuli C1,0, C2,0, and C3,0 (stimuli with contrast bands orthogonal to the preferred axis) and also to stimuli C1,2 and C1,3. The latter have contrast bands that run along the preferred axis, but contrast-reverse at the peak of the Gaussian and thus have no power in the preferred orientation. We will see below that these responses, and the selectivity for polar TDH stimuli, are also consistent with an LN model, but one with a broader orientation tuning associated with the initial linear stage.

Unit 3003u was a complex cell (F1/F0 = 0.6), and had a very different pattern of responses to TDH stimuli. Responses to the Cartesian stimuli were generally independent of stimulus polarity. Qualitatively consistent with a model consisting of an oriented filter followed by a mostly even nonlinearity, the largest responses in unit 3003u occurred for Cartesian stimuli C0,k, i.e., the stimuli whose contrast bands were aligned with the preferred orientation. Responses to polar stimuli, when present, were also independent of stimulus polarity. However, as we will see below, the pattern of selectivity for polar stimuli is inconsistent with the oriented filter that is implied by the selectivity for Cartesian stimuli.

Unit 3003x was also a complex cell (F1/F0 = 0.28), and, like unit 3003u, had responses to Cartesian stimuli that were generally independent of stimulus polarity and primarily responded to Cartesian stimuli C0,k. However, responses to polar stimuli, such as those in the middle of row 3 (A0,1, rank 2) and row 5 (A0,2, rank 4) were strongly dependent on stimulus polarity. Thus although this neuron's polarity -dependency for Cartesian stimuli conformed to the expectations of a complex cell, many responses to polar stimuli were polarity -dependent, like those of units 3003t and 3003s above.

AN EXTENDED LN MODEL. To make the above qualitative observations more precise, we introduce a modified filter-then-rectify model, as shown in Fig. 4. This model is not intended to correspond to anatomy or a wiring diagram, but rather to provide a means to compare the spatial selectivities of responses to Cartesian and polar stimuli. Below (see Energy models) we will also show that, as a consequence of some properties of the TDH functions, the measurements used to test the filter-then-rectify model can also be used to test several variants of energy models.



View larger version (7K):
[in this window]
[in a new window]
 
FIG. 4. Filter-then-rectify framework for analyzing responses to Cartesian and polar TDH stimuli. L and E represent spatial filters; E is followed by full-wave rectification. This model is used to deduce the filter maps presented in Figs. 3, 5, and 6. For further details, see text.

 
Model description.    One branch of the model, characterized by a linear filter L, encompasses "ON" and "OFF" inputs that behave in a linear fashion. A second branch, consisting of a linear filter E followed by full-wave rectification, generates ONOFF responses. The outputs of these branches are added together, along with a maintained firing rate Rm, to produce the neuron's output. The standard filter-then-rectify model makes specific predictions about the relationship between L and E, and conversely, combinations of L and E can be reinterpreted in terms of ON and OFF inputs (see below).

To determine L and E from our data, we make the simplifying assumption that the neural response to each stimulus can be characterized by a scalar "response measure." For this purpose, we will initially use the total spike count during stimulus presentation; dynamics will be considered later. With this initial simplification, the model response R(S) to a stimulus S is

(1)
where L(x, y) represents the spatial weighting of the filter L in the "linear" branch and E(x, y) represents the spatial weighting of the filter E that precedes a full-wave rectification.

To determine the filters L and E from the responses to a set of TDH functions fk and their negatives –fk, we use the fact that each set of TDH functions (either Cartesian or polar) is an orthogonal basis. We therefore can express L and E as a sum of TDH functions


(2)
where La and Ea are the scalar coefficients in these two orthogonal expansions. It follows from the orthonormality of the functions fk that the response of the model (Eq. 1) to the inputs fk and fk are given by


(3)
From Eq. 3 it follows that


(4)
This strategy of separating linear and nonlinear components based on responses to stimuli of opposite parity is similar to an approach suggested for sparse noise stimulation (Nykamp 2003Go); here we exploit the fact that the strategy does not require nonoverlapping stimuli, but merely orthogonal stimuli. The determination of L by Eq. 4 can be viewed as a variant of a "subspace reverse-correlation" approach (Ringach et al. 1997bGo), where we have chosen the subspace to consist of functions limited in spatial extent and bandwidth. Because we have two basis sets for the same subspace, one test of the model (see below) is that the determination of L must be the same for each basis set (Ringach et al. 1997bGo).

Equation 4 shows how the responses to either basis set specify the coordinates Lk and |Ek|. Conversely, as shown by Eq. 3, the coordinates Lk and |Ek|, along with the maintained firing rate Rm, fully and exactly specify the responses to each stimulus within the basis set used. Thus our strategy for testing the model of Fig. 4 is not to check consistency of filters L and E as determined with one basis set with the raw responses, but rather to check consistency of these filters across basis sets.

Relation to notions of linearity, "simple" and "complex."    The model of Fig. 4 will behave in a linear fashion if, and only if, E = 0; in this case, the positive and negative lobes of L correspond to the ON and OFF subfields. Special cases of the model for E != 0 correspond to idealized "simple" and "complex" cells, as characterized by subfield organization (Hubel and Wiesel 1959Go, 1968Go). [We do not mean to imply that this distinction is identical to the simple vs. complex distinction based on the response to drifting gratings (Kagan et al. 2002b; Skottun et al. 1991bGo); the relationship of our model to models that focus on phase dependency is discussed below.] When L = E the model behavior is that of a linear filter, followed by half-wave rectification (i.e., negative signals are set to 0, positive signals are unchanged). This corresponds to an idealized "simple" cell with nonoverlapping ON and OFF subfields, and linear combination of these signals before an output nonlinearity arising from the requisite nonnegativity of the firing rate. When L = 0 (but E != 0) the model behavior is that of a linear filter, followed by full rectification (i.e., negative and positive signals are set to their absolute value). That is, the model produces ON and OFF responses in coextensive areas of space, and is thus an idealized "complex" cell. If L and E are both nonzero but have a similar shape, the model of Fig. 4 simplifies into a one-pathway (LN) model, in which the nonlinearity is partially or asymmetrically rectifying (i.e., intermediate between linear and "simple" or between "simple" and "complex"). Models in which L and E have different shapes, the general case of Eq. 1, correspond to cells with a mixture of spatially distinct ON, OFF, and ONOFF subfields.

Conversely, any feedforward neuron with a single nonlinearity consisting of half-wave, full-wave, or intermediate (asymmetric) rectification can be recast into the form of Eq. 1, by considering its responses to stimuli and their inverses. Once this has been done, the shapes and magnitudes of the filters L and E should be independent of the basis set used in Eqs. 2 and 4. [For L this is the argument of Ringach et al. (1997b)Go; it extends to E by the symmetry-based separation of Eq. 4]. Because Cartesian and polar stimuli each constitute a basis set, the above model allows us to ask whether the responses to Cartesian and polar stimuli are consistent with a large category of feedforward models and, if not, the manner in which they deviate.

Some details.    The above procedure determines L uniquely (within the linear span of the fk), but the filter E is ambiguous because the data determine the magnitude of each coefficient in its orthogonal expansion, but not its sign (Eq. 4, bottom portion). Any assignment of signs to the coefficients for E will result in a filter that will lead to the same responses. For the purposes of graphical display, we choose the signs of the coefficient Ek to match that of Lk. This is a conservative choice, in that it leads to a visual rendition for E that is as similar as possible (within the constraints of the data) to that of L. Other strategies for fixing the signs of the coefficients Ek, such as minimizing the spatial extent of E or making it as sparse as possible, might also be considered. All of our statistical analyses related to E are based on the absolute values of its coefficients in the orthogonal expansion (Eq. 4, bottom portion) and are thus unaffected by the method chosen to resolve the sign ambiguity.

A second detail is that we set Ek = 0 if, on a trial-by-trial basis, the mean response [R(fk) + R(–fk)]/2 did not deviate from the response elicited by a blank, at a 95% confidence limit (by t-test). The implications of this manipulation are discussed below.

Calculations of the filters L and E were performed on a grid of 64 x 64 or larger with {sigma} set equal to of the grid. On this grid, numerical approximations to orthogonality were better than one part in 105 and the largest values of the functions that lay beyond the grid were < of the peak. Thus the consequences of discretization, both in the display of the functions (see Visual stimulus generation) and subsequent analysis, are negligible. Because the basis functions are smooth, the finite linear combinations of these functions as specified in Eq. 2 are smooth as well and no further smoothing was applied.

The number of spikes used to calculate the maps ranged from 457 to 41,797, with a mean of 6,597 and a median of 4,135. Data sets with relatively few spikes were included only if the responses appeared reliable (e.g., PSTHs clearly modulated by stimulus appearance and disappearance).

Indices.    To determine the extent to which the estimated filters L and E correspond to certain idealized notions of simple and complex cells (and to mitigate the difficulties related to the ambiguity of E), we construct two kinds of indices, Isym and Ishape. Isym, which is calculated separately for each basis set (denoted Isymcart or Isympolar), compares the strengths of the filters L and E, but ignores their shapes. Generically

(5)
Here, |L|2 and |E|2 indicate spatial integrals of the squared response profiles


(6)
where La and Ea are the coefficients determined by Eq. 4 from the basis set of interest. The second equality on each line is a consequence of the orthonormality of the basis functions. Note that this implies that |E|2 is independent of the signs assigned to each coefficient Ea.

For an idealized complex cell (in the sense of overlapping, equally strong, ON and OFF subregions), L = 0 and so Isym = 1. For an idealized simple cell (in the sense of separate ON and OFF subregions) consisting of a linear filter followed by half-wave rectification, L = E and so Isym = 0. For a cell that is truly linear (e.g., has a sufficiently high firing rate to avoid rectification), E = 0 and so Isym = –1. Intermediate values of Isym correspond to asymmetric rectification; overrectification for 1 > Isym > 0 (negative and positive signals both transformed to signals of the same sign, but with unequal gains) and underrectification for 0 > Isym > –1 (negative and positive signals unchanged in sign, but transmitted with unequal gains).

The model of Fig. 4 places no constraints on the shape of L, but requires that L is independent of basis set (Cartesian vs. polar). To test this prediction, we use an index Ishape, the spatial correlation coefficient of the estimates Lcart and Lpolar derived from the two basis sets. For Lcart and Lpolar expressed as maps

(7)
Equivalently, expressed in terms of the expansion coefficients of Eq. 2

(8)
where ca,b is the dot product of the ath Cartesian function and the bth polar function

(9)
For filter shape to be independent of basis set, Ishape(Lcart, Lpolar) = 1.

The model of Fig. 4 reduces to a single-pathway model when L and E have the same shape. Analogous indices Ishape (Lcart, Ecart) and Ishape (Lpolar, Epolar) express the similarity of the estimated shapes of these filters, as determined from each basis set. Because of the sign ambiguity in the determination of E, we choose the conservative definitions

(10)
and similarly for Ishape(Lpolar, Epolar). This definition is conservative in that it makes Ishape(L, E) as close to 1 as possible, consistent with the data.

Estimates of Isym and Ishape quoted below were calculated from Eqs. 5, 8, and 10 and debiased by a jackknife procedure (Efron and Tibshirani 1998Go) based on each block of trials; quoted SEs of measurement were determined in a similar fashion.

As described above, before the calculation of the maps and indices we set Ek = 0 if the raw estimates of Ek did not deviate significantly from zero. Nearly all the excluded values were slightly positive. This exclusion avoids the tendency of random spikes to bias the map of E toward that of L [i.e., removes a bias of Ishape(L, E) toward the null hypothesis value of 1] and also reduces random contributions to estimates of the overall size E (i.e., removes a bias of Isym away from the null hypothesis value of 1). Although such biases would also be removed (in the asymptotic limit) by the jackknifing procedure, we considered it more appropriate to remove them at the source. Moreover, because their responses are small, including or excluding them has only a small effect on the filter maps and the resulting statistics, as confirmed by reanalysis of the full data set from one animal without this exclusion.

SITE 1: QUANTITATIVE ANALYSIS. We now use the above model to analyze the responses at Site 1 (Fig. 3). For unit 3003t (Fig. 3A), the filter L extracted from the Cartesian responses had several parallel lobes oriented along the preferred orientation, consistent with a Gabor-like spatial filter. Shapes of the L filters determined from the Cartesian and polar responses were similar but statistically distinguishable: Ishape(Lcart, Lpolar) = 0.74 ± 0.09. The overall size of the L filter was somewhat larger than that of the E filter, indicating that the "linear" responses dominated the "ONOFF" responses (3003t had F1/F0 = 1.8). Correspondingly, Isymcart = –0.63 ± 0.22 and Isympolar = –0.34 ± 0.16, consistent with underrectification. Finally, the even-order pathway filter (E) and the L filter had similar shapes, [Ishape(Lcart, Ecart) = 0.79 ± 0.17, Ishape(Lpolar, Epolar) = 0.89 ± 0.13], consistent with a reduction of the model of Fig. 4 to a single-pathway LN model. In sum, the indices show that, although a single-pathway feedforward model with an underrectifying nonlinearity might be considered as a first approximation, a more quantitative analysis reveals clear deviations from this picture.

Unit 3003s (Fig. 3B, F1/F0 = 1.7, broad orientation tuning) showed little deviation from a simple LN model, even when analyzed quantitatively. Consistent with its broader orientation tuning, the sensitivity profile of the L filter had only two lobes, and the lobes were less elongated than those of unit 3003t. The shapes of these filters were similar as determined from either Cartesian or polar stimuli: Ishape(Lcart, Lpolar) = 0.94 ± 0.02. As with unit 3003t, the "linear" responses dominated the ONOFF components: Isymcart = –0.59 ± 0.02 and Isympolar = –0.60 ± 0.02, consistent with underrectification. The L filters and E filters were similar in shape [Ishape(Lcart, Ecart) = 0.99 ± 0.01 and Ishape(Lpolar, Epolar) = 0.97 ± 0.02], consistent with a reduction to an LN model.

The other two units at this location had very different behavior. For unit 3003u (Fig. 3C), although both Cartesian and polar stimuli elicited responses that led to oriented, Gabor-like maps, the orientation of these maps differed by approximately 37 deg. Correspondingly, Ishape(Lcart, Lpolar) = 0.37 ± 0.15, a substantial deviation from 1. Consistent with its low F1/F0 ratio of 0.6, the E filter dominated the L filter: Isymcart = 0.48 ± 0.12 and Isympolar = 0.32 ± 0.22. For the Cartesian stimuli, the shapes of the L and E filters were similar [Ishape(Lcart, Ecart) = 0.86 ± 0.15]; there was a moderate difference for the polar stimuli [Ishape(Lpolar, Epolar) = 0.57 ± 0.16]. Thus if only the responses to Cartesian stimuli, or only the responses to polar stimuli, are considered, this neuron's response is generally consistent with an oriented filter followed by overrectification (producing an ONOFF response). However, the full set of responses is qualitatively inconsistent with this picture: the apparent orientation of the of the initial filter depends substantially on the basis set used.

Unit 3003x (Fig. 3D, F1/F0 = 0.28) showed yet another kind of behavior. The spatial maps of the L and E filters were similar across basis set [Ishape(Lcart, Lpolar) = 0.97 ± 0.14) and similar to each other [Ishape(Lcart, Ecart) = 0.98 ± 0.12 and Ishape(Lpolar, Epolar) = 1.0 ± 0.07]. However, confirming the impression that responses to Cartesian stimuli were more symmetrically ONOFF than responses to polar stimuli, Isymcart = 0.56 ± 0.11 (overrectification) but Isympolar = 0.13 ± 0.18 (half-wave rectification). In sum, although an LN model can give a reasonable account of the responses to either stimulus set alone, the apparent degree of nonlinearity for this unit is substantially higher for Cartesian than for polar stimuli.

SITE 2. Figure 5 shows TDH responses from three neurons in a cluster located in upper layer VI/lower layer V of cat V1, and emphasizes the heterogeneity of behavior encountered. Unit 3301s (Fig. 5A) and 3301t (Fig. 5B) had similar orientation optima for drifting gratings (100 and 90 deg), whereas unit 3301u had an optimum orientation of 200 deg. Unit 3301t was strongly directionally selective; the other two neurons were not. The Cartesian responses of units 3301s and t, as characterized by L and E filters, were oriented along their preferred orientation, and similar in shape: Ishape(Lcart, Ecart) = 0.99 ± 0.01 for 3301s; Ishape(Lcart, Ecart) = 0.94 ± 0.03 for 3301t. The relative sizes of the L and the E filters were also consistent with the degree of nonlinearity seen in the grating responses. That is, the relative sizes of the L and the E filters for both units were in the "underrectification" range: unit 3301t had a larger contribution from the E filter than unit 3301s (Isymcart = –0.62 ± 0.03 for 3301s, Isymcart = –0.10 ± 0.08 for 3301t), corresponding to difference in their F1/F0 ratios (F1/F0 = 1 for 3301s, F1/F0 = 0.1 for 3301t). However, both units were nearly unresponsive to polar stimuli. This behavior is qualitatively inconsistent with an LN picture: a broadly tuned front end could not account for the absence of responses to the polar stimuli because they overlap extensively with the Cartesian stimuli in spatial frequency content, whereas a narrowly tuned front end could not account for the presence of responses to the Cartesian stimuli across many ranks. In the 15 other neurons recorded at four other infragranular recording sites in cat, we encountered one additional neuron that responded well to Cartesian stimuli but not to polar stimuli. No such neurons were encountered in the single infragranular site in macaque (layer V, four neurons).



View larger version (42K):
[in this window]
[in a new window]
 
FIG. 5. PSTHs of responses of 3 simultaneously recorded neurons in upper layer V1/lower V of cat VI to TDH functions. Data are displayed as in Fig. 3. Units of A and B respond nearly exclusively to the Cartesian stimuli; the unit of C responds in a similar fashion to both basis sets. A, B, and C: units 3303s, t, and u. PSTH scale bar: 75 impulses/s in A and B, 50 impulses/s in C. Range for pseudocolor maps of filters: ±6 impulses/s (A), ±8 impulses/s (B), and ±5 impulses/s (C).

 
Unit 3301u (Fig. 5C) had a somewhat smaller response, but was approximately equally responsive to Cartesian and polar stimuli. The maps of the L and E filters were consistent with an orientation preference nearly perpendicular to that of the other two units. Notably, this neuron, which would be classified as "complex" from its grating responses (F1/F0 = 0.3), had a predominantly linear response (Isymcart = –0.85 ± 0.15, Isympolar = –0.75 ± 0.10).

MACAQUE RECORDINGS. Figure 6A shows responses from unit 5013s, one of three units simultaneously recorded in layer IVb of macaque V1. All three neurons were poorly oriented simple cells (F1/F0 ≥ 1.6) and directionally biased (0.2 ≤ DSI ≤ 0.6), consistent with the preponderance of directional-selective neurons in layer IVb (Hawken et al. 1988Go). Responses were robust, reaching 150 impulses/s. For each pair of opposite-polarity stimuli, the neuron responded at onset to one stimulus, and at stimulus offset to the other. The other two neurons had largely overlapping receptive field profiles, but differed in the sizes of the spike waveform across the tetrode channels, and in response dynamics. As in unit 3003t of Fig. 3A, quantitative analysis indicated consistency of both the Cartesian and polar responses [Ishape(Lcart, Lpolar) = 0.99 ± 0.01] with a one-pathway simplification of Fig. 4 [Ishape(Lcart, Ecart) = 0.99 ± 0.01, Ishape(Lpolar, Epolar) = 0.98 ± 0.01] and an underrectifying nonlinearity: Isymcart = –0.71 ± 0.04, Isympolar = –0.69 ± 0.04.



View larger version (53K):
[in this window]
[in a new window]
 
FIG. 6. PSTHs of responses of 4 neurons at separate locations in macaque V1. Data are displayed as in Fig. 3. For the unit of A, but not for the other units, the filters L and E deduced from the 2 basis sets are similar. A, B, and C: units 5013s, 5007t, and 5008s. PSTH scale bar: 150 impulses/s in A and B, 75 impulses/s in C. Range for pseudocolor maps of filters: ±60 impulses/s (A), ±40 impulses/s (B), and ±50 impulses/s (C).

 
Along this penetration at the layer IVb/c border, we isolated two units, a nonoriented complex unit 5007s (F1/F0 = 0.1) and the narrowly tuned directionally selective (DSI = 0.8) unit 5007t (F1/F0 = 0.8) shown in Fig. 6B. Responses of 5007s to TDH stimuli were consistent with the standard picture of a complex cell (small L, large but nonoriented E, for both Cartesian and polar stimuli). Corresponding to its intermediate F1/F0 ratio, unit 5007t (Fig. 6B) responded in an excitatory fashion to the appearance of both members of a polarity pair, although the sizes of these responses were often not equal—qualitatively consistent with a mixture of quasilinear and ONOFF inputs. The filters Lcart and Ecart consisted of elongated domains consistent with the orientation preference for grating stimuli; the orientation domains were similar [Ishape(Lcart, Ecart) = 0.88 ± 0.04] but were more clearly delineated for Ecart than for Lcart. The filter Lpolar was similar to that of Lcart [Ishape(Lcart, Lpolar) = 0.94 ± 0.05] and consisted of small, minimally elongated blobs, although the dominant orientation of elongated components of Epolar were shifted approximately 20 deg with respect to that of Ecart. [We do not calculate an index Ishape(Ecart, Epolar) because of the ambiguities in the estimation of the E filters, as described above.] The symmetry index was shifted modestly in the direction of greater rectification for polar stimuli: Isymcart = 0.39 ± 0.06, Isympolar = 0.53 ± 0.06.

Near the border of layer IVc{beta} and layer V, we isolated three complex cells (F1/F0 ratio of 0.1 to 0.15), all of which were highly responsive to gratings and directionally biased or directionally selective (DSI ≥ 0.5). Histologically, this recording site was at the lower border of layer IVc{beta}. However, these response properties are more consistent with neurons in upper layer V, which would be within the likely recording sphere of the tetrode (Gray et al. 1995Go). One of these three units, 5008s (Fig. 6C), had predominantly even-order inputs for both basis sets: Isymcart = 0.78 ± 0.09, Isympolar = 0.88 ± 0.13. However, a clear oriented receptive field domain consistent with this neuron's orientation tuning for gratings was seen only for Ecart (the horizontal excitatory subregion). In contrast, Lcart and Lpolar were weak and nonoriented; Epolar was strong but its orientation was not consistent with the orientation tuning of the grating responses. Also at the same recording site, responses generated by unit 5008t (not shown) were partially consistent with the standard picture of a complex cell (small L, large E for both Cartesian and polar stimuli), but Ecart and Epolar differed substantially in shape. Unit 5008u (also not shown) was the only macaque unit that responded well to drifting gratings but not to TDH stimuli. This was the most directionally selective neuron we encountered (DSI {approx} 1.0).

Population summary    Above, we introduced indices derived from an extension of the standard linear–nonlinear model to analyze the responses to Cartesian and polar TDH stimuli. The index Ishape(Lcart, Lpolar) (Eq. 8) indicates the extent to which the linear filters that best account for the responses to the Cartesian and polar stimuli are similar. (An analogous index for the even-order responses is not straightforward to calculate because of the sign ambiguities described above in connection with Eq. 10). The indices Ishape(Lcart, Ecart) and Ishape(Lpolar, Epolar) (Eq. 10) determine for responses to each basis set, to what extent the two-pathway model of Fig. 4 reduces to a single-pathway model. The indices Isym(Lcart, Ecart) and Isym(Lpolar, Epolar) (Eq. 5) determine, for responses to each basis set, whether the response is primarily full-wave rectifying (Isym = 1), consistent with linearity (Isym = –1), or intermediate. We now examine the distribution of these indices and related quantities across the population.

RECEPTIVE FIELD SHAPE. A value of 1 for the index Ishape(Lcart, Lpolar) corresponds to equality of the estimated Cartesian and polar filter shapes, but measurement errors would tend to bias estimates of Ishape downward away from 1. Therefore, as described in METHODS, we used the jackknife procedure (Efron and Tibshirani 1998Go) to debias the estimates and to determine confidence limits on them. Across the 51 neurons (Fig. 7A), the debiased estimate of Ishape had a mean of 0.76 ± 0.26, with f0.05 = 28/51 and f0.01 = 14/51 (here and below, population statistics are summarized as mean ± SD of the debiased estimates, along with f0.05, the fraction significantly <1 at P < 0.05, and f0.01, the fraction significantly <1 at P < 0.01).



View larger version (21K):
[in this window]
[in a new window]
 
FIG. 7. Distribution of the index Ishape (Lcart, Lpolar) (Eq. 8). Values <1 indicate different effective filtering behavior for Cartesian and polar stimuli. Portions of the histograms shaded black represent units for which values were significantly (by jackknife) <1 at P < 0.01; portions shaded gray are significant at 0.01 at P < 0.05; unshaded portions correspond to P > 0.05. Each panel contains calculations based on a different response measure.

 
The cat subset (mean 0.73 ± 0.27, f0.05 = 21/34, f0.01 = 9/34) and the macaque subset (mean 0.81 ± 0.24, f0.05 = 9/17, f0.01 = 5/17) were similar to each other in this regard (P > 0.20 by Kruskal–Wallis test). The simple cell subset (mean 0.82 ± 0.19, f0.05 = 7/10, f0.01 = 4/10) and the complex cell subset (mean 0.75 ± 0.27 f0.05 = 21/41, f0.01 = 10/41) were also not statistically distinguishable (P > 0.20 by Kruskal–Wallis test). There was a suggestion (P = 0.07, Kruskal–Wallis test) that differences between the filters derived from Cartesian and polar stimuli were more prominent in the infragranular recordings (mean 0.66 ± 0.30, f0.05 = 14/22, f0.01 = 8/22) than in granular (mean 0.83 ± 0.19, f0.05 = 8/16, f0.01 = 3/16) or supragranular recordings (mean 0.84 ± 0.20, f0.05 = 6/13, f0.01 = 3/13).

Thus most neurons in cat and macaque V1 showed a difference in effective filtering behavior when tested with a Cartesian versus a polar stimulus set, and this phenomenon was not restricted to the input or output laminae.

The analysis of Fig. 7A and the subset analysis above used the total number of spikes during the stimulus presentation (0 to 250 ms) as a response measure (the "ON response"). This simple but rather gross response measure may overlook a possibly smaller or greater degree of similarity between the maps over the response time course. To test this, we recalculated the index Ishape(Lcart, Lpolar) for other response measures (Fig. 7, BE): the number of spikes from 0 to 100 ms after stimulus onset (the "ON transient"), the number of spikes during the 250-ms OFF-period (the "OFF response"), the number of spikes during the first 100 ms of the OFF-period (the "OFF transient"), and the first principal component ("PC1"; see Response dynamics below).

As seen from Fig. 7, the distribution of Ishape(Lcart, Lpolar) and the number of units in which this index was significantly <1 was similar for the first five response measures. The significant deviations tended to occur in the same units (not shown). The similarity across these response measures (Fig., 7 AE) indicates that the discrepancy between the effective filtering properties for Cartesian and polar stimuli is not a consequence of the temporal weighting of the response measure. As we will see below, the temporal aspects of the responses to Cartesian and polar stimuli are nearly identical and are heavily dominated by the first principal component, which corroborates the essentially spatial nature of this result.

Orientation and spatial frequency.    Ishape(Lcart, Lpolar) is an omnibus index of the difference in the shapes of the maps Lcart and Lpolar. We used this nonparametric approach because many of the maps do not conform closely to Gabor profiles or other shapes that are well described by a small number of parameters. To seek systematic trends in how these maps change, we now examine two parametric descriptors of the maps Lcart and Lpolar: the orientations ORpeakcart and ORpeakpolar and spatial frequencies SFpeakcart and SFpeakpolar of the peak in their Fourier transforms (Fig. 8). As with Ishape(Lcart, Lpolar), 95% confidence limits on these parameters were determined by a jackknife applied to maps determined with single blocks of trials dropped.



View larger version (24K):
[in this window]
[in a new window]
 
FIG. 8. Best orientation (deg) (A) and spatial frequency (c/deg) (B) as determined by Fourier transformation of the maps of the spatial filters Lcart and Lpolar. Error bars are 95% confidence limits determined by jackknife, and data are plotted only for units in which there was a well-defined best orientation or spatial frequency. There was a modest (rcirc = 0.44, P < 0.02) correlation between estimate the best orientation and no correlation between the estimated best spatial frequencies. See text for details.

 
Values of ORpeak were considered significant if their confidence limits included less than the full range of 0 to 180 deg. By this criterion, 35 units (of 51 total) had a significant ORpeakcart; 25 units had a significant ORpeakpolar. In the 20 units in which ORpeakcart and ORpeakpolar were both signi