Regression-Based Identification of Behavior-Encoding Neurons During Large-Scale Optical Imaging of Neural Activity at Cellular Resolution

Andrew Miri, Kayvon Daie, Rebecca D. Burdine, Emre Aksay, David W. Tank


The advent of methods for optical imaging of large-scale neural activity at cellular resolution in behaving animals presents the problem of identifying behavior-encoding cells within the resulting image time series. Rapid and precise identification of cells with particular neural encoding would facilitate targeted activity measurements and perturbations useful in characterizing the operating principles of neural circuits. Here we report a regression-based approach to semiautomatically identify neurons that is based on the correlation of fluorescence time series with quantitative measurements of behavior. The approach is illustrated with a novel preparation allowing synchronous eye tracking and two-photon laser scanning fluorescence imaging of calcium changes in populations of hindbrain neurons during spontaneous eye movement in the larval zebrafish. Putative velocity-to-position oculomotor integrator neurons were identified that showed a broad spatial distribution and diversity of encoding. Optical identification of integrator neurons was confirmed with targeted loose-patch electrical recording and laser ablation. The general regression-based approach we demonstrate should be widely applicable to calcium imaging time series in behaving animals.


The combination of two-photon laser scanning microscopy (TPLSM) with functional indicators, such as bulk loaded or genetically encoded calcium sensors, is now widely used to provide measurements of neural activity from large populations of neurons at cellular resolution (Hasan et al. 2004; Stosiek et al. 2003). It is being increasingly applied to neural circuits in awake, behaving animals in which the neural coding and dynamic properties of the neurons are studied in relation to presented stimuli (Andermann et al. 2010) or behavioral responses (Dombeck et al. 2009; Komiyama et al. 2010). In general, improved methods to identify individual neurons in fluorescence image time series are desired. Ideally, the identification would be rapid and precise, providing information on the functional properties of the individual cells with minimal human supervision. Cell identification can be used to delineate the spatial extent and boundaries of neural circuits in an anatomically complex brain area. If performed on-line, it could be used either to target rapid scanning systems, providing higher time resolution measurements, or to target cell-specific perturbation through ablation or optogenetic stimulation to examine the causal role of the neurons in circuit function. In addition, cells with specific properties could be targeted for subsequent neuroanatomical study through, for example, visually guided electrode-based electroporation with morphological probes or transynaptic tracers.

Typically, the process of cell identification and the characterization of cell coding and functional properties are separately tackled. A common approach is to manually (Dombeck et al. 2007; Kerr et al. 2005; Niell and Smith 2005) or semiautomatically (Dorostkar et al. 2010; Ohki et al. 2005) identify regions of interest (ROIs) based on image contrast delineating the expected morphology of the soma (or dendrites). Fluorescence changes, spatially averaged across these ROIs, can then be correlated with behavioral state (Dombeck et al. 2009; Komiyama et al. 2010) or response to a stimulus ensemble (Ohki et al. 2005; Sato et al. 2007; Stettler and Axel 2009) to identify cells with defined neural correlates. In a second approach, the statistical independence of temporal correlation (ICA) of activity-dependent fluorescence between pixels is used to cluster the image into components that, in some cases, can then be assigned through morphological criteria as individual cells (Mukamel et al. 2009). Unfortunately, manual ROI identification and ICA-based segmentation are time consuming and have been applied primarily as a post hoc data analysis method (but see Valmianski et al. 2010).

The approach we describe here is based on a different principle: pixel-by-pixel regression of fluorescence time series against a set of time series of variables that quantify a behavior. The method was inspired by linear regression-based methods developed for the analysis of functional magnetic resonance imaging (fMRI) data (Friston et al. 1995; Worsley and Friston 1995). A standard method for identifying units of brain volume (“voxels”) that show blood oxygenation level dependence (BOLD) signals correlating with variables such as sensory stimuli, cognitive state, or behavior is to fit a linear model to each voxel's BOLD time series in which some parameterization of such variables is among the set of regressors. Because the BOLD signal is modeled as a convolution of brain activity with a hemodynamic response function that approximates the mapping between brain activity and the BOLD signal, variables expected to be encoded within brain activity are convolved with this hemodynamic response function before regression coefficients are calculated. The significance of each voxel's correlation with a particular regressor can then be assessed from these coefficients. We recognized that image time series collected from tissue labeled with fluorescent calcium indicators could be analyzed in a similar manner. Fluorescence fluctuations can be modeled as a convolution of discrete impulses representing neuronal action potentials with a calcium impulse response function (CIRF) that approximates the rapid rise and slower decay of intracellular calcium induced by each spike (Helmchen and Tank 2005; Tank et al. 1995). By regressing image pixel fluorescence time series on behavioral variables convolved with the appropriate CIRF, our approach automatically identifies pixels from neurons whose firing rate is a neural correlate of these variables. Although more sophisticated formulations of the mapping between firing and intracellular calcium have been devised that significantly aid spike detection from optical recordings (Vogelstein et al. 2009), our simplified model facilitates the rapid localization of image regions corresponding to cells that putatively encode these behavioral variables. These regions can then be broken down into ROIs representing single cells using either a manual or semiautomatic method based on shape and size.

To illustrate our method, we apply it to the identification of neurons of the velocity-to-position oculomotor neural integrator for horizontal eye movements (hVPNI) in the larval zebrafish. The hVPNI is a model system for investigating the mechanisms of persistent neural activity, which is a sustained change in neuronal firing arising from a transient input (Aksay et al. 2000; Major and Tank 2004). Persistent neural activity is observed in a variety of brain areas, where it is thought to encode short-term memory (Fuster and Alexander 1971; Goldman-Rakic 1995; Prut and Fetz 1999; Romo et al. 1999). Although the hVPNI has historically been studied in monkeys (McFarland and Fuchs 1992), cats (Lopez-Barneo et al. 1982), and goldfish (Pastor et al. 1994), creating a larval zebrafish model should facilitate the use of emerging genetic and optical techniques for interrogating its underlying microcircuitry. In hVPNI neurons, persistent neural activity encodes an eye position signal temporally integrated from the eye velocity-encoding burst signals that drive saccades, necessary for stabilizing gaze during fixations between saccades (Cohen and Komatsuzaki 1972; Skavenski and Robinson 1973). Previous work in the goldfish has localized this integrator to a region in the caudal brain stem termed Area I and examined the mechanisms of persistent neural activity therein (Aksay et al. 2000, 2001, 2007). However, numerous questions remain regarding these mechanisms—questions that may be readily addressed with imaging approaches.

Here we report that our regression-based method succeeds in rapid on-line identification of many putative hVPNI neurons in the larval zebrafish hindbrain, a species in which the location of the hVPNI was previously unknown. Our method was applied to Oregon Green BAPTA-1 (OGB-1) fluorescence image time series collected using a custom-built microscope that allows synchronous eye tracking and two-photon laser scanning imaging during spontaneous eye movement. Localization of hVPNI neurons enabled targeted electrical recording and laser ablation that verified the expected firing patterns and integrator functionality of cells within the imaged region. Our data also revealed a broad distribution of putative hVPNI neurons, suggesting that the hVPNI may be distributed in the developing hindbrain. Such a distribution would have been extremely laborious and highly disruptive to characterize using conventional electrical recording-based search methods. Our results directly demonstrate the utility of our regression-based cell identification method and establish a viable zebrafish preparation for studying the mechanisms of persistent activity. Given the broad utility of regression-based approaches in the analysis of fMRI data (Friston 2005), we expect our method and variations thereof to have many applications in optical imaging-based studies of microcircuitry underlying behavior.


Dye loading

mitfa−/− (nacre) mutant zebrafish larvae (Lister et al. 1999), aged 5–12 days postfertilization (dpf) and ranging in length from 4.0 to 4.8 mm, were used for all experiments. Experimental procedures were approved by Princeton University's and Weill Cornell Medical College's Institutional Animal Care and Use Committees. Embryos were collected and reared according to established procedures. Larvae were anesthetized with 0.01–0.02% (wt/vol) ethyl 3-aminobenzoate methanesulfonate (MS 222, Sigma) in Evans solution (in mM): 134 NaCl, 2.9 KCl, 2.1 CaCl2, 1.2 MgCl2, 10 HEPES, and 10 glucose (pH 7.8, 290 mOsm) and mounted in 1.7% low melting point agarose (SeaPlaque, Lonza). For functional imaging, Oregon Green BAPTA-1-AM (OGB-1; Invitrogen) was dissolved at 10 mM in DMSO + 20% (wt/vol) pluronic acid and diluted 10:1 into a modified Evans solution (Evans plus 0.1–0.3% (wt/vol) fast green, without divalent cations, diluted with water back to 290 mOsm). Injection pipettes were pulled (borosilicate, 1.0 mm OD, 0.5 mm ID, FHC; P-2000 puller, Sutter), so that the shaft diameter was no more than 12 μm at a distance 50 μm from the tip, and beveled (BV-10, Sutter) down to a resistance of 8–20 MΩ with Evans solution internally. Dye solution was pressure ejected (Picospritzer II, General Valve) under visual control into rhombomere 7/8 at the caudal edge of the otic vesicle ventral to the medial longitudinal fasciculus (MLF), at pressures ranging from 7 to 70 mbar for between 5 and 15 min to achieve a steady visible plume of dye significantly less optically dense than the dye in the electrode. Larvae were removed from the agarose and allowed to recover for >3 h before imaging and electrical recording.

For reticulospinal (RS) neuron labeling, larvae were embedded in agarose on their sides and an incision was made through the skin into the ventral spinal cord using a sharp tungsten needle. A drop of 50% (wt/vol) dextran-conjugated Alexa 488 (10 kD; Invitrogen) was dried on the tip of a borosilicate patch pipette, which was then briefly inserted into the incision to transfer much of the drop into the spinal cord. At >12 h after dye loading, fish were reembedded in agarose and imaged both on an epifluorescence dissecting microscope (MZ16FA, Leica) and on the custom two-photon microscope used for optical recording described in the following text. Sagittal images of the RS neurons in rhombomeres 4–7/8 were collected on the latter microscope with rostrocaudal image position measured from the dorsal peak of the otic vesicle in the sagittal plane in which the posterior cerebral vein bends caudally around the ear (“ear peak”; Fig. 2C). The centroid of neurons was measured relative to image boundaries off-line in Matlab, allowing calculation of the rostrocaudal distance of each neuron from this landmark.

Combined optical recording and eye tracking

A custom microscope was built to enable synchronous eye tracking and two-photon laser scanning imaging (Fig. 2, A and B). A ×60/0.9 numerical aperture (NA) water-immersion objective (Olympus) projected laterally through a flexible latex diaphragm into a cylindrical plastic chamber filled with water from the fish rearing facility. Larvae were embedded in a thin layer of 1.7% low melting point agarose, dorsal side up, and the agarose was removed from around the eyes. A rectangular agarose block containing the fish was cut out and mounted on the surface of a Sylgard block that sat on the floor of the chamber. Prior to image collection, the MLF in each hemisphere was found using transmitted light and the midpoint between them was defined as the midline separating hemispheres. Excitation light (880 or 920 nm) was provided by a Mira 900 Ti:sapphire laser, pumped by a Verdi V-10 or a Cameleon Ultra II (Coherent). Beam power was controlled with a Pockels cell (model 302, Conoptics) and generally ranged from 20 to 40 mW after the objective. Collected emission passed through a dichroic mirror (Fig. 2A; DM1, FF670-SDi01, Semrock), two lenses (L1, aspheric f = 27 mm ARB2 vis, Linos; L2, f = −29 mm, Omega), a band-pass filter (F3, FF01-680/SP-25, Semrock), and a long-pass filter (F4, BLP01-488R-25, Semrock). Green and red collection channels were split using another dichroic mirror (DM2, XF1018/580DRLP, Omega) and passed through band-pass filters (green: F6, FF01-525/40-25; red: F5, FF01-670/70-25; Semrock) and lenses (L3, aspheric f = 18 mm ANR, Linos) before collection by GaAsP photomultiplier tubes (PMTs; 1077P-40; Hamamatsu). For imaging under visible light, the chamber was illuminated using a 460-nm light-emitting diode (LED; filtered by a short-pass filter (F1, FF01-460/60-25, Semrock). Fluorescence data acquisition and microscope control were performed using Cfnt v.1.529 (Michael U. Müller, MPI, Heidelberg, Germany). Images were 256 × 256 pixels, acquired at 2 ms per line (∼2 Hz) in time series of 750 frames. Dye bleaching, quantified as the percentage change between the mean frame-averaged fluorescence (average over all pixels) for the first 50 frames and the mean for the last 50, was minimal. Over all image time series, the mean change in frame-averaged fluorescence over the approximately 384 s of acquisition was −1 ± 2% (mean ± SD) in the light (n = 42) and −3 ± 5% in the dark (n = 42). To locate the image window relative to anatomical landmarks measured in RS cell-labeled larvae, the image window was positioned with its rostral boundary at the ear peak then translated caudally to the position from which fluorescence image time series were collected.

A 945-nm LED ( illuminated the chamber from below through an opal glass diffuser. A mirror, long-pass filter (F2, FEL0950, Thorlabs), and charge-coupled device (CCD) camera (Hitachi Kokusai Electric) were positioned above the chamber to collect video images processed in real time to extract eye position as previously described (Beck et al. 2004). Briefly, ROIs were predefined for the eyes and a fixed segment of the body (typically the swim bladder) from a reference image. During data collection, the ROIs were thresholded. The two largest objects within the eye ROI were defined as the eyes and a pixel seeding operation was used to smooth them. A line connecting the center of the suprathreshold body segment and the midpoint between the center of the smoothed eyes defined the body axis. Horizontal eye positions were measured as the angle formed by the major axis of the eye and the body axis. Position data were converted to an analog voltage that was digitally recorded at the beginning of each image line (500 Hz) in Cfnt, providing direct synchronization of eye movements with fluorescence images. Values from all lines within a given image frame were averaged to calculate the frame-averaged eye position P for that frame's time point.

Targeted electrical recording

Larvae aged 6–9 dpf were prepared for imaging as described earlier, but additionally a small incision was made in the skin above the hindbrain to facilitate penetration of a recording electrode; the chamber was filled with 0.1× Evans solution. Image time series were collected and analyzed (see following text) to identify eye position-sensitive neurons for recording. Patch pipettes (∼5–10 MΩ in Evans solution, borosilicate, 1.2 mm OD, 0.6 mm ID, FHC; P-2000 puller, Sutter) were filled with 500 μM Alexa 594 hydrazide (Invitrogen) dissolved in either 2 M NaCl or Evans solution. Pipettes were pulled with a long taper to minimize disturbance to the tissue (penetration depths reached 150 μm). Positive pressure (20–40 mbar) was applied to the lumen of the pipette during insertion of the electrode through the incision; 4–8 mbar of positive pressure was used while moving the electrode through the tissue toward the targeted cell. Comparison of the red (primarily Alexa 594 defining extracellular space) and green (OGB-1) fluorescence channels was used to guide the pipette onto the cell. A brief negative pressure pulse was used to form a loose seal. Membrane voltage fluctuations were recorded using a patch-clamp amplifier (AxoClamp 2B, Axon Instruments), band-pass filtered from 0.3 to 2 kHz (SR560, Stanford Research Systems; FLA-01, Cygnus Technologies), digitized at 5–10 kHz (Digidata 1440A, Molecular Devices), and recorded in Clampex (v.8–10, Molecular Devices) along with eye position signals. Spikes were identified in single-unit recordings using custom scripts written in Matlab (v.7.8 or 7.10, The MathWorks; Aksay et al. 2000). Voltage traces were first thresholded (typically at −200 μV) to identify negative-going transients. For all identified transients, peak-to-trough amplitude and, typically, the time separation between peak and trough were plotted against each other and a bounded region defining a single cluster of points was specified. In a few cases, the sum of the positive-going peaks surrounding the negative-going trough was used in place of the time separation to better distinguish action potentials from fast noise transients. Spike times for events within the region were used in subsequent analyses. Firing rate time series were calculated as the inverse of the instantaneous interspike interval time series.

Regression-based cell identification

All analysis of image time series was completed in Matlab. A two-step process was used to identify neurons displaying neural activity correlated with eye movement. In the first step, we used linear regression to identify pixels whose fluorescence was consistent with an underlying firing rate encoding eye position or eye velocity. In the second step, either manual or semiautomatic methods were used to cluster and segment identified pixels into soma of individual neurons.


Pixels with fluorescence changes correlated with eye position or velocity were identified in image time series using the following sequence: 1) motion correction of each frame in the time series; 2) elimination of dim and maximum intensity pixels; 3) linear regression of pixel time series against eye movement variables; 4) normalization of pixel Z scores for motion-induced scaling; and 5) application of a false discovery rate (FDR) correction algorithm to establish a P value threshold that is then used to identify highly correlated pixels. In the following text we consider each of these in greater detail. Matlab code to implement this method, together with exemplary data files, is provided in the online Supplemental Material.1

Motion correction.

A time projection image (i.e., an image where the value of each pixel is the average intensity of that pixel across the entire time series) was first calculated and used as a reference image. Then, for each frame of the time series, the two-dimensional (2D) cross-correlation with the reference image was determined as the frame was shifted along the x- and y-axes. Bilinear interpolation was used to produce shifts in 0.25-pixel increments. The motion-corrected position of the frame was chosen from a center-of-mass calculation using values of the correlation at different shifts. The motion-corrected image pixel values were calculated using bilinear interpolation of the original pixels relative to the shifted coordinates. Details of the algorithm are provided in the Supplemental Material. Frames acquired during large (>5 μm) translations accompanying occasional body twitches were easily identified as large-amplitude transients in plots of the value of the maximum 2D cross-correlation versus time and were deleted from the time series. The number of twitches per series ranged between 0 and 5.

Elimination of pixels.

The mean and SD of an individual pixel's intensity changes over time (“pixel time series”) were calculated separately for every pixel in the motion-corrected time series. Pixels with very weak fluorescence were defined as those whose mean fluorescence intensity in the time series was <2 SDs above 0 (Supplemental Fig. S1). These pixels were omitted from further analysis because we determined that their residuals after regression tended to deviate from normality, complicating significance testing of correlations. They typically represented areas absent of cell soma or areas weakly labeled with calcium indicator that were located along the edge of the imaged area. A subset of image time series had a small fraction of pixels whose pixel time series were uniformly equal to the maximum pixel intensity throughout their entire extent. These pixels were also omitted from subsequent analysis.

Linear regression of pixel time series.

We began the analysis by defining three regressors {p, v, f} (Fig. 3A), where p represents the expected calcium response for a cell with firing rate proportional to the frame-averaged eye position P. It was established by convolving P with a calcium impulse response function (CIRF), consisting of an exponential decay with a 1.61-s decay time (see following text for a description of how this CIRF was established). The second vector v is based on the calcium responses expected from firing rates proportional to eye velocity. It was calculated from the frame-averaged eye velocity (dP/dt calculated numerically as the derivative of measured eye position); v was thresholded to be 0 except during ipsiversive saccades (ipsilateral eye moves temporally; “ipsiversive velocity”) and then convolved with the CIRF. The third vector f is the frame-averaged total fluorescence intensity, meant to capture variation in laser intensity and/or bleaching. Orthonormalized regression vectors {p, v, f} were determined by application of the Gram–Schmidt process [Matlab shared function: Gram_Schmidt_Process (].

The pixel time series X were then fit with the linear regression model X=Gβ+ε where for n time points and m image pixels, X is an n × m matrix with individual mean-subtracted pixel time series as columns; G is an n × 3 matrix of orthogonal regressors, including the vectors p, v, and f as columns; β is a 3 × m matrix of regression coefficients; and ε is the residual matrix (Fig. 3A). Two separate regressions were performed for each image time series: one to measure the significance of each pixel time series' correlation with p and one to measure the significance of each pixel time series' correlation with v. In the first case, the matrix [pvf] was orthonormalized with p as the first regressor so that p = p/‖p‖ and, in the second case, the matrix [vpf] was orthonormalized with v as the first regressor so that v = v/‖v‖. Including orthonormalized regressors beyond the primary regressor (p or v) is a principled means of capturing variance in pixel time series beyond that produced by correlation with this regressor, thereby reducing residuals and providing more significance to correlation estimates (Fig. 3B).

The least-squares estimator (β̂) of the regression coefficients β was then calculated as β=GTX with fit residuals ε̂ = XGβ̂. We found that the residuals following this regression were well described by a normal distribution. More precisely, Kolmogorov–Smirnoff tests of residual normality (Matlab function kstest), performed on the residuals (the columns of ε̂, denoted ε̂*,i) for 1,000 randomly chosen pixels in each image time series, failed to show hypothesis rejection rates inconsistent with normally distributed residuals. For the 88 image time series, the mean rejection rate at a significance threshold of 0.05 was 0.036 ± 0.007 (mean ± SE).

Normalized Z scores.

T statistics (Hoel 1984) were calculated for every pixel time series to provide estimates of the significance of its correlation with p or v. In either case, the T-score Ti for pixel i is given by Ti=β1,jεT*,εi*,in3 where either β^p,i or β^v,i is used for β̂1,i. T scores for all pixels in a given image series were then converted to Z scores. This was accomplished by calculating the fraction of the T distribution falling below a given pixel's T score and then assigning this pixel the Z score below which falls an equivalent fraction of the standard normal distribution. Although the large number of time points in our pixel time series dictates that T and Z scores will be essentially equivalent, in general this conversion will be necessary for the SD calculation described next.

The resulting distribution of Z scores can be used to identify correlated pixels as those with Z scores above a selected threshold. These distributions were asymmetric, with the negative portion resembling a normal distribution whereas the positive side exhibited an extended tail (Fig. 3C). Single-unit electrical recordings in adult goldfish suggest a near absence of neurons whose activity negatively correlates with ipsilateral eye position or velocity. However, the distribution of Z scores below zero was generally wider than that of a standard normal distribution (Fig. 3C), the distribution expected under a null hypothesis of no correlation between pixel time series and a primary regressor. As shown in Supplemental Fig. S2, numerical simulations suggest that the observed Z score distribution shapes can be accounted for by the fact that residual specimen motion (i.e., motion not corrected by the motion correction algorithm, particularly along the optical axis) that is coupled to eye movements would be expected to produce artifactual pixel correlations with eye movement variables yielding a symmetric distribution of Z scores with SD >1.0 (see Supplemental Material). We therefore adopted the following normalization strategy: First, we estimated the Z score distribution expected from the motion artifacts alone by taking the negative half of the experimental Z score distribution for a given image time series and combining it with a positive distribution that is its mirror image around Z = 0. Then we calculated the SD of this estimated distribution and divided all Z scores in the original distribution by this number. We can then reasonably assume in assessing significance that under a null hypothesis of no activity-induced correlation with a primary regressor, the Z scores are drawn from a standard normal distribution. The distributions corresponding to the steps in this normalization are illustrated in Fig. 3, D and E. An example of pixel Z scores plotted in image space is shown in Fig. 4A. P values were calculated from Z scores assuming a two-tailed test of significance.

False discovery rate thresholding.

Differences across experiments due to, for example, variability in calcium indicator labeling, specimen preparation, and motion could produce differences of the corresponding Z score distributions, even after the above-mentioned normalization. We reasoned that a principled way to combine and compare data across different experiments was to set the rate at which pixels are erroneously identified as correlated with the primary regressor due to firing rate, that is, the false discovery rate (FDR), the same across experiments. We used an FDR of α = 0.2 for thresholding correlations with p and α = 0.05 for correlations with v. These rates were set empirically to ensure that many identified somata displayed average fluorescence time series that had Pearson correlations with the primary regressor ranging below 0.5. We accomplished this using an algorithm developed for bioinformatics applications (Storey 2002; Storey et al. 2004). The general strategy is to use the actual p value distribution for pixels from a given image time series to estimate the FDR for a successively more selective set of p value significance thresholds. One then selects and uses the threshold that produces the largest FDR that is less than the desired rate. The underlying assumption is that the least significant p values are produced by the uncorrelated data and the deviation in the distribution of low p values from a uniform distribution can be used to estimate the FDR. The specific steps we used are:

  1. Starting with γ = α, evaluate the FDR as FDR(γ)=#{pi>λ}γmax{#{pi>λ},1}(1λ) for some λ chosen according to the algorithm detailed in section 6 of Storey et al. (2004). λ values ranged from 0 to 0.80. Here, #{s} refers to the number of elements in the set s.

  2. Reevaluate FDR with increasingly smaller values of γ from the set {α3,α10,α30,α100,,α30,000} and select the threshold level pthreshold as the first γ that produces an FDR <α.

  3. Using this threshold, identify pixels in the image series as significantly correlated if their corresponding p value is less than this value.


The following contextual enhancement algorithm was then applied to maps of pixel p values in image space to smooth the boundaries of significantly correlated regions. The intent here is to account for a pixel's “context” by eliminating significant pixels from regions if they have too few significant neighbors and including subsignificant pixels if many neighboring pixels are significant. First, for all image pixels i having p values pi, we calculate qi = 1 − pi and the number of neighboring significant pixels ui (ui ≤ 8). Next, for all pixels an updated qi value i is calculated as i = qi + (L/6)(ui − 3.5), where L = 1 − pthreshold. The constants in this equation were chosen arbitrarily to achieve boundary smoothness akin to that observed in images of actual cells. They define how much relative weight is given to contextual information and determine a trade-off between region smoothness and the inclusion of subsignificant pixels. Each pixel is then reassessed by comparing i to L; pixels for which i > L were deemed significant. This method is repeated until the boundaries of contiguous regions either do not change or alternate between two states with successive iterations (Salli et al. 2001; Tillikainen et al. 2006).

To define ROIs corresponding to individual somata, ROI boundaries could be manually traced around individual somata falling within contiguous regions. However, for on-line analysis to enable targeted electrical recording immediately following image time series collection, an automated segmentation method was used. Contiguous regions spanning areas <60% of the cross section of a typical soma (20 μm2) were ignored, regions between 60 and 120% of this cross section were assigned as individual somata, and regions >120% of this cross section were segmented into individual ROIs as follows (Supplemental Fig. S3). Z scores for pixels falling within a contiguous region following contextual enhancement were mapped onto image space (Supplemental Fig. S3A) and the resultant maps were spatially smoothed with a 2D Gaussian having a half-width of nearly 1 pixel (Supplementary Fig. S3B). The pixel corresponding to the highest Z score within a contiguous region was set as the center of an ROI (Supplemental Fig. S3C). Straight segments were traced leading outward from this pixel in 16 directions uniformly spaced at 22.5° increments. Each segment continued until either the boundary of the contiguous region was reached, a pixel's Z score was higher than that of the previous pixel in the segment, or the segment length reached 2.5 μm (one half a typical soma width), whichever came first, and a boundary point was drawn at the segment's last pixel. Line segments connecting the termination points of adjacent paths were drawn, generating an ROI outline. This process was repeated (Supplemental Fig. S3D), beginning each time at the pixel corresponding to the highest remaining Z score outside of an ROI until the remaining area within the contiguous region was <60% of the cross section of a typical soma. All ROIs smaller than this same threshold were discarded. To verify that our method would perform as expected, we ran our cell identification script in which it is implemented on synthetic data for which the regressor correlations were known (Supplemental Fig. S4; see Supplemental Material). No deviations from expectations were observed.

ROI time series were calculated as the average of the pixel time series for the included pixels. The CIRF-convolved eye position correlation cp and the CIRF-convolved ipsiversive velocity correlation cv for a given ROI were defined as the Pearson correlation coefficient between the ROI time series and p and v, respectively. We tested whether cp values were inflated by the inclusion of pixels in ROIs whose CIRF-convolved eye position correlation was increased by chance due to noise and the exclusion of pixels whose CIRF-convolved eye position correlation was decreased by chance due to noise. This was done by identifying ROIs from a set of image time series using only the odd-numbered frames, calculating cp separately from odd- and even-numbered frames, and then measuring the percentage difference between these two values. For all ROIs identified across seven image time series, the mean percentage difference was 0.02%. This difference was not significant (one-tailed t-test, P = 0.45). ROI-averaged Z scores were calculated as the mean of the Z scores corresponding to all included pixels. Separate averages 〈Zp〉 and 〈Zv〉 were calculated using pixel Z scores from regressions using p and v, respectively, as primary regressors. ROI position was defined using the ROI centroid. ROI positions were measured in distances caudal of the Mauthner cell soma, dorsal of the MLF, and lateral of the midline.

Firing rate/eye movement correlations

Firing rate, eye position, and ipsiversive eye velocity time series were convolved with a 200-ms box window. Position and ipsiversive velocity sensitivities for firing rates were calculated as the Pearson correlation between firing rate time series and eye position or ipsiversive velocity time series, respectively. Significance thresholds for these sensitivities were calculated using a bootstrap method. To preserve temporal autocorrelation, firing rate time series were circularly permuted (circularized and rotated) in time a random amount. Sensitivities were then calculated for these rotated time series as was done for the original series. The significance thresholds (P = 0.05) depicted in Fig. 6E are the 95th percentile value of the distribution of sensitivities for 1,000 rotated samples.

Saccade identification and STA calculations

To identify saccades during an image or firing rate time series, eye velocity was calculated from eye position measurements recorded in image lines (500 Hz) or in Clampex (5–10 kHz) files, respectively. These time series were then convolved with a 50-ms box filter and plotted (Aksay et al. 2000). A velocity threshold above which all saccade-related transients but only saccade-related transients passed was manually defined, typically at roughly 10–20°/s. Saccade times were defined as the first time point at which a given transient passed above this threshold. Somatic saccade-triggered average (STA) fluorescence responses were assembled for a window stretching from 3 s before saccade time to 5 or 6 s following saccade time using ROI time series linearly interpolated up to 500 Hz. This interpolation allows fluorescence measurements collected at about 2 Hz to be aligned to saccade time with 2-ms resolution. STA firing rates were assembled from 1-kHz firing rate time series. Segments of the interpolated ROI or firing rate time series surrounding a saccade were included in STAs only if 1) the previous saccade was in the opposite direction and 2) no other saccades occurred within the segment. The first criterion ensured that included saccades had reasonably uniform sizes and preceding histories. STA fluorescence responses and firing rates were convolved with a 200-ms box filter before subsequent analysis.

CIRF estimation

CIRFs were estimated for a population of manually segmented cells for which cp >0.5 by fitting Aet (Matlab function fit; nonlinear least-squares method, Trust-Region algorithm), to the contraversive STA fluorescence response. The portion of the ipsiversive STA 1–2 s prior to saccade time was averaged to calculate a baseline fluorescence assumed in this fitting. The last second of the contraversive STA was omitted during fitting to avoid transients associated with the subsequent saccade. This method may systematically underestimate the CIRF τ for cells whose fluorescence has not completely decayed prior to subsequent saccades. However, we expect this underestimation to be minimal since CIRF τ · 3 is less than the mean saccade frequency of about 10 s for 94% of cells for which estimates were made.

Bootstrap P values were calculated for the mean correlation between STA fluorescence and CIRF-convolved firing rate by first generating a synthetic distribution of 1,000 time constants having approximately the same mean and SD as the set of CIRF τ estimates calculated for cells from which electrical recordings were made. We used this distribution, rather than the distribution of CIRF τ estimates made from all optically recorded cells, because the mean CIRF τ for electrically recorded cells was somewhat higher than that of the latter distribution. This may be due to higher average dye concentration in electrically recorded cells, which made them easier to target but also slowed calcium extrusion. We did impose a minimum value on the synthetic distribution equal to that of the distribution of CIRF τ estimates made from all optically recorded cells, rather than use the equivalent minimum for the distribution from electrically recorded cells. This last distribution included only 8 values and so the minimum of its true underlying distribution was probably poorly estimated by the minimum measurement. Using CIRF τ values randomly drawn from the synthetic distribution in place of actual CIRF estimates, 1,000 mean correlations between STA fluorescence and CIRF-convolved firing rate, both ipsiversive and contraversive, were generated. The fraction of these values larger than the actual measured mean correlations defined a P value for the significance of the dependence of these actual means on cell-specific CIRF τ estimates.

Eye movement statistics

Eye movement statistics were calculated from >8 min of eye position recording during spontaneous eye movements in each larvae. Control data (n = 8 larvae) were collected using the same microscope and embedding protocol as used for imaged larvae, although fish were not injected with dye and the laser was not scanning during data acquisition. The saccade rate was calculated as the number of saccades/s. Peak saccade velocity was calculated as the maximum of the absolute value of eye velocity during a window from 300 ms before to 500 ms after each saccade. Saccade amplitude was calculated as the absolute value of the difference between the mean eye position 400 to 300 ms before the saccade and the mean eye position 500 to 600 ms after the saccade. Mean centripetal eye velocity was calculated as the mean rate of drift toward the center of gaze, omitting data points from 300 ms before to 500 ms after each saccade.

Laser ablation

Ablation experiments were carried out on a separate custom microscope with functionality similar to that described earlier, but with a vertically oriented objective (Supplemental Fig. S5A). This made both hemispheres of the hindbrain accessible for imaging and ablation. Larvae were embedded in 1.5% low melting point agarose (Sigma A-0701) and imaged from the dorsal surface. Excitation light (930 nm) was provided by a MaiTai HP laser (Newport) and scanned through two lenses (L1, f = 150 mm AC508-150-B; L2, AC508-75-B f = 75 mm; Thorlabs) into a ×40/0.8 NA water-immersion objective (OBJ1; Olympus LUMPLFL40XW/IR2). A continuously variable neutral density filter was used to control beam power (F1, NDC-25C-4M; Thorlabs). Epifluorescence signals were reflected by a long-pass dichroic mirror (DM1, 720dcxruv; Chroma Technology), then separated into red and green components by another dichroic mirror (DM2, 565dcxr; Chroma Technology), and passed through band-pass filters (F1 for green, ET525\50m-2p and F2 for red, ET605\70m-2p, both Chroma Technology) before collection by PMTs (PMTS, R3896; Hamamatsu). Contrast images were acquired using a switchable-gain, large-area Si detector to collect off-axis scattered light from the beam (CCD1, PDA36A, 350–1,100 nm; Thorlabs). Fluorescence and contrast image data acquisition and microscope control were performed using ScanImage ( via a DAQ board (NI PCI-6110; National Instruments). Eye tracking was performed with 850-nm illumination provided by an LED through the back of OBJ1 (LED, LED851L; Thorlabs) collected via a ×4/0.1 NA objective (OBJ2; WPI), a band-pass filter (F4, XF3308; Omega), and a firewire CCD (CCD2, Guppy; Allied Vision Technologies) connected to a second PC running a custom Matlab program based on the principles described in Beck et al. (2004) (Supplemental Fig. S5B). Fluorescence and eye tracking were synchronized using the TTL (transistor–transistor logic) pulses that trigger the onset of fluorescence acquisition in ScanImage to trigger onset of eye tracking on the second PC via a USB DAQ (NI USB 6008; National Instruments).

After collecting baseline eye position measurements, ablations were performed by positioning the laser beam focus within rhombomere 7/8. A location was chosen that had been found to contain a high density of position-sensitive cells via the regression-based analysis of image time series described earlier. Exposure of the tissue to a stationary beam at elevated powers (∼200 mW) produced lesions that were seen under contrast and fluorescent imaging (Supplemental Fig. S5, B and D). These lesions were likely due to plasma generation, as previously reported (Galbraith and Terasaki 2003). Under fluorescence imaging these lesions appear as bright spheres, the size of which depends on the power and the duration of exposure. It was found that lesions <10 μm in diameter failed to produce apparent deficits in fixation stability. However, we also discovered that the majority of the time, producing damage >10 μm using high powers resulted in a decrease in saccade frequency and amplitude.

To avoid disruption of saccadic behavior while producing integration deficits, we developed a two-step procedure in which high powers were used to produce initial damage smaller than 10 μm and subsequent lesion expansion was performed using lower powers. After the initial lesioning, fish were allowed to recover by sitting at 28.5°C for 10 min. Once the initial lesion was created, the lesion region became very sensitive to beam intensity. We took advantage of this sensitivity to gradually expand the size of the lesion by scanning over a square region, roughly 30 μm on a side, containing the lesion at 1 Hz for several seconds at about 50 mW. During this time, a bubble-like variation in the refractive index, which is visible under contrast imaging, forms at the center of the scan region and expands to a steady-state size that was roughly inversely related to the size of the scan area. During the expansion, we allowed the diameter of the bubble to grow to about 30 microns before turning off the beam. After zooming out or reducing the beam's intensity, the bubble would collapse and the diameter of the final lesion was only a few microns larger than the original. This procedure allowed us to produce damage with a large spatial extent using smaller and less concentrated powers (presumably through absorption-induced thermal insult), a procedure from which the animals tended to recover more quickly. Furthermore, we observed eye-movement–correlated fluorescence fluctuations in cells within 20 μm of lesion borders, suggesting that nearby cells remain capable of firing action potentials in a manner consistent with the functionality of hVPNI neurons following lesions.

Eye position/eye velocity plots

Plots of eye position versus eye velocity (“P–V plots”) were generated to quantify the effects of ablation on gaze stability during fixations. Eye position time series collected before and after ablation were smoothed using a third-order Savitzky–Golay filter (Matlab function sgolayfilt) with a 200-ms window. To further minimize the effects of noise in calculating velocity, eye position was again smoothed by fitting a least-squares spline to individual fixations (Matlab function csaps; Fig. 8, red traces). One second of data before and after each saccade was excluded. In cases when fixations were long, the eyes tended to spend a lot of time near the center of gaze. This causes P–V plots to yield poor estimates of drift since the majority of the data points fall near this center where drift is minimal. Thus only the first 10 s of a fixation were included for fixations >10 s. For each data point on this smoothed eye position trace, a corresponding velocity was calculated by measuring the slope of a line fit to the data between 300 ms before and after each point, excluding any points that were <300 ms from the boundaries.


Prediction of fluorescence fluctuation in hVPNI neurons

Our regression-based identification method is founded on a single-compartment model of calcium accumulation that provides a way to approximate what the fluorescence time series should look like as a function of neural activity (Helmchen and Tank 2005; Tank et al. 1995). We assume that the intracellular calcium concentration change associated with individual action potentials is characterized by a CIRF taking the form of an instantaneous increase followed by an exponential return to a resting level and that these changes summate linearly when action potentials occur close together in time relative to the decay time of the CIRF (Fig. 1A). Firing-induced fluorescence changes can then be approximated by a convolution of firing rate with the CIRF. Discrete sampling of fluorescence as occurs with TPLSM will give rise to measured fluorescence time series that resemble smoothed versions of these summations.

Fig. 1.

Predicting fluorescence responses for velocity-to-position oculomotor neural integrator for horizontal eye movements (hVPNI) neurons. A: we expect individual spikes (vertical dashes in top trace) to be associated with exponentially decaying transients in fluorescence (bottom, black) or a smoothed trace (red), representative of discretely sampled fluorescence, and for transients to sum linearly. B: simulated eye position showing saccades of different sizes. C: eye velocity computed from B. D–F: spike rasters (top traces), predicted fluorescence (bottom, black), and smoothed fluorescence (red) expected of neurons encoding eye position (D), ipsiversive burst velocity (E), and a sum of the two (F). G: convolved eye position and ipsiversive velocity (dashed traces) computed from B and C, respectively, and their sum (solid).

Our expectation for the fluorescence variations characterizing hVPNI neurons is established by the close correspondence seen in adult preparations between firing rate and eye movement (Aksay et al. 2000; Lopez-Barneo et al. 1982; McFarland and Fuchs 1992). During spontaneous eye movement, the firing of hVPNI neurons encodes eye position (Fig. 1B) and ipsiversive velocity (Fig. 1C). The eye position dependence is approximately threshold-linear, with zero firing during fixations at positions below a threshold, and the ipsiversive velocity dependence is approximately linear. Based on the model of calcium dynamics we assume, eye position sensitivity in firing should generate exponential rises and decays toward new steady-state fluorescence levels initiated by ipsiversive and contraversive saccades, respectively, with a time constant characteristic of the CIRF (Fig. 1D). Larger changes in firing associated with more extreme positions would lead to rises and decays of larger amplitude. Ipsiversive velocity sensitivity should give rise to calcium-sensitive fluorescence transients, again with a time constant characteristic of the CIRF (Fig. 1E). Longer bursts associated with larger saccades would give rise to larger transients. hVPNI neurons whose firing encodes eye position and ipsiversive velocity would exhibit fluorescence changes approximated by a linear combination of these two components (Fig. 1F). This suggests that linearly combined time series of CIRF-convolved eye position and ipsiversive velocity (Fig. 1G), measured simultaneously during image time series acquisition, would well approximate these fluorescence changes and could form the basis for a regression model that would capture much of their structure. Cells with different mixtures of velocity and position encoding in their firing would have fluorescence waveforms representing different mixtures of the two components that give rise to different regression coefficients.

Synchronous eye tracking and optical recording with cellular resolution during spontaneous eye movement

To implement such a strategy for hVPNI cell identification, we built a microscope allowing simultaneous and synchronous eye position measurement and two-photon laser scanning optical recording of calcium fluctuations during spontaneous eye movement (Fig. 2, A and B). For imaging, the calcium-sensitive fluorescent dye OGB-1 was bolus-loaded (Brustein et al. 2003) into the caudal hindbrain, where we anticipated finding neurons comprising the hVPNI based on observations in the goldfish (Aksay et al. 2000; Pastor et al. 1994). Larvae were mounted on the microscope embedded in agarose with their eyes free to move. In this preparation, larvae saccade spontaneously in the horizontal plane, generally in alternating directions (Fig. 2A, Eye position inset). This behavior did not appear to be appreciably altered by dye loading and fluorescence excitation. Saccade rate, mean amplitude, mean peak saccade velocity, and mean centripetal eye velocity during fixations showed no significant difference in bolus-loaded larvae during laser scanning compared with uninjected, unscanned controls (all P values >0.1, one-tailed Student's t-test; Supplemental Fig. S6). The rate of centripetal drift in the dark (<1°/s; Supplemental Fig. S6D) is consistent with the presence of a functional hVPNI stabilizing gaze during fixation in larvae at the stages imaged (6–12 dpf, 4–4.8 mm in length).

Fig. 2.

Optical recording of calcium fluctuations with cellular resolution in larval zebrafish during spontaneous eye movements. A: schematic of the custom-built microscope allowing simultaneous and synchronous two-photon laser scanning imaging and eye tracking. Lenses, filters, and dichroic mirror part numbers are given in methods. B: top view of the water-filled chamber in which agarose-embedded larvae are mounted. C: sagittal view two-photon image of the otic vesicle and posterior cerebral vein (PCeV). Short arrows indicate blood flow directionality. In C–F, the white dotted line marks the rostrocaudal position of the ear peak. D: lateral view transmitted light image of a 9 days postfertilization (dpf) larval zebrafish, with image regions from C and G outlined in white. E: dorsal view epifluorescence image of the hindbrain with reticulospinal (RS) neurons labeled with Alexa 488 dextran (green). Rhombomeres 4–8 (r4–r8) are indicated. In E and F, the gray dotted line marks the rostral edge of a typical imaging window. F: sagittal view two-photon image of Alexa 488 dextran labeled RS neurons. G: time projection of a two-photon sagittal image time series collected in the caudal hindbrain. The medial longitudinal fasciculus (MLF) and the hindbrain floor are outlined in dotted red. Dorsal (D), ventral (V), rostral (R), and caudal (C) directions are denoted. Images in C, D, and F are oriented similarly to G. The image in E instead has the mediolateral axis running vertically. All scale bars = 20 μm, except in E, where the scale bar = 40 μm.

To quantify spatial location in our search for hVPNI neurons, an anatomical coordinate system was established that enabled landmark-referenced localization of imaged somata in three dimensions within the hindbrain. RS neurons indicative of rhombomere location were fluorescently labeled to establish a rostrocaudal coordinate axis (Fig. 2, E and F; n = 12 larvae) (Ma et al. 2009). Their rostrocaudal distances (Supplemental Table S1) were then measured from a landmark readily observable using two-photon excitation, the ear peak (Fig. 2C; see methods). All optical recording was performed in a separate set of larvae in which the rostral image window boundary was referenced to the ear peak, allowing estimation of the rostrocaudal distance between RS neuron somata and ROIs identified in bolus-loaded larvae with, on average, ∼10 μm accuracy. Mediolateral ROI position was measured from the midline. The MLF in the dye-loaded hemisphere, observable in sagittal plane fluorescence images 20 μm from the midline (Fig. 2G), served as a landmark from which dorsoventral ROI position was measured.

Based on the anatomical location of goldfish Area I, we expected to find hVPNI neurons in zebrafish larvae in the ventral caudal hindbrain. To identify putative hVPNI neurons, we collected sagittal image time series from 100 × 100-μm windows in this hindbrain region in 10-μm mediolateral increments, ranging from 20 to 70 μm lateral of the midline. We set the rostral and caudal image window boundaries for this collection between ∼100 to 130 μm and ∼200 to 230 μm caudal of the Mauthner cell soma, respectively, locating our images in rhombomere 7/8 (Fig. 2G). The ventral boundary of the imaging window was always set such that the ventral floor of the hindbrain was visible. This entailed that nearly two thirds of the dorsoventral extent of the hindbrain at this rostrocaudal position fell within the window. A total of 84 image time series of 750 frames (∼384 s) were collected either in the dark or in the light (n = 8 larvae).

Regression-based behavior-encoding cell identification

Based on the conceptual idea illustrated in Fig. 1, we developed a regression-based algorithm that could quickly find hVPNI neurons within these fluorescence image time series without supervision. Orthogonal regressors were generated from the measured variables CIRF-convolved eye position, CIRF-convolved ipsiversive velocity, and frame-averaged fluorescence (Fig. 3A). The last of these was included to capture fluctuation stemming from bleaching or other systemic variation in fluorescence measurements from frame to frame, although bleaching was minimal in our image time series (see methods). Our approach operated on a pixelwise basis, fitting each pixel time series with a linear model comprised of the orthogonalized regressors. To begin identifying regions within the image window showing significant eye position or ipsiversive velocity dependence, T scores were calculated for each pixel time series from regression coefficients. These scores measure the strength of correlation between pixel time series and a primary regressor, either CIRF-convolved eye position or ipsiversive velocity. The inclusion of regressors beyond the primary regressor served to capture added structure in pixel time series, reducing the fit residuals and thereby increasing T scores (Fig. 3B). Before assessing the significance of correlations, we converted T scores to Z scores: Zp when using CIRF-convolved eye position as the primary regressor and Zv when using CIRF-convolved eye velocity. We then corrected for motion-induced correlation between pixel time series and the primary regressor (see methods). After correction, a significance threshold yielding a specified false discovery rate (FDR; Motulsky 2010) was calculated (Storey 2002; Storey et al. 2004) and pixels with corrected Z scores above this threshold were deemed significantly correlated with the primary regressor (Fig. 3E).

Fig. 3.

Regression model and corrected Z score calculation. A: illustration of collected data (left column) and the variables in the orthogonalized regression model (right column) fit to pixel fluorescence time series. The eye variables are convolved with a calcium impulse response function (CIRF), then all 3 variables are orthogonalized. Formula values represent regression coefficients. B: average median T score and average median sum squared error for regression fits using different sets of regressors. C: an exemplary histogram of Z scores calculated from Formula values for all included pixels from one image time series. Red overlays in C and D show the best fit of a Gaussian to the histogram in D. The cyan overlay in C shows a standard normal distribution with an equivalent peak height. D: histogram of the negative Z scores from C and their additive inverses. E: histogram of Z scores from C divided by the SD of the distribution shown in D. Vertical red lines indicate the false discovery rate (FDR)–corrected significance threshold.

The mapping onto image space of Z scores from regressions using CIRF-convolved eye position as the primary regressor begins to reveal groups of strongly position correlated pixels (Fig. 4A). However, after thresholding the resulting maps (Fig. 4B), the edges of these groups remain poorly defined. To generate smoothly bounded regions of highly position correlated pixels, a contextual enhancement routine was applied to significance-thresholded maps (Fig. 4C) (Salli et al. 2001; Tillikainen et al. 2006). Comparison of smoothed regions with the time projection of image time series revealed that these regions corresponded to clusters of somata, individual somata, and portions of neuropil. The size and number of these regions depended on the specified FDR. This rate was set empirically to ensure that many identified somata displayed average fluorescence time series that had Pearson correlations with the primary regressor ranging below 0.5. This ensured that most or all somata with such correlations >0.5 would be found. Based on analysis of somata for which both fluorescence and firing rates were recorded, a correlation >0.5 for fluorescence with CIRF-convolved eye position was always associated with robust position-sensitive firing. Regions corresponding to clusters of cells were common, as might be expected in the larval zebrafish hindbrain where most somata still reside in a dorsomedial somata-dense mass emanating from the fourth ventricle. Even in the more neuropil-dense ventrolateral extent of the hindbrain at this stage, somata are often closely apposed. This introduced the problem of segmenting individual somata from such regions.

Fig. 4.

Segmentation of Z score maps from an image time series. A: pseudocolored plot in image space of the corrected Z scores for each pixel time series. B: time projection of the image time series with Z scores above the FDR-corrected significance threshold overlaid in a pseudocolor scale. C: time projection of the image time series with Z scores overlaid in a pseudocolor scale after contextual enhancement. D: manually segmented regions identified as putative eye position-dependent somata. E: automatically segmented regions identified as putative eye position-dependent somata. All scale bars = 20 μm.

For off-line data analysis following experiments, segmentation can be performed manually by tracing the edges of cells apparent within these regions in the time projection of an image time series (Fig. 4D). This saves substantial time at this analysis step, obviating the need to define ROIs for all cells visible in the time projection. However, to enable rapid cell identification during experiments, we developed an automated method that uses the change in correlation magnitude over image space to find somal boundaries (see methods; Fig. 4E). For each contiguous region noticeably larger than a single soma, ROIs were defined by tracing paths from pixels corresponding to local correlation maxima radially outward to either local correlation minima, the edge of the contiguous region, or the point at which the path length reached the average somal radius. Path ends were connected to define an ROI (Supplemental Fig. S3). This method succeeds in finding most of the same somata as manual segmentation, defining similar boundaries for them, although there is some disagreement between manual and automated segmentations. For comparison, we calculated the fraction of manually defined cells that were also found via automated segmentation. For a set of 344 manually defined cells from 44 image time series, 77% (266) were automatically identified. A match here required that automated segmentation identify a cell overlapping >50% of the pixels comprising a given manually defined cell. These 266 cells had a median Pearson correlation of 0.97 with their manually defined pair in terms of ROI-averaged fluorescence.

A fraction of automatically segmented regions do not appear to correspond to somata, but rather to neuropil (∼9%; 15 of a sample of 174 automatically segmented regions with cp >0.5), and the boundaries of automatically segmented somata often differ noticeably from those manually defined. Yet since our automatic segmentation method succeeds in rapidly identifying many putative hVPNI cells, it would be sufficient for many conceivable targeted activity measurement or perturbation experiments. For our 256 × 256 pixel, 750 frame time series, our analysis script runs in <10 min on a 2-GHz Intel T2500 processor, the bulk of which time is devoted to the 2D cross-correlation–maximizing motion correction. The remainder of the script runs in <2 min. For certain applications, motion correction may be unnecessary.

A broad spatial distribution of putative hVPNI neurons

Figure 5A depicts representative results of the application of this automated cell localization method to an image time series collected 40 μm from the midline of 8 dpf larvae. For this time series, a total of 34 automatically segmented regions of interest (ROIs) with Pearson correlations with either CIRF-convolved eye position or ipsiversive velocity >0.5 were identified. Those with such correlation >0.60 (n = 24) are outlined. CIRF-convolved versions of simultaneously measured eye position and ipsiversive velocity (Fig. 5, C and E) show clear similarity to ROI-averaged fluorescence time series (Fig. 5F). Some of these time series (e.g., ROI 1) seem to match convolved position quite well, whereas others strongly resemble convolved ipsiversive velocity (e.g., ROI 14). As expected based on measurements of hVPNI neuronal firing in other species (Aksay et al. 2000; Lopez-Barneo et al. 1982; McFarland and Fuchs 1992), many ROI time series appear as a linear combination of convolved position and ipsiversive velocity components (e.g., ROIs 7 and 13). In Fig. 5F, ROI-averaged regression model fits are overlaid for several example cells, evidencing the general strength of our three-parameter model to capture much of the structure in ROI time series.

Fig. 5.

Saccade-related calcium fluctuations in optically identified hindbrain somata. A: time projection of an image time series collected 40 μm from the midline. Outlined are automatically identified regions of interest (ROIs) having averaged fluorescence (ΔF/F) time series with cp >0.6 (red, cells 1–13 ranked by cp) or cv >0.6 (green, cells 14–24 ranked by cv). For A and G–J, scale bars = 20 μm. Simultaneously measured frame-averaged ipsilateral eye position (B) and ipsiversive velocity (D) are shown along with CIRF-convolved versions (C and E). F: calcium fluctuations in outlined ROIs. Two minutes of their somatic fluorescence time series are plotted in black, with the average regression model fit for included pixels overlaid in magenta for several examples. ROI-averaged Z scores (〈Zp〉 and 〈Zv〉), cp, and cv are indicated for these examples. G–J: time projection of image time series collected 20 μm (G), 30 μm (H), 50 μm (I), and 60 μm (J) from the midline with automatically identified ROIs outlined in red and green according to the same criteria used in A. Moving outward from 20 to 60 μm, image windows were translated 5 μm dorsal at each 10-μm step to capture more cells within the window.

ROIs with average fluorescence time series correlating strongly with either CIRF-convolved eye position or ipsiversive velocity were found throughout the mediolateral and dorsoventral extent of the volume from which image time series were collected in all larvae examined (n = 8) in both the dark and the light (e.g., Fig. 5, G–J). To eliminate the influence of any ROIs not corresponding to somata, we then looked at the distribution of manually segmented somata having Pearson correlation >0.5 with CIRF-convolved eye position or ipsiversive velocity. For these somata, the dorsoventral spread in cell location within sagittal planes was extensive. For all cells identified in the light, SDs for dorsoventral locations within sagittal planes ranged from 8.68 μm (20 μm lateral) to a high of 19.50 μm (30 μm lateral), with a mean of 16.56 μm. Within individual image time series in which >5 cells were identified (n = 28 of 44), SDs for dorsoventral cell locations ranged from 7.06 to 25.26 μm, with a mean of 16.14 μm. Comparable results were obtained for data collected in the dark. Although the greatest abundance of cells was typically found 30–40 μm from the midline, cells were found in all sagittal planes whenever >20 somata were identified in an individual larva (n = 6 in the dark and the light). The finding of cells with activity patterns characteristic of hVPNI cells spread throughout a broad extent of rhombomere 7/8 starkly contrasts with observations in the adult goldfish in which cells with these activity patterns are confined primarily to a discrete nucleus comprising a much smaller fraction of brain stem volume (Aksay et al. 2000).

Targeted electrical recording demonstrates expected firing patterns

One immediate application of our automated cell localization method is the identification of cells with specific activity patterns apparent in fluorescence data for targeted electrical recording. We optically identified cell somata with significant eye position and velocity correlation and then recorded from them using visually guided loose seal patch electrodes (Fig. 6A). Cells showed expected action potential firing properties, including persistent position-dependent firing during fixations and bursts associated with ipsiversive saccades (Fig. 6, B–D). Correlations with both eye position and ipsiversive velocity time series were used to quantify the sensitivity of firing rates for these variables (Escudero et al. 1992). Figure 6E plots these values for all recorded neurons (n = 9). A bootstrap method involving circular permutation of time series was used to test the significance of these sensitivities (see methods). All cells were found to be significantly sensitive to eye position and 7 of 9 showed significant burst sensitivity (P < 0.05). Qualitatively, the variation in position and ipsiversive velocity sensitivities we observe is consistent with similar measurements for putative hVPNI neurons in monkeys (McFarland and Fuchs 1992), cats (Delgado-García et al. 1989; Escudero et al. 1992), and goldfish (Aksay et al. 2000).

Fig. 6.

Targeted electrical recording of optically identified eye position-sensitive neurons. A: overlay of green and red channel fluorescence collected during an electrical recording. Scale bar = 5 μm. B–D: ipsilateral eye position (B), electrode voltage (C), and firing rate (D) for a representative cell during spontaneous eye movement. E: eye position (black dots) and ipsiversive velocity (red dots) sensitivity for 9 cells from which electrical recordings were made. Horizontal bars show corresponding significance thresholds (P = 0.05) for the dependence of firing rate on position (black) and ipsiversive velocity (red).

Calcium impulse response function estimation

Our regression-based method for identifying neurons showing behaviorally correlated activity makes use of an estimated CIRF characterizing the mapping between measured fluorescence and underlying firing rate. The nature of firing patterns in hVPNI neurons during spontaneous eye movements presented an opportunity to estimate CIRFs in a novel manner and to do so for most of the cells we identified from optical recording. As illustrated in Fig. 6 and previously characterized in other preparations, contraversive saccades are associated with a step down in the persistent firing of hVPNI neurons (Aksay et al. 2000; Lopez-Barneo et al. 1982; McFarland and Fuchs 1992). If intracellular calcium concentration can be described by the firing rate convolved with an exponentially decaying CIRF, steps down in persistent firing should be associated with an exponential decay in intracellular calcium down to a new steady-state level with a time constant equal to that of the CIRF (Helmchen and Tank 2005; Tank et al. 1995). We then assembled contraversive saccade-triggered average (STA) fluorescence responses for identified neurons and fit exponential functions to these averages over the interval immediately following the saccade time to estimate this time constant that defines the CIRF (Fig. 7A). In this way, we generated cell-specific CIRF estimates, allowing us to examine the distribution of these values across cells. To ensure that we were limiting this analysis to cells with robust persistent activity, we restricted the analysis to a population of 455 manually segmented cells that had Pearson correlations with CIRF-convolved eye position of >0.5. Exponential fits to contraversive STA fluorescence responses were of reasonable quality (R2 >0.5) for 91% (416/455) of these cells. To characterize the actual distribution, we focused on the cells for which the fits were of the highest quality (R2 >0.95; 171/455 = 38% of cells). The minimum, 25th percentile, median, 75th percentile, and maximum of this distribution were 0.70, 1.23, 1.49, 1.91, and 3.98 s, respectively (Fig. 7B). The mean and SD of the distribution were 1.63 and 0.62 s, respectively. Previous estimates of this mean for zebrafish neurons loaded with OGB-1 have ranged from 0.33 (optic tectum; Ramdya et al. 2006) to 5.7 s (olfactory bulb interneurons; Yaksi and Friedrich 2006). To validate that cell-specific CIRF estimates accurately capture the mapping between firing and fluorescence, we calculated estimates from contraversive STA fluorescence responses for 8 position-correlated cells for which we had subsequent electrical recordings. We then assembled STA firing rates for these cells and convolved them with the estimated CIRF to compare them to both ipsiversive and contraversive STA fluorescence (Fig. 7, C and D). We found remarkably strong agreement. For ipsiversive STAs, the Pearson correlation between the two ranged from 0.91 to 0.98, with a mean of 0.96. For contraversive STAs, the Pearson correlation between the two ranged from 0.88 to 0.98, with a mean of 0.96. To quantify the significance of the dependence of these correlations on the cell specificity of the CIRF estimates, we calculated bootstrap P values (see methods). For both the ipsiversive and contraversive STA mean correlation, P < 0.001.

Fig. 7.

Estimating CIRFs. A: example contraversive saccade-triggered average (STA) fluorescence response (green) with overlaid exponential fit (black dotted). Top traces show portions of the ipsilateral eye position (top) from which saccades were identified and the somatic fluorescence time series from which the STA was compiled. B: histogram of estimated CIRF τ values from exponential fits to contraversive STA responses for which R2 >0.95. A box plot for the distribution of values is shown on top. The mean of the distribution, 1.63 s, is indicated. C and D: ipsiversive (C, red) and contraversive (D, green) STA fluorescence responses and CIRF-convolved STA firing rates (black) for the same neuron. The convolved traces were normalized to have a maximum of 1. The fluorescence traces had the mean of the first 1.5 s subtracted off, then were scaled by a constant to minimize their mean squared difference from the convolved traces.

Optical ablation demonstrates expected functionality

To verify that the eye position-encoding neurons we identified were actually part of the hVPNI, we performed focal laser ablations in rhombomere 7/8 where a high density of these neurons was consistently identified and probed for gaze-holding deficits (Fig. 8). The region chosen was at the dorsoventral level of the MLF, the rostrocaudal level of myotome 2, and midway between the midline and the lateral edge of the hindbrain (Supplemental Fig. S5D). Ablations lesioning ∼30-μm spheres of tissue were performed bilaterally in this region, partially damaging the hVPNI, and resulting in decreased fixation stability in all experiments (n = 6 larvae; e.g., Fig. 8, A and B). Fixation stability was quantified by plotting eye velocity versus eye position, excluding data surrounding saccades, and measuring the slope k (Fig. 8, C and D). Less stable gaze would manifest as higher magnitude eye velocities and thus a larger magnitude k. The slope k is related to the time constant of the fixations τ, via the relationship τ = −1/k. The average k before hindbrain ablation was 0.04 s−1 (τ ≃ 25 s) and the average slope postablation was 0.10 s−1 (τ ≃ 10 s) (Fig. 8E). Control ablations were performed in the spinal cord nearly 30–50 μm caudal of the site of hindbrain ablations in 4 other larvae. The average k before and after these control ablations was 0.07 s−1 (τ ≃ 15 s) (Fig. 8F). The distribution of slope changes induced by hindbrain ablations was significantly different from that of control ablations (P < 0.01, two-tailed t-test). This demonstrates the hVPNI functionality of cells within the region in which we have identified eye position-encoding neural activity.

Fig. 8.

Laser ablation of putative hVPNI region. A and B: representative eye position traces before (A) and after (B) bilateral ablation in rhombomere 7/8. Red overlay demarcates data points used for eye position vs. eye velocity (P–V) plots. C and D: representative P–V plots from eye measurements before (C) and after (D) hindbrain ablation. Black lines show least-squares linear fits. E and F: slope of P–V plots before and after individual ablations (gray) and the average of all trials (black) for hindbrain (E) and spinal cord (F) ablations.


Behavior emerges out of the coordinated activity of large neuronal populations. Understanding the neural basis of specific behaviors will require interpretation of the dynamics of this activity and, to address causality, real-time perturbation. Much effort is currently aimed at maximizing the number of neurons that can be optically recorded simultaneously in experimental preparations allowing concurrent behavioral measurement. As the size of data sets grows, so grows the need for methodology facilitating the localization within these sets of neurons relevant to measured behaviors, both during experiments and subsequent off-line data analysis. We have addressed this need by developing a regression-based method for rapidly identifying cells displaying behaviorally correlated neural activity within fluorescence image time series. We applied this method to data collected from a novel preparation allowing simultaneous tracking of eye position and two-photon imaging of optical probes measuring calcium concentration changes in awake, behaving zebrafish larvae. The identification of eye-movement–related activity with cellular resolution using our method enabled targeted electrical recordings and optical ablations that localized the hVPNI in these larvae. We found a broad distribution of cells encoding eye position and ipsiversive velocity that differs markedly from the highly confined distribution in adult goldfish (Aksay et al. 2000).

Our identification method succeeded because we were able to formulate a linear regression model that accounted for a substantial degree of the structure of fluorescence fluctuations in the cells we sought. This hinged on two main factors: 1) a sufficiently good prior expectation of the encryption of behavioral parameters in the firing patterns of our cells of interest and 2) a sufficiently good approximation of the mapping between firing and calcium-sensitive fluorescence fluctuations. The rapidity of our identification procedure stemmed from the use of a linear model, for which a computationally cheap closed-form solution to the regression problem exists, and the legitimacy of parametric statistical testing, which allowed us to avoid bootstrap-based nonparametric tests. Ultimately, the sensitivity of an approach such as ours depends on how much of the structure of cellular fluorescence fluctuations the regression model is able to capture. Our results show that models assuming approximations to behavior–firing relations and firing–calcium concentration relations can perform well. It is possible that the exponentially decaying CIRF we have used here may not be sufficient for cell identification in other contexts. CIRFs of this form have been useful for firing rate estimation (Yaksi and Friedrich 2006), but in general CIRF form may need to be estimated uniquely for distinct cell types and more complex functional forms may be necessary for certain applications. It is important to note that the ability to later infer action potential firing from optical measurements alone in identified neurons can be diminished by a lack of biophysical realism in assumed firing–calcium concentration relations.

The determinants of our method's strength also evidence certain limitations. It likely will not always be the case that a linear model captures essentially all the structure in fluorescence fluctuations, particularly for data sets with high temporal resolution. This could result, for example, from a nonlinear dependence of underlying neuronal firing on measured behavioral variables or from temporally structured components of firing that are independent of these variables. Cells whose activity depends nonlinearly on behavior could still be identified if that nonlinearity has a significant linear correlation over the measured range of behavior. However, in these cases an approximately normal residual distribution is not guaranteed and nonparametric statistical testing relying on bootstrap sampling could be used instead, although at a computational cost. Alternately, methods that account for signal structure not captured by regressors, recovering the ability to parametrically test the significance of linear correlations (Leek and Storey 2008), could be implemented. The general approach taken here would be compatible with the use of nonlinear regression models in cases in which linear models fail to capture a sufficient fraction of the signal structure for the desired purpose. Following the initial application of linear regression models, nonlinear models have been used extensively in analysis of fMRI time series (Buchel et al. 1996; Friston et al. 1998; Genovese 2000). We expect a similar analytical development in the future as the problem of identifying neurons demonstrating behaviorally related activity from cellular resolution optical recordings is progressively addressed by researchers. It is also important to note that our method relies on univariate statistics, testing behavioral correlation at individual pixels separately. Improved detection sensitivity could be achieved using multivariate statistical methods that could account for behavioral dependence on collective fluorescence changes across pixels. Such methods have been successfully applied in the analysis of fMRI data in which researchers are searching for groups of voxels that carry information about cognitive or behavioral variables (Norman et al. 2006). One caveat to consider here is that when testing the significance of the relationship of groups of pixels with behavior, a null model characterizing the expected correlation between image pixels must be assumed.

Despite its limitations, our approach should fill an important niche among methods for image time series analysis, speeding both on-line and off-line identification of neurons encoding measured behaviors. To date, most reported analyses of in vivo optical recording data have relied on manual ROI definition (Dombeck et al. 2007; Kerr et al. 2005; Niell and Smith 2005), which can be laborious, particularly with imaging of large populations, and may not be tenable for cell identification during experiments. Automated methods for cell identification have been described that make use of cell morphology apparent in image contrast (Dorostkar et al. 2010; Ohki et al. 2005; Valmianski et al. 2010) and/or the correlation structure of image time series (Mukamel et al. 2009; Ozden et al. 2008). The former type has been applied to segment cells in cortex, but it remains unclear how well it would perform when confronted with situations where cells are more closely packed, as in the larval zebrafish CNS, or fluorescent labeling provides limited contrast between somata and neuropil. Correlation-based methods require a certain degree of independence between cells to distinguish them. For example, the ICA-based method described by Mukamel et al. (2009) begins to have trouble separating cells when the pairwise correlation of their calcium-sensitive fluorescence exceeds 0.8. Yet in our data, closely apposed cells commonly had correlations >0.9. Whereas Mukamel and colleagues turn to image segmentation to separate highly correlated cells, the dense packing of highly correlated cells in the zebrafish hindbrain drove us instead to use local maxima and minima in maps of behavior correlation to draw cellular boundaries.

Using our method, we were able to find many somata demonstrating fluorescence fluctuations consistent with an underlying firing rate encoding eye position and/or ipsiversive velocity. The firing of hVPNI neurons in adult vertebrate preparations encodes mixtures of these two variables. Strictly speaking, cells belonging to the hVPNI have at least some position-sensitive firing, yet some of the cells we have found could be exclusively velocity sensitive. These cells could represent developmentally immature neurons that will later encode eye position or saccadic burst neurons that encode ipsiversive velocity and drive saccades.

Our observation of a broad distribution of putative hVPNI neurons across space provides an important snapshot of hVPNI development. Based on the spatial extent of goldfish Area I (Aksay et al. 2000), we might expect to find such neurons in a more confined region of the hindbrain. The rostrocaudal extent of goldfish Area I represents roughly 10% of the length of the brain stem from the abducens nucleus to the rostral end of the spinal cord, whereas our imaging window represents about 50% of the equivalent extent in zebrafish larvae. Goldfish Area I is also confined to proportionally smaller extents along the mediolateral and dorsoventral axes. Because the gross morphology of the developing hindbrain at the stage we imaged differs from that of the mature brain stem, we expect that the broad distribution reflects the structure of the hVPNI early in development rather than an interspecies difference. The methods we demonstrate should provide a straightforward means to study the development of the spatial organization and dynamics of the hVPNI.

Our preparation and regression-based method will be useful for further study of the mechanisms underlying persistent neural activity within the hVPNI, complementing previous investigations using electrical recording. Two-photon excitation is minimally invasive, limiting not only damage to the microcircuitry under examination but also disturbance of behavior. The use of optical recording is free from certain biases inherent in searching for putative integrator cells with microelectrodes, such as the confinement to isolatable units and the experimenter's arbitration in selecting cells for recording. Emerging techniques for connectivity mapping and subcellular activation measurements among optically identified neuronal populations on short spatial scales should be compatible with the preparation. This would enable a more direct assessment of the extent to which recurrent connections among hVPNI neurons and bistability/multistability within their dendrites subserve firing persistence (Aksay et al. 2001, 2007; Goldman et al. 2003) than has been feasible to date.

Beyond our preparation, a regression-based cell identification approach should be broadly useful, facilitating experiments aimed at elucidating the logic of microcircuits subserving behavior. Rapid cell identification of behaviorally correlated neurons will enable subsequent cell-specific activity perturbation using laser ablation, juxtacellular stimulation, or the excitation of optogenetic probes. Although we did not use it here to demonstrate hVPNI functionality, we have successfully laser ablated individual identified position-encoding neurons (data not shown). Fast identification may at times be critical in testing ideas about microcircuit function and collective dynamics among component neurons, since this could require not just activity perturbation in arbitrary cells, but in cells with specific activity patterns. The cells identified by our method could also be targeted for neuroanatomical assessment via, for example, electrode-based electroporation of fluorescent probes revealing morphology or synaptic connections, allowing activity measurements to be interpreted vis-à-vis network architecture. Since the subcellular spatial resolution optical recording affords does come with reduced temporal resolution, the identification of cells of interest during an experiment would allow subsequent targeted scanning of these cells at high sampling frequencies or targeted electrical recording. This would speed the collection of high temporal resolution activity measurements from large numbers of behaviorally relevant neurons. Coupling our identification method with emerging techniques for fast random access scanning could allow such measurements to be made simultaneously.

A regression-based approach could be useful for the identification of other cell types functionally imaged simultaneous to behavioral measurement. For example, place or head direction cells typically show firing rate tuning curves relative to an animal's position in one dimension (McNaughton et al. 1983) or head direction (Taube and Bassett 2003), respectively, that could be approximated by piecewise linear functions. The same should hold for tuning curves in terms of calcium-sensitive fluorescence. Regression of pixel fluorescence on position or head direction using a piecewise linear model and subsequent significance testing of regression coefficients could be used to identify pixels comprising cells with distinct position or head direction sensitivity. A closed-form solution to the regression problem using a piecewise linear model can be derived for the case in which the x coordinates of connection points between linear pieces are known and these connection points could be estimated from data prior to model fitting. For head direction cells, firing rate is well described as a piecewise linear function of head direction and the same may be true for fluorescence, such that normally distributed residuals would result from the regression and significance testing would be fast. More generally, our method, when coupled with targeted recording to validate activity–parameter correlations and laser ablation to validate predicted functionality represents a promising framework for optically guided mapping of neuronal populations underlying behavior.


This work was supported by a National Science Foundation predoctoral fellowship to A. Miri; a Burroughs Wellcome Fund Career Award at the Scientific Interface, a Searle Scholar award, and a Frueauff Foundation grant to E. Aksay; and National Institutes of Health Training Grant EY-007138-16 to K. Daie and Grant R01 MH-060651 to D. W. Tank.


No conflicts of interest, financial or otherwise, are declared by the authors.


We thank G. Gamkrelidze for assistance with experiments in the early phases of this work; F. Collman for motion-correction software; D. Dombeck, R. Baker, and L. Ma for technical advice; G. Detre and M. Chikina for helpful conversations; and D. Dombeck for helpful comments on the manuscript.


  • 1 The online version of this article contains supplemental data.


View Abstract