## Abstract

We previously demonstrated that there is an abrupt (rather than smooth) transition between reactive and predictive modes of eye-movement tracking of target lights (a phase transition). We also found evidence that the sequence of eye movements in the reactive mode was independent, whereas those in the predictive mode were correlated and possibly formed a random fractal sequence. Here we confirm the finding of fractal structure by quantifying the rate of decay of nonlinear forecasting when applied to these data. We also estimate the window over which consecutive trials are correlated and show that the duration of this window is fixed in time rather than number of trials. These results have implications for the neural mechanisms that drive predictive movements.

## INTRODUCTION

In previous work, we demonstrated that there is a “phase transition” between two different modes of saccadic tracking (Shelhamer and Joiner 2003). Subjects were asked to follow a visual target that periodically jumped between two fixed positions at various rates. When target pacing was slow (<0.5 Hz), saccade latency was ∼180 ms; subjects waited for each target jump and reacted with a saccade. When target pacing was fast (>0.6 Hz), saccade latency decreased dramatically to the point where subjects made predictive saccades that anticipated the target jumps. We call these two distinct behaviors *reactive* and *predictive* saccades. The transition from reactive to predictive as pacing frequency increases (and vice versa) is not smooth but abrupt, and occurs at ∼0.5 Hz. [This result is closely related to earlier work on phase transitions in finger tapping (Kelso et al. 1979).]

Our initial study also presented evidence for a difference in the statistical scaling properties of saccade latencies in the reactive and predictive modes. The power spectra of the series of latencies for reactive saccades was relatively flat, suggesting an uncorrelated or white noise process. The spectra of the predictive saccade latency series decayed as a function of frequency in a power-law fashion. The decay exponent was in a range that suggested that the latencies form a random fractal sequence, indicative of long-term correlations between latencies (see discussion for an explanation of random fractal sequences).

Here we confirm and extend this result through the use of nonlinear forecasting or prediction (Farmer and Sidorowich 1987). Nonlinear forecasting is a way to predict subsequent events in a time series based on a reconstruction of system trajectories in state space. It has been shown that forecasting quality decays as a function of forecasting horizon (time into the future over which the forecast is made) in a characteristic manner for random fractal processes. We show that forecasting confirms that predictive saccade latencies indeed form a random fractal sequence, and we use this technique along with autocorrelation functions to determine the time interval over which past saccades have a “significant” effect on the latency of the current saccade. [A study with similar goals but using different methodologies was carried out for human finger tapping (Roberts et al. 2000).]

## METHODS

All subjects gave their consent to participate as approved by the local Institutional Review Board. To obtain data on saccades at different pacing rates, subjects were seated in a dark room and presented with two red light-emitting diode (LED) targets spaced horizontally 15° to the left and right of center. The LEDs were illuminated alternately at selected rates from 0.3 to 1.2 Hz. The subject was instructed simply to look at each light when it came on. Eye movements and LED position were recorded, and a series of saccade latencies was extracted at each pacing rate by comparing the two time series. It is these series of saccade latency values, one series for each subject and tracking condition, that are analyzed throughout this study.

To obtain large data sets for the initial analysis of reactive and predictive behavior, 1,000 trials were recorded from each of four subjects, during pacing at 0.3 and 0.9 Hz. To obtain data on the characteristics of predictive saccades at different frequencies, saccades from four subjects were recorded during pacing at 1.2, 1.0, 0.8, and 0.6 Hz, in that order, for 150 trials at each frequency. The third subject in the first set of trials is the same as the fourth subject in the second set; all other subjects are unique. The first set of trials here (*n* = 1,000 saccades) is identical to *task 2* in our previous paper (Shelhamer and Joiner 2003), and the subjects are the same as in that study (although in a different order: *subjects 1–4* here are *subjects F, E, G,* and *H* from the previous paper).

Data are presented here in terms of both frequency of pacing and the interval between each target step (interstimulus interval, ISI, which is half the period). For frequencies 0.6, 0.8, 1.0, and 1.2 Hz, the corresponding ISI are 833, 625, 500, and 417 ms.

Nonlinear forecasting was carried out in a standard manner (Farmer and Sidorowich 1987) as used in our previous work on other aspects of eye-movement control (Shelhamer and Gross 1998; Trillenberg et al. 2001). Nonlinear forecasting is a means of predicting subsequent (latency) values based on local approximations to a reconstructed system trajectory in state space. Forecasting begins by creating the system time trajectory in a suitable (possibly high dimensional) state space. This is a space in which key variables are represented on each axis, and the system behavior over time traces out a trajectory in this space. Because it is not known what the key variables are, we use time-delay reconstruction (Packard et al. 1980; Ruelle 1990), which preserves essential properties of the actual state space but is more mathematically tractable. In this procedure, consecutively time-delayed values of a single measured quantity form points in an *M*-dimensional space. The reconstructed trajectory consists of the points *x*(*i*), generated from the original time series *y*(*i*) as follows *N* is the number of points on the trajectory, *M* is the dimension. (More general forms of the reconstruction exist, but this one is suitable for our purposes.) A reference point on the trajectory is chosen from which to make the forecast, and a set of *K* nearest neighbors is found. A linear regression is performed on these neighbors to see how they project into the future, and the resulting regression coefficients are used to forecast how the reference point will move in the future, over several steps. (A depiction of this procedure is described in results in association with Fig. 2.) Because the subsequent points are in fact already known (but not used in the forecasting procedure), we can determine the correlation between the forecast values and the actual data, averaged over all forecasts made. The system can be characterized by the behavior of the correlation coefficient as the forecasting is carried further into the future. A random system will have uniformly poor forecasting quality for all forecasting steps, whereas a (nonchaotic) deterministic system will have uniformly good forecasting quality. Chaotic systems (those that are nonlinear and deterministic but unpredictable in the long run) and certain correlated random systems can be forecast in the short term but the quality decays rapidly.

Forecasting was carried out in this study from 1 to 20 steps (trials) ahead. For each data set, forecasting parameters were adjusted so as to optimize the forecasting quality while forecasting one step ahead. Optimization was performed via exhaustive search over a suitable range of values of state-space dimension *M,* number of nearest neighbors *K,* and the number of values preceding the reference point from which to choose these neighbors (the “template” from which to draw points).

A significant advance in the interpretation of forecasting results was made by Tsonis and Elsner (1992), who demonstrated mathematically that forecasting quality decays in a certain manner [log(1 − *r*) vs. log(forecasting step) is a linear function] if the underlying data form a so-called “random fractal sequence” or “fractional Brownian motion” (fBm). We apply this test to our forecasting results. The consequences of this particular data structure will be elucidated in the discussion.

## RESULTS

### Large data sets—power spectrum, autocorrelation, forecasting

Power spectra for reactive and predictive saccade latencies for one subject (*n* = 1,000 trials) are shown in Fig. 1, *A* and *B* (based on Fig. 4 in Shelhamer and Joiner 2003). The reactive saccade spectrum (Fig. 1*A*) is approximately flat (slope: −0.00076, approximating white noise), indicating that consecutive saccades during reactive tracking are uncorrelated. The predictive saccade spectrum (Fig. 1*B*), on the other hand, decays as a power law with respect to frequency (slope: −0.92), producing a linear trend on the log-log scale. Power-law scaling indicates that consecutive saccades during predictive tracking are correlated in some manner.

Figure 1, *C* and *D*, shows the autocorrelation functions for the sequences of reactive and predictive latencies, corresponding to the power spectra in *A* and *B.* Autocorrelation shows how well a signal is similar to a time-shifted copy of itself. Here the abscissa is expressed in terms of number of trials rather than time per se. The autocorrelation for reactive saccades (Fig. 1*C*) has a peak at 0, as expected, and decays very quickly with intertrial interval. This approximates a mathematical “(Dirac) delta function” and again indicates that trials are uncorrelated. (The autocorrelation for uncorrelated or white noise is a single spike at 0 lag.) The autocorrelation for predictive saccades (Fig. 1*D*) likewise has a peak at zero but decays much more slowly, indicating that there is “significant” correlation (defined here as correlation >0.2) between saccade latencies even as far apart as 16 trials.

An example of forecasting for one subject tracking at 1.0 Hz is shown in Fig. 2. In *A,* a small subset of 16 actual and 15 forecast latency values is presented, drawn from the larger sequence. Actual and forecast data are shown, indicating the quality of the short-term forecasts in this case; the actual and forecast data diverge after 10 steps. In *B,* a two-dimensional reconstructed state-space trajectory is shown, for both actual and forecast latency values. Here, only the first five latency values from *A* are shown. This is a two-dimensional time-delay reconstruction: latency for a given trial is plotted along the abscissa while latency of the previous trial is plotted along the ordinate. As in *A,* the general trend of the forecast data (thin line, ×) follows that of the actual data (thick line, empty circle). The dotted line marked by squares shows the template data: latency data from early in the sequence that provide the basis for the forecasts. To visualize the forecasting process, begin at the point labeled “1,” from where the forecasting begins. The actual data trajectory moves down and to the left to get to point “2.” The forecast trajectory, however, moves down and to the right. This is because the forecast is based on a linear regression applied to the nearby template points—those marked by squares along a trajectory that parallels the forecast trajectory from “1” to “2.” In similar fashion, subsequent forecasts roughly parallel the nearest template trajectory, which generally also matches that of the actual data in this case, resulting in high-fidelity forecasts.

Forecasting quality from this same subject, and three others, is shown in Fig. 3. For reactive saccades (pacing at 0.3 Hz, *n* = 1,000), the correlation coefficients *r* between sets of forecast and actual values are plotted as a function of the number of steps (trials) into the future over which the forecasting is carried out, using linear scales (Fig. 3*A*). Forecasting quality is very poor; even forecasting one step ahead is not possible with great fidelity. (Forecasting would be perfect for *r* = 1 and completely nonexistent for *r* = 0.) This again is representative of uncorrelated trials because past behavior provides little information with which to forecast future behavior. A similar plot for predictive saccades (pacing at 0.9 Hz, *n* = 1,000) shows very good forecasting quality which decays gradually over the course of 10–15 trials (Fig. 3*B*). These same predictive-saccade forecasting results are plotted in Fig. 3*C* in the form of log(1 − *r*) versus forecasting step and in *D* in the form of log(1 − *r)* versus log(forecasting step). The decay appears as a linear function only in the case of Fig. 3*D*; this indicates that the data form a fractional Brownian process (Tsonis and Elsner 1992), the implications of which will be discussed in the following text. Thus both forecasting and autocorrelation demonstrate that latencies of predictive saccades are correlated across at least five trials (during pacing at 0.9 Hz).

### Small data sets—autocorrelation and forecasting define a correlation window

Intertrial correlations among predictive saccade latencies were noted in the preceding text. We now wish to determine if this “correlation window” is of constant duration (in seconds) during predictive saccade tracking at different pacing frequencies or if it is rather defined in terms of a constant number of trials regardless of the pacing frequency. To perform this analysis, we examined autocorrelation functions and forecasting quality, as in the preceding text, over a range of predictive tracking frequencies from 0.6 to 1.2 Hz in four subjects. These data sets were obtained in a single recording session for each subject and so were limited to *n* = 150 trials (saccades) at each frequency to reduce fatigue. Thus the quality of the results, especially in terms of forecasting, is not as good as in the case of the larger data sets just presented. Nevertheless the appropriate trends are clear. (Although forecasting with such small data sets yields insufficient statistics to confirm power-law scaling per se, it does allow for comparisons across tracking at different frequencies.)

Predictive tracking was verified for each of these four subjects at each frequency by finding the mean latency in each case, after rejecting the first 10 saccades to analyze only steady-state behavior. Mean latencies (±SD) for the first subject are −57.41 ± 52.1, −55.9 ± 38.3, −37.7 ± 33.8, and −38.1 ± 38.7 ms at 0.6, 0.8, 1.0, and 1.2 Hz, respectively. Values for the other subjects are: 96.5 ± 110.9, −17.6 ± 153.0, −90.4 ± 136.7, and −87.9 ± 90.4 ms (*subject 2*), 25.6 ± 111.7, −61.7 ± 100.7, −60.2 ± 77.5, and 78.8 ± 49.4 ms (*subject 3*), and −75.4 ± 87.9, −70.9 ± 75.1, −71.0 ± 56.3, and −7.3 ± 48.6 ms (*subject 4*). Because latency is typically ∼200 ms for random saccades (Becker 1989), a latency much smaller than this is deemed to be predictive. In particular, negative latencies correspond to responses that precede the stimulus (e.g., Fig. 7*G*). Even though latency is sometimes positive, all subjects exhibited predictive tracking at all frequencies.

Similarly to Fig. 1*D*, Fig. 4 shows the autocorrelation functions for sequences of *n* = 140 predictive saccade latencies (again the 1st 10 were rejected in each case to eliminate the starting transient), for pacing frequencies 0.6, 0.8, 1.0, and 1.2 Hz, for four subjects (1 along each row of graphs). The width of the central peak changes systematically with pacing frequency. The abscissa is expressed in terms of “trials” not “time,” and so the graphs depict correlations as a function of number of trials separation, not number of seconds separation. Taking the first points at which each autocorrelation magnitude decays to 0.2, we can define a “correlation window” as the corresponding width (actually the half-width, from 0 lag to the 0.2 point) of the central peak at these points. These window sizes are provided in Table 1 for each subject at each pacing frequency, in terms of trials and in terms of the corresponding time in seconds (which is equal to the window width expressed in trials multiplied by the ISI).

These data are also plotted in Fig. 6*A*, where it is clear that the duration of the correlation window remains relatively constant in terms of time, not number of trials, across these different rates of predictive saccade pacing. For the first subject, the correlation window spans a range of 2.2:1 expressed in terms of trials (2.3–5.0, for a ratio of 5.0/2.3 = 2.2:1; in this notation a value of *R* closer to 1 indicates a smaller range), but only 1.6:1 when expressed in terms of absolute time (1.3–2.1 s). Similarly for the other subjects (except *subject 4* where the values are close), the correlation window is approximately constant in terms of time (1.3:1, 1.3:1, 1.4:1) more so than trials (2.1:1, 2.3:1, 1.2:1). (Obvious outliers are excluded: 0.6 Hz for *subject 3* and 1.2 Hz for *subject 4.*) Each of these lines was tested for trend via linear regression of the window size on frequency. The regression slopes were tested for a significant difference from 0 (no trend across frequency) with *t*-test at a significance level of 0.1. In all cases but one, the slopes in terms of trials are significantly different from 0, and in all cases, the slopes in terms of time are not significantly different from 0. This confirms that the correlation window is constant in terms of time more so than in terms of trials.

Forecasting quality for different pacing frequencies is shown in Fig. 5 for the same four subjects. There is significant quality to the forecasting in each case, indicating that predictive saccades were being made. Again using 0.2 as the threshold, correlation windows are provided in Table 2, and plotted in Fig. 6*B*. As in the case of the autocorrelation functions, the correlation window of good-quality forecasting varies with pacing frequency so as to remain approximately constant in terms of time more so than in terms of number of trials. As for the correlation window determined from the autocorrelation function, the lengths of the correlation windows determined from forecasting for each subject are ∼1.5–2.5 s.

Another way to picture the operation of this correlation window, over which past saccades influence the current saccade, is by examining the properties of saccades that outlast the stimulus. In Fig. 7 we present, for the same subjects and conditions as in Figs. 4 and 5, the final 2 s of target pacing in each case. Target pacing ends at the 2-s point in each graph. It is very clear that the number of saccades that continue past the time of target presentation increases as the pacing rate increases. In accord with the results in the preceding text, these continuing saccades last for ∼1.5 s regardless of pacing rate (the exception being *E* and *I*).

## DISCUSSION

Our earlier study (Shelhamer and Joiner 2003) indicated that sequences of predictive saccade latencies are correlated, and the present study confirms and extends that finding. During saccadic tracking of targets paced at high rates (typically >0.5 Hz), target steps come often enough that there is insufficient time to react to each individual step when programming a saccade, and therefore the performance of previous saccades must be considered in triggering the current saccade. This leads to the correlations that we observe in the latency series of predictive saccades.

An obvious question is whether or not the repetitive rhythmic behavior addressed here has relevance for normal activities, which are often transient. Indeed these repetitive patterns may take the system out of its normal natural regime. However, we gain by stimulating the system in such a way that its behavior can be investigated with existing mathematical techniques. This is necessary at this stage of our investigations. In the following text, we make some comments as to differences between long- and short-duration repetitive stimulation, which have some bearing on this question as well.

### Saccade latency series as a fractional Brownian motion

The specific form of the correlations between predictive saccades results in what appears to be a random fractal sequence. Also known as “fractional Brownian motion” (fBm), this is a generalization of Brownian motion (Addison 1997; Bassingthwaighte et al. 1994; Mandelbrot 1983). Brownian motion is essentially the integral of Gaussian white noise. It has a power spectrum that decays as 1/*f*^{2} across all frequencies. It is also statistically self-similar, in this sense: taking a small segment of a larger Brownian motion signal and expanding its time scale by (for example) a factor of *b* = 4, the resulting signal will have the same statistics as the original if the amplitude scale is adjusted by a factor of *b*^{0.5} = 4^{0.5} = 2. This “expansion” exponent, 0.5, is called the Hurst exponent *H* and is related to the decay exponent α of the power spectrum. This self-similarity means that the signal is a fractal.

Fractional Brownian motion is a generalization of this, where the Hurst scaling exponent *H* can take on any value between 0 and 1. The associated power spectrum also decays with a value other than 2 (in the example in Fig. 1, the frequency-decay exponent is 0.92; that is, the spectrum has the form 1/*f*^{0.92}). Values of *H* other than 0.5 (regular Brownian motion) indicate that the signal exhibits long-term correlations (fBm). [For signals with long-term correlations such as fBm, the relationship *H* = (1+α)/2 should hold (Rangarajan and Ding 2000). The first example given above is regular Brownian motion, which does not exhibit long-term correlations and for which this relationship does not hold].

In theory, Brownian motions (conventional and fractional) have infinite memory—the current value depends on integration over the infinite past. (Here we interpret the series of saccade latencies for each trial as a series of samples from an fBm.) Obviously this is not possible for real neural systems, and characterization of latency series as an fBm is necessarily an approximation. We set an arbitrary threshold to define a correlation window over which past saccades have “significant” impact on the current saccade. Because evidence that latency series are fBm suggests that long-term correlations exist, setting, this strict threshold is subject to debate. It is more likely that the performance of past saccades is taken into account in a decaying manner, such that more recent saccades have more impact. This evidence also implies that, as more trials are added to the history of the system (i.e., as predictive saccade tracking continues for many seconds or minutes), the effective correlations will extend further into the past and the correlation window will increase in duration. We return to this point in the following text.

Another consequence of the fBm formulation (alluded to in the preceding text) is that there are interactions between trials over different time scales. One might gain an appreciation of this from Fig. 8, where we have plotted the latency series (*n* = 1,000) from one subject during reactive tracking at 0.3 Hz (*top*) and predictive tracking at 0.9 Hz (*bottom*). Whereas the *top graph* resembles an uncorrelated random sequence, the *bottom graph* exhibits features that are characteristic of fractional Brownian series, most notably fluctuations on different time scales. These observations suggest that there is distributed neural control of predictive saccades, operating over different time scales (Ding et al. 2002) and possibly coinciding with distributed neural saccade pathways (Gagnon et al. 2002; Gaymard et al. 1998).

### Extent of correlations between predictive saccades

An obvious question is over what extent “significant” correlations exist. We examined the autocorrelation structure of predictive saccades at different pacing frequencies. The correlation window, as defined by autocorrelations >0.2, was ∼2 s. In Fig. 1*D* for tracking at 0.9 Hz note that the correlation window is on the order of 16 trials or 8.9 s, much longer than the windows found in the shorter data sets in Figs. 4 and 5. We conjecture that, as predictive tracking continues for long periods, the correlation window may gradually increase as it becomes clear that it is safe to take into account performance further in the past since the stimulus conditions are likely to remain fixed. In the pursuit system, a similar neural store has been hypothesized to account for the ability to generate improved anticipatory pursuit after repeated viewing of target motion (Wells and Barnes 1998). The length of this store is apparently just a few seconds. As we noted in the preceding text for saccades, the duration of this pursuit store can be greatly increased under suitable conditions (Chakraborti et al. 2002).)

Another measure of correlation, which is sensitive to more general nonlinear interactions, is forecasting in the state space—how well can previous latency data be used to predict future latencies? Our findings here again show that correlations exist over ∼2 s. The fact that the autocorrelation and nonlinear forecasting methods give similar results suggests that the dominant relations between trials are linear.

In this study, we are faced with the typical conundrum of data set size. The quality of the autocorrelation functions, and especially of the forecasting results, would be improved by the use of larger data sets. However, our own data (autocorrelations in Figs. 1 vs. 4) demonstrate that there can be large differences in the correlation structure of predictive latencies when prediction is carried out for more than a few seconds at a time. One might argue that the smaller data sets are more representative of natural behavior, where predictive repetitive motions might occur over several seconds rather than several minutes. Thus our results may characterize prediction in natural circumstances more so than if we had used longer data sets with better correlation and forecasting features.

### Outline of a model for predictive saccade generation

There appears to be a relatively fixed correlation window over which past experience is incorporated into the programming of the current saccade. This window, on the order of 2 s, is fixed in absolute time rather than number of trials. We suggest that the saccadic system attempts, whenever possible, to make predictive saccades based on past performance. It does so by “integrating”—in some unknown manner—performance over the previous few seconds. When the rate of target pacing falls below ∼0.25 Hz, there are no longer any previous saccades within this correlation window at the time that a target moves. Thus there is no information on past performance available for adjustment of the current saccade, and prediction breaks down (see Fig. 9 for a schematic representation). The system wants to predict, but the information available for it to do so is too far in the past to be of value. An implication of this model is that predictive saccade pathways are always in some sense “active,” although they have nothing to process until target pacing reaches a critical value. This provides a natural explanation for the abrupt “phase transition” that we previously demonstrated between reactive and predictive saccades when the rate of target pacing reaches ∼0.5 Hz (Shelhamer and Joiner 2003). This formulation might also explain the observed preference for predictive over reactive behavior that may be a general feature of repetitive tasks (Joiner and Shelhamer 2004).

There is an apparent discrepancy between our previously observed phase-transition frequency of ∼0.5 and 0.25 Hz as predicted here by a correlation window of 2 s. This might be resolved in several ways. First, as noted, the correlation window is arbitrarily defined by thresholds of 0.2; higher values would lead to shorter windows and a higher predicted transition frequency. Second, it may be that more than one previous saccade must fall within the correlation window in order for predictive tracking to occur. Establishing the details of this prediction model is a subject of our current work.

## GRANTS

This work was supported by National Eye Institute Grant EY-015193.

## Acknowledgments

The assistance of W. Joiner and A. Zorn is appreciated.

## Footnotes

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “

*advertisement*” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

- Copyright © 2005 by the American Physiological Society