Subsecond Timing in Primates: Comparison of Interval Production Between Human Subjects and Rhesus Monkeys

Wilbert Zarco, Hugo Merchant, Luis Prado, Juan Carlos Mendez


This study describes the psychometric similarities and differences in motor timing performance between 20 human subjects and three rhesus monkeys during two timing production tasks. These tasks involved tapping on a push-button to produce the same set of intervals (range of 450 to 1,000 ms), but they differed in the number of intervals produced (single vs. multiple) and the modality of the stimuli (auditory vs. visual) used to define the time intervals. The data showed that for both primate species, variability increased as a function of the length of the produced target interval across tasks, a result in accordance with the scalar property. Interestingly, the temporal performance of rhesus monkeys was equivalent to that of human subjects during both the production of single intervals and the tapping synchronization to a metronome. Overall, however, human subjects were more accurate than monkeys and showed less timing variability. This was especially true during the self-pacing phase of the multiple interval production task, a behavior that may be related to complex temporal cognition, such as speech and music execution. In addition, the well-known human bias toward auditory as opposed to visual cues for the accurate execution of time intervals was not evident in rhesus monkeys. These findings validate the rhesus monkey as an appropriate model for the study of the neural basis of time production, but also suggest that the exquisite temporal abilities of humans, which peak in speech and music performance, are not all shared with macaques.


An essential component of primate cognitive function is the ability to extract and represent temporal information from the environment. The quantification of the passage of time, in turn, is crucial to coordinate motor behavior. Processing of temporal information is a key element during speech production and comprehension (Shannon et al. 1995), music performance (Janata and Grafton 2003; Mauk and Buonomano 2004; Shannon et al. 1995), and complex motor actions (Mauk and Buonomano 2004), such as target interception and collision avoidance (Merchant and Georgopoulos 2006; Merchant et al. 2009). For example, the ability to capture and interpret the beats in a rhythmic pattern allows people to move and dance in time to music; in fact there is evidence showing that how we move may influence our perception of musical rhythm (Phillips-Silver and Trainor 2005). As in music, there is a spectral and temporal structure in speech necessary for successful word articulation and recognition (Diehl et al. 2004). The importance of timing in speech is apparent in patients with cochlear implants that show nearly perfect speech recognition with a reduced amount of spectral information (Shannon et al. 1995). Thus auditory stimuli are efficiently processed and are associated with the extremely complex timing abilities in humans (Grondin 2001; Merchant et al. 2008c; Wearden et al. 1998). In general, all these behaviors unfold on the millisecond timescale, a range that seems to depend on a specific neural timing mechanism (Gibbon et al. 1997; Rammsayer 1999). Indeed, functional imaging studies have shown that the basal ganglia, cerebellum, and different cortical structures including the supplementary motor area (SMA), prefrontal, and posterior parietal cortex form a neural circuit engaged in temporal information processing (Bengtsson et al. 2005; Coull et al. 2004; Pouthas et al. 2005; Rao et al. 1997, 2001). Nevertheless, although subsecond time processing has been relatively well studied from the behavioral (Buhusi and Meck 2005; Grondin 2001) and functional imaging perspective (Janata and Grafton 2003), there are few neurophysiological studies of perceptual timing (Lebedev et al. 2008; Leon and Shadlen 2003). To our knowledge, there are no reports on motor timing neurophysiology, which demands an appropriate animal model to study the neural underpinnings of interval timing during voluntary motor performance (Patel et al. 2005).

The brain representations of time and space are considered supramodal, since no specific sensory organs are devoted to provide such complex information. However, when the senses deliver conflicting information, vision dominates spatial processing, whereas audition dominates temporal processing (Bertelson and Aschersleben 2003; Guttman et al. 2005; Repp and Penel 2002). It has been suggested that the human perceptual system abstracts the rhythmic-temporal structure of visual stimuli into an auditory representation that is automatic and obligatory (Brodsky et al. 2003; Guttman et al. 2005). To understand the neural basis of spatial processing, the most frequently studied nonhuman primate is the macaque monkey, due to its remarkable ability to deal with spatial information and its psychophysical similarity with human subjects at the perceptual (Britten et al. 1992; Parker and Newsome 1998; Romo et al. 2000), cognitive (Fortes et al. 2004; Janssen et al. 2000; Merchant et al. 2003, 2004b), and motor levels (Buneo et al. 2002; Georgopoulos et al. 1993; Velliste et al. 2008). A number of combined neurophysiological and psychophysical experiments in macaques have been designed to uncover, with notable success, the functional organization of the neural circuits that mediate spatial processing (Georgopoulos et al. 1986, 1989; Hubel and Wiesel 1968; Mountcastle et al. 1975; Rolls 1999). This neurophysiological information has been fundamental for understanding the human brain mechanisms of spatial behavior (Kourtzi et al. 2003; Vanduffel et al. 2002) because of the interspecies similarities in the visual system (Newsome and Stein-Aviles 1999; Nichols and Newsome 1999). Therefore the same experimental procedures could be valid to study the neural basis of time processing. Open questions, though, are 1) whether the macaque timing production shows the same properties of human psychophysical execution and 2) whether this type of primate constitutes a good animal model to study the neural basis of interval production.

The current study provides a detailed psychometric description of the similarities and differences in motor timing performance between human subjects and rhesus monkeys during two timing production tasks. Results indicate that the rhesus monkey is a suitable model for the study of the neural basis of time production, but they also suggest that not all of the exquisite temporal abilities of humans are shared with macaques.



Twenty human subjects (10 males, 10 females), mean (SD) age of 26.5 (2.5) years (range: 23–32 yr) were tested in this study. They were right-handed, had normal or corrected vision, and were naive about the task and purpose of the experiment. All subjects reported no systematic musical training for >1 yr. Each subject volunteered and gave informed consent for this study, which complied with the Declaration of Helsinki and was approved by the National University of Mexico Institutional Review Board.

Three naive male monkeys (Macaca mulatta, 5–7 kg, referred to as M1, M2, and M3) were used. The ages of the monkeys were 7, 6, and 5 yr, respectively. M2 and M3 were right-handed and M1 was left-handed. All experimental procedures with the animals were approved by the National University of Mexico Institutional Animal Care and Use Committee and conformed to the principles outlined in the Guide for Care and Use of Laboratory Animals (National Institutes of Health, publication number 85–23, revised 1985).


Human subjects were seated comfortably on a chair facing a computer screen (refresh rate: 60 Hz; Dell Optiplex 19-in.) in a quiet experimental room and tapped on a push-button (4-cm diameter, #7717, sampled at 200 Hz; Crest, Dassel, MN) during the production tasks. The button made an approximately 50-dB sound every time it was pushed (HER-400; Decibelimeter, Electrónica Steren, Mexico City, Mexico). The subjects could not see their own hand during tapping. Monkeys were seated in a primate chair in a sound-attenuated room facing a computer screen. The animals tapped on the same type of push-button with one hand, whereas their opposite arm was comfortably restrained during the task. The monkeys started each trial in the tasks by putting their working hand on a horizontal key (with infrared sensors) that was placed next to the push-button. Human subjects started each trial by placing their hand next to the push-button on a custom-made platform that prevented each from seeing its own hand during tapping. The stimulus presentation and the collection of behavioral responses were computer-controlled by a custom-made Visual Basic program (Microsoft Visual Basic 6.0, 1998). Auditory stimuli were presented through noise-canceling headphones (MDR-NC50, Sony) or two equidistant front speakers for humans and monkeys, respectively. The monitor was at a distance of 57 cm from the eyes in both species.

Task 1: multiple-interval task (MIT)


At the beginning of the trial, stimuli were presented with a constant interonset interval. Subjects were required to push a button each time a stimulus was presented, which resulted in a stimulus-movement cycle. The subjects started to press the button when they were ready to start the synchronization phase. After four consecutive synchronized movements the stimuli were eliminated and the subjects continued tapping with the same interval for three additional intervals. Monkeys received a reward if each of the intervals produced had an error <35% of the target interval. In addition, the monkey could receive a double reward if the intertap interval was <20% of the target interval. It is important to mention that the amount of monkey reward (fruit juice) was adjusted to be proportional to the trial duration (interval duration × 6 produced intervals), to decrease the bias for the production of short-duration intervals. For human subjects feedback was displayed on the screen as the mean intertap interval and SD for the continuation phase. Throughout the experiment, trials were separated by a variable intertrial interval (1.2 to 4 s).


For both human subjects and monkeys the auditory stimuli were pure tones (33 ms, 2,000 Hz, 65 dB). Visual stimuli were 4-cm side squares presented in the center of a computer screen for 33 ms, with green color for human subjects and red for monkeys. The frame rate of the video board (60 Hz) was accurately calibrated and both the visual and auditory stimuli, although brief, were clearly detectable. The target intervals were 450, 550, 650, 850, and 1,000 ms and were chosen pseudorandomly within a repetition. Ten repetitions were collected for each interval for a total of 300 produced intervals (5 durations × 6 intervals [3 synchronization + 3 continuation] × 10 repetitions).

Task 2: single-interval task (SIT)


For each interval, there was a training and an execution period. In the training period, a target interval (two stimuli whose onsets were separated by a particular duration) was presented at the beginning of the trial. Then the subject tapped twice on the push-button to produce the same interval. This was repeated for five training trials, after which the subject entered the execution period, where he/she produced another 10 single intervals, each in response to a go signal that appeared on the screen. In the case of monkeys, each duration interval was associated with a particular stimulus feature (e.g., 450 ms with a blue square) so that during the execution period the go signal was a stimulus that had been linked to the production of a specific interval during the training period. Monkeys were rewarded following the same rules described in Task 1. Again, feedback was displayed on the screen for human subjects, indicating the mean intertap interval and SD across trials of the same target interval during the execution period. Throughout the experiment, trials were separated by a variable 1.2- to 4-s intertrial interval.


In this task we tested only four target intervals: 450, 650, 850, and 1,000 ms. The same auditory and visual stimuli as in the MIT were used in human subjects. For monkeys, the stimulus properties were associated with a particular target interval duration as follows: 4,400-Hz tone or blue square with 450-ms, 3,000 Hz tone or green square with 650-ms, 1,000-Hz tone or cyan square with 850-ms, and 650-Hz tone or yellow square with 1,000 ms. A block of five trials in the training period followed by ten trials during the execution period were collected for a particular interval duration before changing to another one. The target intervals were chosen pseudorandomly between blocks. Therefore a total of 60 trials (40 for the execution period) were collected.


Human subjects performed both tasks for each marker modality in random order in two sessions, each session of 1 h/day. Ten repetitions per session were collected for each marker modality and task. Before data collection, practice trials were given in the tasks until the subjects acknowledged that they understood the tasks and were comfortable with their performance.

Monkeys were trained following classical conditioning techniques; they received its normal food rations but were water deprived, except for the juice drops obtained during the training and testing sessions. The animals worked 6 days/wk, 4 h/day on average; they performed around 1,000 trials/day, with a total liquid intake of 120–220 ml. Weight was strictly controlled by giving supplementary fluids to the monkeys when they lost >20% of their initial weight. The monkeys were initially trained in the MIT using the following four steps. First, the monkey learned to place his dominant hand on the horizontal key. Second, after hand-on-key detection, stimuli were presented and the monkey had to push the button twice to produce a single interval with a duration similar to that of the interonset stimulus interval (ISI). Two of the monkeys (M1 and M3) performed wrist up-down tapping movements, as humans subjects did, whereas monkey M2 performed the tapping with a forward–backward wrist movement. Third, the monkey was trained to produce several (five to seven) taps in response to each stimulus. Finally, after producing four or more synchronized taps, the monkeys learned to produce intervals in the continuation phase of the task. The animal started by producing one continuation interval and gradually increased the number of taps until he was able to produce three intervals in this phase of the task. This was an extremely difficult task to learn and execute for the monkeys. Monkey M1 was initially trained in the MIT using auditory markers to define the interval durations. However, we found that the monkeys showed a clear preference for visual stimuli. Thus monkeys M2 and M3 were first trained in the visual condition and then the auditory markers were introduced. This strategy considerably reduced the training period (see following text).

Once the monkeys learned the MIT, they were trained in SIT. In this case, after the key activation the animals produced an interval after two stimuli were presented. Thus the monkey was required to associate a particular interval duration with a specific frequency (auditory condition) or color (visual condition). Once this association was achieved, the monkey could perform the task not only in the instruction but also in the execution phase, by producing one interval after a stimulus. This task was simpler for the monkeys to learn and execute, although during training the visual marker preference persisted.

Analysis of behavioral data


Standard statistical techniques were used for data analysis including repeated-measures ANOVA and linear regression. In most of the repeated-measures ANOVAs the between-subjects variable was species and the within-subjects variables included marker modality, task phase, and interval duration. The reported probability levels in the repeated-measures ANOVAs correspond to the Greenhouse–Geisser test, which corrects for possible deviations in sphericity (Mauchly test). The level of statistical significance to reject the null hypothesis was α = 0.05. Subroutines written in MATLAB (version; The MathWorks, Natick, MA) and the SPSS statistical package (version 12, SPSS, Chicago, IL) were used for the statistical analyses.


Two parameters were evaluated as a measure of subject performance: the variance and the constant error. The mean and SD of all intertap intervals for each subject were used to compute the constant error and the variance, respectively. This implies that for the MIT, the variance corresponded to a general measure of within- and between-trial variability without averaging across trials in the synchronization and continuation phases. In accordance, in the SIT the variance corresponded to the between-trial variability, since only one interval per trial was produced. Finally, the constant error was defined as the difference between the mean minus the target interval.


We used the model reported by Getty (1975) and Ivry and Hazeltine (1995) to analyze the scalar property, a form of Weber's law that defines a linear increase in temporal variability (SD) as a function of mean subjective time (Getty 1975; Gibbon et al. 1997). A linear regression between the timing variance (σ2) and the mean subjective duration squared (D2) was performed as follows σTotal2=k2D2+c1 where k is the slope that approximates the Weber fraction and the intercept c is a constant representing the time-independent component of the variability. This model uses variance against squared durations as a generalized Weber equation because it is only with the variance that different sources of variability can be decomposed (Getty 1975). For this reason, this model has proved useful to dissociate the temporal component from the fraction of variance that remains similar across interval durations (e.g., sensory detection and motor implementation) and accurately predicts both the initial drop in Weber fraction for very short (<200-ms) durations and the observed constancy of the Weber fraction for durations ≤2 s (Getty 1975). In addition, Eq. 1 was a better model than the linear regression of SD against interval duration in a variety of perception and tapping tasks for the range of durations in the hundreds of milliseconds (Church et al. 1976; Ivry and Hazeltine 1995).


The monkeys' performance was analyzed once the animals reached an asymptotic level in their learning curve across tasks. We assumed that the monkeys' performance was stable when their daily performance was >70% of correct trials for >1 wk. The appropriate learning criterion was reached on the MIT after 25 mo in monkey M1, 12 mo in monkey M2, and 11 mo in monkey M3, which emphasizes how difficult this task was for the monkeys, particularly in the auditory condition. In contrast, all animals learned the SIT in both marker conditions in <4 mo. Once the monkeys learned the tasks, their performance was quite consistent across days, as shown in Fig. 1 for monkey M2 during the multiple interval task. Figure 1 also shows how well this monkey differentiated between target intervals across conditions.

Fig. 1.

Mean of the intertap intervals (ITIs) as a function of days in the multiple-interval task (MIT) for monkey M2. The series of mean ITIs are shown in grayscale (cf. middle) for the 5 target intervals, across the synchronization (top) and continuation (bottom) phases, and the auditory (left) and visual (right) marker modalities.

The following results are divided into three sections. First, we address the variability and accuracy of both species during MIT. Second, we characterize the temporal performance of human subjects and monkeys in the SIT, and finally we compare the performance between the two experimental paradigms.

Multiple interval task (MIT)

The multiple-interval tapping task has been a useful paradigm in experimental psychology to understand different aspects of temporal performance (Ivry and Hazeltine 1995; Merchant et al. 2008a,b,c; Wing and Kristofferson 1973). This task has at least four main components: a sensorimotor process during synchronization, an internal timing component during both synchronization and continuation, a cyclic element for the multiple-interval production, and a working-memory component used during the continuation. Thus in the following sections we intend to compare the performance of human subjects and monkeys in this task, making an effort to dissociate the temporal and nontemporal processes of the MIT across these comparisons.


As an initial step, we compared the mean asynchronies for humans and monkeys in the MIT. The asynchronies are defined as the time difference between the stimulus onset and the tap onset. In accordance with the literature (see Repp 2005 for a review), Fig. 2 shows that human subjects were able to synchronize their behavior to the actual metronome in the synchronization phase with negative mean asynchronies, particularly in the visual modality. In contrast, the asynchronies in monkeys were positive and around 300 ms across intervals. A repeated-measures ANOVA was carried out in which species (monkeys and humans) was the between-subjects variable, the interval duration and marker modality (auditory and visual) were defined as within-subjects variables, and mean asynchronies constituted the dependent variable. The results showed significant main effects only for species [F(1,21) = 464.2, P < 0.0001]. These findings indicate that monkeys were not able to synchronize their tapping behavior to the sensory metronome as human subjects do.

Fig. 2.

Mean asynchronies (mean ± SE) for the synchronization (black) phase of the MIT for human subjects (squares) and monkeys (circles), and reaction times (light gray) in the 5 consecutive movements of the SRT for monkeys. The auditory and visual interval marker conditions are depicted in the left and right panels, respectively.

We also compared in the monkeys the mean asynchronies of the MIT with the reaction times of a sequential reaction time task (SRT), in which the animals performed five tapping movements in response to five stimuli with random (600–1,400 ms) interonset intervals to receive a reward in each trial (Fig. 2). A repeated-measures ANOVA, using the asynchronies (MIT) or the reaction times (SRT) as dependent variable and task and modality as within-subjects variables, revealed significant main effects for task [F(1,2) = 143.02, P = 0.007], but not for modality [F(1,2) = 0.188, P = 0.707] or the task conditions × modality [F(1,2) = 10.9, P = 0.081] interaction. Therefore these findings suggest that although the monkeys were not able to synchronize their behavior to external cues, their tapping responses in the MIT were shorter than the reaction times in the SRT and thus showed some level of timing prediction during the synchronization phase.


The SD was computed from the within- and between-trial individual interresponse intervals (IRIs) for each target interval duration. A repeated-measures ANOVA was performed using marker modality, interval duration, and task phase (synchronization and continuation) as within-subjects variables, species as between-subjects variable, and SD as dependent variable. The results showed significant main effects for species [F(1,21) = 68.46, P = 0.007], modality [F(1,21) = 13.3, P = 0.002], interval duration [F(4,84) = 37.95, P < 0.0001], and phase [F(1,21) = 14.62, P = 0.001], as well as significant species × modality [F(1,21) = 9.84, P = 0.005], species × interval duration [F(4,84) = 4.84, P = 0.001], and species × phase [F(1,21) = 18.58, P < 0.0001] interactions. These results indicate that in both phases of the MIT, the overall variability was greater in monkeys than that in humans, it was also larger for visual than for auditory markers, and it increased as a function of the interval duration. In addition, the interaction effects demonstrated that only human subjects showed greater temporal variability in the visual than that in the auditory condition and only monkeys showed increased variability in continuation when compared with synchronization (see Supplemental Table S1).1

The next step was to verify whether the variability of successive IRIs was stable across synchronization and continuation phases or whether there were systematic changes in performance at some point in the tapping sequence. In this case the SD was computed across trials for each sequence position. Supplemental Fig. S1 shows the mean SD as a function of the six intervals produced, the first three from synchronization and the last three from continuation. A repeated-measures ANOVA was carried out using SD as dependent variable, production sequence (one to six IRIs) and modality as the within-subjects variables, and species as the between-subjects variable. The ANOVA showed significant main effects on production sequence [F(5,105) = 5.12, P < 0.0001] and species [F(1,21) = 7.65, P = 0.012]. In addition, the production sequence × species [F(5,105) = 8.66, P < 0.0001] and the modality × species [F(1,21) = 11.08, P < 0.003] interactions showed significant effects. Overall, this analysis indicates that the temporal variability was not homogeneous across the six taps, with a systematic decrease after the first tap and an increase for the last tap in both species. Furthermore, there was a significant difference in SD production sequence between species, in which human subjects showed a more stable pattern of temporal performance across the synchronization and continuation, whereas monkeys showed a stepped increase in variability in the continuation phase. It is appropriate to mention here that due to the observed sequence effects on the performance variability, we eliminated the first interval of the synchronization and the last interval of the continuation phases from the computations of temporal variance for the following slope analysis.


This analysis assumes that the overall variability in a timing task can be decomposed into variability associated with timing mechanisms and variability resulting from duration-independent processes. The slope method uses a linear regression between the variability and the squared interval duration to arrive at a generalized form of Weber's law (Eq. 1). The resulting slope is associated with the time-dependent process, since it captures the scalar property of interval timing. The intercept is related to the time-independent component, which is the fraction of variance that remains similar across interval durations and is associated with sensory detection and processing, decision making, memory load, and/or motor execution (Ivry and Hazeltine 1995).

In Fig. 3 the variance (means ± SE) is plotted against the square of the intervals produced, where it can be noticed that the variance increased linearly as a function of the interval produced. Table 1 summarizes the regression results for individual participants. Large differences are evident across conditions in the slope, intercept, and proportion of variance accounted for by the models (R2). To characterize these differences, we performed a set of separate ANOVAs, where the dependent variables were the slope, the intercept, or the R2, and where species was used as between-subjects variable and modality and phase as within-subjects variables. In the case of the time-dependent component (slope), the ANOVA revealed significant main effects of species [F(1,21) = 25.75, P < 0.0001] and phase [F(1,21) = 22.36, P < 0.0001], but not of modality [F(1,21) = 2.01, P = 0.17]. In addition, a significant species × phase interaction was found [F(1,21) = 39.59, P < 0.0001]. This analysis revealed one of the most important findings of the present study—although the slope during synchronization was similar between species, the slope in the continuation phase decreased slightly in human subjects but increased dramatically in monkeys. Indeed, no significant species differences in slope were found during synchronization [F(1,44) = 1.16, P = 0.29], the slope in monkeys was significantly larger in continuation than synchronization [F(1,10) = 17.82, P = 0.002], and a marginal difference was found in the slope of human subjects between the two phases [F(1,78) = 2.83, P = 0.09]. Furthermore, these results show that the modality of the interval marker did not play an important role in modulating the slope. Thus when multiple intervals were internally timed and produced, it seems that the temporal information processing was much more efficient in human subjects than that in monkeys.

Fig. 3.

Variance (mean ± SE) plotted as a function of the produced duration squared (mean ± SE) in the MIT during synchronization (gray line, open circles) and continuation (black line, close circles), for human subjects (top) and monkeys (bottom), and the auditory (left) and visual (right) modalities. The straight lines correspond to the best linear fittings.

View this table:
Table 1.

Slope analysis statistics for the multiple-interval task

The ANOVA on the intercept did not show significant main effects. Only the species × phase [F(1,21) = 15.38, P = 0.001] interaction reached significance. A note of precaution regarding the slope model is in place here. Although the time-dependent (slope) and time-independent (constant) components are theoretically independent in the regression model, noise generally produces covariation between these two measures. For example, if the slope overestimates the error for the longest duration, this will likely increase slope and decrease intercept.

Finally, the ANOVA on the goodness of fit showed only a significant species × phase [F(1,21) = 8.6, P = 0.008] interaction. Monkeys showed greater R2 values for the continuation, whereas human subjects showed the inverse effect, i.e., greater R2 values in the synchronization.


The constant error can have positive and negative values, zero reflecting perfect performance. Thus over- and underestimation are associated with positive and negative constant errors, respectively. Figure 4 shows that this variable was negative across all conditions and species. The ANOVA results for constant error showed significant main effects for species [F(1,21) = 18.1, P < 0.0001], interval duration [F(4,84) = 16.2, P < 0.0001], and phase [F(1,21) = 12.6, P = 0.002], and marginal main effects on modality [F(1,21) = 3.62, P = 0.07]. In addition, significant effects were found on species × phase [F(1,21) = 12.8, P = 0.002], interval duration × phase [F(4,84) = 28.6, P < 0.0001], and interval duration × species [F(4,84) = 11.1, P < 0.0001] interactions. These results indicate the following: 1) human subjects were more accurate than monkeys across conditions; 2) constant error increased in the continuation phase, particularly in monkeys; 3) there was a decrease in constant error as a function of interval duration across species and task phases; and 4) both species had the tendency to be more accurate in the auditory condition, producing shorter intervals in the visual condition.

Fig. 4.

Constant error (mean ± SE) as a function of target intervals for all conditions in the MIT. The horizontal line at zero represents perfect accuracy. The straight lines correspond to the best linear fittings; however, for the continuation phase in monkeys the interval of 1,000 ms was eliminated of the regression analysis. All the other conventions are the same as in Fig. 3.

Besides the scalar property of interval timing, psychophysicists have demonstrated that short durations are overestimated and long ones are underestimated (Jones and McAuley 2005; McAuley and Miller 2007). This finding, first noted by Vierordt (1868; see Jones and McAuley 2005; McAuley and Miller 2007), implies that there is an intermediate value with no constant error, also termed the indifference point. Figure 4 shows that the constant error (means ± SE) has a clear tendency to decrease as a function of the produced interval in the MIT across conditions. However, only in the auditory condition for human subjects did the constant error show positive values for short durations and negative values for long intervals in the continuation. Monkeys in both modalities and task phases, as well as human subjects in both task phases in the visual condition, underestimated intervals across the range of durations tested. Thus no indifference point could be determined from these data. In addition, the linear regression models for the constant error as a function of the interval produced, shown in Fig. 4, revealed larger negative slopes for continuation than those for synchronization for both species and marker modalities. It is important to clarify that the interval of 1,000 ms was considered an outlier in the continuation phase of monkeys and was not included in the regression models for the auditory and visual marker conditions (see Fig. 4). Furthermore, the synchronization slopes were similar between the two species, but were larger in the monkeys than those in human subjects during the continuation phase.

These findings suggest that monkeys show a predisposition to produce shorter intervals, particularly for longer durations, since they show clear difficulties in withholding their responses. Nevertheless, it seems that the mechanism for temporal processing shows similar “accuracy fingerprints” in both primates across task phases and modalities.

Single-interval task (SIT)

To properly perform the single-interval task, subjects needed to store in memory a representation of interval duration for a relatively long time. This representation, acquired during the instruction period, was used to produce two consecutive taps after a go signal in the execution period. Thus SIT has memory requirements different from those of the MIT; moreover, only one interval is produced, which eliminates the cyclical component of the previous task.


Figure 5 shows that the variance also increased linearly as a function of the square of the interval produced in the SIT. It is clear that the temporal variability was similar between species and modalities, although there was a decrease in SD for the auditory condition in human subjects. The corresponding ANOVA showed significant main effects only for interval duration [F(3,63) = 14.4, P < 0.0001].

Fig. 5.

Variance (mean ± SE) as a function of the square of the produced interval for all conditions in the single-interval task (SIT) for human subjects (top) and monkeys (bottom), and the auditory (left) and visual (right) modalities. The straight lines correspond to the best linear fittings.


Table 2 shows the slope, intercept, and R2 for human subjects and monkeys and for both sensory marker conditions in this task. Again, separate ANOVAs were carried out, using the slope, the intercept, or the R2 as dependent variables and the species and modality as factors. Remarkably, no significant effects were detected for any of the tested variables. These results indicate that the scalar property was followed in both species during the SIT. Importantly, the analysis showed that the time-dependent component involved in single-interval production was similar between human subjects and monkeys.

View this table:
Table 2.

Slope analysis statistics for the single-interval task


Just as in the MIT, the constant error was smaller in human subjects than that in monkeys during the SIT across marker modalities (Fig. 6). The corresponding ANOVA showed significant main effects on species [F(1,21) = 5.3, P = 0.032] and interval duration [F(3,63) = 10.57, P < 0.0001], as well as on the species × interval duration interaction [F(3,63) = 7.17, P < 0.0001]. Thus these findings support two of the results of the MIT—that there was a decrease in constant error as a function of the target interval and that human subjects were more accurate than monkeys.

Fig. 6.

Constant error (mean ± SE) as a function of the produced intervals for all conditions in the SIT. The horizontal line at zero represents perfect accuracy. The same conventions as in Fig. 5.

The negative slope in the linear fittings of Fig. 6 was larger in monkeys (auditory [−0.049] and visual [−0.64]) than that in human subjects (auditory [−0.019] and visual [−0.030]). As in the MIT the 1,000-ms interval was excluded in the linear regression in both modality conditions in the SIT for monkeys because it is a clear outlier.

Comparing single- and multiple-interval production performance

The existence of a central timing mechanism implies that temporal performance should be similar in different behavioral contexts. Thus the main question was whether the slope analysis would show similar time-dependent components in the MIT and the SIT. An ANOVA was performed with the slope as dependent variable, species as between-subjects variable, and task (continuation in MIT vs. SIT) and modality as within-subjects variables. The results revealed no significant main effects on task or modality and only a marginal main effect on species [F(1,21) = 3.57, P = 0.073]. In addition, the task × species interaction was significant [F(1,21) = 5.64, P = 0.027]. Thus these results underscore the differences in temporal performance between the two primate species. The results in monkeys support the notion of a partially overlapping mechanism for temporal performance in the two production tasks, showing similar slopes in the MIT continuation and SIT. In contrast, human subjects showed a smaller slope in the MIT continuation than that in the SIT, confirming previous reports that the presentation of multiple intervals confers some advantages on timing precision (Hazeltine and Ivry 1995; Merchant et al. 2008a,c; Miller and McAuley 2005).

As a final point, it is interesting to note that the accuracy patterns in the SIT and MIT shared some common properties, the most important of which are 1) the overall underestimation of intervals in both species; 2) the larger negative slope in monkeys than that in humans, with large underestimation of 1,000 ms in monkeys; and 3) the production of shorter intervals in the visual than in the auditory condition.


This is the first detailed comparison of the psychometric performance between human subjects and rhesus monkeys during interval production. Six main results were obtained in the present study. 1) In both primate species, the variability in time production during the MIT and SIT increased as a function of the mean length of the produced interval, following the scalar property. 2) There was a general underestimation of time that increased as a function of the interval duration in both species. 3) During the production of single intervals or the production of multiple intervals cued by a sensory metronome, the monkeys' timing variability was similar to that of human subjects. 4) Through the continuation phase of the MIT, human subjects showed a decrease, whereas monkeys showed an increase in variability with respect to the synchronization phase. 5) In both tasks, human subjects showed greater accuracy and less temporal variability in the auditory than in the visual marker condition, an effect that was not evident in monkeys. 6) In contrast to human subjects, monkeys did not synchronize their tapping to the sensory metronome during the MIT.

The scalar property, which is a form of Weber's law, is a ubiquitous feature of interval timing. It has been observed in many timing tasks and species (Allan 1998; Church et al. 1994; Fetterman and Killeen 1992; Gibbon et al. 1997; Merchant et al. 2008c; Penney et al. 2008). In addition, the scalar property is not followed by subjects with timing deficiencies, such as those with Parkinson's disease or cerebellar patients (Artieda et al. 1992; Harrington et al. 1998; Merchant et al. 2008a; Pastor et al. 1992; Spencer et al. 2003). Therefore our results on the rhesus monkey indicate that the neural timing machinery possesses functional properties that are phylogenetically conserved. Indeed, due to the behavioral, anatomical, and functional similarities between humans and macaques, the present findings support the rhesus monkey as a good animal model for the study of time production neurophysiology. Nevertheless, as we discuss in the following text, some precautions should be followed when extrapolating the neural underpinnings of temporal processing from macaques to humans.

The slope analysis revealed that the time-dependent component of the total performance variability (the slope) was similar between species during the SIT and during MIT synchronization. These findings suggest, first, that both primate species have a similar internal timing mechanism when the passage of time needs to be quantified for only one interval. Indeed, rhesus monkeys have practically the same abilities as those of humans in a large number of sensorimotor tasks, such as reaching (Georgopoulos et al. 1982; Merchant et al. 2004a; Naselaris et al. 2006), categorizing and discriminating stimuli (Britten et al. 1992; Fortes et al. 2004; Hernandez et al. 1997; Merchant et al. 1997; Romo et al. 1996), and anticipatory pursuit (Heinen et al. 2005; Janssen and Shadlen 2005; Kowler 1989). Furthermore, the interception skills of monkeys are as good as—or even better than— those of human subjects (Merchant et al. 2003). This could explain in part the similar temporal performance of both species during the SIT, if we consider that subjects compute the time-to-contact of the target for a successful interception (Merchant and Perez 2009). The picture is more complex in the MIT synchronization, since both species showed similar slopes, although monkeys did not synchronize its tapping. These results suggest that some but not all of the neural processing involved in the stimulus–response cycles during synchronization may be shared between both species.

On the other hand, the slope in the MIT continuation phase decreased in human subjects but increased in monkeys compared with the synchronization phase. The slope decrease in humans corroborates previous studies in which corrective processes that maintain synchronization do so at the cost of increased variability of interresponse intervals (Kolers and Brewster 1985; Madison 2001; Semjen et al. 2000). This phenomenon not only suggests that the human timing mechanism benefits from the cyclical component of the MIT (Ivry and Hazeltine 1995; Merchant et al. 2008b,c), but it also suggests that this timing mechanism does not have to carry out phase corrections when working independently of external sensory cues (Repp 2005; Wing 1977). On the other hand, the fact that the variability of the time-dependent component is significantly larger during the continuation than in the synchronization phase in monkeys (see Table 1), suggests that the internal timing machinery in macaques is not built to produce multiple consecutive intervals. It is plausible to assume that in the rhesus' natural repertoire of temporal behaviors, there is no need to execute multiple and precisely timed intervals, even if their internal timing mechanism is quite capable of measuring and producing durations of individual events. On the contrary, human subjects often execute multiple intervals during speech, music, and dance (Janata and Grafton 2003; Phillips-Silver and Trainor 2007; Thomson and Goswami 2008). Most of these complex human behaviors include auditory cues to process temporal information, which could be associated with the smaller temporal variability and better accuracy during the MIT in the auditory than that in the visual marker condition, an effect that has been well documented in the literature (Grondin et al. 1996; Merchant et al. 2008c; Repp and Penel 2002; Wearden et al. 1998). The fact that auditory signals are timed with greater precision and judged longer than equivalent duration visual signals is readily apparent in healthy children (5–8 yr old), as well as young and older adult human participants (Droit-Volet et al. 2007; Lustig and Meck 2001; Penney et al. 2000). In contrast, these auditory/visual modality differences are less pronounced and more dependent on the level of training and feedback in rodents (Cheng et al. 2008; Meck 2005).

The large monkey deficiencies in learning and executing the MIT during the continuation phase, the fact that they did not synchronize their tapping to the metronome, and the lack of preference for the auditory modality all strengthen the idea that temporal underpinnings in monkeys cannot deal primarily with the production of multiple intervals, in part because vocalizations in macaques do not have a complex temporal structure (Ghazanfar and Logothetis 2003). Indeed, it has been suggested that the ability to synchronize motor behavior to predictive auditory cues over a wide range of tempi is present not only in humans but also in parrots that are able to perform vocal mimicking behavior (Patel et al. 2009; Schachner et al. 2009). In contrast, nonhuman primates cannot entrain its motor behavior (Schachner et al. 2009). Thus synchronization may have played an important role in the evolution of music and even of language (Merker et al. 2009). Needless to say, the many months of monkey training in the MIT probably improved the temporal processing capabilities of the timing neural network, as recently reported in human auditory cortex after music training (Musacchia et al. 2007). Nevertheless, a general alternative interpretation is that the difference in performance during the MIT between human subjects and monkeys could be due to nontemporal factors, such as memory, attention, and reward expectancy. Indeed, the species differences observed in the continuation phase could be due to the more developed working memory and/or attention systems in humans.

Two alternative mechanisms have been proposed as the neural substrate of interval timing on the scale of hundreds of milliseconds (Ivry and Schlerf 2008; Mauk and Buonomano 2004): a centralized mechanism that processes temporal information in a multimodal fashion and across perceptual and motor timing tasks; and multiple mechanisms that involve a specific and independent neural circuit for different timing behaviors. Thus similar Weber fractions across timing contexts and durations (Getty 1975; Gibbon et al. 1997; Ivry and Hazeltine 1995), significant intersubject correlations in timing variability between temporal tasks (Keele et al. 1985; Merchant et al. 2008c; Robertson et al. 1999), and generalization of timing learning among modalities, stimulus locations, and between the perception and production of time intervals (Bartolo and Merchant 2009; Karmarkar and Buonomano 2003; Meegan et al. 2000; Wright et al. 1997) justify the view of a unified mechanism of temporal processing in the subsecond range. In contrast, psychophysical and modeling work (Karmarkar and Buonomano 2007; Staddon and Higa 1999) has supported the notion of a multiple-independent clock mechanisms.

Recent neuroimaging and psychophysical studies have led to an intermediate hypothesis—that interval timing depends on a partially overlapping, distributed mechanism, where main-core cortical and subcortical timing structures, such as SMA, prefrontal and posterior parietal cortices, as well as the basal ganglia and the cerebellum, can be influenced differently by context-dependent information that is processed by the corresponding brain areas (Grondin 2001; Lewis and Miall 2003; Merchant et al. 2008b,c). For example, using the slope analysis, different multidimensional analyses, and the correlation of intersubject timing variability, we found that the sensorimotor processing (perception vs. production), the modality of the stimuli used to define the intervals (auditory vs. visual), and the number of intervals (one vs. four) had important effects on the temporal performance of human subjects (Merchant et al. 2008b,c). However, these analyses did not support the notion of a completely multiple-independent timing system, since clear but complex relations in the temporal variability were observed between tasks (Merchant et al. 2008b,c). Of course, neurophysiological experiments are needed to confirm or refute this hypothesis, but at least two different functional modes of a partially overlapping timing network can be suggested: 1) a mechanism in which the interaction of main-core timing structures is similar across contexts, but where the information exchange with nontiming areas induces the performance differences across different timing tasks; or 2) a timing neural network in which the association main-core timing areas (such as the posterior parietal cortex and/or prefrontal cortex) that have access to multimodal sensory information and can process motor planning and intentionality signals, process temporal information depending on the behavioral contingencies of the task.

Following the latter line of reasoning, we could suggest that the reported similarities in temporal processing between human subjects and monkeys depend on a conserved main-core circuit constituted by similar cortical and subcortical structures. This timing circuit, with the same basic anatomofunctional organization, may be modulated by species-specific neural structures that cause the time production differences observed between human subjects and monkeys. Due to the important timing bias toward auditory signals for the triggering of temporal performance in human subjects (Guttman et al. 2005; Kolers and Brewster 1985; Repp and Penel 2002), we propose that auditory association areas of the temporal and parietal lobe, as well as frontal structures including Broca's area, may be important human cortical nodes conferring the enormous temporal capabilities to Homo sapiens observed during the MIT continuation and during speech and musical perception and production.

We reported that the temporal underestimation increased as a function of interval duration, particularly in the monkey. These results indicate that the range of intervals tested in the MIT and SIT were not short enough in neither species to reveal the indifference point and the overestimation of short durations, as stated initially by Vierordt (1868). However, the notion of indifference point has been strongly questioned recently by McAuley and colleagues. They found that a standard interval that was relatively small in comparison with the global temporal context tended to be overestimated, whereas the same standard interval that was relatively long in comparison with the global temporal context tended to be underestimated (Jones and McAuley 2005; McAuley and Miller 2007). This explanation is consistent with duration categorization judgments, where the point of subjective equality falls near the geometric mean of the anchor boundaries of the durations tested (Grondin 2001; Meck and Church 1983; Penney et al. 2008). On the other hand, the larger underestimation in monkeys may be due to the emphasis that these animals put on obtaining more reward per unit of time (Kim et al. 2008; Watson and Platt 2008). The constant error of monkeys showed minimal underestimation at 450 ms, which could be the result of the animals' tendency to minimize the total trial time to obtain reward. Adapting their preferred internal periodicity at the shortest interval could decrease variability and increase accuracy for intervals produced around the fundamental (or harmonic) preferred period. Actually, monkeys executed the tasks based on liquid reward as a motivational drive, receiving double reward if their performance accuracy was greater (see methods). With the purpose of decreasing the bias for the production of short-duration intervals, we adjusted the amount of fruit juice to be proportional to the trial duration. Nevertheless, it is quite possible that the monkeys placed more emphasis on the production of shorter intervals. In fact, the large constant error at 1,000 ms is an evidence that monkeys could not withhold their responses for large-interval durations.

In conclusion, the present study indicates that the rhesus monkey is a good animal model for studying the neurophysiological basis of time production, especially for single intervals. However, only after a long training period were the macaques able to execute the continuation phase of the MIT and the variability of temporal performance in this phase was substantially larger than that in human subjects. These behavioral differences could be rooted on both the social experience and learning associated with speech and music and the evolution of neural structures devoted these behaviors in the auditory system of the human. Such areas could confer the temporal abilities needed to produce multiple and complex interval sequences. In contrast, the well-known similar spatial abilities of both primates probably depend on the anatomofunctional commonalities of their visual system.


This work was supported in part by Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica Grant IN206508-19, Fogarty International Research Collaboration Award Grant TW007224-01A1, and Consejo Nacional de Ciencia y Tecnología de México Grant 47170.


We thank B. Repp and two other anonymous reviewers for enlightening comments on the initial versions of the manuscript, R. Ivry for fruitful observations on this work, D. Pless for proofreading the manuscript, R. Paulín for technical assistance, and the staff of the graduate program in biomedical sciences of the Universidad Nacional Autónoma de México.


  • 1 The online version of this article contains supplemental data.


View Abstract