Physiological Insights Into the Social-Context-Dependent Changes in the Rhythm of the Song Motor Program

Brenton G. Cooper, Franz Goller


Precisely timed behaviors are central to the survival of almost all organisms. Song is an example of a learned behavior under exquisite temporal control. Song tempo in zebra finches (Taeniopygia guttata) is systematically modified depending on social context. When male zebra finches sing to females (directed), it is produced with a faster motor pattern compared with when they sing in isolation (undirected). We measured heart rate and air sac pressure during directed and undirected singing to quantify motivation levels and respiratory timing. Heart rate was significantly higher when male birds sang to females and was negatively correlated with song duration. The change in song tempo between directed and undirected song was accounted for by varying the duration of vocal expiratory events, whereas the duration of silent inspirations was unchanged. Song duration increased with repeated singing during directed song bouts, which was caused by a uniform increase in the duration of both expirations and inspirations. These results illustrate the importance of motivational state in regulating song tempo and demonstrate that multiple timing oscillators are necessary to control the rhythm of song. At least two different neural oscillators are required to control context-dependent changes in song tempo. One oscillator controlling expiratory duration varies as function of social context and another controlling inspiratory duration is fixed. In contrast, the song tempo change affecting expiratory and inspiratory duration within a directed bout of song could be achieved by slowing the output of a single oscillator.


Precise timing underlies many learned motor behaviors. Song in songbirds is an example of a highly stereotyped, learned behavior. In zebra finches (Taeniopygia guttata), song is reproduced with remarkable stereotypy. However, slight temporal variation occurs depending on the social context in which a bird sings (Jarvis et al. 1998; Sossinka and Böhner 1980; Zann 1996). Zebra finches sing faster when they sing to a female (directed song) than when they sing in isolation (undirected song). It has been suggested that temporal and acoustic variation allows for exploration of motor space, and such variability provides feedback necessary for reinforcement learning (Kao et al. 2005; Ölveczky et al. 2005). However, a physiological basis for the timing change has not been studied, and the precise nature of the timing change at the level of individual song elements has not been identified.

Song is a precopulatory behavior that is thought to signal male fitness (Nowicki et al. 2002). In zebra finches, song tempo and stereotypy could function to indicate male quality, and the speed of song delivery may also be related to motivational state of the male (Jarvis et al. 1998). The coarse control of song tempo is dictated by the respiratory motor system. Sounds are produced during expiration and silent periods of the song typically correspond to inspirations (Suthers et al. 1999). The detailed acoustic structure of the song is controlled by the bird's syrinx (avian vocal organ). Sound production is controlled by a discrete set of nuclei, which are collectively referred to as the motor pathway (Wild 1993, 1997). Song development, but not production, requires an intact anterior forebrain pathway (Bottjer et al. 1986; Scharff and Nottebohm 1991), which receives afferents from, and sends efferents to, the motor pathway (Iyenegar and Bottjer 2002; Johnson et al. 1995; Luo et al. 2001).

Neural activity in the anterior forebrain pathway increases and is more variable when birds sing undirected songs compared with when they sing directed songs (Hessler and Doupe 1999; Jarvis et al. 1998; Kao et al. 2005). Additionally, removal of the output nucleus of the anterior forebrain pathway, the lateral magnocellular nucleus of the nidopallium (LMAN), abolishes context-dependent variability in the fundamental frequency of song syllables (Kao et al. 2005). It is likely that song tempo changes mediated by social context are controlled by the anterior forebrain connection from LMAN to the nucleus robustus archipallialis (RA) (Kao et al. 2005). Neural activity in RA controls the timing and acoustic structure of the song (Chi and Margoliash 2001). Neural projections from dorsal RA innervate respiratory centers and ventral RA innervates the syrinx via the tracheosyringeal branch of the hypoglossal nerve (Vicario 1991; Wild 1997). This distributed neural system controls sound production by timing and coordinating the activity of vocal and respiratory effectors, which in turn leads to the generation of the bird's song.

What is the physiological mechanism for context-dependent modifications of song tempo? If motivational state is related to song tempo, then autonomic responses should correlate with the timing of song. Neural changes in the anterior forebrain may then be induced by the autonomic response thereby leading to changes in song timing (Doupe et al. 2005). Motivational state, in this context, refers to the sexually motivated behavior of singing. To measure a physiological correlate of such a motivational state, we recorded heart rate in spontaneously singing zebra finches. Respiratory pressure patterns were recorded simultaneously to quantify duration of vocal expirations and silent inspiratory minibreaths during song. We found that motivational state was correlated with song tempo. Furthermore, a uniform duration change in all song elements was not observed. Context-dependent temporal variation occurred during vocal expirations but not during inspirations. These results link motivational systems with song tempo and illustrate the need for multiple timing patterns to control the different respiratory rhythms of song.


Animals and recording procedures

Eight male zebra finches were used in this study and varied in age between 150 and 250 days. They were given seed and water ad libitum and vegetables mixed with vitamins every other day. Animals were bred and housed in a flight aviary and maintained on a 14:10 h light:dark cycle. While birds were undergoing experimental manipulations, they were housed individually in small cages (32 × 23 × 30 cm) contained in a sound-attenuating box (61 × 49.5 × 49.5 cm). The back wall and two sides of a box were lined with acoustic foam. The front of the box remained open for presentation of female zebra finches. A Plexiglass top covered the box and it contained a small opening that allowed wires from the bird to be routed to amplifiers and recording equipment adjacent to the bird's cage. A small elastic band was placed around the bird's thorax, and birds were tethered with a wire leading up and out of the box. The wire was attached to a counter-weighted balance arm positioned above the box, which allowed for free movement within the cage (Cooper and Goller 2004; Franz and Goller 2002). After birds were accustomed to being tethered and repeatedly sang directed song to a female presented in a separate cage, surgical procedures were initiated.

Surgical procedures

After a 1-h period of food and water deprivation, birds were anesthetized with isoflurane. Heart rate was recorded with custom constructed circular electrodes. One electrode was sutured to the bird's chest and one on the bird's back. These wires were routed subcutaneously to the center of the bird's back and then soldered to stronger wires that led up from the backpack on the bird and out of the bird's cage. Signals were amplified (500–1,000 times) and band-pass filtered (50 Hz high-pass and 3 kHz low-pass; Brownlee, Model 440; San Jose, CA). Air sac pressure was recorded by inserting one end of a small flexible cannula (silastic tubing, OD 1.65 mm, 6 cm in length) into a thoracic air sac through an opening in the body wall. The cannula was sutured to the rib cage, and glue (Nexaband) was applied to adhere the cannula to the skin and to seal the opening in the body wall. The opposite end of the cannula was attached to a pressure transducer (Fujikura FPM-02PG; Santa Clara, CA), which was mounted on the bird's back. The voltage output from the pressure transducer was amplified prior to recording (50 times, Hector Engineering). Song was recorded with an Audiotechnica AT8356 microphone placed ∼30 cm from the perch and centered in front of the bird's cage. The output from the microphone was amplified 100 times and high-pass filtered at 300 Hz (Brownlee, Model 440).

Experimental procedures

Heart rate, air sac pressure, and song were recorded simultaneously in three of the eight birds, and air sac pressure and the acoustic data were recorded from three additional birds. Data from the remaining two birds were excluded from the study because the voltage level of ambient pressure shifted between directed and undirected singing conditions. This shift in the DC voltage level can occur when fluid begins to fill the cannula or if the cannula moves in the body during singing. In these cases, the voltage readings become unstable and are not suitable for detailed analyses. All data presented were collected from the six birds. In three of the six birds, data were recorded onto tape using a multichannel digital recorder (20 kHz bandwidth, TEAC-135T), and later digitized (Data Translation 2821G, 40 kHz sample rate; Marlboro, MA) using Signal v 3.0 software (Engineering Design, Berkeley, CA). The data from the remaining three birds were recorded directly onto computer (PCI-MIO 6056E, 16 bit, 25-kHz sample rate; National Instruments, Austin, TX) using Avisoft Recorder software (Avisoft Bioacoustics, Germany). Data were saved to disk whenever air sac pressure exceeded a user-defined threshold for longer than 15 ms.

Birds sang 1–3 days after the cannula was inserted. At this time, birds were recorded for 2 h during directed singing and 3 h during undirected singing episodes. Because birds sing more bouts when in the directed song condition, we recorded undirected song for an additional hour to equalize the number of song bouts recorded in both conditions. Birds were tested between 10:00 and 15:00 h, with recording commencing 1 h after the morning feeding. In some birds, directed song bouts were recorded prior to the undirected session, and in other birds, this order was reversed. This procedure ensured that circadian changes did not influence song tempo. Further, previous work has also shown that song tempo can be modified independently of circadian factors (Botas et al. 2001). Directed song was defined as a song that occurred when a female bird was present in the front of the male bird's cage. We observed birds during segments of the recording session and confirmed that directed singing occurred. Directed song bouts were characterized by dance-like movements toward the female, erected feathers, as well as more frequent occurrences of beak wiping at the perch between song bouts (Williams 2001; Zann 1996). Directed songs are usually sung with a faster tempo than those recorded in the undirected condition (see Fig. 1). Undirected song occurred during the same day as the directed song, but with no females present in the room. In the undirected singing condition, the acoustic environment included other male zebra finches. Distinct differences in heart rate provided further confirmation of the differences between directed and undirected singing conditions (Fig. 2).

FIG. 1.

Example of the physiological measures recorded, song motor patterns and data analysis. A, top: spectrogram is displayed that shows the frequency/time characteristics of the zebra finch song bout. The motif variability that can be exhibited by zebra finches is displayed. The last 2 syllables of the motif (indicated by the asterisk) are sometimes excluded from the song. For analyses of motif duration as a function of sequence in the bout, these syllables were omitted from the analysis. Below the spectrogram is the microphone recording (M), the heart rate (HR) during the bout of song, and the air sac pressure (P). Heart rate data were analyzed as beats/s. The presong heart rate was measured during a 1-s interval before the onset of song (solid line, presong), and also during the 1st motif of the bout. The solid line running through the air sac pressure trace indicates ambient pressure. Supra-atmospheric pressurization corresponds to expiration and subatmospheric pressurization corresponds to inspiration. B: expanded view of the air sac pressure pattern for the bird's motif during directed (black) and undirected (red) song. Each expiratory pulse (EP), corresponding to 1 or 2 syllables, of the bird's motif is numbered inside the pressure trace. Below the air sac pressure trace is a scatter plot showing 2 of the 4 parameters measured. Duration is plotted on the ordinate and average amplitude is plotted on the abscissa (see methods). Each square corresponds to an individual repetition of an EP (bottom left) or an inspiratory pulse (IP; bottom right), and the EPs and IPs from all of the songs recorded are displayed. Black squares correspond to directed song, and red squares are EPs during undirected song. There is a decrease in the duration of most EPs during directed song but the same temporal shift is not present in the duration of IPs.

FIG. 2.

Heart rate is correlated with changes in social context. A: number of introductory notes in the directed singing condition was twice that of undirected song in this study, and the heart rate of the birds was increased by >2 beats/s (B). The heart rate prior to singing was negatively correlated with the duration of the first motif (data presented are standardized scores), and the average heart rate during the first motif was also negatively correlated with duration of the first motif (C and D). These data suggest that the enhanced motivational state that accompanies directed singing is related to the decrease in motif duration.

Twenty to 30 bouts of directed and undirected song were typically collected during each recording session. Bouts of song were visually identified from all of the files collected during the recording session and were selected for data analyses.

Data analysis

Heart rate.

Each heart beat was identified by the peak of the voltage change recorded on the electrodes. Heart rate (beats/s) was quantified during a 1-s period prior to the onset of introductory notes (presong heart rate) and during the first motif of the song bout (song heart rate). Heart rate during the subsequent motifs was not analyzed because motivational state might be confounded with respiratory demands of singing for an extended period of time (Franz and Goller 2003).

Quantification of all expiratory and inspiratory pulses (EPs and IPs) during song bouts.

Air sac pressure provides a distinctive signature for each syllable in the bird's motif; accordingly, we used air sac pressure to identify both bouts of song and the motifs within a song (Fig. 1). Each EP was defined as a continuous period of supra-atmospheric pressure. IPs were defined as a continuous period of subatmospheric pressurization. The IP preceding the first syllable of the song was omitted from analysis because it frequently was not as deep or as structured as the IPs occurring during song. Similarly, the last IP of the bout was omitted because this inspiration is highly variable, ranging from a typical minibreath, to quiet respiratory inspiration, or a period of apnea (Franz and Goller 2003). Thus in all analyses there are fewer IPs than EPs. The DC level of the voltage output of the transducer was shifted such that ambient pressure corresponded to 0 volts. In the data output and graphical display, all EPs are represented as positive voltage values and IPs are negative voltage values.

The onset and offset of every bout of song during directed and undirected conditions was visually identified and all of the EPs and IPs within the bout were automatically identified by a custom-written software program (LabView; Austin, TX). Prior to analysis, air sac pressure data were low-pass filtered (400 Hz). This filter removed acoustic oscillations that are transmitted through the body and recorded as pressure oscillations by the transducer. After filtering the pressure signal, each EP and IP within the song bout was segmented from the data stream, and four parameters were measured to quantify the pattern of EP pressurization: duration, average voltage, peak voltage, and coefficient of variation of amplitude (CV = SD * 100/mean). After birds reach phonatory pressure levels, each EP has a characteristic pattern of temporal modulation, which corresponds to the unique acoustic properties of the syllable. CV of amplitude was calculated to quantify this temporal variation in the pressure waveform. The first and last 5 ms of the EP were omitted from the calculation because the rising and falling phase of the EP are common across all EPs. CV, rather than SD, was used because the mean amplitude of air sac pressure changed as a function of directed/undirected singing conditions in some birds. For all IPs in the song, three parameters were measured: duration, average voltage, and peak voltage (maximally negative voltage value). CV of amplitude was not calculated for IPs because the temporal pattern of pressurization is similar across almost all IPs (Fig. 1B). All of the repetitions of the EPs and IPs that were recorded in the experiment are displayed in a single plot (hundreds of repetitions of individual EPs and IPs are displayed in Figs. 1B, 3, and 4). This method for display and analysis was inspired by the procedures used by Tchernichovski et al. (2000) to analyze acoustic data.

Motif duration within bouts of song.

A song bout was defined as being preceded by one or more introductory notes and followed by one or more repetitions of the bird's motif. A new song bout was defined by an interval of ≥2 s between the end of a motif and the onset of introductory notes in the next song bout (Sossinka and Böhner 1980). If a new song bout was initiated with less than a 2-s interval, these songs were not analyzed to quantify changes in motif duration within song bouts. The duration of each motif during each repetition within the bout was calculated by identifying the motif onset and offset from the pressure trace. The motif onset was identified by the first supra-atmospheric pressurization during the first EP of the motif. The end of the motif was identified as the return to ambient pressure after the last EP of the bird's motif. Three zebra finches had a consistent motif. The motif of the remaining three birds was more variable; one bird sang a different first syllable when the first motif of the bout was sung, the remaining two birds, one or two syllables were not always sung at the end of the motif (Fig. 1A). In these three birds, the “canonical” motif was used to measure motif duration. The canonical motif was defined as the EPs and IPs that were always present at each of the motif repetitions within a song bout (Fig. 1A).

Duration of individual EPs and IPs.

During directed song, motif duration varies as a function of motif sequence within the bout, such that motifs sung later in the sequence tend to be longer in duration than earlier motifs (Chi and Margoliash 2001). To control for changes in duration as a function of motif sequence, the mean duration for each of the EPs and IPs in the first motif in the bout of undirected song was divided by the corresponding mean duration of the EP/IP in the first motif of directed song. This process was repeated for the second and, when possible, for the third motif of the bout. This ratio is referred to as the context-dependent ratio. A score >1.0 corresponds to a longer duration EP/IP during undirected song; a score <1.0 corresponds to a shorter duration EP/IP during undirected song.

How do individual EPs and IPs change in duration during the course of a song bout? For both directed and undirected song bouts, the average duration of an EP/IP in the second motif was divided by the duration of the same EP/IP in the first motif. This process was repeated for the third motif in the bout. This ratio is referred to as the within bout ratio score. Scores >1.0 indicate an increase in duration for an EP/IP during the second and third repetitions of the motif in a bout; conversely, scores <1.0 indicate a decrease in EP/IP duration.

Statistical procedures

Heart Rate.

To determine if motivational state is related to song tempo change, the mean heart rate during presong was correlated with the duration of the first motif in a bout using a Pearson's linear correlation (SPSS, v.11.0, Chicago, IL). Additionally, the mean heart rate during the first motif was correlated with the duration of the first motif. Motif duration varied substantially among the birds in this study. To allow comparisons between birds, the motif duration and heart rate from both directed and undirected songs for an individual bird were normalized using Z scores ([xi-mean]/SD). The Z score transforms all the values such that the mean = 0, and a score of ±1 corresponds to ±1 SD above or below the mean. All correlation analyses were performed on these transformed scores. The number of introductory notes was counted for each directed and undirected song bout, and the heart rate was calculated during the first motif of directed and undirected song bouts. Between groups t-test were calculated to determine if the number of introductory notes and average heart rate changed significantly between directed and undirected singing bouts, with α ≤ 0.05 for all statistical comparisons.

Quantification of all EPs/IPs.

For all of the EPs in the data set, the mean for each parameter measured (duration, amplitude, peak amplitude, CV) for each bird was determined. The means of the three parameters measured for IPs were also calculated (duration, amplitude, peak amplitude). A paired t-test was calculated using the mean score from each bird for each parameter to determine if there was a significant change between directed and undirected song types.

Duration of Individual EP/IP.

A ratio of the EP and IP duration between undirected and directed song was calculated, or context-dependent ratio (see preceding text). A score of 1.0 indicates no change in duration for the EP/IP between directed and undirected singing conditions. Similarly, a within bout ratio score was calculated. A score of 1.0 indicates that the EP/IP is the same duration in the second and third repetition of the motif as it is during the first motif. Values >1.0 indicate an increase in duration during undirected song (context-dependent ratio) or during subsequent repetitions of the motif (within bout ratio). A one-sample t-test with an expected score of 1.0 was calculated to determine if either the context-dependent ratio or the within bout ratio changed significantly.

Are the context-dependent changes in song tempo related to the within bout changes in song tempo? The context-dependent ratio and the within bout ratio score are not directly comparable because the context-dependent ratio compares the EPs/IPs of the first motif of directed songs to the same EPs/IPs of the first motif of undirected songs. In contrast, the within bout ratio compares the duration of EPs/IPs in the first motif to the same EP/IPs produced during the second and third motifs of a bout. To account for this difference, an average context-dependent ratio was calculated for each EP/IP in the first and second motifs of directed and undirected song. This average context-dependent ratio was correlated with the within bout ratio score using a linear correlation analysis.

To determine if EPs/IPs within a bout changed duration similarly within directed and undirected song bouts, the within bout ratio for the EPs/IPs in the directed condition was correlated with the within bout ratio from the same EPs/IPs during the undirected condition. A linear correlation analysis was used for these comparisons.


Motivational state is correlated with song duration

Acoustic output, respiratory pressure, and heart rate were recorded simultaneously in three birds during directed and undirected song (Fig. 1A). The number of introductory notes was higher in directed song bouts (t(99) = −9.28, P < 0.001; Fig. 2A), and the mean presong heart rate was higher during the directed singing condition than it was during the undirected condition (t(99) = −7.72, P < 0.001; Fig. 2B). The mean presong heart rate in the directed condition was 12.66 beats/s (means for each bird: 13.80, 12.68, 11.58) and in the undirected condition it was 10.23 beats/s (means for each bird: 10.98, 9.55, 10.18). Presong heart rate was negatively correlated with the duration of the first motif (r = −0.46, n = 101, P < 0.001, Fig. 2C). Mean heart rate during the first motif was also negatively correlated with motif duration (r = −0.59, n = 101, P < 0.001, Fig. 2D). In sum, there are clear physiological differences between the directed and undirected singing conditions used in this study and level of motivation is associated with song tempo.

Social context selectively modifies the duration of vocal expirations

EPs produce one or more acoustic syllables and IPs typically correspond to intersyllable intervals. The EPs and IPs were analyzed for changes in duration, average amplitude, peak amplitude, and CV of amplitude between directed and undirected conditions (see methods). All EPs from the motifs of songs recorded during directed and undirected song for all birds are displayed in Fig. 3 (black squares, directed; red squares, undirected). The only consistent change across birds was a decrease in EP duration during directed song bouts (black squares shifted to the left, Fig. 3; Table 1 t(5) = −4.74, P < 0.005). Figure 3 also suggests that the duration decrease in EPs does not necessarily occur in every EP of a bird's motif (see individual EP/IP analysis in the following text).

FIG. 3.

EP duration decreases during directed song. Data from 6 birds in this study are displayed, EPs form clusters that produce the syllables of each bird's motif. Following the convention of Fig. 1, each data point represents 1 occurrence of an EP. All of the EPs recorded during directed and undirected song are displayed (black = directed song; red = undirected song). Duration is plotted on the ordinate and CV of amplitude is plotted on the abscissa (CV Ampl). EP duration is shorter during directed song bouts compared with undirected song bouts.

View this table:

Social context selectivity affects the duration of expiratory pulses

The duration change between directed and undirected song was the only parameter that changed in every bird (Table 1). In some birds, average amplitude, peak amplitude, or CV of amplitude changed between the two conditions. Because these changes were not consistent for every bird, more detailed analyses were only performed for EP and IP changes in duration.

In contrast to the EPs, consistent changes were not observed for the duration of IPs when social context was manipulated (Fig. 4). Visual examination of the cluster plots for IPs does not reveal a similar duration shift to that observed for EPs (Fig. 4), and the mean duration for the IPs did not change consistently across all birds in the study (Table 2). Additionally, social context did not consistently affect the amplitude or peak voltage of the IPs (Table 2). Thus unlike expirations, temporal characteristics of inspirations are not systematically modified by social context.

FIG. 4.

Directed singing does not systematically change the duration of IPs. The inspirations recorded from the birds during directed and undirected song are displayed (symbols as in Figs. 1 and 3). Duration is plotted on the ordinate and average amplitude is plotted on the abscissa. In contrast to EPs, a change in duration of IPs as a function of different social contexts was not observed for any of the birds.

View this table:

Social context doesn't affect the characteristics of inspiratory pulses

Individual EPs, but not IPs, increase in duration

Consistent with previous work (Chi and Margoliash 2001), the duration of a bird's motif increased with motif sequence (Fig. 5A). The increase in motif duration as a function of motif order in the bout was evident for both directed and undirected songs. How does an individual EP/IP change between directed and undirected song? The mean duration of each EP and IP for the first motif of undirected song was divided by the mean duration of the corresponding EP/IP produced during the first motif of directed song. This process was repeated for the second and third motif in the sequence of the song bout. The ratio ensured that EPs and IPs were matched for motif sequence in the bout before comparing duration changes between undirected and directed song. An example of the waveforms from an EP of the first motif in the bout of directed and undirected song is displayed in Fig. 5B. The duration of the EP decreased during directed song. An example of IP waveforms that did not change in mean duration between conditions is also displayed (Fig. 5B).

FIG. 5.

Motif duration increases with sequence in the bout, and EP duration is systematically modified by social context. A: similar to previous work, as birds continue to sing the motif during a bout the song tempo decreases (duration data are converted to standardized scores). Directed (open circles) and undirected conditions (gray triangles) show similar increases in duration during the bout. B: example waveforms from an EP in both directed (black) and undirected song (grey) show an decrease in duration during directed song (top), whereas the IP duration does not change (bottom). These waveforms were selected from the first motif of all the bouts of directed and undirected song. The longest duration EP and IP are indicated by the dashed line. C: frequency histogram of the ratio scores for undirected EPs divided by directed EPs with motif sequence held constant (top). The line in the histogram marks the bin centered around 1.0 with a bin size of 0.02. Note that some EPs fall into the bin encompassing 1.0, which indicates that not all EPs increase in duration during undirected song. The bottom panel shows the frequency histogram for the same ratio analysis for the IPs (bottom panel, gray). Unlike EPs, average ratios for IPs did not differ significantly from 1.0.

The mean context-dependent ratio for EPs was 1.036 (t(62) = 10.50, P < 0.001; Fig. 5C, top). The line in the histograms indicates the bin encompassing a mean of 1.0. The EPs falling within a bin with the mean of 1.0 came from multiple birds, confirming that not all EPs were longer during undirected song than they were during directed song bouts. Thus a global timing change that equally affects the entire motif is not observed. In contrast to the EPs, the individual IPs did not systematically increase in duration. The mean context-dependent ratio for IPs was 1.0 (t(47) = −0.31, P = 0.76; Fig. 5C, bottom).

How do EPs and IPs change as a function of repeated singing in the bout? Both EPs and IPs increased in duration during the subsequent repetition of the motif within bouts of directed song (EPs: t(47) = 7.18, P < 0.001; Fig. 6, A and B, top. IPs: t(36) = 2.80, P < 0.01; Fig. 6A, bottom). This same pattern was largely true for motif repetition during undirected song bouts. However, only the duration of EPs increased significantly in duration (t(35) = 7.59, P < 0.001; Fig. 6B, top). The duration of individual IPs did not change significantly during the course of an undirected song bout (t(25) = 0.88, n.s.; Fig. 6B, bottom).

FIG. 6.

Change and variance in tempo within bouts of song. As birds repeatedly sing the motif during bouts of directed song, expiratory (top) and inspiratory (bottom) duration increases significantly (A). During repeated song bouts in the undirected condition, expirations increase their duration significantly (B, top). Although the pattern is similar for inspirations, the increase is not significant (B, bottom). This is likely due to increase variance of inspirations during undirected song bouts (see following text). For expirations (C) and inspirations (D), there is a strong relationship between the song tempo increase within bouts of directed and undirected songs. The duration of individual EPs (E) and IPs (F) is more variable during undirected song than it is during directed song.

During the course of a song bout, both EPs and IPs increase in duration. In contrast, only EPs change duration between directed and undirected singing conditions. This suggests that there are different mechanisms underlying these two types of timing change. To further examine this issue, a mean context-dependent ratio was calculated for the EPs and IPs in the first two motifs of directed and undirected songs. This mean context-dependent ratio was correlated with the within bout ratio score for the same EP/IP (see methods for further description of the procedures). For EPs and IPs, there was no relationship between the context-dependent ratio and the within bout ratio for directed or undirected song bouts (directed song bouts, EPs: r = 0.13, n.s., n = 26; IPs: r = −0.01, n.s., n = 20; undirected song bouts, EPs: r = 0.09, n = 26, n.s. IPs: r = 0.27, n = 20, n.s.). This pattern suggests that the context-dependent tempo change for an EP/IP is unrelated to the tempo change within a song bout.

Is the tempo change of an EP/IP during directed song related to the tempo change during undirected song? There was a significant relationship between the within bout ratio score for directed and undirected songs (EPs: r = 0.60, n = 26, P = 0.001; IPs: r = 0.79, n = 20, P < 0.001, Fig. 6, C and D). This relationship shows that the duration change within a bout of directed song for an EP/IP is similar to the duration change observed within a bout of undirected song.

Motor production is less stereotyped during undirected song

Neural activity in the anterior forebrain is more variable during undirected song than it is during directed song (Hessler and Doupe 1999; Kao et al. 2005). Context-dependent variability in the peripheral motor system was quantified by calculating CV of duration of each EP/IP in directed and undirected conditions. The mean CV of duration was 1.90 for the EPs during undirected song and during directed song it decreased to 1.49 (t(133) = −2.54, P = 0.012). Similarly, for IPs the mean CV of duration was 4.65 during undirected song and this decreased to 2.91 during directed song bouts (t(101) = −4.28, P < 0.001). Thus the duration of EPs and IPs is more variable during undirected song than it is during directed song (Fig. 6, E and F).

EP duration and sequence are not correlated with changes in song tempo

Why does social context change the duration of some EPs more than others? A Pearson's correlation was calculated between the duration of the EP and the context-dependent ratio score. Additionally, EP sequence in the motif was correlated with the context-dependent ratio score. Only EPs from the first motif were used in the analysis. The context-dependent ratio for an EP did not correlate with duration of the EP (r = −0.01, n = 26, n.s.), and it was not related to the sequence of the EP in the motif (r = 0.09, n = 26, n.s.). Thus a simple relationship between EP duration or EP sequence in the motif does not explain the variable tempo change for individual EPs.


The results of this study demonstrate that social contexts differentially affect the duration of EPs, but not the duration of IPs, during song bouts. Furthermore, the heart-rate data reveal an increased level of motivation that accompanies singing in the presence of a female zebra finch and show that motivational state is correlated with song tempo. The neural correlates of directed and undirected song, combined with the well-described neuroanatomical network for singing, make song production an ideal system for developing biologically plausible neural models of a precisely timed behavior. Our data suggest that social context-dependent changes in song timing require at least two neural oscillators. One oscillator that controls expiratory duration varies systematically with changing social context, and a second oscillator that controls inspiratory duration is not modified by social context. A conceptual model is proposed in the following text that describes the proposed inter-relationships between motivation, neuromodulation, neuroanatomical connections in the song nuclei, and mechanisms for selective changes in respiratory timing. In contrast to the context-dependent change in song tempo, changes in song tempo within a directed song bout similarly changed EP and IP duration. Thus directed song can be modeled using a simpler mechanism of a single oscillator controlling both expiratory and inspiratory duration.

Motivational state, as measured by heart rate, is inherently linked to movement and respiration. The dance of the zebra finch during directed song bouts likely increases heart rate. Furthermore, heart rate is influenced by the motor act of dancing. We sought to avoid the confound of both respiratory- and movement-induced increases in heart rate by using a 1-s period prior to singing (presong heart rate), during which time period movement patterns should be similar between directed and undirected conditions. Heart rate could be elevated by the anticipation of movement, as it is in humans during motor imagery tasks (Decety et al. 1993; Oishi et al. 2000). Because anticipation of movement and singing are interwoven with the motivational state of the bird, changes in heart rate resulting from anticipating the upcoming dance are likely to be tightly linked to the motivational state of the bird.

Function of directed and undirected song

Why do zebra finches change song tempo? Context-dependent changes in motif duration and variability may function as a form of motor practice; varying motor patterns allow birds to receive and monitor feedback necessary for acquiring and maintaining the song (Kao et al. 2005; Ölveczky et al. 2005). There is emerging evidence that the anterior forebrain pathway contributes to motor variability. First, LMAN lesions in juvenile birds lead to premature song crystallization (Bottjer et al. 1986; Scharff and Nottebohm 1991). Second, LMAN activity during song is critical for normal song variability in juvenile and adult birds (Kao et al. 2005; Ölveczky et al. 2005). The neural mechanisms of motor variability have recently been studied, but the underlying physiological mechanisms have not been addressed. Our data suggest that motivational state of the singing male affects song tempo. There was a negative correlation between heart rate and motif duration, consistent with the interpretation that faster songs are sung when the motivational state is high. Therefore song timing may communicate the motivational state of the singing bird. Perhaps faster songs signal copulatory readiness and quality of the male. Indeed, castration leads to a slowing of song tempo in adult zebra finches (Arnold 1975).

Links between motivation, neuromodulators, and song tempo

Motivation levels were higher during directed singing events, and this enhanced motivational state may influence the release of neuromodulators within the nuclei of the anterior forebrain pathway of the song control system (Doupe et al. 2005; Jarvis et al. 1998). The anterior forebrain pathway shares homologies with the mammalian basal ganglia (Doupe et al. 2005). In mammals, dopamine is released in the striatum during sexual behaviors (Becker et al. 2001). In addition, dopamine has been hypothesized to establish a “teaching signal” within the mammalian basal ganglia in a computational model of sequential learning (Suri and Schultz 1998). Last, dopamine is thought to modulate interval timing behaviors (Matell and Meck 2000). Directed song is a precopulatory behavior, which is an example of a precisely timed sequential learning task. Dopaminergic projections to the anterior forebrain in birds may play a critical role in mediating the timing changes observed in the current study and may have a more general role in song learning and song plasticity in adulthood (Abarbanel et al. 2004a,b; Bottjer 1993; Doupe et al. 2005).

In house sparrows (Passer domesticus), neural activity in a variety of areas indirectly associated with song learning and production correlates with sexually motivated singing. These brain areas include the medial preoptic nucleus and the ventral tegmental area (Riters et al. 2004). The medial preoptic nucleus projects to the ventral tegmental area (Riters and Alger 2004). The ventral tegmental area sends dopaminergic projections to the anterior forebrain, particularly Area X (Bottjer 1993). Thus in combination with activity in the medial preoptic nucleus the ventral tegmental area likely functions as part of the neural system signaling motivational state during singing. Perhaps, dopamine input to the anterior forebrain pathway has the net effect of reducing the overall activity level of the neurons in this neural system.

Although other neuromodulators could change the timing of song, dopamine may be the dominant neuromodulator for regulating song timing because norephinephrine levels are low in Area X, and acetylcholine does not interact with dopamine receptors in Area X as it does in the mammalian basal ganglia (Gale and Perkel 2005). Acetycholine could contribute to timing in the motor pathway. This possibility is discussed in the following text.

Neural control underlying changes in song tempo

How do changes in neural activity levels in the anterior forebrain lead to selective timing changes in vocal production? The projection of RA neurons onto brain stem respiratory and vocal premotor neurons makes it ideally situated to directly control the timing of vocal production. Control of RA neurons is derived from two inputs, HVC (acronym used as proper name) and LMAN, the latter providing the neural output from the anterior forebrain pathway. RA projecting neurons in HVC are active for ∼6 ms during an individual acoustic segment of the song (Hahnloser et al. 2002). This sparse code is relayed to RA neurons, which are active for ∼10 ms during specific periods of the song and correlate with the timing of acoustic production (Chi and Margoliash 2001; Leonardo and Fee 2005).

Song tempo of vocal respiratory elements could be controlled in part by the duration of RA bursts or overall activity level of RA projection neurons (Chi and Margoliash 2001). LMAN projection neurons provide excitatory input to RA neurons and synapse on the same neurons as HVC projection neurons (Mooney and Konishi 1991; Spiro et al. 1999; Stark and Perkel 1999). We speculate that the increase in song tempo during directed song may be caused by the decreased activity of LMAN projection neurons (Hessler and Doupe 1999; Jarvis et al. 1998; Kao et al. 2005). The decrease in excitatory input to RA decreases the activation of RA projection neurons, and the reduced activation of RA projection neurons functions to decrease the duration of vocal song elements. Consistent with this hypothesis, LMAN and RA activity levels vary depending on the social context in which song is produced (Jarvis et al. 1998).

Separate timing of vocal EPs and silent IPs and functional significance

How is the duration of an EP selectively modified? The topographic connections between the anterior forebrain and the motor pathway may contribute importantly to the selective timing changes observed in the current study. Medial LMAN projects to dorsal RA and lateral LMAN sends efferents to ventral RA (Iyenegar and Bottjer 2002; Johnson et al. 1995). Dorsal RA projections innervate respiratory centers for expiration and inspiration, nucleus retroambigualis (RAm) and nucleus paraambigualis (PAm), respectively (Reinke and Wild 1998; Wild 1993). Perhaps medial LMAN plays a particularly important role in modulating the timing of song during directed singing. Decreased activation of medial LMAN neurons, which project to dorsal RA, may function to decrease the activation of neurons that innervate RAm.

RA interneurons may regulate the timing of vocalizations by synchronizing the activity of dorsal and ventral RA projection neurons (Spiro et al. 1999), and RA interneurons are critical for normal song (Vicario and Raskin 2000). Synchrony between dorsal and ventral RA is essential to ensure that the decreased respiratory duration of the EP corresponds with a temporal change in syringeal motor control of air flow and acoustic structure of the syllable.

The hypothesized increase in activity of dorsal RA, which may lead to increased song tempo could also be modulated by acetylcholine input to HVC. Infusion of cholinergic agonists in HVC leads to an overall increase in activity of RA neurons (Shea and Margoliash 2003). Thus interactions among several neuromodulators in multiple brain areas could modify neural activity patterns controlling song timing.

It has been suggested that RAm may play a critical role in sensorimotor learning because of its bilateral projections, connections with brain areas involved in controlling the motor systems of the upper vocal tract, and its feedback connections to the song system (Wild 2004). Our data are consistent with this idea because they show that the duration of vocal expiratory events can be modified without a corresponding change in inspiratory duration. Inspirations largely serve the purpose of replenishing the air supply to sustain song, which suggests that the duration of IPs may be controlled by more reflexive motor programs. We suggest that RAm and PAm are mutually inhibitory and that this mutual inhibition ensures that inspiratory muscles are activated precisely out of phase with expiratory muscles (Wild et al. 1998). Brain stem inspiratory motor neurons could be released from inhibition at the conclusion of the expiratory-mediated vocalization, and then reflexive brain stem inspiratory circuits control the duration of inspiration.

Learning and maintenance of vocal song elements requires coordination of multiple vocal motor systems (Suthers et al. 1999), combined with monitoring auditory feedback (Brainard and Doupe 2000). Thus temporal modifications to vocal respiratory events may be a result of the need for motor practice to maintain the EP motor program to produce the intended vocalizations. Such precise coordination and feedback may not be necessary to maintain silent inspirations. It is interesting that at the level of an individual inspiration, the average duration is more variable during undirected song than it is during directed song. This may be a reflection of the more variable motor commands because the duration of expirations is also more variable during undirected song. Thus some modulation of inspiratory networks from the song motor program must be occurring. However, the modulation is not sufficient to systematically shift the duration of these inspiratory periods of song production as a function of changing social contexts.

Separate timing patterns for social context and within bout tempo change

Although social context selectively changed the duration of vocal expirations, both expirations and inspirations became longer as the bird repeated its motif within a directed song bout. The timing change of an EP/IP caused by manipulating social context was not correlated with the timing change of the EP/IP within a song bout. This suggests that these timing mechanisms are unrelated. There was a strong correlation between the within bout tempo change for an EP/IP during directed song bouts and the within bout tempo change for the EP/IP within undirected song bouts. This provides further evidence that tempo changes within a song bout are a distinct timing process. The within bout tempo change could be mediated by intrinsic network properties of HVC (Solis and Perkel 2005) or by respiratory demands of singing extended bouts of song (Franz and Goller 2003).

Implications for neural models of song

Neural models of song have largely focused on mechanisms of song learning (Troyer and Doupe 2000a,b) and how the syrinx produces sound (Gardner et al. 2001; Laje et al. 2002). Recent efforts to model the neural system for song learning and production have been based on the increasing knowledge of the electrophysiological properties of cells in the motor pathway, and the anatomical connections between the motor pathway and the anterior forebrain pathway (Abrabanel et al. 2004a,b; Fiete et al. 2004). However, computational models do not yet encompass the complexity of mediating separate, and selectively modifiable, timing patterns for controlling expiratory and inspiratory events during song. There is an emerging view that brain stem respiratory centers are critical for timing and maintaining the neural song motor program (Ashmore et al. 2005; Schmidt 2003; Wild 2004). The data from the current study indicate that separate neural oscillators for expiratory and inspiratory respiratory events are required to accurately model the context-dependent changes in song tempo. The finding that only EP duration changes systematically, and that not all EPs change equally, provides strong evidence against a single “clock” controlling song tempo. Instead, at least two simultaneously operating and independently modifiable oscillators are required to control the respiratory timing of song under differing social contexts. In contrast, song tempo changes within a directed song bout can be modeled using a single timing pattern that similarly changes expiratory and inspiratory duration.


This work was supported by National Institutes of Health Ruth R. Kirschstein Postdoctoral Fellowship 05722 to B. G. Cooper and Grant R01 04390 to F. Goller.


The authors thank S. Torti, J. Whittington, and C. Elemans for helpful comments on the manuscript. We also thank S. Torti, J. Langeland, and M. Franz for their contribution to data collection and analysis and L. Greene for animal care.


  • The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.


View Abstract