Vocal production requires complex planning and coordination of respiratory, laryngeal, and vocal tract movements, which are incompletely understood in most mammals. Rats produce a variety of whistles in the ultrasonic range that are of communicative relevance and of importance as a model system, but the sources of acoustic variability were mostly unknown. The goal was to identify sources of fundamental frequency variability. Subglottal pressure, tracheal airflow, and electromyographic (EMG) data from two intrinsic laryngeal muscles were measured during 22-kHz and 50-kHz call production in awake, spontaneously behaving adult male rats. During ultrasound vocalization, subglottal pressure ranged between 0.8 and 1.9 kPa. Pressure differences between call types were not significant. The relation between fundamental frequency and subglottal pressure within call types was inconsistent. Experimental manipulations of subglottal pressure had only small effects on fundamental frequency. Tracheal airflow patterns were also inconsistently associated with frequency. Pressure and flow seem to play a small role in regulation of fundamental frequency. Muscle activity, however, is precisely regulated and very sensitive to alterations, presumably because of effects on resonance properties in the vocal tract. EMG activity of cricothyroid and thyroarytenoid muscle was tonic in calls with slow or no fundamental frequency modulations, like 22-kHz and flat 50-kHz calls. Both muscles showed brief high-amplitude, alternating bursts at rates up to 150 Hz during production of frequency-modulated 50-kHz calls. A differentiated and fine regulation of intrinsic laryngeal muscles is critical for normal ultrasound vocalization. Many features of the laryngeal muscle activation pattern during ultrasound vocalization in rats are shared with other mammals.
- vocal folds
the laboratory rat (Rattus norvegicus), like other rodents, produces ultrasound vocalization (USV) in various contexts (e.g., Barfield and Thomas 1986; Burgdorf et al. 2007; Litvin et al. 2007; Wöhr and Schwarting 2007). Acoustic variability is high in USV. For example, fundamental frequency (F0), an important perceptual variable, varies between 18 and 90 kHz in some individuals (see, e.g., Kim et al. 2010). The sources of such acoustic variability are insufficiently understood but are critical to interpretation of vocal variability and evaluation of the complexity of the rat's vocal behavior. The rat's vocal behavior is often used as a model system, for example, to investigate emotional expressions (e.g., Brudzynski 2007; Panksepp 2007), the anatomy and pharmacology of neural processing (e.g., Ahrens et al. 2009; Feng et al. 2009; Oka et al. 2008), or human diseases associated with vocal changes (e.g., Ciucci et al. 2009; Nagai et al. 2005). Because our understanding of the nature of laryngeal and respiratory mechanisms during USV is incomplete (Brudzynski and Fletcher 2010), the present study was conducted to fill these gaps. The goal was to identify sources of F0 variability.
USVs in adult rats are categorized in two clusters, 22- and 50-kHz calls, according to the approximated F0 around which most calls in each cluster are produced; however, actual variability is much more complex (e.g., Portfors 2007; Wöhr and Schwarting 2010). Twenty-two-kilohertz calls are more often associated with contexts causing distress or fear (Kim et al. 2010; Litvin et al. 2007) and are uttered if an animal attempts to withdraw from a certain situation (Burgdorf et al. 2008). Fifty-kilohertz calls with modulated F0 are uttered in appetitive contexts (Wöhr and Schwarting 2007; Wright et al. 2010), while those with less-modulated F0 (“flat 50-kHz calls”) are more often associated with aggressive encounters (Burgdorf et al. 2008).
USV in rats can be evoked by stimulating various brain areas (Yajima et al. 1980), and the circuitry of laryngeal and vocal tract motoneurons to other brain areas is complex (Pascual-Font et al. 2011; van Daele and Cassell 2009; van Daele et al. 2011). Roberts (1975a) proposed that USV are produced by a whistle mechanism, in contrast to sounds in the audible range, which are produced by flow-induced vocal fold oscillations. He based this hypothesis on the observation that F0 of USV, but not that of audible sounds, increases if respiratory air is replaced by a light gas. USV is produced during expiration as airflow passes through larynx, pharynx, nose, and mouth (Roberts 1972a, 1975b). The larynx plays a critical role in sound generation because vocal folds seem to be positioned in a specific phonatory position during USV (Sanders et al. 2001). Furthermore, discharges from motoneurons of laryngeal muscles in the nucleus ambiguus are synchronized with USV (Yajima and Hayashi 1983), and nerve transections (recurrent or superior laryngeal nerve) alter the normal glottal function. Vocalizations change or are no longer produced after such operations (Roberts 1975b; Nunez et al. 1985; Wetzel et al. 1980).
In the present study subglottal pressure, tracheal airflow, and electromyographic (EMG) activity of two intrinsic laryngeal muscles were recorded during USV. Experiments were conducted in awake and spontaneously behaving adult male rats. The collected data helped identify motor patterns during USV. The data can also contribute to refining models of the physical mechanisms underlying sound production in the rat.
Procedures involving animals and their care were reviewed and approved by the Institutional Animal Care and Use (IACUC) committee of the University of Utah. Subglottal pressure, tracheal airflow, and EMG activity were recorded in 18 male Sprague-Dawley rats (body mass 400–450 g at time of experiment). Animals were housed in pairs from the age of 6 wk in rodent cages (46 × 30 × 15 cm), with ad libitum food and water supply and a 12:12-h light-dark cycle.
An animal was anesthetized with an intraperitoneal injection of xylazine (8 mg/kg) and ketamine (80 mg/kg). Atropine was administered at 0.05 mg/kg (im) Surgical procedures were performed under aseptic conditions. Operated rats received fluids (Ringer, 10 ml sc), antibiotics (Baytril 20 mg/kg orally), and analgesics (buprenorphine, 0.3 mg/kg sc).
Subglottal pressure was measured through a stainless steel tube implanted in the upper third of the trachea. The trachea was exposed by using a midline ventral neck incision extending between 3 cm caudal from the pogonion and the cranial end of the sternum. Subcutaneous fat, glandula mandibularis, and sternohyoideus muscle were bluntly separated to expose the trachea. One end of a stainless steel L-shaped tube (1-mm outer diameter) was introduced between the 5th and 7th tracheal rings. The other end of the tube pointed caudally. The two branches of the tube were 6 and 8 mm in length. The tracheal end of the tube was spliced open for a safe anchor in the trachea. The tube was held in position by a suture (4-0 silk; Ethicon, New Brunswick, NJ), and tissue adhesive (Nexaband, Veterinary Products Laboratory, Phoenix, AZ) prevented air leaks. Great care was taken not to damage the left or right recurrent laryngeal nerve. The left and right aspects of the sternohyoideus muscle were reattached by sutures. The subcutaneous tissue and the skin were closed in two layers, and an additional suture was placed where the tube exited the overlying skin. A flexible Silastic tube connected the metal tube to a pressure transducer fixed on the rat's jacket. The animal was equipped with a rodent jacket and a rodent tether. The tether was fixed to a ring that could slide along a rod placed 10 cm above the cage, allowing the animal to move freely in the cage. An experiment ran for 3–5 days, during which time the tracheal tube remained open and clean. The tube and tracheal space were dissected and inspected at the end of each experiment.
The free end of the Silastic pressure tube was connected to a pressure transducer (model FHM-02PGR-02, Fujikura, Tokyo, Japan). The pressure signal was recorded synchronously to the sound signal in a second channel. In all animals the pressure transducer was calibrated at the beginning or the end of the experiment with a digital manometer (Omega HHP-90, Stamford, CT).
Subglottal pressure perturbations.
To test the effect of pressure variations on F0, small volumes of compressed air were injected into the trachea. Pressure perturbation in the trachea during vocalization provided a quantitative measure for F0 change per unit of subglottal pressure change (ΔF0/ΔPs).
A T-shaped tube connector was placed within the Silastic tube halfway between the pressure transducer and the stainless steel tracheal tube. The third end of the connector was connected to a pressure pulse generator (Parker Hannifin, Fairfield, NJ; Picospritzer II) by Silastic tubing. When the Picospritzer was triggered, a puff of compressed air was injected into the trachea. Air injection was irregularly timed so it could not be anticipated by the rat. The timing and amplitude of the air pulse were monitored by the pressure transducer.
The injected pulse of air was 40 ms in duration. Its amplitude was varied between 0.2 and 5 kPa. The amplitude was well within the range of subglottal pressure observed during certain rapid movements or pulling on the tether system.
The results depend on the assumption that the frequency changes during the air pulse are solely due to the pressure increases and no compensatory actions occur in respiratory muscles, or in larynx and other aspects of the vocal organ. Similar experiments in humans, rats, and other mammals, in which tracheal pressure variations were recorded in association with muscle activity, suggest that pressure variations in the trachea can lead to muscle responses in respiratory, laryngeal, and/or oral muscles but only 30–40 ms after the onset of a pressure variation (e.g., Baer 1979; Garrett and Luschei 1987; Horner et al. 1991; Testerman 1970; van den Berg 1957). To avoid compensatory/reflex movements, all frequency and pressure measurements were taken within 20 ms of the air pulse onset.
Light gas in the vocal tract.
Roberts (1975a) performed experiments in which he held a small tube releasing the light gas in front of the nostrils of rat pups. He observed an increase of fundamental frequency of distress calls proportional to the heliox concentration. These findings were tested here again in adult animals with a different heliox delivery method. In two rats that were connected to the pressure pulse generator, heliox (80% He, 20% O2) was injected into the trachea with low pressure (0.5–1.0 kPa) during the production of 22-kHz calls. In light gas the fundamental frequency of a whistle sound increases in proportion to the amount of light gas present. With our experimental gas, the travel time of a click sound between a speaker and a microphone, positioned in a Plexiglas box filled with heliox, was 1.8 times faster than in normal air.
To measure tracheal airflow, a microbead thermistor (0.13-mm diameter; BB05JA202, Thermometrics) was implanted into the tracheal lumen between the 2nd and 3rd tracheal rings. A small hole was made with a 23-gauge hypodermic needle. The thermistor was guided through the hole so that the end came to a midlumen position. The hole was sealed with tissue adhesive. The wires of the probe were sutured to the 3rd or 4th tracheal ring.
The microbead thermistor was mounted on wires that were supported by a coat of epoxy. The wires from the flow probe were led through the skin and routed to connectors on the backpack. A Wheatstone bridge was used to measure resistance changes through the thermistor probe. Resistance changes were proportional to the rate of airflow. Temperature and humidity inside the trachea affect the voltage. The circuit was zeroed after the transducer was in position by interrupting airflow in the trachea for 3 s (manually blocking nostrils and mouth in the anesthetized animal). The airflow trace shows a positive signal regardless of flow direction.
The thermistor signal deteriorates as serous fluids accumulate on the bead. Inspection after the animal was killed showed that thermistors were covered with some material. It was also observed that signal quality decreased after 48 h (e.g., DC shift), indicating material accumulation. Since it was not possible to control accumulation or to clean the probe of deposits during the experiment, flow signals were only interpreted qualitatively using signals uttered within 36 h after surgery.
EMG recordings were obtained from the cricothyroid (CT) and the thyroarytenoid (TA) muscle. The CT muscle is classified as a vocal fold adductor. It tenses the medial edge of the vocal fold. The TA muscle is also classified as a vocal fold adductor, but its deep position inside the vocal fold causes a shortening of the vocal fold and possibly a medial bulging of the tissue covering the TA muscle (lamina propria and epithelium). As in humans, the TA muscle possesses a lateral and a medial portion.
The CT muscle was exposed by the same approach as explained above for the tracheostomy. A small incision was made into the fascia of the thin muscle into which a bipolar silver electrode (Teflon insulated except at the tip of the 76.2-μm wires; A-M Systems) was inserted. A small drop of tissue adhesive was used to secure the electrode pair to the fascia. The TA muscle was accessed via a small hole in the lateral thyroid cartilage that was drilled with a 23-gauge hypodermic needle. The bipolar silver electrode was moved 1 mm through the hole and secured by a small drop of tissue adhesive. The exact location of the electrode tips was determined after the experiments.
At the end of the experiment, the animal was killed and electrolytic lesions were made in the muscle with the EMG electrodes (8- to 10-mA current for 40–60 s). The larynx was then excised and fixed for 24 h in 10% buffered formalin phosphate (Fisher Scientific, Fair Lawn, NJ; catalog. no. SF100-4) and placed for 8 h in Decalcifier 1 (Surgipath Medical Industries, Richmond, IL; catalog no. 00400). The tissue was then embedded in paraffin, and 5-μm serial horizontal sections were made. Sections were stained with hematoxylin and eosin for a general histological evaluation and identification of the implantation location.
Wires were routed subcutaneously to the backpack, from which stronger wires led EMG signals out of the cage to signal conditioning and recording instruments. EMG recordings were differentially amplified (model EX4–400, Dagan) and band-pass filtered (100–3,000 Hz). EMG recordings were full-wave rectified and low-pass filtered (300 Hz) for better visualization. The rectified signals during phonation were normalized to maximum EMG activity. The largest EMG activities in both muscles were observed during swallowing when the animal was feeding or drinking.
Animals were continuously acoustically monitored by a condenser ultrasound microphone [Avisoft-Bioacoustics CM16/CMPA-5V; 15–180 kHz, with a flat frequency response (±6 dB) between 25 and 140 kHz] placed 20 cm above the cage floor. Signals were acquired through a multichannel data acquisition device (NI USB-6212, National Instruments, Austin, TX), sampled at 100, 150, or 200 kHz, and saved as uncompressed files on a computer with Avisoft Recorder software (Avisoft-Bioacoustics).
Twenty-two- and fifty-kilohertz calls were either uttered without the experimenter in the room (no external trigger noted) or associated with one of the following stimuli: 1) touch by human hand with or without petting, 2) presentation of a second familiar male, 3) presentation of female odor. All males had previous experience with females.
All measurements were performed with the sound analysis software PRAAT (version 5.0.41; www.praat.org). One hundred thirty-kilohertz-bandwidth spectrograms were used for analysis. USV were divided into 22-kHz call and 50-kHz call categories. Fifty-kilohertz calls contained trills, up- or downmodulations, or flat call components with F0 above 40 kHz. Twenty-two-kilohertz calls were long calls with flat F0 between 19 and 30 kHz. Calls were analyzed for total duration and F0 contour. In 22-kHz calls, PRAAT's pitch tracking method was used to track F0 values at 5-ms intervals. Before final F0 outcomes were computed, each track was visually inspected by overlaying it on a corresponding narrowband spectrogram.
Three analyses were performed to study the effect of subglottal pressure. First, subglottal pressures were compared between 22-kHz and 50-kHz calls. Differences in subglottal pressure between call types were tested by paired t-test. Second, the F0 change per unit of subglottal pressure change (ΔF0/ΔPs) was determined in normal phonation as well as during perturbations of the subglottal pressure by means of a Picospritzer. In normal calls, ΔF0/ΔPs was determined during the first 10% of the call as well as during the plateau phase (between 10% and 90% time points). Third, subglottal pressure and fundamental frequency were correlated 1) in sets of 20 randomly selected 22-kHz and 50-kHz calls from different recordings, 2) in long bouts of 22-kHz calls, and 3) in 22-kHz calls during subglottal pressure perturbations. Correlations between subglottal pressure and F0 were tested by Spearman rank correlations. Flow data were described qualitatively. The relation between F0 and flow was also investigated. To identify the functional role of CT and TA muscle, average activity was compared between both muscles as well as within a muscle and between 22- and 50-kHz calls.
Pressure during ultrasound production.
Subglottal pressure during both call types was measured in six males. Twenty-two-kilohertz calls lasted on average 0.4–1.5 s (Table 1). Twenty-two-kilohertz calls stereotypically showed an initial fundamental frequency modulation (FM) followed by a long plateau in which F0 was less modulated. F0 either decreased (Fig. 1A) or increased (Fig. 1B) before the plateau phase. Average F0 at call onset ranged between 20.2 and 25.7 kHz and at midcall position between 20.2 and 24.3 kHz.
Fifty-kilohertz calls were on average 20–180 ms in duration. F0 in 50-kHz calls (measured at midcall position) ranged between 58.2 and 63.4 kHz. Fifty-kilohertz calls are uttered in longer bouts (Fig. 2) or as single vocalizations (Fig. 3).
During a 40- to 90-ms period before a 22-kHz call, subglottal pressure increased, rising to 0.4–0.9 kPa at phonation onset (Table 1, Fig. 1). At midcall, subglottal pressure ranged between 0.8 and 1.5 kPa in 22-kHz calls and between 0.9 and 1.9 kPa in 50-kHz calls (Table 1). The latter was slightly higher but not significantly different from pressure in 22-kHz calls (paired t-test, df = 5, t = −2.12, P = 0.09).
Fifty-kilohertz calls contain slowly modulated components (Fig. 3) and rapid modulations (Fig. 4). The fast FMs were associated with rapid small-amplitude modulations in the subglottal pressure signal (Fig. 4). Fifty-kilohertz calls sometimes contain large F0 jumps, which were not reflected in the pressure pattern (Fig. 5).
Pressure-frequency relation in normal calls.
ΔF0/ΔPs was 1.3–8.7 kHz/kPa during the initial phase of 22-kHz calls, and it was 4.7–29.5 kHz/kPa during the plateau phase in 22-kHz calls (Table 1).
The pressure-frequency relation was tested in randomly selected sets of 20 calls. At call onset of 22-kHz calls, subglottal pressure and F0 were significantly positively correlated in two rats (out of 6; rats 13 and 19; Fig. 6). At midcall, in 22-kHz and in 50-kHz calls, pressure and F0 were significantly correlated in three rats (out of 6; rats 4, 11, and 19 in 22-kHz calls and rats 4, 16, and 19 in 50-kHz calls; Fig. 6).
The pressure-frequency relation was also tested in long bouts of continuous 22-kHz calls. An example is shown in Fig. 7. The animal started spontaneously (segment 1) and then was gently touched by the experimenter (segment 2) and released again (segment 3). F0 tends to positively correlate with subglottal pressure (Fig. 7D); however, the regression coefficients are 0.69 (n = 57 calls) and 0.73 (n = 37 calls) but drop to 0.006 in segment 3. Significant positive correlations with high regression coefficients were found in long bouts from three other individuals (rat 11: n = 85 calls within 180 s, r2 = 0.42; rat 13: n = 22 calls within 41 s, r2 = 0.53; rat 19: n = 80 within 110 s, r2 = 0.75).
Pressure-frequency relation during small perturbations.
In three rats with a tracheostomal tube, a connection to a pressure transducer and a pulse generator was made. Subglottal pressure was perturbed by a brief injection of compressed air into the trachea during 22-kHz call production. The injection was associated with a small frequency jump (Fig. 8A) the size of which was dependent on the magnitude of the subglottal pressure change. The F0 change per subglottal pressure change (ΔF0/ΔPs) ranged between 0.1 and 0.2 kHz/kPa (Fig. 8B).
Effect of light gas.
Heliox injected into the trachea in two animals during the production of 22-kHz calls led to an increase in F0 by up to 17 kHz or ∼80% as soon as the gas arrived in the animal's vocal tract system (Fig. 9).
Tracheal airflow during ultrasound production.
Microbead thermistors were implanted in four animals. Two of them were also fitted with pressure tubes. Relative tracheal airflow in 22- and 50-kHz calls (Fig. 10 and Fig. 11, respectively) showed a similar general pattern in all four individuals. Airflow was maximal near the beginning of the call and decreased thereafter.
Flow decreased while subglottal pressure remained at an elevated level in 22-kHz calls. This could be achieved by increased resistance in the airway, for example, by continuously adducting the glottis throughout the call. However, a combination of resistance increase and decrease of expiratory effort is also possible.
In two animals (only flow signal, no pressure signal recorded), both call types were recorded (Fig. 12). Results were inconsistent. In one animal the flow at midcall was lower in the 22-kHz calls than in the 50-kHz calls. In the second animal there was a large overlap of flow ranges. Apparently, higher flow rates (in the trachea) are not always necessary to produce higher F0.
Laryngeal muscle activity.
EMG electrodes were implanted in five males. In four of these a pressure tube was also implanted. EMG activity was recorded successfully in CT muscles of five rats and in TA muscles of four rats. Histological images indicated that the EMG signals were recorded from the lateral portions of the TA muscle.
Great care was taken that electrodes did not produce tension on muscles, laryngeal framework, or surrounding structures. Nevertheless, EMG electrode implantation affected call production. F0 in 22-kHz calls was elevated between 1 and 15 kHz. In 50-kHz calls, some components were completely missing, although the pressure pattern and range were the same as in the animals that received only a tracheostomal tube. Apparently, both muscles are critical for USV production, and F0 is very sensitive to small experimental manipulations of these laryngeal muscles.
Both muscles demonstrated normal respiratory patterns. The TA EMG showed end-inspiratory activity and the CT muscle inspiratory activity (Fig. 13) (see, e.g., Sherry and Megirian 1980). EMG activity was present in the CT and the TA muscle during 22-kHz calls (Fig. 14). The onset of activity in both muscles coincided with the increase in subglottal pressure 40–90 ms prior to voice onset. The offset of activity in both muscles coincided with call offset and the sudden drop in subglottal pressure at the end of 22-kHz calls.
In 22-kHz calls, mean TA muscle EMG activity was 10–20% lower than that of the CT muscle (Table 2; paired t-test, df = 3, t = 8.8, P = 0.003). In 50-kHz calls, mean TA EMG activity was ∼10% larger than CT EMG activity (Table 2), which was, however, not quite significant (paired t-test, df = 3, t = −2.9, P = 0.063).
The TA muscle contributes differently to the two call types (Table 2). Mean TA EMG activity tends to be larger during 50-kHz than 22-kHz call production (paired t-test, df = 3, t = 2.4, P = 0.071). No differences in the EMG activity of the CT muscle were apparent between call types (paired t-test, df = 4, t = −0.62, P = 0.57).
In 22- and 50-kHz calls, EMG activity of both muscles is tonic during constant frequency production (Fig. 14 and Fig. 15). The rapid frequency changes and quick onset and offsets of call segments, respectively, in frequency-modulated 50-kHz calls were accompanied by short bursts of EMG activity. The burst amplitudes were always larger than the tonic EMG activity during constant-frequency segments. The bursts in EMG activity of CT and TA muscle were out of phase (Fig. 15). In three animals TA and CT EMGs were available during frequency-modulated 50-kHz calls. TA bursts followed CT bursts with a delay of 35–45% of interburst interval duration (Table 3). The short bursts are a demonstration of the fast kinetics of both muscles. The interburst intervals ranged between 6.0 and 12.3 ms in both muscles, and the average was ∼10 ms (Table 3), suggesting contraction rates ranging from 80 to 160 Hz and of 100 Hz on average.
The goal of this study was to identify sources of F0 variation. Ultrasound production in rats shows features of a whistle, i.e., F0 of a whistle is affected by the medium inside the respiratory tract. The F0 of pup calls (Roberts 1975a) and adult 22-kHz calls (this study) increases if the respiratory gas is replaced by heliox. A whistle is an acoustic excitation generated if an airflow becomes disturbed by an obstruction in the path of the air jet, provided that, first, the flow speed is in an unstable range and second, a feedback path allows certain frequencies in the disturbance to increase. The nature of the instability mechanism can be diverse, such as a simple sharp edge, a hole, or a side branch. Roberts (1975a) and Brudzynski and Fletcher (2010) favored the holetone whistle mechanism in which an airstream leaving a tube or hole (possibly trachea and glottis) hits a round aperture downstream (e.g., the space between the ventricular folds), generating an acoustic excitation. Whatever the mechanism is in rats, generally the primary sound of a whistle is not always very loud. The coupling of the sound source to a resonator, however, can make it a powerful acoustic instrument (Chanaud 1970; Hall 1991).
F0 of a whistle is determined by boundary conditions of sound source and resonance cavity, i.e., their geometry and wall characteristics. The rat larynx possesses a ventral pouch, an air sac-like structure (Riede et al. 2008) that is described under different names (ventral pouch by Lewis and Prentice 1980; fovea centralis by Walander 1950; ventriculus laryngicus medius by Liebich 1975). Its function is unknown. The rat's pharyngeal and oral cavities demonstrate considerable variability and flexibility (Hiiemäe and Ardran 1968; Liebich 1975; Reidenberg and Laitman 1991), but the extent to which this is exploited during USV is little understood. That the upper vocal tract is somehow involved in F0 modulation is supported by two observations. First, lateral X-ray images of vocalizing rats showed stereotypic occurrence of a single (50-kHz calls) versus a dual (22-kHz calls) chamber expansion (Riede et al. 2011a). Fifty-kilohertz calls were accompanied by an expansion of the oral cavity, while during 22-kHz calls an expansion of an oral and a pharyngo-nasal cavity were seen. Second, Burgdorf et al. (2008) noted a greater degree of F0 modulation in 22-kHz calls if tongue movements accompanied call production. Some authors suggest that the larynx in rats resembles a scaled version of a typical mammalian larynx like that of humans (Roberts 1972b; Inagi et al. 1998). The vocal folds are indeed not special. Viscoelastic properties of rat vocal folds predict that oscillation frequencies are restricted below 10 kHz, which is typical for the range of the rat's audible sounds produced by flow-induced vocal fold oscillations (Riede et al. 2011b). However, it is likely that some aspects represent specific adaptations to produce ultrasound by a whistle mechanism, because why else would this mechanism only be used by rodents?
F0 of a whistle is also determined by flow velocity. Roberts (1975a) suggested, inferring from his findings with a physical model of a holetone whistle to rat USV, “…that the majority of upward jumps should occur with increases in respiratory pressure and the majority of downward jumps with decreases in respiratory pressure.” (Roberts 1975a, p. 86). The present findings do not fully support this hypothesis. The pressure-frequency relation was significant with high r2 values in some cases, and no relationship was found in others. The variability in the pressure-frequency relation and three additional observations suggest that subglottal pressure changes are not the main variable to regulate F0. First, the pressure and flow ranges were overlapping between the call types, which does not indicate a greater effort at the higher-frequency calls. Second, experimental subglottal pressure manipulations had only a small effect on F0 (0.1–0.3 kHz/kPa). Third, the range for ΔF0/ΔPs in normal 22-kHz calls was enormous (1–20 kHz/kPa).
The investigation of long bouts of 22-kHz calls showed that the relationship between pressure and frequency can change over time, possibly with context. Two aspects, not mutually exclusive, could contribute to this phenomenon. First, every time the animal moves, the laryngeal and vocal tract geometry changes passively a little. Provided that the range of laryngeal geometries in which a whistle can be produced is broad enough, F0 remains the same but the pressure-frequency relation changes. Second, the viscoelastic properties of larynx and upper vocal tract could cause variable geometries leading to variable pressure-frequency relation. The tissue in the larynx or upper vocal tract could increase resistance (by reducing cross-sectional area) due to elastic recoil if pressure decreases. Increased expiratory effort could lead to passive widening of laryngeal and vocal tract settings, simply by pushing tissue toward the periphery. This is unlike the setting in a musical wind instrument, in which boundary conditions are determined by rigid walls and are rather stable.
The relation between tracheal airflow and F0 currently cannot be fully explained. Besides lung pressure, flow depends also on the resistance in laryngeal and upper vocal tract. The present data indicated that flow was low during call production, confirming findings by others (Hegoburu et al. 2011; Roberts 1975b). Tracheal airflow decreased in 22-kHz calls, while pressure remained elevated. This could be achieved, for example, by an increase in resistance by adducting the vocal folds. In fact, TA and CT muscles were active during USV; however, there was no sign of increasing activity associated with the decrease in flow rate. This means that increased resistance must be achieved either by another not yet recorded laryngeal adductor muscle (e.g., the interarytenoideus muscle) or by other muscles in the vocal tract. Alternatively, a dependence of the cross-sectional area of the vocal tract on expiratory pressure due to viscoelastic tissue properties, as explained above, could play a role.
The flow pattern did not lead or follow the F0 contour, and flow ranges were overlapping between call types, suggesting that F0 variations are also independent from changes in tracheal airflow. However, changes in the laryngeal aperture could have created different flow velocities locally in the glottis and not be picked up by the intratracheal thermistor. Methodological problems with flow probes did not allow a clean determination of absolute flow, and therefore this question remains to be further investigated.
F0 changes must be controlled by a different mechanism than higher or lower respiratory effort. The primary process of the acoustic excitation occurs at the glottis, as nerve cut studies, investigations of motoneurons, and direct observations suggest (Roberts 1975b; Nunez et al. 1985; Sanders et al. 2001; Wetzel et al. 1980; Yajima and Hayashi 1983). In the present study, EMG recordings confirm the importance of the larynx as primary sound source. The fine regulation of intrinsic laryngeal muscles is apparently highly critical. During 22-kHz call and 50-kHz call production the EMG activity in CT and TA muscles was elevated compared with that of quiet respiration. Constant F0 elements were associated with tonic muscle activity (22-kHz calls and constant-frequency 50-kHz calls). FM components and on/offset of call elements, respectively, were associated with bursts of EMG activity (FM 50-kHz calls). The EMG activity of the two muscles showed a number of qualitative and quantitative differences, such as different activation amplitudes and alternating bursts during the FM elements. In many calls, the elevated EMG activity was initiated with an onset burst, and the amplitude of the bursts in the FM call segments was always higher than in the constant-frequency calls. In general, a more dynamic vocal behavior is often associated with higher activation levels of muscles of a vocal organ than constant more monotonic F0 vocalizations (see, e.g., Goller and Suthers 1996).
Comparison with other mammals.
Vocal sounds in many mammals [speech in humans (Titze 2000), vocalization in nonhuman primates (see, e.g., Brown et al. 2003)] are produced by flow-induced vocal fold oscillations. The tissue oscillations are self-sustained by an airstream passing between the vocal folds. Coordinated by laryngeal muscles, vocal folds are adducted into prephonatory position. Muscle activity determines adduction and vocal fold length (and thereby tension). The oscillation itself is a passive mechanism, and the vocal output is largely dependent on the viscoelastic properties of the vocal fold tissue (e.g., Riede 2010). The physical process of the whistle mechanism in rat USV is different, but how far do the differences extend to other aspects involved? Many features are in fact similar. The whistle (e.g., Chanaud 1970; Shadle 1983) as well as the sound production by flow-induced vocal fold oscillation (e.g., Titze 2008; Titze et al. 2008b) rely on a primary sound produced at a source and a feedback mechanism between source and vocal tract resonances. The combination of a larynx producing a source sound and a filter variably coupled to the source suggests that the source-filter theory (Fant 1960), which is rapidly gaining relevance for interpreting nonhuman vocalization (e.g., Riede et al. 2005; Taylor and Reby 2010), is important for both mechanisms. Furthermore, the driving power for both mechanisms is provided by the respiratory system. Subglottal pressure values during USV in rats overlap with those recorded during vocal production in other species, for example, human speech or singing (e.g., Bouhyus et al. 1968; Holmberg et al. 1988; Plant et al. 2004). Most interestingly, the EMG activity of TA and CT muscles during rat USV resembles the activity in other mammals during phonation (see Table 4 for examples). The stereotypic vocal repertoire in rats (see, e.g., Wright et al. 2010) suggests that vocal and respiratory motor patterns must be under precise control, affording the opportunity to further explore the neural mechanisms enabling this precise control.
The contraction rates of CT and TA muscles were as high as 150 Hz. Contraction rates of larynx muscles of other mammals rarely exceed 40 Hz (see, e.g., Alipour et al. 1987; Hast 1967; Hirose et al. 1969; Martensson and Skoglund 1964); only in bats are contraction rates comparable (Suthers and Fattu 1973). This confirms the fast-twitch kinetics suspected by histochemical and in vitro investigations (DelGaudio et al. 1995; Hinrichsen and Dulhunty 1981; McMullen and Anrade 2006). Reflex responses of rat laryngeal muscles in the range of 10–30 Hz, for example, to mechanical stimulation (see, e.g., Hinrichsen and Dulhunty 1981) were known. The present study extends the data to the vocal function and shows that very rapid muscle activation patterns are important for voice quality. This fast kinetics puts the rat's TA and CT muscles in a class with other superfast muscles involved in sound production (Elemans et al. 2008; Fine et al. 2001; Rome et al. 1996; Uchida et al. 2010).
The consequences of the similarities and differences between USV and sound production by flow-induced vocal fold oscillation, for example, for somatosensory control, are little understood (Burgdorf et al. 2007; Jürgens 2002; Yajima et al. 1980). It is, for example, likely that, as in other species (Fenzl and Schuller 2005; Fine and Perini 1994; Jürgens 2000), differences in neural control circuits account for functionally and acoustically different calls. The rat's vocal production shares important features such as the precise control of vocal and respiratory muscles with other mammals, emphasizing the importance of the rat and its vocal behavior as a model system.
This work was supported by a University of Utah seed grant (VP528) and in part by National Institute on Deafness and Other Communication Disorders Grants R01-DC-006876 and R01-DC-008612.
No conflicts of interest, financial or otherwise, are declared by the author(s).
I am grateful to Franz Goller for support and encouragement.
- Copyright © 2011 the American Physiological Society