|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
EDITORIAL FOCUS
Consider the college freshman with an hour to kill between classes. He can either go to the coffee shop in the student union or the one in the library. On several occasions he has run into friends or struck up enjoyable conversations with strangers in the library canteen but rarely in the student union. Although he visits both venues throughout the quarter and often studies alone in both places, he is more likely to be found in the library coffee shop. On the Friday after Thanksgiving, he gets stuck on campus while all of his friends are away. He goes to the library that day, although he drinks his coffee alone. Conditioned place preference is an animal version of such contextual reinforcement learning. By focusing on the "where" of reinforcement learning, rather than when or how, this paradigm allows investigators to examine how reward related experience is associated with placeor contextand subsequently biases behavioral choices. In conditioned place preference, animals are placed in an apparatus divided into two (or 3) chambers. Preference for one chamber over the other(s) is assessed by measuring the amount of time spent in each chamber. During the conditioning phase, the opening between the chambers is blocked, and animals are exposed to each chamber separately while receiving a particular experience in each chamber. For example, the subject may receive morphine in the left chamber and saline in the right. During the preference test, the animal is allowed to roam freely between the chambers, and no experience (e.g., morphine or saline) is provided. Thus the animal's preference during the test phase is based solely on previous associations with each chamber.
Conditioned place preference, as widely used, focuses on the acquisition of preference (frequently assessing the motivational value of drugs) while the discrete behaviors that comprise the expression of preference are generally unexamined (Tzschentke 1998
). German and Fields reframe the conditioned place preference task as an exploratory decision-making process for the rodent. At any given moment the animal has to decide whether to stay put or to move, and if it moves, it must decide where to go. "Preference" arises from a series of behavioral choices that reflect an exploratory strategy. Examining the duration of visits in each chamber and transitions between chambers, the authors found that the animals do not favor the morphine-paired room with longer visits (which would be observed as a shifted Gaussian distribution of visit duration), but rather they exhibit an increased exit probability (a shifted exponential distribution of exits) from the saline-paired room. This is curious in the context of addiction and relapse. The rats are not merely associating the morphine-paired chamber with good times and lingering in the chamber but are actively decreasing their exploration of the saline-paired room and thereby, given the restricted choices, increasing their rate of return to the morphine-paired chamber. When the previous experience does not occur in the morphine-paired room, however, they continue to explore. As German and Fields put it "they have reason to go, but no reason to stay." This reflects the experience of an addict during early sobriety: nondrug-related cues and contexts are not rewarding but neither are drug-related cues and contexts in the absence of drug. What is counterintuitive is that the actual change in goal-directed, exploratory behavior occurs in the previously non-rewarded context. This finding is provocative and suggests an alternative to conventional thinking about relapse: an important component of drug-seeking behavior during abstinence may be an increased propensity to "give up" on nonrewarded contexts; that is, a decrease in tolerance for and exploration of previously nonrewarded environments may be a salient underlying behavioral process promoting relapse.
In conditioned place preference, there are no distinct, temporally discrete cues predicting reward, nor are there any instrumental behaviors with which the animal brings about reward. In fact, during the testing phase, there is no reward. The measured behavior, preference, is an aggregate statistic rather than a discrete response. Therefore there is no cue, learned response, or reward consumption within which to frame a standard electrophysiological analysis. The model described by German and Fields overcomes this difficulty and offers a discrete behavior to be examined.
Conditioned place preference is often construed as a form of Pavlovian conditioning where the CS is a complex compound stimulus (i.e., context) and the CR is simply approach behavior. Although theoretically satisfying, this view obscures potentially interesting questions about place learning (see Bardo and Bevins 2000
). How is the compound CS represented in the brain? Are the various cues comprising a "place" coalesced into a configuration of stimuli and encoded by specific cells representing context? In the absence of a discrete CR, such as salivating, what response is associated with the CS (see Konorski 1966)? In a companion paper in this issue of the Journal of Neurophysiology (p. 20942106), German and Fields enter into this largely uncharted territory. The nucleus accumbens (NAc) plays a critical role in conditioned place preference (Bardo 1998
). By examining the correlation of NAc activity with transitions between rooms before and after conditioning, German et al. provide a first glimpse of the mechanics of the accumbens contribution to place conditioning. They find a population of neurons that exhibit room specific activity after conditioning, including neurons excited by the saline-paired room and inhibited by the morphine-paired room. The activities associated with each room are tonic, sustained throughout the duration of a room visit, and change immediately on transition to a different location. Thus this activity constitutes not a direct motor signal preceding transitions but a bias operating during a visit, which the authors argue determines the probability of exiting from the chambera "context-dependent preference signal." Far from resolving the mechanism of conditioned place preference, this first direct look at the underlying neural activity promises to open Pandora's box and stimulate both debate and future studies.
The apparent simplicity of the subjective experience of reward belies the complexity of reward and reward learning as neural processes. Recent computational models have enriched our theoretical understanding of reinforcement learning (Montague et al. 2004
). German and Fields present a model of the rat's behavior in conditioned place preference that accurately describes and simulates their actual behavior; it does not, however, explain the phenomenon it describes. Their model will stand as an invitation to bring the same theoretical rigor and formal computational investigation to conditioned place preference as has been afforded other learning tasks, potentially yielding a richer, more comprehensive understanding of reinforcement learning generally. Although the initial insight underlying these two papers appears humble enoughasking what specific behaviors comprise preferencethis simple beginning promises to open fertile new areas of investigation.
Departments of Neurobiology, Pharmacology, and Physiology, University of Chicago, Chicago, Illinois
Address for reprint requests and other correspondence: J. Beeler (E-mail: jabeeler{at}uchicago.edu)
REFERENCES
Bardo M. Neuropharmacological mechanisms of drug reward: beyond dopamine in the nucleus accumbens. Crit Rev Neurobiol 12: 3767, 1998.[ISI][Medline]
Bardo M, Bevins R. Conditioned place preference: what does it add to our preclinical understanding of drug reward? Psychopharmacology 153: 3143, 2000.[CrossRef][Medline]
German P, Fields H. How prior reward experience biases exploratory movements: a probabilistic model. J Neurophysiol 97: 20832093, 2007a.
German P, Fields H. Rat nucleus accumbens neurons persistently encode locations associated with morphine reward. J Neurophysiol 97: 20942106, 2007b.
Konorski J. Integrative Activity of the Brain. Chicago: University of Chicago Press, 1967.
Montague P, Hyman S, Cohen J. Computational roles for dopamine in behavioral control. Nature 431: 761767, 2004.
Tzschentke T. Measuring reward with the conditioned place preference paradigm: a comprehensive review of drug effects, recent progress and new issues. Prog Neurobiol 56: 613672, 1998.[CrossRef][ISI][Medline]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |