|
|
||||||||
INNOVATIVE METHODOLOGY
1Neuroscience Statistics Research Laboratory, Department of Anesthesia and Critical Care, Massachusetts General Hospital, Boston; 2Division of Health Sciences and Technology, Harvard Medical School/Massachusetts Institute of Technology, Cambridge, Massachusetts; and 3Department of Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Second, significant between-subject variation in responses is typical in learning studies. As a consequence, learning experiments often require multiple subjects to execute the same task to characterize the features of the learning process common to the population (Dias et al. 1997
; Eichenbaum et al. 1986
; Fox et al. 2003
; Jonasson et al. 2004
; Maclean et al. 2001
; Roman et al. 1993
; Rondi-Rieg et al. 2001
; Stefani et al. 2003
; Whishaw and Tomie 1991
). Instead of formally characterizing between- and within-subject variation, current analyses of population learning compute only simple proportions of correct responses within a fixed number of trials, across multiple subjects. Furthermore, these analyses use definitions of learning that have been shown to be suboptimal (Smith et al. 2004
). These shortcomings of current population analyses of learning have not been addressed.
Use of random effect models to estimate population and individual characteristics from the time series measurements of multiple subjects executing the same protocol is an established paradigm in statistics (Fahrmeir and Tutz 2001
; Jones 1993
; Laird and Ware 1982
; Stiratelli et al. 1984
). For learning studies, the random effects approach offers an efficient way to estimate the population curve, as well the individual learning curve for each subject. Although random-effects models have been widely applied in medical, epidemiological, and sample survey research, they have not been used to analyze population learning in behavioral experiments.
We introduced a state-space framework for conducting dynamic analyses of learning in behavioral experiments from time series of binary responses (Smith et al. 2004
). The framework provided an estimate of the learning curve and its confidence intervals, gave a precise definition of the learning trial, and characterized learning more accurately and reliably in simulated and actual learning experiments than several currently accepted methods. To develop a dynamic approach to characterize simultaneously population and individual learning performance from time series of binary responses, we extend this framework by defining a state-space model with random effects. We present definitions of the learning curve, learning trial and the ideal observer curve for the population and individuals, and dynamic estimates of between- and within-group differences in learning. We illustrate the new approach by analyzing learning in a group of control rats and a group of rats treated with an NMDA (N-methyl-D-aspartate) receptor antagonist in a set-shift task. We also show how the paradigm may be used to design learning experiments optimally.
| METHODS |
|---|
|
|
|---|
We assume that learning is a dynamic process that can be studied with the state-space framework (Kitagawa and Gersh 1996
; Smith and Brown 2003
). The state-space model consists of 2 equations: a state equation and an observation equation. The state equation defines an unobservable learning process whose evolution is tracked across the trials in the experiments. Such state models with unobservable processes are often referred to as hidden Markov or latent process models (Fahrmeir and Tutz 2001
; Roweis and Ghahramani 1999
; Smith and Brown 2003
; Smith et al. 2004
). Because our objective is to characterize learning for the population and the individual subjects in our study, we formulate a state-space random effects (SSRE) model. That is, we assume that there is a population state learning process and that the learning processes for the individual subjects are drawn from a probability distribution, which has the population learning process as its mean.
We formulate the population and individual state learning processes so that they increase as learning occurs and decrease when it does not occur. From the learning state processes we compute population and individual learning curves that define the probability of a correct response as a function of trial number. The observation equations complete the state-space model setup and define how the observed data relate to the unobservable learning state processes. The data we observe in the learning experiment are the series of correct and incorrect responses for each subject as a function of trial number. Therefore, the objective of the analysis is to estimate the population and individual learning state processes and thus the population and individual learning curves from the observed data.
We conduct our analysis of the experiment from the perspective of an ideal observer. That is, given the state and observation models, we estimate the learning state processes at each trial after seeing the outcomes of all the trials of each subject in the experiment. This approach is different from estimating learning from the perspective of the subject executing the task, in which case the inference about when learning occurs is based on the data up to the current trial (Kakade and Dayan 2002
; Yu and Dayan 2003
). Identifying when learning occurs is therefore a 2-step process. In the first step, we estimate from the observed data the learning state process and thus, the learning curve. In the second step, we estimate when learning occurs by computing the confidence intervals for the population and individual leaning curves or, equivalently, by computing for each trial the ideal observer's assessment of the probability that each subject and the population perform better than chance.
To define the SSRE model, we assume that J subjects participate in a learning experiment with K trials, where we index the trials by k for k = 1, ..., K and the subjects by j for j = 1, ..., J. To define the observation equation we let njk denote the response on trial k, from subject j where nkj = 1 is a correct response and nkj = 0 is an incorrect response. We let pkj denote the probability of a correct response k from subject j. We assume that the probability of a correct response on trial k from subject j is governed by an unobservable learning state process xk, which characterizes the dynamics of learning as a function of trial number. At trial k, for subject j, the observation model defines the probability of observing nkj (i.e., either a correct or incorrect response), given the value of the state process xk. The observation model can be expressed as the Bernoulli probability mass function
![]() | (2.1) |
![]() | (2.2) |
j is the learning modulation parameter for subject j. We define the random effect component of our state-space model by assuming that the modulation parameters
j are independent Gaussian random variables with mean
0 and variance 
2IJxJ where IJxJ is a J x J identity matrix. Therefore, we define the probability of a correct response for the population as
![]() | (2.3) |
![]() | (2.4) |
k are independent Gaussian random variables with mean 0 and variance 
2.
An important concept that underlies all SSRE analyses is exchangeability, which means that the response data from each subject in a cohort provide information about the performance of every other subject in the cohort (Gelman et al. 1995
). Therefore, the response data from each subject can be used to estimate the population learning curve and to estimate the learning curves for every subject in that cohort. To use the SSRE model optimally, it is key to define subgroups in the experiment for which exchangeability is a reasonable assumption. We illustrate this point in our analyses in the RESULTS.
In the learning experiment, we set the number of trials K and we observe N1:K = {n1, ..., nK}, the responses for each of the K trials, where nk = {nk1, ..., nkJ} is the set of responses from the J subjects on trial k. The objective of our analysis is to estimate x = {x1, ..., xK,}
= {
1, ...,
J} and
= (
0, 
2, 
2) from these, data to estimate pkj, the probability of a correct response for subject j and pk, the probability of a correct response for the population for j = 1, ..., J and k = 1, ..., K. If we can estimate x,
, and
then, by Eq. 2.2, we can compute the probability of a correct response as a function of trial number given the data for each of the J subjects and the population. Because x and
are unobservable and
is an unknown parameter, we use the ExpectationMaximization (EM) algorithm to estimate them by maximum likelihood (Dempster et al. 1977). The EM algorithm is a well-known procedure for performing maximum-likelihood estimation when there is an unobservable process or missing observations. We used the EM algorithm to estimate state-space models from point process observations with linear Gaussian state processes (Smith and Brown 2003
). Our EM algorithm is an extension of the algorithm is Smith et al. (2004)
, and its derivation is given in APPENDIX A. We denote the maximum-likelihood estimate of
as
= (
, 
2, 
2).
Estimating individual and population learning curves
Given the maximum-likelihood estimates of the x and
, we can compute for each xkxk|K, the smoothing algorithm (Eqs. A16A18) estimate of the population learning state at trial k. It is the estimate of xk given N1:K, all the data in the experiment with the parameter
replaced by its maximum-likelihood estimate, where the notation xk|K means the learning state process estimate at trial k given the data up through trial K. Similarly, the smoothing algorithm estimate of the individual learning modulation parameters is the estimate of
given N1:K with the parameter
replaced by its maximum-likelihood estimate. We denote the estimate of the learning modulation parameters as
K|K = (
K|K1, ...,
K|KJ) given in Eq. A16 of APPENDIX A. The smoothing algorithm gives the ideal observer estimate of the population learning states and the individual modulation parameters.
The smoothing algorithm estimate of the learning state at each trial k is the Gaussian random variable with mean xk|K (Eq. A16) and variance,
k|K2 (Eq. A18). The smoothing algorithm estimate of
is the Gaussian random variable with mean
K|K and covariance matrix computed from WK|K defined in Eq. A18 of APPENDIX A. The individual learning curve for subject j is computed by Eq. 2.2 at the maximum-likelihood estimates of xk,
j, and
and is defined as
Individual learning curve estimate
![]() | (2.5) |
Population learning curve estimate
![]() | (2.6) |
As Eqs. 2.5 and 2.6 show, our approach to estimating population learning curves does not simply compute the average of the state-space estimates of the individual learning curves. Instead, using the exchangeability assumption, we estimate the population and individual learning curves simultaneously by extending the EM algorithm we previously developed to estimate individual learning curves (Smith et al. 2004
). The key technical point that makes possible this extension is the augmented state-space model in Eq. A8 (Eden et al. 2004
; Jones 1993
). This model represents the common learning state process and the individual learning modulation parameters in a single J+1-dimensional state equation so that the probability density of the current learning state depends on the value of the previous state (Eq. 2.4), whereas the modulation parameters have the same probability density (see text below Eq. 2.2) for the entire experiment. In other words, each learning modulation parameter is a random effect (variable) specific to each subject and each has the same probability density at each trial in the experiment. The probability density of the learning state variable changes from trial to trial depending on the value of the previous learning state variable. By using the augmented state-space to represent the properties of our model, we compute in the E-step of the EM algorithm (Eqs. A16A18) both the best estimates of the state variable at each trial and the subject-specific modulation parameters given all the responses of the cohort recorded in the experiment (Jones 1993
).
Estimating a common population learning state from the binary responses of subjects belonging to the same cohort is analogous to decoding a biological signal from the spiking activity of an ensemble of neurons using a state-space model to characterize the signal and point process models to represent the spiking activity. For this reason, the filter algorithm (Eqs. A9A11) and smoothing algorithm (A16A18) used in the E-step of our EM algorithm are respectively the analogs of the Bayes' filter and the Bayes' smoother used in Brown et al. (1998) to decode the position of a rat in its environment from the ensemble spiking activity of place cell neurons in the CA1 region of the animal's hippocampus.
To construct confidence intervals for the learning curves, we must obtain their probability densities. For the population learning curve we can compute the probability density of any pk|Kj using Eq. 2.2 and the standard change of variables formula from elementary probability theory. That is, the smoothing algorithm estimates the state as the Gaussian random variable with mean xk|K (Eq. A16) and variance,
k|K2 (Eq. A18). Because the population learning curve estimate is a function of this random variable, we can compute its probability density by standard change of variable formula from elementary probability theory. Applying the change of variable formula to the Gaussian probability density with mean xk|K and variance
k|K2 yields (Smith et al. 2004
)
![]() | (2.7) |
j and xk, and the joint distributions of these 2 random variables, given the data N1:K, is given by the smoothing algorithm. Because this learning curve is a function of 2 random variables it is more difficult to derive its probability density in closed form. Therefore, we compute it by the Monte Carlo algorithm in APPENDIX B. The ideal observer curve and the ideal observer learning trial
Having estimated the learning curve, we compute for each trial the ideal observer's assessment of the probability that the subject or the population performs better than chance. We term this function the ideal observer curve. The ideal observer curve for individual subject j is Pr(pk|Kj > p0), where pk|Kj is defined in Eq. 2.5, p0 is the probability of a correct response by chance in the experiment and k = 1, ..., K. We compute this curve for each of the J subjects. The ideal observer curve for the population is Pr(pk|K > p0), which is the probability that the population performs better than chance for trials k = 1, ..., K. The probability that the population performs better than chance on trial k is computed using the smoothing algorithm and Eq. 2.7, where the ideal observer curve for each individual is computed using the Monte Carlo algorithm in APPENDIX C. An important advantage of the ideal observer curve is that it provides, together with the learning curve, a dynamic assessment of learning in terms of how sure an ideal observer is that learning has occurred on each trial in the experiment.
Contrary to the approach taken by the current hypothesis-testing methods for analyzing learning, this analysis makes explicit the fact that learning is not a yesno process (Smith et al. 2004
). Nevertheless, for the purpose of making comparisons with these and other methods, it is important to define a learning trial. We define the population (individual) learning trial as the earliest trial in the experiment such that the ideal observer is reasonably certain that the performance of the population (individual) is better than chance from that trial through the balance of the experiment. Because we define learning as performance that is better than chance, identifying a learning trial indicates that learning has occurred. For our analyses we define a level of reasonable certainty as 0.95 and term this trial the ideal observer learning trial with level of certainty 0.95 [IO(0.95)].
In terms of the ideal observer learning curve, we define the learning trial as follows. Given a level of certainty of 0.95, the learning trial of subject j is the earliest trial r such that Pr(pk|Kj > p0)
0.95 for all trials k
r. Given a level of certainty of 0.95, the population learning trial is the earliest trial number r such that Pr(pk|K > p0)
0.95 for all trials k
r.
For either an individual or the population learning curves, the ideal observer learning trial can be computed from the lower confidence bounds for pkj and pk, respectively. The ideal observer learning trial for the individual (population) is the first trial on which the lower 95% confidence bound for the probability of a correct response, pkj (pk) is greater than chance p0 and remains above p0 for the balance of the experiment.
Comparing learning between and within groups
An objective of population learning studies is to compare learning between 2 or more groups. This comparison can be carried out in a straightforward way in our paradigm because we have the probability distribution associated with each learning curve (Eq. 2.7). Therefore, given any 2 learning curves we can compute at each trial, the probability that curve one is greater than curve two, or vice versa, and plot this probability as a function of trial number. Therefore, we can state for each trial how sure we are that one curve is greater than the other, and test hypotheses about differences in learning between the 2 groups. We explain in APPENDIX C how we compute these comparison probabilities by Monte Carlo from the probability models for learning curves of 2 different groups.
Another objective of population and individual learning studies is to compare learning within a group. This comparison can also be carried out in a straightforward way in our paradigm because we estimate the joint probability distribution associated with each learning curve (Eq. 2.7). Therefore, given any 2 trials we can compute the probability that the population (individual) performance at one trial is greater than the performance at any other trial. A plot of this 2-dimensional comparison for all trial pairs illustrates how sure we are that performance on one trial is greater than performance on any other trial. We explain in APPENDIX D how we compute these comparison probabilities by Monte Carlo from the joint probability distribution of the learning states for a given group.
The Matlab (MathWorks, Natick, MA) code for the algorithms we present here can be downloaded from our website: https://neurostat.mgh.harvard.edu/BehavioralLearning/Matlabcode.
Learning analysis using the 8-trial blocks and the 8 consecutive correct response methods
Stefani et al. (2003)
estimated population learning curves from group responses in learning experiments by computing the fraction of correct responses across all animals in nonoverlapping blocks of 8 trials. We termed this method the 8-trial blocks (8TB) method. This gave a 10-point estimated learning curve for each group. Stefani and colleagues considered an animal to have learned the task when it gave 8 consecutive correct responses. We termed this method the 8 consecutive correct responses (8CCR) method. We compared the 8TB method with our state-space random-effects method for estimating the population learning curve and the 8CCR method with our IO(0.95) method for identifying the learning trial.
Experimental protocol for a set-shift task
To illustrate the performance of our method on actual experimental data, we analyzed the responses from 2 groups of rats performing a set-shift task. In the set-shift task the animal learned one task during the first phase (Set 1) then during a second phase (Set 2) had to shift and learn a second task with the confound of the response options of the first task were present as the animal learned the second task (Stefani et al. 2003
). The task consisted of 2 discriminations, performed on consecutive days in the same 4-arm maze. The arms of the maze differed along 2 stimulus dimensions: texture and brightness. Texture was either rough or smooth and brightness was either light or dark. For each trial, one arm was blocked so that the maze was in a T-configuration. Thus, from each start arm a rat had a choice of a left or right turn, and simultaneously by design, a choice between rough and smooth, and a choice between light and dark (Fig. 1). Each trial began from a different start arm, chosen pseudo-randomly so that in each block of 8 consecutive trials there were 2 starts from each of the 4 arms.
|
Twenty minutes before beginning training on Set 2, each rat received a bilateral microinjection into the medial prefrontal cortices of either a vehicle solution (145 mM NaCl, 2.7 mM KCl, 1.0 mM MgCl2, and 1.2 mM CaCl2) or the vehicle solution plus the NMDA-receptor antagonist MK801 at a dose of 3 µg per hemisphere. The hypothesis tested by Stefani and colleagues was that treatment with MK801 should alter the ability of the rats to execute the set-shift compared to the animals receiving the vehicle. We termed animals receiving only the vehicle solution the Vehicle group and animals receiving the vehicle solution with MK801 the Treatment group.
| RESULTS |
|---|
|
|
|---|
To illustrate application of our methods, we analyzed the learning behavior of the Vehicle and Treatment groups from Set 2 from the BrightnessTexture part of the set-shift experiment. The trial responses are shown in Fig. 2 as blue and red marks corresponding respectively to correct and incorrect responses. Figure 2A and 2B (2C and 2D) are the responses from the Vehicle (Treatment) group. We subdivided each group according to the rewarded arm in Set 1. Figure 2A (2C) are the Vehicle (Treatment) animals rewarded for the light reward arm in Set 1 and Fig. 2B (2D) are the Vehicle (Treatment) animals rewarded for the dark arm Set 1. We denote the subgroups Vehicle light, Vehicle dark, Treatment light, and Treatment dark.
|
The performance of the 9 Treatment animals began close to or slightly below chance with 4 of 9, 3 of 9, 4 of 9, and 3 of 9 correct responses in the first 4 trials (Fig. 2, C and D). At the end of the 80-trial experiment, the performance of the Treatment group was greater than chance, but with many more incorrect responses than the Vehicle group. Only animal 3 (Fig. 2C) in the Treatment light subgroup had an uninterrupted sequence of correct responses at the end of the experiment. This sequence began at trial 60, much later than those of animals 2, 3, 4, and 5 in the Vehicle light subgroup (Fig. 2A).
We performed 3 analyses using the state-space paradigm: 1) a state-space (SS) analysis in the Vehicle and the Treatment groups in which the response data are pooled across all the animals in each group (Smith et al. 2004
); 2) a state-space analysis of the response data of each individual animal; and 3) a state-space random effects (SSRE) analysis on the response data within each of the 4 subgroups: Vehicle light, Vehicle dark, Treatment light, and Treatment dark. The state-space analysis of the pooled response data illustrated population learning curve estimation under the assumption that there was no between-subject variation. The state-space analysis of individual responses illustrated the estimation of individual learning curves under the assumption that there was no common or population feature shared by the members of any of the subgroups. The state-space random effects analysis illustrated characterization of between-subject variation in learning by estimating simultaneously population and individual learning curves within a subgroup.
We compared the learning curves estimated from our state-space methods with the learning curve estimated by the 8-trial nonoverlapping block method (8TB) (Stefani et al. 2003
) and we compared our IO(0.95) learning trial estimates from the state-space analyses with those computed by the 8 consecutive correct responses criterion (8CCR) (Stefani et al. 2003
).
Analysis of learning from the pooled responses within the vehicle and the treatment groups
We first analyzed the response data without taking into account the reward arm during Set 1. That is, we combined the responses across the Vehicle light (Fig. 2A) and the Vehicle dark (Fig. 2B) subgroups and analyzed the experimental data as the number of correct responses from the 13 animals by trial across the 80 trials. For the Treatment group, we combined the responses across the Treatment light (Fig. 2C) and the Treatment dark (Fig. 2D) subgroups and analyzed the experimental data as the number of correct responses from the 9 animals by trial across the 80 trials. This analysis thereby assumes that there is no between-subject variation within either the Vehicle or the Treatment group.
To do this, we replaced the Bernoulli model in Eq. 2.1 with the binomial observation model
![]() | (3.1) |
![]() | (3.2) |
The SS learning curve estimated from the pooled responses of the Vehicle group provided a trial-by-trial estimate of the probability of a correct response that increased monotonically from 0.58 on trial 1 to 0.94 on trial 80 (Fig. 3A). This learning curve was >0.5, the probability of a correct response by chance (Fig. 3A, horizontal dashed line), for the entire experiment. The behavior of this learning curve is consistent with the performance apparent from the pattern of correct and incorrect responses seen in the Vehicle group (Fig. 2, A and B). The SS learning curve estimated from the pooled responses of the Treatment group began at 0.5, decreased slightly to 0.46 at trial 5, increased almost monotonically to a maximum of 0.79 at trial 70, and decreased to 0.77 at trial 80 (Fig. 3B). The behavior of this learning curve is also strongly consistent with the performance apparent from the pattern of correct and incorrect responses of the Treatment group (Fig. 2, C and D). In particular, the large number of incorrect responses in the early trials was the reason this learning curve initially fell to <0.5. Similarly, the several incorrect responses on trials 77 to 80, particularly in the Treatment dark subgroup, were responsible for the decline in the learning curve at the end of the experiment.
|
To compare our state-space model analysis of the pooled responses with the approach taken in Stefani et al. (2003)
, we estimated for both the Vehicle and the Treatment groups the learning curves using the 8TB method and we identified the learning trials for both groups using the 8CCR method. The population learning curve computed using the 8TB method provided only 10 estimates for the 80 trials for each of the 2 groups. For the Vehicle group, this curve (Fig. 3A, black SE error bars) increased from 0.64 in the first block to 0.92 in the 10th block. This learning curve was in close agreement with the SS learning curve for this group. Neither this curve nor any of its lower SE barsdefining an approximate 67% confidence interval in each blockdropped below 0.5 (dashed horizontal line), the probability of a correct response by chance. For the Treatment group (black SE error bars, Fig. 3B), the population learning curve began at 0.41 in the first block, increased to a maximum of 0.69 in the ninth block, and decreased slightly to 0.65 by the last block. This learning curve was also in close agreement with the corresponding SS learning curve. As was true for the SS learning curves for the Vehicle and Treatment groups, the 8TB learning curve for the Treatment was below the 8TB learning curve for the Vehicle group at each of the 10 trial estimates.
The 67% confidence intervals for the 8TB learning curve for the Vehicle group were wider through trial 24 and became smaller for the trials beyond trial 24 (Fig. 3A). The 90% confidence intervals for the SS learning curve showed a similar change in width. Based on the width of the 67% confidence intervals for the 8TB learning curve the corresponding 90% confidence intervals for the 8TB learning curve would be larger than the 90% confidence intervals for the SS learning curve. The larger intervals occurred because for the vehicle group each 8TB confidence interval was based on 8 trials x 13 = 104 observations, whereas the SS confidence intervals were based on 80 trials x 13 = 1,040 observations. The 67% confidence intervals for the 8TB learning curve were slightly wider than 90% confidence intervals for the SS Treatment group learning curve because the SS confidence intervals were based on all the 80 trials x 9 animals = 720 observations, whereas each 8TB interval was based on only 8 trials x 9 animals = 72 observations (Fig. 3B).
The population learning trial in the analysis of Stefoni et al. (2003)
was computed by using the 8CCR method to compute the learning trial for each animal in the Vehicle (Treatment) group (Fig. 2, light blue squares) and then taking the population learning trial to be the mean of the individual Vehicle (Treatment) learning trial estimates. The mean (median) of the individual learning trials for the Vehicle group was trial 48.1 (51). As predicted by the analyses of actual and simulated data in Smith et al. (2004)
, this learning trial is much later than the IO(0.95) learning trial estimate of trial 3 for this group. The mean (median) of the individual learning trials for the Treatment group was trial 70.0 (80) under the assumption used by Stefani and colleagues that trial 80 was assigned as the learning trial to an animal that did not reach the criterion of 8 consecutive correct responses. This differed from the IO(0.95) learning trial for this group of trial 31. For both the Vehicle and the Treatment groups the population learning trial was later than what might have been expected by analyzing the 8TB learning curve. This discrepancy between the 8TB learning curve estimate and the 8CCR estimate of the learning trial arises because the two methods, unlike the SS learning curve estimate and the IO(0.95) learning trial, are not related. In particular, it is possible to have many more correct than incorrect responses yet not have 8 consecutive correct responses. The 8TB learning curves agree closely with the SS learning curves in this pooled analysis and showed clearly the difference in learning between the Vehicle and Treatment groups. By construction the IO(0.95) learning trials gave estimates of the learning trial consistent with the SS learning curves. The 8CCR learning trials suggest that learning occurred much later in the Vehicle group and perhaps not at all in the Treatment group.
State-space analysis of individual learning within the vehicle and treatment groups
The pooled analysis treated all the responses within each group as if there was no subject-specific effect. To analyze the learning on a subject-specific basis, we estimated the SS learning curve for each animal using the state-space model for a single individual defined in Smith et al. (2004)
(Fig. 4). This corresponded to using the state-space model in Eq. 2.4 and observation model in Eqs. 3.1 and 3.2 with m = 1. For the Vehicle group all the SS learning curves increased. In agreement with the responses (Fig. 2), the individual SS learning curves for the Vehicle light subgroup (Fig. 4A) increased more rapidly than the individual SS learning curves for the Vehicle dark subgroup (Fig. 4B). The IO(0.95) learning trials for the Vehicle light (dark) subgroup ranged from trial 9 (9) to 21 (54) with a median of trial 13.5 (35).
|
These analyses confirm the finding from the pooled analysis that learning in the Treatment group was impaired relative to the Vehicle group. They also show that learning differed as well between the Vehicle light and the Vehicle dark subgroups) and, even though the numbers were small, there was a difference in learning between the Treatment light and the Treatment dark subgroups. These analyses further suggest that because the learning behavior was similar within the Vehicle and Treatment subgroups, the SSRE analysis should be carried out within these subgroups.
State-space random effects analysis of population and individual learning
We applied the SSRE analysis to the Vehicle subgroups (Fig. 5, A and B) and Treatment subgroups (Fig. 5, C and D). For the Vehicle light subgroup, the SSRE population learning curve (Fig. 5A, red line) increased monotonically from 0.5 at trial 1 to 0.99 by trial 45 and remained constant at this level for the balance of the experiment. The individual SSRE learning curves (Fig. 5A, green lines) were distributed evenly about this population learning curve. The 90% confidence intervals for the population learning curve (Fig. 5A, gray shaded region) were wide through trial 30 and began to decrease as the learning curve began to climb monotonically. The lowest individual learning curve, which was slightly below the lower 95% confidence bound for the population learning curve toward the end of the experiment, corresponded to animal 1. This animal continued to make errors throughout the experiment (Fig. 2A). The IO(0.95) learning trial identified from the SSRE population learning curve is trial 11 (Fig. 5A). The ideal observer curve (Fig. 5E) showed that the IO(0.95) learning trial would have occurred earlier were it not for a series of incorrect responses on trials 9 and 10 (Fig. 2A).
|
The Treatment light subgroup had only 3 animals (Fig. 2C). For this subgroup, the SSRE population learning curve (Fig. 5C, red line) decreased slightly from 0.5 at trial 1, to 0.45 at trial 5, and then increased monotonically to 0.75 at trial 80. The 90% confidence intervals for this population learning curve were broad across the entire experiment because the responses were pooled across only 3 animals. One of the individual SSRE learning curves (Fig. 5C, green lines) was above the population learning curve and one was almost indistinguishable from the population curve. The third learning curve, which was well below the population learning curve, corresponded to animal 2. This animal's individual learning curve increased only slightly above 0.50 (Fig. 4C) and its analysis did not identify an IO(0.95) learning trial. In this case, the good performance of the other 2 animals in this subgroup (with individual IO(0.95) learning trials of 18 and 44) pulled up the learning curve of this animal. The population ideal observer curve (Fig. 5G) mimicked the behavior of the population learning curve and identified the population IO(0.95) as trial 29.
For the Treatment dark subgroup, the SSRE population learning curve (Fig. 5D, red line) decreased slightly from 0.5 at trial 1, to 0.45 at trial 5, increased slightly and remained constant at trial at 0.5 from trial 11 to trial 27. From this trial, the population learning curve increased monotonically to 0.70 at trial 70 and decreased to 0.65 at trial 80. The 90% confidence intervals for this population learning curve had a constant width across the entire experiment that was narrower than the widths of the 90% confidence intervals for the learning curve of the Treatment light subgroup. Between trials 11 to 27 the population and individual learning curves were indistinguishable because nearly all 6 of the animals in this subgroup made many incorrect responses in this interval. The population IO(0.95) learning trial for this group was trial 44 (Fig. 5D, 5H).
From the individual learning curve analyses (Fig. 4D), we concluded that 4 of the 6 animals did not learn by the IO(0.95) learning criterion, whereas the remaining 2 animals learned at trials 62 and 35. From the individual learning curves computed as part of the SSRE analysis, we found 5 of the 6 animals had learning trials that ranged from trial 44 to 49. As was true for the 3 animals in the Treatment light subgroup, by pooling the data to estimate the population and individual learning curves for the Treatment dark group, more animals showed learning than would be indicated by the individual analyses. Moreover, the population analysis showed that although the individual animals in this subgroup performed poorly at the outset of the experiment, this subgroup showed population learning. The 15-trial difference in the learning trial between the Treatment light and the Treatment dark group suggests that learning in these 2 subgroups was different.
For each subgroup, the 8TB learning curve agreed with the SSRE population learning curves (Fig. 5). For each subgroup, the 67% confidence intervals for the 8TB learning curves were close to the width of the 90% confidence intervals for the SSRE learning curves because the former were computed from only the points in the given 8-trial block, whereas all the SSRE confidence intervals are based on all the responses in each group. Whereas the IO(0.95) learning trials for the SSRE analysis for the Vehicle light, Vehicle dark, Treatment light, and the Treatment dark were trials 11, 30, 29, and 44, respectively, the 8CCR learning trial estimates for these groups were identified much later at trials 25, 63, 58, and 80, respectively.
The SSRE analysis computed, for each subgroup, the population learning and an individual learning curve for each member of the subgroup. In this way, the data from each group member contributed to the individual learning curve estimate of every other group member. For each animal, we also computed its SS individual learning curve (Fig. 4) (Smith et al. 2004
). To show the difference between an individual learning curve estimated from the SSRE analysis and the individual SS learning curve we compared these 2 learning curves for animal 2 from the Vehicle light group and animal 4 from the Treatment dark group. Animal 2 clearly performed better than chance in the task (Fig. 6A), with no incorrect responses from trial 36 to the end of the experiment. Based on the responses it was less apparent whether animal 4 performed better than chance. It had 47/80 correct responses with 9/11 correct responses in the last 11 trials of the experiment (Fig. 6B).
|
Comparing population learning between the vehicle and treatment subgroups
The aim of the experiment was to test the effect of MK801 on the ability of the rats to shift a learned strategy. As a result, we were interested in whether the learning curves for the treatment animals were different from the learning curves for the vehicle animals. Because we predicted that MK801 would impair learning, we estimated the trial-by-trial probability that the population performance in the Vehicle light (dark) subgroup was greater than the population performance in the Treatment light (dark) subgroup (Fig. 7A). That is, using the Monte Carlo algorithm in APPENDIX C, we computed Pr(pkVehicle light > pkTreatment light) and Pr(
> pkTreatment dark) for trials k = 0, ..., K. We considered the performance in the Vehicle subgroup to be greater than the performance in corresponding Treatment subgroup on trial k if this probability was
0.95.
|
We concluded from this analysis that the rats injected with the NMDA-receptor antagonist MK801 were significantly impaired in their ability to learn compared to those injected only with the vehicle for most of the later half of the experiment (trial 42 to trial 80). We also concluded that although performance in the Vehicle light subgroup was better than performance in the Vehicle dark subgroup, a difference in performance between the Treatment light and the Treatment dark subgroups was less apparent.
Comparing learning within the vehicle and treatment subgroups
The learning trial identifies the trial on which the ideal observer is 0.95 certain that the animal is performing better than chance from that trial through the balance of the experiment. This analysis compares the performance on trial 0 with performance on each of the 80 trials. Another frequently asked question is whether learning performance differs between trials within a group. In these analyses, learning in the later trials of the experiment is frequently compared with learning in the earlier trials. Using the Monte Carlo algorithm in APPENDIX D, we computed Pr(pk2 > pk1), the probability that the learning curve at trial k2 was greater than the learning curve at trial k1 for all k1 < k2. These results consist of K(K + 1)/2 within-subgroup comparisons (probabilities) for the Vehicle light subgroup (Fig. 7C), Vehicle dark subgroup (Fig. 7D), Treatment light subgroup (Fig. 7E), and Treatment dark subgroup (Fig. 7F). Comparisons on which Pr(pk2 > pk1) was
0.95 are shown in red. The algorithm in APPENDIX D shows that the computations involve evaluating comparisons between pairs of random variables from the K-dimensional joint probability density of the learning state process. This probability density was estimated by our model-fitting analysis in APPENDIX A. For this reason, there is no problem with multiple hypothesis tests in this analysis.
For the Vehicle light subgroup (Fig. 7C), the learning curve at trial 40 onward was significantly greater than the learning curve from trials 1 to 37. The steplike structure in the probability surface resulted from the steplike increase in the learning curve around trial 40 (Figs. 7C and 5A). Because of the large increase in the learning curve at the start of the experiment, where the probability of a correct response is 0.5, there is a line of red along the top of the probability surface, indicating the animals' performances were significantly above chance early in the experiment. The Vehicle dark subgroup (Fig. 7D) also showed a significant increase around trial 40 but in this case, the improvement continued through the length of the experiment. For this subgroup, the learning curve for any trial greater than trial 40 was consistently larger than the learning curve 10 trials earlier or more.
Beginning at trial 30 for the Treatment light group (Fig. 7E) performance was better on this trial than that on all trials 20 trials earlier or more. This level of difference in between-trial performance was maintained for the balance of the experiment. A similar pattern held for the Treatment dark group (Fig. 7F). These analyses show that within each of the 4 subgroups there is substantial improvement in performance consistent with learning within each group.
Optimal design of a learning experiment
Two important questions that arise in the design of behavioral experiments that compare population learning are how many animals per group and how many trials per experiment are required to detect accurately between-group differences in learning. To study these question, we used our SSRE model to conduct a theoretical study of how well we can distinguish differences in learning between 2 populations as a function of the true differences in their learning propensity, J the number of animals per group and K the number trials in the experiment. We assumed that learning in both groups (denoted by Control group and Treatment group) was dependent on the same unobservable learning process defined at trial k by the logistic equation
![]() | (3.3) |
k is a zero mean Gaussian random variable with variance 
2 = 0.04 for k = 1,... K. In the analyses we compared a Control group with 3 different Treatment groups. For each group we assumed that, given its learning modulation parameter
0, its population probability of a correct response was given by evaluating the expected value of the state model (Eq. 3.3) in Eq. 2.2. We assumed that the Control and Treatment groups differ only in their learning modulation parameters. This analysis simulated the situation in which the ability to learn was modulated by treatment or previous experience.
We assumed that each group consists of J individuals and that
j, the learning parameter for individual j, is drawn from a Gaussian probability distribution with mean
0 and variance 
2 for j = 1, ..., J. Therefore, for each individual in each group, we assumed that given its learning modulation parameter
j, the individual's probability of a correct response was given by evaluating the state model in Eq. 3.3 using Eq. 2.2. For the Control group, we set
0 = 2.6 and 
2 = 0.04. To simulate a treatment effect that induced impaired learning propensity, we chose 
2 = 0.04 and 3 different values of
0: 1.8, 1.4, and 1. That is, the differences between the population learning parameters of the Control and Treatment groups in these cases were given by
, where
= 0.8, 1.2, and 1.6.
The resulting population learning curves are shown in Fig. 8A. We chose this model because the resulting Control and Treatment group learning curves resembled, respectively, smoothed versions of the Vehicle and MK801 Treatment group learning curves that we estimated in our real data example. In addition, the parameter values are similar to those estimated from the analysis of the true data. As we did for the analysis shown in Fig. 6A, we computed by Monte Carlo the probability that the population learning curve for the Control group differed from the population learning curve of the Treatment group for each of the 3 values of the Treatment group parameters, assuming that there was a sample of 10,000 individuals per group (Fig. 8B) and 120 trials in the experiment. The differences between the learning curves of these groups are the between-group difference curves we would like to detect in our SSRE model analysis.
|
decreased from 0.8 to 1.2 to 1.6, the maximum difference between the Control and Treatment learning curves increased from 0.07, to 0.13, to 0.20 (Fig. 8B) and the earliest detectable learning trial decreased from trial 58, to 43, to 34 (Fig. 8C). This feature of the simulation was important because it indicated that a longer experiment might be needed to detect smaller differences between the learning curves. For each of the 3 differences in population learning curves between the Control and Treatment groups, we tested 6 different numbers of subjects per group J = 3, 5, 7, 11, 15, and 20, and 7 different numbers of trials per experiment K = 40, 50, 60, 70, 80, 100, 120, and 140. This represents a reasonable range of number of subjects and number of trials per experiment that might be used in a population learning study. For each of the 3 x 6 x 8 = 144 triplets of parameter values, we simulated a learning curve for each of the J subjects in each group and from each subject's learning curve we simulated experimental data that constituted a sequence of correct and incorrect responses of length K.
We used our SSRE model to estimate from the sample of simulated binary response data the