|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1Department of Statistics, University of Connecticut, Storrs, Connecticut; 2Department of Statistics, Florida State University, Tallahassee, Florida; and 3Department of Organismal Biology and 4Anatomy, University of Chicago, Chicago, Illinois
Submitted 28 April 2006; accepted in final form 15 November 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Template matching has been the basis of many pattern-identification procedures (Abeles et al. 1988
, 1993
; Chi et al. 2003
; Dave and Margoliash 2000
; Louie and Wilson 2001
; Nádasdy et al. 1999
). In these procedures, a template is constructed based on exemplar spike trains that exhibit a certain pattern of interest. During data analysis, each segment of the data is scored in terms of its similarity to the template. Segments with high similarity scores are interpreted as exhibiting the pattern of interest and are thus extracted for further analysis.
In most of the current procedures, the template is treated as nearly time locked to stimuli or behavioral events. Except for some local variability, such as random perturbation in spike timing, global variability is expressed as uniform timescaling, which multiplies all the interspike intervals (ISIs) by a single factor (Chi et al. 2003
; Louie and Wilson 2001
; Nádasdy et al. 1999
). However, in some systems spiking activity can exhibit "time warping," i.e., nonuniform time compression or expansion on a relatively large timescale (Dave and Margoliash 2000
; Nádasdy et al. 1999
). As far as we know, the procedure introduced by Victor and Purpura (1996)
is the only one that explicitly addresses time warping. The procedure was developed to analyze the precision of temporal coding during specific short time intervals of sensory stimulation. In the procedure, the ISIs are assumed to vary independently and similarity scores are computed with dynamic time warping (DTW). Such scores are known as "text edit distances" in the field of computer science (Wagner and Fischer 1974
). Similar procedures based on text edit distances are extremely useful in bioinformatics (Aach and Church 2001
; Bishop and Thompson 1986
; Needleman and Wunsch 1970
; Waterman et al. 1987
). Sequences of discrete behaviors such as syllables in birdsongs are also occasionally analyzed using the text edit distance (Margoliash et al. 1991
; Sankoff and Kruskal 1983
). DTW is also commonly used in speech recognition (Huang and Gray 1988
; OShaughnessy 1999
; Rabiner and Juang 1993
), birdsong recognition (Anderson et al. 1996
; Ito and Mori 1999
; Kogan and Margoliash 1998
), and other applications.
The variability of spiking activity at local versus relatively large timescales can be roughly described as follows (Fig. 1A). Think of the occurrences of a pattern of interest as "texts." The "codewords" of the texts are short intervals with spikes. The definition of codeword should be problem dependent. For different systems, a codeword may contain a single spike, multiple spikes, or a high-frequency burst. Within the codewords, the detectable variability can be reasonably characterized as random local perturbation in spike timing and spike count. Between the codewords, on the other hand, the gaps often have quite visible nonuniform variability, resulting in time warping. The gaps need not be activity free. Instead, they can contain "noise" spikes that do not repeat in different occurrences of the pattern.
|
The local comparison should be based on specific representations of spiking activity. We shall consider two types of representations. The point process representation registers spike times continuously. As a result, the neural activity is characterized as processes of points in time. The binary representation divides the neural activity into small time bins, and assigns 1 to a bin if there is at least one spike in it and 0 otherwise. As a result, the neural activity is characterized as sequences of 0 and 1 s. Although the local comparison algorithms for these two representations look quite different, they derive from nearly identical arguments based on likelihood tests.
By using DTW, the template-based approach automatically "parses" each identified data segment into codewords and gaps in between, thus allowing a more detailed interpretation of the identification results (Fig. 1A). Such parsing can yield additional insight. If the spiking activity pattern is preselected in association with a certain behavior, and each of its codewords is reliably associated with a landmark event during the behavior, then the parsing provides a way to infer from the neural activity when the landmark events occur within the overall behavior. Presently, the dominant procedures for inference of the time course of behavior are based on regression or state-space models (Brockwell et al. 2004
; Carmena et al. 2003
; Serruya et al. 2002
; Smith and Brown 2003
; Wu et al. 2004
, 2006
). They are designed to estimate spatial trajectories of motor output with minimized average error over time. The estimation of the timing of events during motor output can therefore provide a useful new perspective to the decoding of neuronal activity.
In what follows, under METHODS, we describe the identification procedures and present pseudocode for their implementations. The procedure for the point process representation was implemented in C++, and that for the binary representation in MATLAB (The MathWorks, Natick, MA). Under RESULTS, we illustrate the utility of the procedures with two brief neurobiological examples. The procedure for the point process representation is applied to a single-unit (SU) recording collected during sleep from the nucleus robustus arcopallium (RA) of zebra finches (Dave and Margoliash 2000
). The procedure for the binary representation is applied to multielectrode recordings from the primary motor cortex (MI) of a behaving macaque monkey during a target pursuit task (Hatsopoulos et al. 2004
; Serruya et al. 2002
). In this example, the times when a target was reached and the following target appeared in each successful trial can be estimated with error <0.2 s. This is somewhat surprising in view of the relatively high level of spike-timing variability in MI. For each procedure, we conduct a sensitivity analysis, demonstrating that the procedures are robust.
| METHODS |
|---|
|
|
|---|
Because of the nature of the recording technology and cortical activity, the levels of accuracy of these two recordings were very different. First, in terms of signal-to-noise ratio (SNR), the accuracy of separating action potentials from background was much higher for the recordings from RA. The SNRs of spikes from SU recordings in the RA of zebra finches are generally >10, whereas those from the array recordings in macaque monkeys can be as low as 3. The difference in SNRs result from the different experimental designs and thus different recording technologies are used: movable but small numbers (one to four) of electrodes in the bird versus a fixed array of large numbers (>100) of electrodes in the monkey. Second, in terms of neurophysiological characteristics, the spike-timing variability of zebra finch RA units is much lower as well (Yu and Margoliash 1996
; cf. Hatsopoulos et al. 2004
). The differences are likely to influence the performance of pattern identification. The sampling rate of recording is 20 kHz for the RA data and 30 kHz for the MI data.
Part one: point process representation
The procedure described in this part is suitable when 1) the pattern of interest consists of well-defined spike bursts; 2) across occurrences of the pattern, the bursts exhibit low variability; and 3) the data being analyzed constitute a single spike train, consisting of time registry of spikes accurately estimated from the SU recording. Under these conditions, an exemplar spike train expressing the pattern may be used as a template. Under RESULTS, more detail will be given on how templates are selected for SU activity in RA.
The procedure directly compares the spike times. Figure 1B illustrates how to compare a template with data registered around a time location. In general, the template need not start or end with a spike. Together with its bursts, the intervals at the beginning and at the end of the template and those between the bursts are all parts of the template and will be referred to as interburst intervals (IBIs). The bursts are used as the codewords of the template pattern. The IBIs can be thought of as independent "springs," and the bursts as relatively incompressible "blocks" interconnected by the springs. By compressing or stretching the IBIs, the template bursts are shifted accordingly and compared with the data registered in the corresponding time intervals. Given a set of modified IBIs, the score of similarity between the warped template and the data segment aligned with it is a sum of the similarity scores associated with the bursts, minus a weighted sum of the amount of changes in the IBIs and the number of spikes in the data segment that do not match any template spikes. Optimizing the weighted sum over a range of modified IBIs then yields the optimal score at the time location. We next consider appropriate similarity scores and their computation.
TEMPLATE.
A template spike train has to be registered in an interval [0, D] (Fig. 1B). If the template has n bursts, then in general it has n + 1 IBIs, including the interval between its onset (t = 0) and first spike, and the interval between its last spike and offset (t = D). A subset of template spikes is a burst if all its ISIs are less than a specified threshold. Fix parameter
> 0, which controls the precision for comparing spike times. Let Hn+1 = D, T0 = 0, and for 1
i
n
![]() |
LOCAL SCORES FOR TEMPLATE BURSTS.
First, we fix a kernel function K(x). This will be used to measure the temporal discrepancy between a data spike and a template spike. Some commonly used kernel functions defined on [1, 1] are constant 1 ("square" kernel), 1 |x| ("triangular"), 1 x2 ("Epanechnikov"), and (1 x2)2 ("biweight"). To evaluate the similarity between the ith template burst and a data burst s1, s2,... in [x, x + Bi], the idea is to translate both to [0, Bi] so they can be compared directly. If the ith template burst consists of spikes t1, t2,..., then the similarity score between the two bursts is
![]() | (1) |
0 the maximum absolute negative contribution a nonmatching data spike can make to the local score. From Chi et al. (2003)
i.
GLOBAL SCORE FOR THE ENTIRE TEMPLATE.
Denote by N(a, b) the number of data spikes between a and b. For the ith IBI, let Gi(x)
0 be a cost function for its warping. Suppose the warped template has its IBIs changed by v1, ..., vn+1. Denote
![]() |
![]() | (2) |
i
n + 1. Data spikes in these intervals are counted as "noise" and reduce the similarity. This gives rise to the third term in Eq. 3.
By the probabilistic interpretation given in the APPENDIX, the overall similarity score assigned to the time location x is defined as
![]() | (3) |
1(x), ...,
n+1(x) be a solution. By Fig. 1B, we can establish a one-to-one correspondence between the intervals I1, ..., In+1 in the template and I1, ..., In+1 in the data segment and identify the latter as a set of optimal matching IBIs. Then we can identify the intervals B1, ..., Bn between I1, ..., In+1 as a set of optimal matching burst intervals. The interlacing I1, B1, I2, ..., Bn, In+1 form a partition of the data segment between x and the right endpoint of In+1, which is an optimal match to the template that one can possibly find at time x under the similarity score. The optimal matching IBIs Ii and burst intervals Bi are computed as follows
![]() |
![]() | (4) |
DTW FOR THE GLOBAL COMPARISON.
To compute L(x) and
1(x), ...,
n+1(x), let
![]() |
![]() |
![]() |
![]() |
1(x), ...,
n+1(x) can be computed by DTW. Table 1 summarizes the procedure.
|
1n+1(x)] are extracted.
, where
is a given threshold;
r}, with r the radius for local comparison; and
r. This is to make sure that L is not a constant in [x r, x + r]. A reasonable choice for r is D, the duration of the template. When some of the extracted segments overlap in time, a disjoint subset of the segments may be selected.
The selection of the threshold value
in general is a difficult problem. One important reason is that the stochastic properties of the data can be difficult to grasp. In RESULTS we will describe an ad hoc method for the RA activity that tries to simulate the noise in the data. A different ad hoc method was used for the data collected from the MI of monkey, which will also be described in RESULTS.
ADAPTIVE SELECTION OF PARAMETER VALUES.
The parameters
and
can be set based on estimation. For
, we first compute the mean value d of the template ISIs, which can be regarded as the minimum resolution to distinguish template spikes. If the squarekernel is used in Eq. 1 to compute the local scores, then set
= d/2. To use a different kernel K,
has to be adjusted. The reason is that except for x = 0, K(x) is strictly less than the square kernel. Without adjusting
, the kernel puts less total weight to data spikes that match the template spikes, resulting in artificially lower similarity scores. We set
![]() | (5) |
From Eq. A1 in the APPENDIX,
log (
0/
)/log (
'/
0), where
' is the firing rate within template bursts,
is the firing rate within template IBI, and
0 is the mean firing rate in nonmatching spike trains. Because for a Poisson process the density is the reciprocal of the mean ISI, we set the value of
based on the approximation
![]() | (6) |
is adaptive.
NUMERICAL ISSUES.
In data analysis, L(x) is computed on a discrete time grid jd, with j = 0, ±1,..., and d > 0 a fixed step size. Also, the warping of each IBI has to be bounded. Table 2 contains a version of the algorithm that takes into account these constraints. It requires that Fi(x0 + jd) be computed and stored for a given x0. We apply the subroutine below to compute Fi(x0 + jd). Let
a
denote the smallest integer no less than a, and
a
the largest integer no greater than a.
i(s x0 jd), where j0 =
(s x0
)/d
and j1 =
(s x0)/d
.
|
The procedure described in this part is suitable when 1) the spiking activity is recorded from multiple single units and exhibits significant amount of variability in spike timing across trials and/or 2) the neural signals have lower SNRs compared with SU recordings; typically, the spike trains are obtained from multielectrode recordings by spike sorting. Under these conditions, the spike train of a single unit typically cannot represent the variability in a pattern of interest and therefore it alone cannot be directly used as a template for pattern identification. Instead, we shall construct a template as a linear filter that is made up of log-likelihood ratios estimated from sample spike trains of all the recorded single units. The statistical model involved in the construction is an integral part of the pattern identification procedure and is described below.
BASIC SETUP. Suppose the spiking activity pattern we are interested in is associated with a sequence of n landmark events during the course of a certain type of behavior. For example, if a behavioral task requires a monkey to reach n targets in sequence, then for i = 1, ..., n, the ith event may be defined as the cursor reaching the ith target (see more detail in RESULTS). The n 1 intervals between the times of consecutive events during an occurrence of the event sequence will be referred to as interevent intervals (IEIs).
We suppose that the multielectrode spike trains are recorded from the same set of C units and each data spike has been associated with one of the units by spike sorting. Fix parameter
> 0 as the duration of time bin. For c = 1, ..., C, the binary representation for thespiking activity of the cth unit is a binary sequence xc(1), xc(2), xc(3), ..., such that
![]() |
is small enough, almost all time bins can have at most one spike, and the binary representation induces little loss of information on spike count.
To start, one needs to collect a sample of multielectrode spike trains associated with multiple occurrences of the event sequence. For each sample spike train, if T1, ..., Tn are the associated event times, then the segments of the spike train in surrounding intervals [Ti A
, Ti + B
] are registered as codewords, where A, B
0 are fixed integers. As in the case of point process representation, the codewords cannot be warped in time.
STATISTICAL MODEL. The procedure is based on the following assumptions.
, t + B
] are independent, such that the activity of each unit forms a Poisson process. For i = 1, ..., n, c = 1, ..., C, and j = A, ..., B, let pi,c(j) be the probability that the cth unit generates spikes in the jth time bin away from the ith event. The probability can be estimated using a peristimulus time histogram method. That is, for the ith event and cth unit, we first align the sample spike trains of the unit at the event, then estimate pi,c(j) as the fraction of those sample spike trains that have spikes in the jth time bin relative to the time of the event.
For i = 1, ..., n 1, we use q(t,
i) to model the probability density of the duration of the ith IEI. The parameter
i can be estimated based on the sample as well.
LOCAL SCORES FOR SPIKING ACTIVITY PATTERNS AROUND EVENTS.
To identify the occurrences of the whole event sequence, the first step is to evaluate the likelihood of the occurrence of a single event in each time bin. The derivation of the likelihood is given in the APPENDIX. Here we consider the computational procedure only. For the ith event and the cth unit, we introduce
i,c(j), such that for j = A, ..., B, it is evaluated as
![]() |
![]() | (7) |
i,c*xc(t) is the tth entry of the convolution between
i,c and xc. The derivation of the formula is given in the APPENDIX. We therefore use
![]() |
i,c acts as a discrete linear filter. By convolving
i,c with the entire binary sequence xc, the score is obtained for all time bins.
GLOBAL SCORE FOR THE EVENT SEQUENCE.
For the ith IBI, i = 1, ..., n 1, and t
1, let Gi(t) = log q(t
,
i). Then for any time bin t and I1, ..., In1, the logarithm of the probability that the tth time bin contains the onset of an occurrence of the event sequence with IEIs I1
, ..., In1
can be expressed as l(t, I1, ..., In1) plus a constant, where
![]() | (8) |
![]() | (9) |
Denote by Î1(t), ..., În1(t) the maximizers associated with L(t). As in the case of the point process representation, L(t), Î1(t), ..., În1(t) can be computed using DTW. The algorithm is sketched in Table 3.
|
![]() |
. In practice, because the score function L(t) can be quite noisy, it may need to be smoothed to extract meaningful local maximum points.
To summarize, the numerical procedure for the binary representation starts with estimating pi,c(j) and q(x,
i). Next the procedure constructs filters
i,c for i = 1, ..., n, c = 1, ..., C, and also constructs Gi(x) for i = 1, ..., n 1. The scores and potential occurrences of the event sequence are then computed by the algorithm in Table 3.
| RESULTS |
|---|
|
|
|---|
We applied the procedure for point process representations to the SU spiking activity in RA of adult male zebra finch. The data were high signal-to-noise ratio recordings from chronically implanted movable electrodes in RA, collected in a previous study (Dave and Margoliash 2000
). Male birds directed their songs toward females housed in adjacent half-cages.
For each RA unit, the goal was to identify segments in its spontaneous spiking activity during sleep that exhibited temporal patterns similar to its premotor activities associated with daytime singing. Such segments have been interpreted as replay of premotor activity during sleep that may be involved in learning and maintaining song (Margoliash 2002
). The female-directed songs of adult male zebra finches are highly stereotyped intense behaviors (Sossinka and Böhner 1980
), acquired through learning (Brenowitz et al. 1997
), and are dynamically maintained in adults by auditory feedback (Nordeen and Nordeen 1992
). The songs constitute sequences of "syllables" repeated in a fixed pattern to form one or more repeated "motifs." Premotor spiking activity in RA associated with song renditions are highly stereotyped and exhibit precise temporal patterning within bursts, with the SD in spike timing as low as 0.2 ms (Chi and Margoliash 2001
; Yu and Margoliash 1996
). Therefore the point process representation is preferred to the binary representation for RA activity.
For these data, the spontaneous activity of a unit during sleep can be probed with one or more templates. Once a signature motif of the bird was identified, any spike train of the unit associated with a rendition of the motif could serve as a template (Fig. 2, A and B). To partition a template into bursts and IBIs, we aligned the spike trains associated with all the identified renditions of the motif by their onset times (Fig. 2C). The raster plot revealed common temporal structures of the spike trains that had a many-to-many correspondence to the acoustic structures of the motif. Based on the structures, we partitioned the spike trains into bursts and IBIs by the criteria that adjacent bursts be at least 20 ms apart. Within the bursts of these spike trains, the mean ISI was d' = 3.06 ± 2.68 ms, and within their IBIs the mean ISI was d = 73.02 ± 55.97 ms. The same partition was obtained when we applied a more refined alignment procedure which minimized the total L1 distances of the spike trains (Chi and Margoliash 2001
).
|
To implement the procedure in Table 2, we used the biweight kernel to compute local similarity scores by Eq. 1. The parameter to control the temporal precision for spike matching was
= d'/2 x 1.875 = 3.06/2 x 1.875 = 2.869 ms by Eq. 5. To calculate the global similarity scores by Eq. 2, we set Gi(x) = 0 and fixed the maximum proportion of warp at 1/5 for each template IBI. The score function was computed on a grid with step size 0.5 ms.
During the spontaneous activity of the unit in sleep, the mean ISI was d0 = 49.05 ± 20.85 ms. By Eq. 6, the penalty on each nonmatching spike was
= 0.1434. We then calculated the similarity scores across the data to identify segments that exhibited similar patterns to the template.
The results from one session of the data are shown in Fig. 3; plot A displays the similarity scores across. To identify matching segments, we set the threshold for similarity score
= N/3, where N = 41 was the total number of spikes in the template. Four above-threshold local peaks were identified in this session, with scores 16.6, 19.2, 15.9, and 15.4, respectively. The second and fourth peaks are marked by x and z in Fig. 3A. By Eq. 4, we extracted the optimal matching segments at the peak locations and partitioned them into burst intervals and IBIs that were mapped one to one to those in the template (Fig. 3, B and C). The identified segments exhibited a high degree of similarity to the template and also showed nonuniform time warping of the IBIs.
|
To see whether DTW was necessary in this example, we collected all the identified optimal matching segments from the data. There were 15 of them, with mean score 16.6 ± 1.9. To see how much warping of the template was involved to match these segments, we analyzed the two longest IBIs in the template, which were the fourth and fifth IBIs. The time compressions/expansions of the two IBIs, 4.2 ± 21.5 and 2.5 ± 9.6 ms per segment, showed no significant correlation with each other (r = 0.34, P = 0.22, 95% CI = [0.21, 0.73], t-test). For instance, to match the segment at x in Fig. 3A, the two IBIs were compressed by 33.5 ms and expanded by 6.5 ms, respectively. This suggests that nonuniform, wide-ranging time warping was present in the data, supporting the use of DTW for the pattern identification.
Another question is whether the probability distribution of the time warping in the premotor activity could be used in the pattern identification for the sleep data. The answer seems to be negative. There is evidence that the variability of IBI in the spontaneous activity during sleep differed from that during wakeful behavior. In the preceding data, the variance of the fourth IBI in the identified matching segments was significantly larger than the one in the exemplar spike trains associated with motif renditions. The ratio of the two was r = 18.61 (P < 105, 95% CI = [6.25, 55.44], F-test). The variances of the fifth IBI on the other hand did not exhibit significant difference (r = 2.20, P = 0.15, 95% CI = [0.74, 6.54]). The change in the statistical properties of neuronal activity across different behavioral states poses difficulty for the selection of parameter values, as we tried to use the templates obtained in one state to analyze the activity in another.
Sensitivity to temporal precision of spike matching
We had used
= 1.875
, where
= 1.53 ms was the half mean ISI within the bursts of the exemplar spike trains associated with the motif renditions, and the coefficient 1.875 "standardizes" the biweight kernel. To examine how sensitive the pattern identification was to
, we compared the results to those obtained with different preselected values of
= 1.875
while keeping the other parameters unchanged. For larger
, the temporal precision required for template matching was lower and more segments with above-threshold similarity scores were expected. The question is how sensitive the number of identified segments was to the change in
. Denote by n the number of segments that were identified using a preselected value of
but missed by the original procedure that used
= 1.53 ms and n' the number of those the other way around
![]() |
around the value selected based on the exemplar spike trains. Except for
= 0.5, all the extra segments identified using the preselected values of
were not completely overlooked by the original procedure. They actually showed up as local peaks with relatively high scores (>10). Therefore the increasing trend in n could be offset by lowering the threshold in the original procedure. The extra segment identified with
= 0.5 was also not completely overlooked by the original procedure. It had roughly 50% overlap in time with a segment identified by the procedure. Sensitivity to the penalty on mismatching spikes
In the procedure, the penalty
= 0.1434 was selected adaptively, based on the firing rates of the premotor activities associated with motif renditions and the spontaneous activity during sleep. We compared the results to those obtained with different fixed values of
while keeping the other parameters unchanged. Let n be the number of segments identified by using a preselected value of
but missed by using the adaptively selected value and n' the number of those the other way around:
![]() |
= 0.1434, 15 optimal matching segments were identified. The summary shows that for values of
around the adaptively selected value, the identification results were quite similar. Sensitivity to kernel
We evaluated how sensitive the pattern identification procedure was to the kernel. We implemented the procedure using the square, triangular, and Epanechnikov kernels, respectively. By Eq. 5,
had to be adjusted accordingly. Denote by n the number of segments that were identified using a different kernel but missed by using the biweight kernel and n' the number of those the other way around:
![]() |
= 41/3, with differences 0.54, 1.20, 1.47, and 0.50, respectively. Simulations for threshold selection and performance evaluation
For the pattern identification procedure, the issues of threshold selection and sensitivity to noise are complicated by the fact that the statistical properties of the sleep data are likely to be different in some unknown but important aspects from the premotor activity that generated the templates. The simulation method is based on the idea that by testing the pattern identification procedure on artificial data that incorporate certain statistical properties of the sleep data, some reasonable inference might obtain.
We generated simulated data using the exemplar spike trains associated with motif renditions (Fig. 2A) because they had unambiguous interpretation. The pattern identification procedure was then applied to the simulated data to yield score functions. Because the peak values in these score functions were associated with "true targets" (i.e., spike trains known to exhibit the pattern of interest), we could use their distributions to draw inferences on the ability of the procedure to find true targets corrupted by noise.
Given an exemplar spike train T registered in [0, D], a simulated spike train was generated as follows.
N(0,
).
was added to I as "noise" spikes.
.
Because DTW automatically accounts for time warping of IBIs, we did not randomly modify the IBIs. In all the simulations,
= 1 ms, which is a plausible lower bound for the ISI in RA. We also fixed B = 500 ms in all the simulations. The value of
was selected based on the average firing rate of the spontaneous activity during sleep, which was estimated at 21 Hz. We set
= 20 Hz. The parameters q and
are unobservable from the spontaneous activity during sleep. We set
around or above the half-mean
(=1.53 ms) of the ISI within the bursts of the exemplar spike trains, so that there was a good chance for a spike in the exemplars to have a large "jitter" relatively to the ISIs. We conducted four simulations, with q = 1/4, 1/3 and
= 1.5, 2 ms, respectively.
For each of the exemplar spike trains, 100 simulated spike trains were generated. The raster plot of a total of 1500 simulated spike trains with
= 1.5 ms and q = 1/3 is shown in Fig. 4. Despite the relatively high chance of spike deletion in the true targets, the simulated spike trains maintained the overall temporal structure of the exemplar spike trains. However, not all of them exhibited enough similarity to the exemplars. A small percentage a of the simulated spike trains either failed to yield similarity scores above
= N/4, with N the number of spikes in the template, or failed to have peak locations in score function within 50 ms from the onsets of the true targets. Excluding these spike trains, denote by S the peak value in the score function of any remaining spike train. The distribution of S was close to a Normal distribution. As
or q increased, a increased and the distribution of S shifted toward 0.
![]() |
S. For all the simulations, this value was a little less than the actually used threshold value N/3. Therefore a few more segments would have been identified, and the sum of a and the tail probability under N(µS,
S) (
2.5%) could serve as a nominal false negative rate.
|
= 50 Hz. In these simulations, the signal-noise ratio became much lower, greatly reducing the number of identified segments. For example, with all the other parameters the same as in the second simulation, the number of identified segments was only 87 (results not shown). This result, however, made no sense for the data we analyzed. Our suggestion is that to conduct meaningful simulations, the parameters in the simulations should be tuned as much as possible according to the data. Application to MI of macaque monkey
We applied the procedure for the binary representation of neuronal activity to recordings obtained from the arm area of MI in the left hemisphere of a trained adult macaque monkey. The data were collected using a chronically implanted silicon microelectrode array composed of 100 electrodes (1.0-mm electrode length; 400-µm interelectrode separation; Cyberkinetics Neurotechnology System, Foxboro, MA). Based on histological evidence from previous implants that we have performed, the electrode tips are likely to be in lower portions of layer 3 or in layer 5. To acquire extracellular action potentials, signals were amplified (gain = 5,000), band-pass filtered (250 Hz to 7.5 kHz) and sampled at 30 kHz per channel. Only waveforms that crossed a threshold were stored and spike-sorted using Off-line Sorter (Plexon, Dallas, TX). The monkey was trained to repeatedly perform a sequence of movements (see following text) to receive a juice reward. From a decoding perspective, it is of interest to have a means to estimate the times of certain landmark events in the behavior during each trial based on the associated neuronal activity. These estimates were compared with the actual event times recorded during each trial.
The monkey was operantly trained to perform a target pursuit task by moving a cursor to a target with its right arm. The cursor and the target were displayed on a horizontal projection surface in front of the monkey. The monkeys arm rested on cushioned arm troughs secured to links of a two-joint robotic arm (KINARM system; Scott 1999
) underneath the projection surface. The shoulder joint was abducted 90° such that shoulder and elbow flexion and extension movements were made in the horizontal plane. The target was programmed to appear on the four corners of a fixed square. A trial could start at any time after the end of the previous one. At the start of a trial, the target appeared at the upper right corner of the square and the monkey had to move the cursor to it. Once reached, the target jumped counterclockwise to the next corner and the monkey had to move the cursor again, and so on (Fig. 5). A trial was registered as successful only if the monkey reached the target at all of the four corners within 5 s. In the data we collected, there were 240 successful trials; see Fig. 5 for some sample traces of hand movement in successful trials. A total of up to C = 49 MI single units were simultaneously recorded.
|
3 s. We fitted the distribution by the Gamma distribution
![]() |
i = (ai, bi). We shall mainly report results attained by incorporating the fitted Gamma distributions in the scoring. Later, we will show that similar results were attained as well without exploiting these distributions.
The temporal pattern of the spiking activity in MI has more variability than in RA. Taking this into account, the time bin duration was selected so that 99.9% of the time bins each contained at most one spike. Under this condition, the largest time bin duration was about 10 ms. We therefore chose
= 10 ms. The binning reduced the size of the data while keeping most of its information. The hand position data were also down-sampled at 10-ms resolution to estimate the times when the corners were reached.
We used the movement sequences in the first 200 successful trials as the training data set to estimate the firing probabilities pi,c(j) within 1 s around each event, with i = 1, ..., 4 indexing the events, c = 1, ..., 49 indexing the units, and j = 100, ..., 100 indexing the time bins within 1 s around a time of event. The parameters
i for the Gamma distributions were estimated as well. The estimates of pi,c(j) and
i were then used to construct the template. The recording after the 200th successful trials was used to test the accuracy of the identification procedure.
The score function L(t) was computed according to Table 3 over the entire recording and then smoothed with a low-pass filter with cutoff frequency 0.5 Hz. Because successful trials occurred at a much lower frequency (
0.21 Hz) than the cutoff frequency, the smoothing had no aliasing effect on the estimation of the onset times of the movement sequences (i.e., the times when the first corner was reached) in successful trials.
Across the entire recording period, 622 time locations ti were identified where the smoothed score function
(t) peaked locally (Fig. 6). Based on the assumption that the scores were higher near the onsets of the movement sequences in successful trials than elsewhere, only ti with
(ti) >
(ti±1) were selected, resulting in 202 estimated onset times. For each of the selected ti, the corresponding event times were estimated as in Table 3 (Fig. 7).
|
|
1 s was defined as a false positive. The counts of true positives, false positives, selected ti, and events that actually occurred in the training data set, test data set, and the entire data set are as follows
![]() |
To see how sensitive the above evaluation of power was to the threshold for the mean error, we varied the threshold value from 0.5 to 1.5 s, causing the power to change from 62 to 73% (Fig. 8A). The slight change in the power indicates that the evaluation was robust to the threshold selection thus justifies using 1 s as the threshold value.
|
Finally, we tested the procedure without incorporating the estimated distributions of the IEIs in scoring. Equivalently, all Gi were set to 0. For the test data set, we again attained a high power (29/40) and a high rate of true positives (29/41). The mean error in the estimated event times was 0.226 ± 0.196 s and the errors associated with the four corners were 0.536 ± 0.607, 0.045 ± 0.062, 0.092 ± 0.189, and 0.231 ± 0.433 s. The results for the entire data set and those for the training data set were similar to these reported results. The results were only marginally inferior to those attained by exploiting the estimated distributions of the IEIs.
Comparisons with Kalman filter decoding method
A number of decoding methods have been proposed for MI of Macaque monkeys, including population vectors, linear filters, artificial neural networks, or Kalman filters (Carmena et al. 2003
; Georgopoulos et al. 1986
; Serruya et al. 2002
; Wu et al. 2006
). These methods have been demonstrated to be effective and accurate in estimating kinematic parameters such as hand movement direction or hand trajectory. However, none of them addresses the estimation of the timing of landmark events during a movement. This is largely attributed to the experiment paradigms these methods are designed for, which generate fairly straightforward hand movement of monkey, making it unnecessary to decode event timing during the movement. In contrast, in our squared movement paradigm, the monkey moved its hand following four corners of a square in each trial. In this paradigm, the times when the hand reached th