|
|
||||||||
INNOVATIVE METHODOLOGY
Picower Center for Learning and Memory, RIKEN-MIT Neuroscience Research Center, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Submitted 24 October 2003; accepted in final form 3 June 2004
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
To fully exploit these experimental advances, there is a corresponding need for new methods to analyze complex firing patterns across multiple simultaneous spike trains (i.e., N simultaneous continuous-time discrete stochastic processes). In practice, this may consist of 10100 simultaneous spike trains (1 train per neuron) each sampled every 0.1 ms, with the sample value "1" (a spike) occurring in each train with average frequency 0.110 Hz (the remaining sample values being "0"), and recorded for several hours. The cross-correlation function is a widely used method for detecting interactions between a pair of spike trains. A central peak indicates a tendency for coincident spikes, while an off-center peak indicates a tendency for one cell to fire after the other with a particular delay. However, only a few studies exist that describe ways to analyze interactions among the spike trains of more than 2 neurons (Abeles 1991
; Abeles and Gerstein 1988
; Abeles et al. 1993
; Aertsen et al. 1989
; Gerstein and Aertsen 1985
; Gerstein and Perkel 1969
; Gerstein et al. 1985
; Grün et al. 2001
; Gütig et al. 2003
; Lee and Wilson 2000
; Louie and Wilson 2001
; Martignon et al. 2000
; Nakahara and Amari 2002
; Palm et al. 1988
). Furthermore, most of these methods analyze firing patterns in terms of a set of precise interspike intervals (i.e., exact time delays between the successive spikes).
We develop a new approach for analyzing sequential firing patterns involving an arbitrary number of neurons based on relative firing order. Sequences are very important for neural processing. Examples include memories of a sequence of experienced events, a sequence of actions in an overall movement, or sequential recruitment of different brain areas during a task, as well as spike sequences in models of pattern recognition (Hopfield 1995
). Relative time order of (as opposed to exact time intervals between) events within a sequence is important for dealing with possibilities such as timescaling (both uniform and nonuniform), noise (resulting in variability in timing), and order-sensitive plasticity rules (Bi and Poo 1998
; Gustafsson et al. 1987
; Levy and Steward 1983
; Markram et al. 1997
). It can also help reduce the sample size problem associated with analyzing complex patterns based on exact time intervals.
Specifically, we develop a combinatorial method for quantifying the degree of matching between a "reference sequence" of N distinct "letters" (representing a particular target order of firing by N cells) and an arbitrarily long "word" composed of those letters (representing the relative time order of spikes in an arbitrary firing pattern). The method involves computing the probability that a random permutation of the word's letters would by chance alone match the reference sequence as well as or better than the actual word does, assuming all permutations were equally likely. Lower probabilities thus indicate better matching. For more complex words, we derive formulae for computing upper and lower bounds of this probability. Using the probabilities from across a heterogeneous set of words (such as those produced during the course of an experiment), the overall degree and statistical significance of sequence matching can be computed without the use of Monte Carlo techniques. Words may contain repeated letters and need not contain each letter in the reference sequence. Thus this method can sample whether traces of a particular sequence exist in words that do not necessarily contain every letter (i.e., it naturally handles missing spikes). Variations of our approach emphasizing different aspects of matching are described, as well as variations that modify the equally likely permutation assumption.
Although we have initially motivated our method in terms of single spikes from individual neurons, our approach is general and can be applied to other types of neural data, such as multispike bursts from multiple cells, EEG events from multiple locations, or suprathreshold signals across multiple pixels of an optical or functional magnetic resonance image (fMRI). That is, what we will often refer to as a single spike from a single cell could just as well refer to a burst of spikes from a single cell, a particular pattern from a subset of cells, an EEG event from a given location, or any other type of signal. Thus we describe our approach in abstract terms. Each type of event (e.g., a spike) is captured in the abstract term "letter," and a sequence of such events from multiple sources is represented by a "word" [i.e., a string of letters (with different sources represented by different letters)].
We have recently applied our method to quantify memory traces of sequential experience in the rodent hippocampus during subsequent slow wave sleep (SWS) (Lee and Wilson 2000
). We found recurring bursts of activity involving up to 6 neurons firing in an order that directly related to the rat's previous spatial experience. This case illustrates many of the conditions the method was designed to handle: a clear reference sequence (i.e., the sequence of place fields traversed by the rat) that we tested for traces of in sleep, missing letters, repeated letters, errors, and variable interletter and overall timescales.
| SUPPORTING METHODS |
|---|
|
|
|---|
The hippocampal data analyzed here consist of the same set of cells used in our previous work involving the relative-order analysis of hippocampal activity during slow wave sleep (SWS), and the experimental methods are described in detail there (Lee and Wilson 2000
). Briefly, an experiment consisted of a rat first sleeping (PRE), then repeatedly running back and forth along a narrow track for chocolate reward at each end (RUN), then sleeping again (POST). A rodent hippocampal "place" cell fires selectively when the animal is in a particular location (the cell's "place field") of an environment (Muller 1996
; O'Keefe and Dostrovsky 1971
). We extracted 2 sets of time periods from RUN, those in which the rat was running in one direction (POS laps), and those in the opposite direction (NEG laps). During each set of laps in a given direction, the rat repeatedly passed through a fixed sequence of place fields of 5 to 11 simultaneously recorded hippocampal CA1 place cells. For the relative-order analysis, each such sequence of cells formed a reference sequence (with cells labeled 1, 2, 3, 4,... representing the order in which the peaks of their place fields were traversed, 1 being traversed first), and we tested the activity of these same cells during PRE and POST SWS for traces of this RUN sequence (Lee and Wilson 2000
). (Note that for rats running along narrow tracks, many place cells fire only in one direction and those that fire in both directions do not necessarily have fields in the same location. Thus the sequences in each direction generally consist of different sets of cells and are not simply reversed versions of each other.) The results of both the relative-order and exact time interval analysis presented herein represent results pooled across the cells from both directions of 3 rats (1 experiment per rat). All of the hippocampal spike rasters displayed herein come from the POST SWS activity of the 10 simultaneously recorded place cells from one direction of one rat [with reference sequence 1234... , and corresponding to the "RAT1" POS direction cells reported in Lee and Wilson (2000)
].
Exact time interval analysis
For comparison with our relative-order method, we searched our hippocampal data for all exact time interval firing patterns that satisfied the following criteria: the pattern must consist of spikes from
4 distinct cells, must have duration (i.e., the time from the first to last spike)
500 ms, and must repeat at least once (i.e., occur at least twice). The first step in our procedure was to perform burst filtering on the spike train of each cell. This involved eliminating all spikes from a cell that followed another spike from that cell by less than max_isi (which was chosen to be 0, 15, or 50 ms), thus leaving only the first spike of each "burst," as well as isolated single spikes. We then reduced the times of all the remaining spikes to 1-ms precision (from the original 0.1-ms precision). With these remaining spikes, we used a modified version of the algorithm of Abeles and Gerstein (1988)
to search for exact time interval patterns that repeated at least once. To allow for some variability in timing, we binned the time shifts between corresponding spikes in each pair of candidate patterns into 5-ms bins (except in the case of max_isi = 0 ms, i.e., no burst filtering) before checking for pattern matching. Finally, for a firing pattern P2 to be considered to be a repeat of a firing pattern P1, either their first spikes had to be separated by
100 ms, or the last spike of P1 had to occur before the first spike of P2. This was done to eliminate repeats that were presumably attributed to longer stretches of continuing activity (up to 100 ms) than were handled by the burst filtering. The search was done separately for each set of cells from one direction of one experiment. For example, for the 10 cells from the POS direction laps of one rat, we searched for repeating patterns in the activity of these 10 cells during POST SWS and the RUN POS laps (but not the RUN NEG laps). Then for the 11 cells from the NEG direction laps of this rat, we searched for repeats in the activity of these 11 cells during POST SWS and the RUN NEG laps. We then pooled the number of repeats found across both directions of the 3 rats.
Simulated spike trains
To illustrate parsing schemes for other types of data (Fig. 5, A and B), spike trains were simulated as follows. A step function representing the strength of input (expressed in terms of the target average firing rate) into the cell as a function of time was created. A Gamma process of order 5, with its rate adjusted to match the current target firing rate, was used to determine the time of the next spike. Each spike was followed by an absolute refractory period of 1 ms, after which time the Gamma process was restarted for the next spike. Time was quantized into 0.01-ms bins.
|
| MAIN METHOD AND RESULTS |
|---|
|
|
|---|
Consider the simultaneous spike trains from N neurons over a given time period. How could one detect and quantify meaningful firing patterns that may occur in this data? One type of meaningful pattern is a sequence of spikes, each one from a different cell. We set out to test whether an arbitrary pattern of spikes from multiple cells (e.g., the spikes fired during a 500-ms window) contains significant matching to a particular sequence of this type.
First, we cast the problem in its most general form. Because (as mentioned above) what we call a single spike from a single cell could represent any of a number of different types of events, we denote each event from a particular source (e.g., a spike fired by cell 1) by a "letter" (e.g., "1"). Because we are concerned with the relative time order of events from multiple sources, the timing of events can be reduced to an ordered string of letters (with different letters representing different sources). Then the basic problem we address can be stated as follows. We start by defining a "reference sequence" as an ordered list of an arbitrary number (N) of distinct (i.e., nonrepeating) "letters." Then we ask: how can one quantify the degree of matching between a particular reference sequence and an arbitrary string consisting of only those letters? This arbitrary string, which we call a "word," may contain repeated letters (e.g., a cell may fire more than one spike) and also need not contain all the letters in the reference sequence (e.g., a particular cell may not fire a spike at all), but every letter in a word must appear in the reference sequence. For example, the word 4271 could represent a period in which cells 4, 2, 7, and 1 each fired a spike in that order (Fig. 1). Because the letters are just labels and because we restrict ourselves to reference sequences containing only one event from each source, a reference sequence of length N can be represented as 1234 N. We will assume this form for all reference sequences (unless explicitly stated otherwise). An example of an invalid reference sequence is 1234156789, given that no repeats are allowed.
|
Our approach to the above sequence matching problem is based on combinatorics and probability. First we describe the idea with simple cases. For instance, given a 2-letter word in which both letters are distinct (i.e., the 2 letters are different), there are 2 possible ways in which the letters could be ordered (e.g., 12 or 21), and thus a 1/2 chance that the letters are in the same order as in the reference sequence, assuming each ordering is equally likely. Given a 3-letter word in which all 3 letters are distinct, there are 3! = 6 ways in which the letters could be ordered (e.g., 123, 132, 213, 231, 312, or 321), and thus a 1/6 chance that the letters are in the same order as in the reference sequence, again assuming each of the possible orderings are equally likely. [Note that 3! = 3 x 2 x 1 and, in general, a! = a x (a 1) x (a 2) x x 3 x 2 x 1.]
Now we extend this process to words of arbitrary length and letter composition. Because longer words could contain imperfect matches to the reference sequence that are still highly improbable based on chance, we will include imperfect matches in our analysis. For instance, although it is highly unlikely to find the word 123456789 by chance, it is also highly unlikely to find a word such as 124563789 by chance. Just as in the simpler cases, chance is defined as the probability of getting a match as good as the best one actually found in that word or better (where in the simple 2- and 3-letter words above there were no matches better than having the 2 or all 3 letters in order), assuming all n! permutations of the letters of that length n word are equally likely. The lower this probability, the more unlikely, and thus the higher the degree of matching between the word and the reference sequence. Note that the probability is conditional on the specific letters that were actually observed in the word. Later we show that this is a reasonable approach.
The reason for computing the probability that a word would contain a match as good as or better than the best match found in that word, assuming all possible permutations of its constituent letters were equally likely, is that we are now dealing with imperfect matches. It is exactly analogous to the statistical procedure of determining a P value, such as in computing the probability that one would observe 80 heads in 100 tosses of a coin, assuming the coin is fair. To evaluate how unlikely this is by chance, one computes the probability of getting 80 or more heads, assuming the null hypothesis of a fair coin. Here "80 heads" corresponds to the best match found in a word, "more than 80 heads" corresponds to the better matches, and "fair coin" corresponds to the permutations being equally likely.
Thus we need 1) a precise definition of matching that covers perfect as well as various imperfect matches, and 2) a way to rank matches from best to worst so that the phrases "best match found" and "better match" make sense.
Match definition
We define matches with 2 numbers, x and y, in the following manner. A word W (which may have repeated letters) contains an (x, y) match to a given reference sequence S (which by definition cannot have repeated letters) if there exist x + y consecutive letters in W of which at least x letters are in strictly increasing order in S (but not necessarily consecutive in S). According to this definition, if S = 123456789, the word 11377 does not contain a (5, 0) match, but does contain a (3, 0) match [as well as (3, 1), (3, 2), (2, 0), and other matches], and the word 13436892 contains a (6, 1) match [as well as (6, 2), (5, 1), (5, 2), (5, 3), (4, 1), and other matches]. Essentially, our notation means that there are at least x letters in order, with at most y interruptions in between those matching letters.
Ranked match lists
Given a word W of length n and containing k distinct letters (n
k), the best possible match to the reference sequence S by optimally rearranging the letters of W is a (k, 0) match (i.e., k consecutive letters in W in the same order as in S and with no interruptions). In our approach, in the case that n > k, the ordering of the n k remaining letters does not matter. [Likewise, given any (x, y) match, the order of the n x remaining letters does not matter.] Thus there could be several different permutations of W containing the best possible match. For instance, if S = 123456789 and W = 42717, both 12477 and 71247 contain the best possible match. One reason we do not consider the specific arrangement of the remaining n k letters is that, as discussed later, we presume (but do not require) a preprocessing (i.e., "parsing") stage in which the letters and words are extracted from the raw data such that n is usually not much greater than k.
Now how do we rank the possible matches? Given a word W of length n and containing k distinct letters, the list of all possible matches is shown in Fig. 2. To begin with, we will not consider matches (1, y) for any y or (2, y) for y > 0. This is because every word contains a (1, y) match, and any permutation containing a (2, y) match with y > 0 also contains a (2, 0) match.
|
However, these constraints do not address how to rank (x1, y1) and (x2, y2) in the case that x1 > x2 but y1 > y2. This is partially addressed by the following. The intuition behind constraints (I) and (II) can be stated as the single constraint (c*): the ranking of matches must be such that a word that contains a worse match must not automatically also contain a better match. For instance, every word that contains an (x, y) match automatically contains an (x 1, y) match; thus an (x, y) match cannot be considered worse than an (x 1, y) match. However, (c*) also includes cases not covered by (I) and (II). For instance, according to (c*), (6, 1) > (4, 0), because the 1 interruption in the (6, 1) match will always still leave a (4, 0) match somewhere. Even if the interruption is placed in the middle, say in 123Z456, no matter what Z is, either 123Z or Z456 will be a (4, 0) match. Similarly, (6, 3) > (4, 1).
Still, these constraints do not force a unique ranking of all the possible matches listed above. That is, several different rankings obey the intuitive constraint (c*). Two such rankings are described next.
(H) THE "HORIZONTAL H RANKING".
Given any 2 possible matches [where an (x, y) match is possible if and only if x
k and x + y
n], assign (x1, y1) > (x2, y2) if either 1) x1 > x2, or 2) x1 = x2 and y1 < y2. That is, the better matches are the ones with more letters in strictly increasing order regardless of interruptions, or the ones with the least interruptions given an equal number of letters in strictly increasing order. Thus the best possible match is the (k, 0) match, followed by (k, 1), then (k, 2), and so on, until (k, n k), then (k 1, 0), followed by (k 1, 1), (k 1, 2), and so forth, then ending with (2, 0). This is called "Horizontal" because matches are ranked following left to right horizontal lines, starting with the top row, in Fig. 2.
(D) THE "DIAGONAL D RANKING".
Given any 2 possible matches, assign (x1, y1) > (x2, y2) if either 1) x1 y1 > x2 y2, or 2) x1 y1 = x2 y2 and x1 > x2. Furthermore, consider only matches (x, y) such that x y
2. That is, the better matches are the ones with the greater difference between the number of letters in strictly increasing order and the number of interruptions, or the ones with the greater number of letters in strictly increasing order if the differences are equal (i.e., this ranking balances rewarding longer matches with penalizing more interruptions). Thus the best possible match is the (k, 0) match, followed by (x, y) such that x y = k 1, ranked by decreasing x [i.e., (k, 1) (assuming k + 1
n; if not, skip it) then (k 1, 0)]. Then next are (x, y) such that x y = k 2 [i.e., (k, 2) (assuming k + 2
n; if not, skip it), then (k 1, 1), then (k 2, 0)]. Continue this as long as x y
2. Thus the last (worst) match we consider is (2, 0). This is called "Diagonal" because matches are ranked following upper-right to lower-left diagonals, starting with the upper leftmost such diagonal [i.e., that containing just (k, 0)], in Fig. 2.
We prove that both these rankings obey the constraint (c*) in APPENDIX B. In APPENDIX C we evaluate these and other possible rankings in terms of the probabilities that a random permutation contains each type of match. Intuitively, less-probable matches should generally be considered better. We have primarily used the D ranking scheme in our previous work, but we also showed that the H ranking gave similar results there (Lee and Wilson 2000
). Both schemes have merit, as does a simplified H ranking scheme in which 1) (x1, y1) > (x2, y2) if x1 > x2 and 2) (x1, y1) = (x2, y2) if x1 = x2 (i.e., the number of interruptions y does not matter). As discussed in APPENDIX C, it is a matter of which aspects of matching one wants to emphasize.
Match probability calculation
Given any match ranking scheme, our combinatorial method provides a straightforward way for computing probabilities (and thus the significance) of sequence matching in words of arbitrary length and letter composition (including words with repeated letters and words with matches that have interruptions), and for reference sequences of arbitrary length. The method can be described simply as follows. Given a particular reference sequence and an arbitrary word of length n, find the best match in the ranked match list that is actually found in the word. Then compute how rare such a match is by chance (i.e., assuming all n! permutations of the word's letters were equally likely) by determining the fraction of the n! permutations that contain a match that is as good as that match or better according to the ranked match list. We illustrate this procedure with examples.
Given reference sequence S = 123456789, say we observe the word 524679. For this word n = 6, k = 6. Using the D ranking scheme, the ranked match list (best to worst) is: (6, 0) > (5, 0) > (5, 1) > (4, 0) > (4, 1) > (3, 0) >... . The best match from this list actually found in the word is a (5, 0) match (i.e., 24679). Out of the 6! = 720 possible permutations of the 6 letters, there are 10 permutations whose best match is a (5, 0) match (not necessarily 24679), and 1 permutation with a better match, the perfect (6, 0) match 245679. Thus the probability of getting a (5, 0) match or better in a word of this letter composition by chance (assuming all permutations were equally likely) is very small: (10 + 1)/6! = 0.015 (Fig. 3).
|
As these examples show, our method allows us to quantify matching in words that do not necessarily contain every letter in the reference sequence, that may have repeated letters, that may have matches with interruptions, and that may have additional letters outside the matching segment. All of these situations are handled in the same way with ranked match lists and permutations.
The null hypothesis
To reiterate the null hypothesis, we assume that, given a word of particular length and letter composition, all possible orderings (i.e., permutations) of those letters (including any repeated letters) were equally likely. That is, we were assuming no particular ordering. We then compute the probability that such random ordering of these letters would by chance produce a match to a specific reference sequence that is at least as good as the best match observed in the actual ordering of letters in the word. As mentioned above, this is a conditional probability, that is, conditional on the letters actually observed in the word.
This approach gives us the same answers as other reasonable null hypotheses.
For instance, consider the following null hypothesis. Assume each of the N letters in the reference sequence occurs exactly once in each word, but that we observe only n < N of them (e.g., some spikes are missing because of neural variability). Note this also means that n = k (i.e., words do not have repeated letters). Further assume that the relative position of the unobserved letters could be anywhere. Consider the probability that we would observe a match as good as or better than the best match found among any n observed letters, assuming all N! underlying permutations of the N letters were equally likely. The answer is the same as the original probability computed above. For example, given the reference sequence 123456789 and the word 524679, we observe only 6 letters, whereas the letters 1, 3, and 8 are missing. For each permutation of any 6 observed letters, there are 9!/6! unique ways 3 unobserved letters could be located with respect to the positions of those 6 observed letters. The numerator in Fig. 3 thus becomes (11 x 9!)/6!, whereas the denominator becomes (6! x 9!)/6! = 9! (representing every permutation of the 9 letters in the reference sequence), yielding the same probability 0.015. That is, 0.015 is the chance that any 6 letters selected from a random permutation of 9 distinct letters would contain a (5, 0) match or better with respect to the reference sequence 123456789. However, if k < n (there are repeated letters), then the form this alternate null hypothesis should take is less clear, whereas our original approach handles repeated letters in a natural way.
Now consider a null hypothesis that is basically an inversion of our original one. As before, given a reference sequence S = 1234 N of length N and a word W of length n containing k = n distinct letters (n
N), find the best match in that word (according to the chosen match ranking scheme). Call that match F. However, this time treat the observed word as fixed. Instead, ask what is the probability that a random reference sequence (i.e., any permutation of S) would have yielded a match in W that is as good as or better than F (i.e., the best match that had been found in W with respect to the original reference sequence S). Assuming all N! possible reference sequences were equally likely, the answer is the same as the original probability above. This is proven in APPENDIX D. For example, this means that the word 524679 contains a (5, 0) match or better with respect to 1.5% (i.e., 0.015) of the 9! possible 9-letter reference sequences (see APPENDIX D). Although this result holds only for words without repeated letters, we will return to it when we consider the issue of sample size.
At a fundamental level, our conditional probability approach separates the issue of generating a word from that of the order within a word. One could imagine a null model that did not separate these issues and instead specified how each word was to be produced from among all possible words. This model might include the probability of producing a word of a given length, of getting a particular set of letters, of having certain letters repeat, of having these letters in a certain order, and all this perhaps as a function of previous words. For instance, consider a first-order Markov process with n + 1 states (one state for each letter plus a termination state), with the initial letter drawn from some probability distribution, and with transition probabilities for moving from each state to any other state (including itself) specified. This model would implicitly specify a null hypothesis of what letter orderings to expect (most likely not equally probable). It would also implicitly specify the expected probabilities of getting each type of (x, y) match, and from this we could similarly compute a probability of observing a particular match or better with respect to a chosen match ranking. That is, our relative-order and (x, y) match ranking approach could also be applied to such null models as these. However, our conditional probability approach sidesteps the complicated issue of how words are generated by taking the letters of each observed word as given. Instead we focus exclusively on the issue of order by assuming that every possible ordering of those observed letters was equally likely. This also means that the ordering of letters within each word is assumed to be independent of the ordering in every other wordan assumption used later in determining the significance of matching across a set of words. Our focus is only on the order of letters and not how they are generated because we are concerned with sequences, not word length, letter composition, or other issues.
Is it a good idea to not assume any particular model of word generation? A modified version of the coin-flip analogy can again be useful here. Say we observe 100 consecutive trials of an unknown experiment with 2 possible outcomes (called "heads" and "tails"). Unlike with a real coin, here we do not know whether there is a fixed probability of getting a head on any given trial, whether trials are independent, or whether any aspect of the system is constant in time. How would one analyze such a potentially complicated system? One of the first things to measure would be the observed fraction of heads. Say it is 0.6 (i.e., 60 heads). To get a sense of whether this fraction is unusually large, we could compare it to the fraction expected from the simplest possible model: 100 independent trials each with fixed probability 0.5 of getting heads. Based on this model, the probability of observing a fraction
0.6 is 0.03 (i.e., low). What can we conclude from this? We cannot conclude that the trials are independent, each with fixed probability 0.6 of getting heads. In fact, based on the fraction alone, we cannot conclude that any particular model of how outcomes are generated is likely to be correct. For instance, a deterministic sequence of "HHHTT" repeating forever would also produce the fraction 0.6. We can say only that it is unlikely for the simplest model100 independent flips of a fair cointo account for this observed fraction of heads. However, this knowledge is of value. This is exactly what we have done with respect to sequences. We have developed a way to measure the amount of sequence structure in an arbitrary word (and, as we shall show later, in a set of words, i.e., trials) by comparing it to the amount of sequence structure expected from the simplest possible model: the random model of ordering. It is analogous to measuring the fraction of heads. It answers the question of whether there is a bias in ordering at all, and how large the bias appears to be. This measure says nothing about how the observed amount of sequence structure comes about (i.e., how the words are generated). However, as we have seen with the coin-flip analogy, in the absence of constraints (such as those based on possible physical mechanisms), the space of possible models is vast. Thus here we avoid the complex issue of models of word generation.
Therefore if one asks what justifies comparing the data against such an unlikely null model (i.e., equally likely permutations), we can say the following. It is not because we believe the null model to be true, but because it is the first standard of comparison for ordering, just as a fair coin is the first standard for analyzing a series of coin flips. In fact, we have tested the equally likely assumption in real data using shuffles (and in some cases all possible permutations), as described below, and found the data consistent with such a model (Lee and Wilson 2000
). Thus this null model is not necessarily unlikely.
More efficient ways to compute exact, upper bound, and lower bound match probabilities
Although our combinatorial approach applies to words of arbitrary length and letter composition, this does not mean that it is always easy to calculate the corresponding probabilities, especially for longer words. Consider a word of length n with k distinct letters. Let m be an N-dimensional "multiplicity" vector whose entries (m1, m2, m3,... , mN) give the number of times each letter (1, 2, 3,... , N) occurs in this word. Thus k elements are nonzero, and i mi = n. A brute force probability calculation would require testing each of the n!/[
i (mi!)] distinct permutations to see whether they contained the best match found or better. Therefore we have developed more efficient ways to calculate the exact probability for some important cases, and upper and lower bounds of the probability for other cases. In particular, we have a formula for computing the exact probability of getting a (k, 0) match (i.e., the best possible match) in any word (i.e., given any m). Furthermore, we have a formula for computing an upper bound probability for getting any (x, y) match, and a lower bound probability for getting any (z, 0) match where z < k. (See APPENDIX E for the formulae and derivations.)
With these formulae we can compute both upper and lower bound probabilities of getting a given match or better in any word with respect to any ranking scheme. For example, consider again the reference sequence 123456789 and the word 51469784. For this word m = [1 0 0 2 1 1 1 1 1]. Using the D ranking scheme, the ranked match list (best to worst) is: (7, 0) > (7, 1) > (6, 0) > (6, 1) > (5, 0) > (6, 2) > (5, 1) > (4, 0) > (5, 2) >..., and (5, 1) is the best match found. We can use formula E2 to calculate an upper bound of the probability of getting a (5, 1) match or better according to this ranking scheme as follows. The matches we must count are (7, 0), (7, 1), (6, 0), (6, 1), (5, 0), (6, 2), and (5, 1). Because formula E2 gives the probability of getting any (u, v) match with u
x and v
y, we use it twiceonce with x = 5 and y = 1, and once with x = 6 and y = 2. The first gives an upper bound probability of getting a match inside A (Fig. 4, solid box), the second an upper bound probability of getting a match inside B (Fig. 4, dashed box). Adding these 2 upper bounds together gives us the desired upper bound. Here we get 0.1038 compared with the exact probability 0.0580. (Although not necessarily the tightest bound, it is good enough for practical use in many cases, e.g., in determining the significance of matching across many words, as shown below.) To compute a lower bound probability of getting a (5, 1) match or better, find the worst (z, 0) match as good as or better than (5, 1), here (5, 0), and compute a lower bound probability of getting this match using formula E3. This is the desired lower bound because it is a lower bound of getting a match that is as good as or better than (5, 1). Here we get 0.0195 compared with the exact probability 0.0580. (Although again not necessarily the tightest bound, it is of practical use in many cases.) An analogous approach will work for any ranking scheme.
|
In computing probabilities, we have so far used the null hypothesis that all permutations are equally likely, but our approach can also handle other null hypotheses in which permutations have unequal weightings. All that is necessary is a way to weight the permutations in a manner appropriate to test what is desired.
For instance, suppose it is known that pairs of letters tend to occur in the same order as in the reference sequence S with a probability B, where B is called a bias. If B > 1/2 then we would expect more sequence matching to S than based on equally likely permutations. To quantify the degree of sequence matching beyond that expected based on the bias, we weight more those permutations with more pairs of letters in order. The exact weighting is a matter of choice.
For example, we could weight each n letter permutation by Bf x (1 B)r/H, where f = the number of adjacent letter pairs in the permutation that are in the same order as in S, r = the number in the opposite order, and H is a normalization factor to make the weights of all permutations sum to 1. For 4271, f = 1 (from 27) and r = 2 (from 42 and 71). (Note repeated adjacent letters would be ignored; thus f + r
n 1.) If B > 1/2, this weighting would tend to increase the expected probability of getting a given match or better. [A more complicated version could involve bias values that are specific to each letter pair, e.g., B(3, 5) = the probability of observing the pair 35 given that one observes a 35 or 53.] This weighting scheme assumes a process in which each letter depends only on the letter immediately before it, with the resulting match probability indicating the amount of longer sequence matching that cannot be accounted for by the random chaining together of the shorter pair biases. However, in general there is no reason to believe such a process is the correct one.
Alternatively, we could weight each permutation by Bf x (1 B)r/H, where f = the number of all n x (n 1)/2 letter pairs in the permutation that are in the same order as in S, r = the number in the opposite order, and H is again for normalization. This treats each of the n x (n 1)/2 letter pairs as independent, but because they cannot all be independent, it gives an upper bound of the effect of a bias B > 1/2, at least as far as perfect matches [i.e., (n, 0) matches in words of length n] are concerned. We have used this latter weighting in our work on hippocampal sequences in SWS (Lee and Wilson 2000
). There we set B = the average bias observed across all 2-letter words with 2 distinct letters. One could also set B = the average bias observed across all words regardless of length [with each n-letter word contributing n2 letter pairs to the average, where n2 is the number of all its n x (n 1)/2 possible letter pairs in which the 2 letters are distinct].
Finally, consider weighting each permutation i by h x bf(i), where again f(i) = the number of all n x (n 1)/2 letter pairs in permutation i that are in the same order as in S, and h and b are fit to 2 equations
![]() | (1) |
![]() | (2) |
Table 1 shows the different permutation weights for the simple case of W = 123 under the second ("all pairs upper bound") and third ("all pairs") weighting schemes with B = 0.6. For W = 524679 and the D ranking, the match probability [i.e., the probability of getting a (5, 0) match, which is the best match found, or better] goes from 0.015 (equally likely) to 0.074 (all pairs upper bound) and 0.039 (all pairs) with B = 0.6. Thus such a match is considered that much less rare when a 0.6 pair bias is taken into account.
|
Significance of matching across many words
As mentioned above, the probability we calculate (whether with the equally likely permutation assumption or any other null hypothesis) is identical to a P value, thus giving us the significance of sequence matching for a single word. In a typical experiment there will be multiple observations, yielding a set of words, each of which will have a match probability associated with it. Some words may have low probability matches, whereas others may not, with a larger than expected proportion of low probability matches indicating general matching to the reference sequence.
An overall significance of sequence matching between such a set of words and the reference sequence can be determined in the following manner. First we define a "P
P' trial" as a word for which the best possible sequence match produced by optimally rearranging its letters [generally a (k, 0) match] has probability
P' of occurring (i.e., a word that could have contained a match of probability P' or rarer based on its constituent letters). We define a "P
P' match" as a word for which the probability of getting the best match actually found in the word or better is
P'. For example, given the reference sequence S = 123456789 and the D ranking, if we set P' = 1/24, then the word 2471 is a P
1/24 trial (given that a rearrangement of its letters to 1247 contains a match of probability
1/24, i.e., 1/24 exactly) but not a P
1/24 match [given that the best match actually found, a (3, 0) match, has probability >1/24, i.e., 7/24], the word 524679 is both a P
1/24 trial and a P
1/24 match (given that the best match actually found has probability
1/24, i.e., 0.015; Fig. 3), and the word 123 is neither a P
1/24 trial or P
1/24 match. The expected P
P' match/trial ratio for a set of words is P', assuming the ordering of letters within each word is independent of the ordering within all the other words. Given M matches out of T trials, the significance of the observed ratio M/T can be quantified with a Z score determined from the normal approximation to the binomial distribution
![]() | (3) |
![]() | (4) |
![]() | (5) |
One should note the difference between the ratio (matches/trials) and the significance (Z score and P value). The comparison between the observed and expected match/trial ratios reveals what the effect size is likely to be, whereas the Z score (and P value) gives an idea of the confidence we should have in that effect size based on the amount of data we have.
This approach is valid for any chosen P'. Because we are generally concerned with finding at least moderately long matching sequences, we set P' to values <1/2, such as 1/6, 1/24, or 0.01. Given that 1/3! = 1/6 is the probability of getting a (3, 0) match in an n = 3, k = 3 word, setting P' = 1/6 means that we are focusing on matching sequences that generally have
3 letters in order (i.e., "3rd-order interactions"). Likewise, setting P' = 1/4! = 1/24 focuses on matching sequences with generally 4 or more letters in order (i.e., "4th-order interactions"), and P' = 0.01, 5 or more letters. Thus by setting P' we can focus on sequence matches of any desired length (i.e., "nth-order interactions") within the same framework. Note that because P
P' matches can have probabilities <P', the match/trial ratio, Z score, and overall P value are actually lower bounds of the true amount of matching at that particular P' level. Thus a warning: to prevent a significant underestimate of the degree of matching, one should check the following 2 points. First, one should check that most of the P
P' matches do not have probabilities that are far lower than P'. Second, one should check that most of the P
P' trials could contain a match whose probability is simultaneously
P' and close to P'. As an extreme illustration of the second kind of underestimate, suppose one chooses P' = 1/24 and that all the P
P' trials are 5-letter words, each with 5 distinct letters. Assuming equally likely permutations, the best possible match (5, 0) has probability 1/120, whereas the second best match (4, 0) has probability 1/15. Thus the expected match/trial ratio is not 1/24 but rather 1/120, given that the only way for a word to be a P
P' match is for it to contain a (5, 0) match.
In practice, we can use the exact, upper, and lower bound match probability formulae to quantify matching over a set of words in the following way. To begin with, it is important to distinguish 2 approaches with different goals. First, to compute a lower bound of the match/trial ratio, count a word as a P
P' trial unless shown otherwise (i.e., unless some lower bound probability of the best possible match is >P'), and do not count a word as a P
P' match unless shown otherwise (i.e., unless some upper bound probability of the best match found in the word is
P'). Alternatively, to compute an upper bound of the match/trial ratio, count a word as a P
P' match unless shown otherwise (i.e., unless some lower bound probability of the best match found in the word is >P'), and do not count a word as a P
P' trial unless shown otherwise (i.e., unless some upper bound probability of the best possible match is
P', or unless it has already been counted as a P
P' match). In the case of the standard equally likely permutation assumption, formula E1 for the exact probability of a (k, 0) match allows an unambiguous determination of whether a word is a P
P' trial for both the upper and lower bound ratios. The lower bound ratio should be used as a conservative test to show that significant matching exists, and the upper bound ratio as a conservative test to show that significant matching is not present. In addition, in trying to show the latter, one should verify the 2 points mentioned above to prevent an underestimate. Note that it is not necessary to verify these points in computing a lower bound of matching because an underestimate is still a lower bound. In our hippocampal sequence learning work, we chose P' = 1/24, and we computed the lower bound ratio (and Z score) to demonstrate significant matching between the experienced sequence and the set of words from the period subsequent to that experience. We computed the upper bound ratio (and Z score) to demonstrate a lack of matching between the experienced sequence and the set of words from the preceding period. We also verified that most of the P
P' matches from that preceding period did not have probabilities that were far lower than P', and that most of the P
P' trials could have contained a match whose probability was simultaneously
P' and close to P' (Lee and Wilson 2000
). However, for the reasons mentioned above, it is generally easier to use this approach to demonstrate that significant matching exists than to demonstrate that it does not.
By considering matching over a set of words, our method allows one to detect traces of a particular reference sequence in a heterogeneous set of words that do not contain every letter and that may have few letters in common with each other. Such a situation may occur during spontaneous activity in which only a fraction of cells are active at a given time, as in the hippocampus during SWS (Lee and Wilson 2000
). For instance, consider the following set of 4 words made from 7 letters: 125, 2367, 146, and 3455. Each word contains no more than 4 out of the 7 possible letters, and no pair of words has more than one letter in common. However, together these 4 words indicate the strongest traces of 3 out of the 7! = 5,040 possible reference sequences: 1234567, 1234657, and 1234675.
Shuffle analysis can be used to yield a complementary estimate of match significance across a set of words. First, compute the match/trial ratio with respect to a randomly shuffled version of the reference sequence (e.g., S = 472936185). The significance of matching to the original reference sequence could then be estimated from the distribution of match/trial ratios for many different shuffled reference sequences. For instance, one could determine that the match/trial ratio with respect to S = 123456789 is Z' SDs above the mean match/trial ratio of the shuffle distribution. This allows a measure of significance across words that does not assume any particular null hypothesis for the distribution of permutations, but rather uses the distribution across the observed words themselves. However, this does not allow one to measure the amount of sequence structure in an absolute sense (i.e., with respect to a simple, known null hypothesis, such as equally likely permutations). We computed the shuffle significance for our hippocampal data and found it to be similar to that based on the equally likely assumption. Furthermore, the shuffle distribution itself was consistent with the equally likely permutation and independent trials model (Lee and Wilson 2000
).
Application to specific types of neural data
We have described our approach in terms of abstract symbols (letters and words) to focus only on the essential elements of relative order and sequence matching. The results are general and can be applied to a broad range of multiple-source neural data. To apply our results to a specific type of neural data one simply needs an appropriate (and preferably natural) "parsing" method to convert the data into letters and words.
For instance, in an experiment composed of separate trials, the activity during each trial could give a different word. Within each trial, the order of letters (e.g., with each letter derived from the activity of a particular neuron or the signal at a local EEG site) could be based on the relative time of occurrence of signals (e.g., spikes or particular EEG events) following a particular event, such as the presentation of a stimulus, the beginning of a delay period, the beginning of a motor movement, and so forth. The reference sequence would be a presumed order of activity that one wanted quantitative evidence of within a set of trials. In Fig. 5, A and B we illustrate possible parsing schemes for both low and high firing rate cases using simulated spike trains. In the low-rate case, letters are assigned to the first spike fired by each cell. In the high-rate case, letters are assigned to the first moment the instantaneous firing rate rises sufficiently above the baseline rate. (A similar parsing scheme could be used for imaging data in which the intensity in a given region is substituted for the instantaneous firing rate.) If a cell does not fire any spikes (low-rate case) or significantly raise its firing rate above baseline (high-rate case) during the trial period, that letter is considered missing for that trial.
One could also parse a continuous stretch of data into a set of words. In our previous study, we parsed a continuous stretch of spike activity from many rat hippocampal CA1 pyramidal cells during slow wave sleep (SWS) (Lee and Wilson 2000
). The characteristic activity of such cells during SWS consists of intermittent population bursts of nearly coincident (within 100 ms) activity from a subset of cells, separated by relatively long periods (from 200 ms to seconds) with little activity. Furthermore, each time a cell fires in such a population burst, it tends to produce not a single spike, but a burst of spikes with small (5-ms) interspike intervals (Ranck 1973
). Thus we parsed the data in the following manner (Fig. 5C). Each cell was identified with its own letter. Each time a cell fired a set of consecutive spikes with interspike intervals less than max_isi (here 50 ms), the activity was represented by a letter occurring at the time of the first spike. Each isolated spike also produced a letter. Then the letters from all the cells were merged to form a single time-ordered string. This string was then broken between every pair of adjacent letters separated by more than max_gap (here 100 ms) into a set of words. Each word thus represents the relative order of activity within a SWS population burst. This set of words was then tested for matching to a reference sequence representing the order of spatial activation of these same cells during running behavior.
Finally, as mentioned above, although our approach works for words of arbitrary length and letter composition, it gives the most useful answers if the length of each word is not significantly larger than the number of distinct letters in that word (i.e., if n is not much greater than k). For instance, a parsing scheme that produced many words such as 332557778998988A (i.e., assigning a letter to every spike in Fig. 5C, top panel) would not be as useful as one that produced 325789A instead (i.e., assigning a letter only to the first spike of each burst fired by a cell), where "A" represents cell 10. Thus one criterion for a good parsing method is that it produces a set of words with this property.
Sample size considerations
The significance of matching across many words depends on the amount of data available (i.e., the number of trials). An important aspect of our relative-order and probability approach is that it can greatly reduce the sample size problem associated with finding evidence for or against a particular complex firing pattern, especially compared with methods that classify patterns based on exact time intervals (Abeles 1991
; Abeles and Gerstein 1988
; Abeles et al. 1993
; Aertsen et al. 1989
; Gerstein and Perkel 1969
; Grün et al. 2001
; Gütig et al. 2003
; Martignon et al. 2000
; Nakahara and Amari 2002
; Palm et al. 1988
). That is, with larger n (i.e., more spikes in each pattern), it becomes harder to accurately estimate probability densities in the (n 1)-dimensional space of exact (binned) intervals (i.e., there are too many possible patterns for a given amount of data).
Say we have N neurons and are concerned with patterns that occur within J ms. For simplicity, assume that each neuron contributes exactly one letter per J ms. First consider patterns with exact time intervals. If we divide J into U time bins, then there are
![]() | (6) |
However, our method can do much better than N!. In fact, it effectively reduces this number to approximately 1/P', where P' is the probability we set that determines which words are matches and trials. This is attributed to the result described above and in APPENDIX D. For example, say n = 8, we choose the D ranking scheme, and we set P' = 1/24 = 0.042. Say we observe an 8-letter word W1 with 8 distinct letters. The worst match W1 could contain and still be a P
1/24 match is (6, 2), which has probability 0.041 (which is nearly 1/24). According to APPENDIX D, this also means that W1 is a P
1/24 match with respect to 0.041 x 8! out of the 8! possible reference sequences. That is, although there are 8! possible relative-order patterns, each word is evidence for approximately P' x 8! of them and evidence against (1 P') x 8! of them. This also applies to words with missing letters. Say a word contains only 4 out of the 8 letters (once each) (e.g., W2 = 1247). Only a (4, 0) match (which has probability 1/24) will make it a P
1/24 match. However, such a word will be evidence for 8!/4! = P' x 8! reference sequences, given that W2 contains a (4, 0) match with respect to any reference sequence with the letters 1, 2, 4, and 7 in that order, regardless of where the missing letters are placed around them (there being 8!/4! unique ways to do so, e.g., S = 12345678 and 61284735). Therefore an estimate of the amount of sequence structure with respect to each of the N! possible relative-order patterns requires a sample size of only a few times 1/P' trials (e.g., 100200 trials for P' = 1/24) regardless of the size of N, as opposed to a few times N! trials. Essentially, we have reduced the required sample size by allowing matches that are partial and imperfect to an extent determined by P'. (Note the approximate nature of 1/P' is ascribed to the presence of words with repeated letters and words that cannot contain a match with probability exactly =P'.)
To illustrate the sample size issue, we compare our previous relative-order analysis of hippocampal spike activity during SWS (Lee and Wilson 2000
) with an exact time interval analysis of the same data (see SUPPORTING METHODS). Consider the activity of the RUN sequence place cells from both directions of the 3 rats. Using our relative-order analysis (with parsing parameters max_isi = 50 ms, max_gap = 100 ms), in POST SWS we had found 35 P
1/24 matches to the RUN reference sequence out of 270 total P
1/24 trials (vs. 270/24 = 11 matches expected, P < 4 x 109), providing strong evidence of RUN sequence structure in subsequent SWS. Because P' was set to 1/24 = 1/4!, this analysis only considered matches with
4 distinct letters (cells) in order. We searched the same POST SWS data (98 min total) for exact time interval patterns consisting of spikes from
4 distinct cells firing within a maximum duration of 500 ms (which is longer than the matches found with relative-order analysis). Even when using 5-ms bins [compared with standard 1- to 3-ms bins (Abeles et al. 1993
)], we found no patterns that repeat within POST SWS using a burst filtering max_isi of 50 ms (as used in the relative-order analysis), and only 6 patterns that repeat using max_isi = 15 ms. None of these patterns consisted of spikes from more than 4 distinct cells or occurred more than twice. Although one of these patterns matched (in relative order) the RUN reference sequence, not only were both occurrences detected by our relative-order method, but our method also detected several more spikes firing in the correct order (Fig. 6A ). Furthermore, our method also detected 33 additional (P
1/24) matches to the RUN sequences.
|