## Abstract

Recently, the SPIKE-distance has been proposed as a parameter-free and timescale-independent measure of spike train synchrony. This measure is time resolved since it relies on instantaneous estimates of spike train dissimilarity. However, its original definition led to spuriously high instantaneous values for eventlike firing patterns. Here we present a substantial improvement of this measure that eliminates this shortcoming. The reliability gained allows us to track changes in instantaneous clustering, i.e., time-localized patterns of (dis)similarity among multiple spike trains. Additional new features include selective and triggered temporal averaging as well as the instantaneous comparison of spike train groups. In a second step, a causal SPIKE-distance is defined such that the instantaneous values of dissimilarity rely on past information only so that time-resolved spike train synchrony can be estimated in real time. We demonstrate that these methods are capable of extracting valuable information from field data by monitoring the synchrony between neuronal spike trains during an epileptic seizure. Finally, the applicability of both the regular and the real-time SPIKE-distance to continuous data is illustrated on model electroencephalographic (EEG) recordings.

- data analysis
- synchronization
- spike trains
- clustering
- SPIKE-distance

measures that estimate the degree of synchrony between two or more simultaneously recorded spike trains are important tools in many different areas of neuroscientific research (Kreuz 2011). Among many potential applications they can be used to evaluate the role of synchronous neuronal firing in signal propagation (Reyes 2003), to estimate the pairwise correlation within a neuronal population (Pillow et al. 2008), or to quantify the role of spike synchronization in feature binding (Singer 2009).

Some of the most widely used measures of spike train dissimilarity depend on a parameter that determines the temporal scale in the spike trains to which the measure is sensitive. Examples include the Victor-Purpura spike train distance (Victor and Purpura 1996) and the van Rossum distance (van Rossum 2001). Recently, the ISI-distance (Kreuz et al. 2007, 2009) and the SPIKE-distance (Kreuz et al. 2011) have been proposed as parameter-free and timescale-adaptive alternatives. These two measures are complementary to each other: While the ISI-distance quantifies local dissimilarities based on the neurons' firing rate profiles, the SPIKE-distance tracks dyssynchrony mediated by differences in spike timing. The latter kind of sensitivity can be very relevant since coincident spiking is found in many different neuronal circuits, e.g., in the visual cortex (Priebe and Ferster 2008; Usrey and Reid 1999) and in the retina (Meister and Berry 1995; Shlens et al. 2008).

In some situations it is sufficient to evaluate spike train synchrony at a rather low temporal resolution, e.g., by means of a moving-window analysis where the level of synchronization within a certain interval is compressed into a single value and then compared for successive intervals. On the other hand, many applications require a high temporal resolution, e.g., in order to detect replay of precisely timed sequential patterns of neural activity (Ji and Wilson 2007), to track spike train response variability within a neuronal population (Kreuz et al. 2009; Mitchell et al. 2007), or to understand the role of synchronous firing in the neuronal coding of time-dependent stimuli (Miller and Wilson 2008). In epilepsy research, a high temporal resolution could help to gain a deeper understanding of the neuronal spiking patterns involved in the different phases of seizure generation, propagation, and termination (Bower et al. 2012; Truccolo et al. 2011).

Previously, the Victor-Purpura and van Rossum distances (as well as some other spike train distances) were compared against the ISI-distance (Kreuz et al. 2007, 2009) and the SPIKE-distance (Kreuz et al. 2011) regarding their capability to evaluate the overall similarity of two or more spike trains. However, the maximum possible temporal resolution is achieved when one value of dissimilarity is calculated for each time instant and a continuous time profile is obtained. In contrast to the Victor-Purpura and van Rossum distances, the ISI-distance and the SPIKE-distance can be calculated and visualized in such a time-resolved manner. Yet, as already noted by Kreuz et al. (2011), while the original definition of the SPIKE-distance yields the expected results after temporal averaging and also correctly reflects long-term trends by means of a moving average, slightly unreliable spiking events lead to spurious high instantaneous values. In the first part of this study, we remedy this problem by considering spike time differences between nearest spikes instead of separating differences between preceding spikes and differences between following spikes.

After improving the SPIKE-distance, we extend its applicability to situations in which the degree of synchrony between two or more simultaneously recorded spike trains is monitored in real time. In the field of brain-machine interfaces this may be a promising approach to the rapid online decoding of neural signals needed to control prosthetics (Hochberg et al. 2006; Sanchez et al. 2008). In epileptic patients who are refractory to medical treatment, the method could be applied to large ensembles of single neurons close to or within the epileptic focus and then be integrated into a prospective algorithm aimed, e.g., at early seizure detection (Jouny et al. 2011).

Most spike train distances calculate one value of overall spike train synchrony for a time interval once this time interval has passed. There is a lack of measures that can estimate synchrony in a time-resolved way and, even more striking, in a causal way. The SPIKE-distance, similar to the ISI-distance, is calculated from instantaneous values of spike train dissimilarity for which at each time moment not only preceding spikes but also following spikes are taken into account. This noncausal dependence on future spiking does not allow for a real-time calculation. In the second part of this study we modify the SPIKE-distance such that the instantaneous value of dissimilarity for two or more spike trains relies on past information only and can be calculated in real time and in a causal manner.

We illustrate the new methods on three different applications. First we validate the improved definition of the SPIKE-distance and its real-time variant on artificially generated spike trains. We show that these measures are able not only to track the overall synchronization within a group of two or more spike trains but also, because of the reliability added by the revised definition of the distance, to track and visualize changes in instantaneous clustering, that is, to follow the evolution of (dis)similarity patterns within multiple spike trains. Furthermore, we demonstrate additional features such as selective and (internally and externally) triggered temporal averaging as well as the comparison of spike train groups. Subsequently, we present an application to real spike train data in which we analyze single-unit and multiunit activity in an epilepsy patient recorded in a time interval that includes the occurrence of an epileptic seizure. We can show that the new spike train distances are suitable measures to characterize the neuronal firing patterns involved in seizure generation, propagation, and termination.

Finally, as for spike trains, there is a lack of measures that are capable of monitoring time-resolved synchrony in continuous data. This issue is addressed in the third and last application, in which both variants of the SPIKE-distance are used to measure the time-resolved dissimilarity in continuous data that, in a preprocessing step, are first transformed into discrete data. We illustrate this approach on electroencephalogram (EEG) time series whose level of synchronization was modified post hoc in a controlled manner by means of linear mixing.

## METHODS

The different variants of the SPIKE-distance and the ISI-distance (see appendix a) are defined in terms of a function of time, a time profile for each pair of spike trains that gives an instantaneous measure of the (dis)similarity between the two spike trains. The distances are then defined as the temporal average of these dissimilarity profiles, e.g., for the bivariate SPIKE-distance,
*T* denotes the overall length of the spike trains, e.g., the duration of the recording in an experiment. In the following this equation is always omitted, and we restrict ourselves to showing how to derive the respective dissimilarity profiles, e.g., *S*(*t*).

Furthermore, for all distances there exists a straightforward extension to the case of more than two spike trains (number of spike trains *N* > 2), the averaged bivariate distance. This average over all pairs of neurons commutes with the average over time, so it is possible to achieve the same kind of time-resolved visualization as in the bivariate case by first calculating the instantaneous average, e.g., *S*^{a}(*t*) over all pairwise instantaneous values *S*^{mn}(*t*),
*Eq. 1*. All bivariate and averaged bivariate dissimilarity profiles and thus all distances are bounded in the interval [0,1]. The value 0 is only obtained for identical spike trains.

#### Original definition of the SPIKE-distance.

We first review the original definition of the time-resolved profile for the bivariate SPIKE-distance (Kreuz et al. 2011). This profile is denoted as *S*_{o}(*t*) with the subscript “o” standing for original. We then compare it against the improved profile, denoted as *S*(*t*), by which it will henceforth be replaced. The derivation consists of three steps: calculating the instantaneous time differences between spikes, taking the locally weighted average, and normalizing the result.

After labeling the times of the spikes in the spike trains *n* = 1,2 by *t*_{i}^{(n)}, *i* = 1, …, *M*_{n} (with *M*_{n} denoting the number of spikes of the *n*th spike train), we assign to each time instant between 0 and *T* (see Fig. 1*A*) the time of the preceding spikes
*t* = 0 and auxiliary trailing spikes at time *t* = *T* to each spike train.

We then denote the instantaneous absolute differences of preceding and following spike times as

To be more local in time, a weighted average of the two differences Δ*t*_{i}(*t*) with *j* = P,F is employed such that the difference of the spikes that are closer dominates. To this aim, we denote with
*n* = 1,2.

Inserting the inverse of the mean intervals as weights in the locally weighted average
*Eq. 2*.

#### Improved definition of the SPIKE-distance.

In its original definition in Kreuz et al. (2011), the SPIKE-distance proved to be a reliable indicator of the overall level of multineuron synchrony for both simulated and real data. Appropriate moving averaging also allowed us to correctly reflect long-term changes in spike train synchrony. However, as already noted in Kreuz et al. (2011), each individual instantaneous value in itself is less reliable since spuriously high values are obtained during slightly unreliable spiking events. This can be seen in the bivariate example of Fig. 2*A*, where we use a frequency mismatch to construct two spike trains with gradually varying spike matches. In the first half the spikes in the second spike train exhibit increasing distances to the preceding spikes of the first spike train, while in the second half they move closer and closer to its following spikes. Thus we expect a monotonic increase of the instantaneous values followed by a monotonic decrease. These two trends can indeed be recognized. However, they are interrupted by short intervals of spurious high values.

The same kind of spurious high values can be seen in Fig. 2*B* for the averaged bivariate variant *S*_{o}^{a}(*t*) in a multivariate example. Here in the first half we generated four spiking events with increasing jitter within a noisy background, whereas the second half consists of evenly spaced firing events with increasing precision, this time without any background noise. In both cases the spurious high values appear in small intervals during which some of the spikes are still following spikes while others are already preceding spikes. In these intervals the small differences between the spikes within the event are not taken into account. Instead, because of the separation of preceding and following spikes, the “wrong” spikes are compared to each other. Some of the differences among the preceding spikes as well as among the following spikes are very large, and this, according to *Eq. 13*, leads to the high instantaneous values.

A straightforward remedy for this mismatch is to compare the correct spikes to each other, in this case, to compare each spike to the nearest spike in the other spike train. To do so we first rewrite *S*_{o}(*t*). For the sake of brevity we now omit time dependencies. Using 2〈*x*_{j}^{(n)}〉* _{n}* =

*x*

_{j}

^{(1)}+

*x*

_{j}

^{(2)}with

*j*= P,F gives

*t*

_{P}and Δ

*t*

_{F}(

*Eqs. 6*and

*7*) yields

*S*

_{o}(

*t*) can be interpreted as the normalized sum of four weighted differences, one each for the preceding spike from the first spike train

*t*

_{P}

^{(1)}, the following spike from the first spike train

*t*

_{F}

^{(1)}, the preceding spike from the second spike train

*t*

_{P}

^{(2)}, and, finally, the following spike from the second spike train

*t*

_{F}

^{(2)}. However, as we have established above, in some instances these four corner spikes are compared against the “wrong” spikes. This happens because we always restrict the spikes of comparison to the respective other preceding and to the respective other following spike.

A way to resolve this restriction is to allow more flexibility and compare each of these four corner spikes to its most appropriate spike, i.e., the closest counterpart in the other spike train. To this aim, we define
*t*_{F}^{(1)}, *t*_{P}^{(2)}, and *t*_{F}^{(2)} (see Fig. 1*B*). In the improved dissimilarity profile these four terms replace the twofold contributions of |Δ*t*_{P}| and |Δ*t*_{F}|. Furthermore, instead of one local weighted average of the two differences between previous and following spikes with the mean intervals to the previous and the following spikes as weights (*Eq. 11*), the weighting is now carried out for each spike train separately. The local weighting for the spike time differences of the first spike train reads
*S*_{2}(*t*) is obtained for the second spike train. Averaging over the two spike train contributions and normalizing by the mean interspike interval yields
*S*_{o}(*t*) now the two preceding and the two following spikes are no longer compared to each other. Instead, if a following spike in one spike train is the closest spike to a preceding spike of the other spike train (or vice versa) these will now be compared to each other, which leads to lower instantaneous values in the dissimilarity profile.

This modification resolves the problem of spurious high values and, as can be seen in Fig. 2, the desired dissimilarity profiles are obtained. In Fig. 2*A* the observed monotonic increase of the instantaneous values followed by a monotonic decrease mirrors exactly the actual change in the match between the spike times of the two spike trains. For the multivariate example of Fig. 2*B* and the averaged bivariate variant (calculated according to *Eq. 2*) we find in the first half high instantaneous values that reflect the background noise. The presence of four spiking events with increasing jitter within this noisy background is indicated by less and less pronounced drops in the dissimilarity profile. In the second half, where there is no background noise, the evenly spaced firing events with increasing precision are correctly reflected by a rather monotonic decrease of the instantaneous synchrony level that reaches 0 at the perfectly synchronous event at 3,900 ms. Such perfect events for which the value 0 is obtained are marked by vertical dashed lines in the dissimilarity profile.

The firing rate correction introduced between *Eqs. 18* and *19* has the important and desirable consequence that the SPIKE-distance between Poisson spike trains increases with the difference in firing rate (results not shown).

#### Real-time SPIKE-distance.

Here we introduce the real-time SPIKE-distance *D*_{Sr}. This is a modification of the SPIKE-distance with the key difference that the corresponding time profile *S*_{r}(*t*) can be calculated online because it relies on past information only. From the perspective of an online measure, the information provided by the following spikes, both their position and the length of the interspike interval, is not yet available. Like the regular (improved) SPIKE-distance *D*_{S}, this causal variant is also based on local spike time differences, but now only two corner spikes are available and the spikes of comparison are restricted to past spikes, e.g., for the preceding spike of the first spike train
*Eq. 9*; see also Fig. 1*B*). This yields a causal indicator of local spike train dissimilarity:

In Fig. 3 we show the results of the real-time SPIKE-distance for the two cases already used in Fig. 2. As can be seen for the example of two spike trains with gradually varying spike mismatches (Fig. 3*A*), any spike time difference is considered most relevant right at the later of two spikes when *S*_{r}(*t*) goes back to a local maximum value. In the case where the two preceding spikes are closest to each other, it goes back to its maximum value of 1. At these points the mean time interval to the two preceding spikes is exactly half their difference. Any successive period of common nonspiking leads to a decrease of the instantaneous distance values. This is a desired property since common nonspiking is as much a sign of synchrony as common spiking. The decrease is hyperbolic, and its slope depends on the preceding spike time difference.

In addition to the regular trace we also show here a moving average that, in line with the real-time calculation, is causal. While the regular dissimilarity profile exhibits certain fluctuations, this moving average shows that the real-time SPIKE-distance in fact reflects the increasing spike shifts in the center of the interval and the better match of the spike times at the edges.

During irregular spiking in the multivariate example of Fig. 3*B*, the time intervals to the preceding spikes in different spike trains are very variable, and this leads to large fluctuating values of *S*_{r}^{a}(*t*). Within this irregularity, the perfect spiking event at 400 ms results in an abrupt drop to 0, which reflects the delta distribution of the time intervals to the preceding spikes. The successively more jittered events that follow are indicated by a very pronounced short-term increase, followed by a decrease. Here the distribution of time intervals to preceding spikes starts to develop a peak at very small intervals and thus becomes bimodal, which causes the increase. Then, after spikes have appeared in quick succession in all spike trains, the distribution becomes very narrow, which is reflected by the decrease. In the second half, when there is no background noise and the spiking events become less and less noisy, the succession of peaks denoting the events is becoming more and more prominent, with increasing amplitude range but narrower base. Just as in the bivariate case, the short timescale fluctuations in this multivariate example can be eliminated by an appropriate (causal) moving average. Here this moving average exhibits a gradual decrease, which reflects the consistent increase of the spike event reliability.

It is important to note that, in contrast to the noncausal SPIKE-distance (see Fig. 2), the observed peaks are not spurious. They occur at time instants when it is not yet known whether there will in fact be a reliable spiking event or not. This is illustrated in Fig. 4 with two spiking events that are identical (1 spike per spike train) except for the omitted second half of the second event. While for both events the instantaneous values in the first part necessarily have to be identical (and very high since only some of the neurons have recently spiked), the differences in the second part become evident as more and more information becomes available. For the first event, once all neurons have fired, a rapid decrease of *S*_{r}^{a}(*t*) can be observed, while the lower firing reliability in the second event leads to a slow decrease.

#### Practical considerations.

The improved dissimilarity profile *S*(*t*) of the SPIKE-distance is piecewise linear (with each linear interval running from one spike of the pooled spike train to the next) rather than piecewise constant as is the case for the ISI-distance. Therefore, when the localized visualization is desired, a new value has to be calculated for each sampling point and not just once per each interval in the pooled spike train. In cases where the distance value itself is sufficient, the short computation time can be even further decreased by representing each interval by the value of its center and weighting it by its length. Not only is this faster, but it actually gives the exact result, whereas the time-resolved calculation is a very good approximation only for sufficiently small sampling intervals d*t* (imagine the example of a rectangular function: at some point any sampled representation has to cut the right angle). The dissimilarity profile *S*_{r}(*t*) of the real-time SPIKE-distance is hyperbolic and not linear, but here also the exact result can be obtained by piecewise integration over all intervals of the pooled spike train.

The calculation of the SPIKE-distance consists of three steps: In a precalculation step, for each spike the distance to the nearest spike in all the other spike trains is calculated. Successively, for each time instant and each pair of spike trains, the distances of the four corner spikes are first locally weighted and then normalized. These latter steps involve matrices of the order “number of time instants” × “number of spike train pairs,” which for very long data sets with many spike trains can lead to memory problems. The solution to these problems is to make the calculation sequential, i.e., to cut the recording interval into smaller segments and to perform the averaging over all pairs of spike trains for each segment separately (no additional auxiliary spikes are needed except for very huge data sets, for which even the calculation of the first matrix is too memory demanding). In the end, the dissimilarity profiles for the different segments (already averaged over pairs of spike trains) are concatenated, and its temporal average yields the distance value for the whole recording interval.

The computational load scales with the number of spike trains *N* as *N*^{2}. With MATLAB on a notebook with a 2.53 GHz Intel Core 2 Duo processor the calculation of most of the examples in this article took just a few seconds, while the slowest example (the single-unit recordings analyzed in Fig. 9) took <5 min.

More information on the implementation as well as the MATLAB source code for calculating and visualizing both the ISI-distance and the SPIKE-distance (including the real-time variant) can be found at **www.fi.isc.cnr.it/users/thomas.kreuz/sourcecode.html**.

## RESULTS

#### Application to artificially generated data: instantaneous clustering.

By eliminating the spurious high values in *S*_{o}^{a}(*t*), we have gained reliability of *S*^{a}(*t*) for each time instant. This allows us to use the instantaneous matrices of pairwise spike train dissimilarities to divide the spike trains into clusters, i.e., groups of spike trains with low intragroup and high intergroup dissimilarity. There are no limits to the temporal resolution; such a clustering can, in principle, be obtained for each time instant.

In Fig. 5 we show examples of instantaneous clustering for both the regular and the real-time SPIKE-distance. Both examples depict artificially generated spike trains that fall into different clusters. However, in contrast to the overall level of spike train synchrony, which remains rather constant (results not shown), the cluster affiliation of the different spike trains changes every 500 ms. The spike trains in Fig. 5*A* contain four different compositions of two clusters, whereas in Fig. 6 the number of clusters increases until a state of random spiking is reached where each spike train forms its own cluster. As exemplified by four different time instants in each figure, this varying clustering structure is correctly reflected in the pairwise dissimilarity matrices of both the regular and the real-time SPIKE-distance, although the latter only uses information from past spiking.

Both methods not only can distinguish different clusters instantaneously but also are sensitive to the detailed structure within the clusters. An example can be seen in Fig. 5*A* for the fourth spike train within the second 500-ms interval (see arrows). The two methods are quite similar in the first two columns, but they differ considerably in the third and fourth columns. In the third column, while for the regular SPIKE-distance it does not matter whether the time instant is right within a spiking event or in between two events (compare against the first 2 columns), the real-time variant clearly separates neurons depending on whether they have already fired or not (see Fig. 4). In the fourth column, in contrast to the regular SPIKE-distance, the real-time SPIKE-distance is not yet aware of the irregular cluster affiliation in the last 500-ms interval.

The main difference between the two measures is clearly visible in the last column, where the regular SPIKE-distance averages over past and future behavior and thus superimposes the checkered pattern of the third interval with the more disordered clustering of the last interval. This last interval is not yet relevant for the real-time variant, which only reflects the checkered pattern of the past interval.

Similar differences can be observed in the second and third intervals of Fig. 5*B*. The matrices for the regular SPIKE-distance are quite symmetrical with respect to the dissimilarities to the different clusters since the increasing distances between the preceding spikes are balanced by the decreasing distances to the following spikes. In contrast, the real-time SPIKE-distance reflects the increasing distance between the preceding spikes only. Thus, although the spike time differences between adjacent groups are similar, because of the normalization lower dissimilarities are obtained for groups of spike trains whose spikes are further in the past.

So far we have shown clustering matrices obtained at certain time instants. Since such a matrix exists for each and every time instant *t*, it is possible to selectively average over certain time intervals. These intervals do not have to be continuous; selective averaging over separated intervals is possible as well. Four examples are shown in Fig. 6, an average over one individual 500-ms interval, averages over two consecutive as well as two separated 500-ms intervals, and finally the average over the whole time interval. In each case a linear superposition of the individual matrices can be observed. For the whole interval (Fig. 6*B*, 4th column) this means that the four groups of 10 spike trains that frequently belonged to the same cluster can still be identified (see arrow). Because there were two 500-ms intervals (the 2nd and the 5th) where the second and the third spike train group formed one big cluster, these two groups are more closely related than the other groups, which is correctly reflected by the lower values of dissimilarity.

Another option is triggered temporal averaging. Here the matrices are averaged over certain trigger time instants only. The idea is to check whether this triggered temporal average is significantly different from the global average, since this would indicate that something peculiar is happening at these trigger instants.

The trigger times can be obtained either from internal conditions (such as the spike times of a certain spike train) or from external influences (such as the occurrence of certain features in a stimulus).

An example of internal triggering can be seen in Fig. 7, *A–C*. In this artificially generated setup there are 20 simultaneously recorded neurons and almost all of them fire independently from each other, following a Poisson statistic (Fig. 7*A*). The exception is the first neuron, which fires at a lower rate and is assumed to have a strong excitatory synaptic coupling to five of the other neurons (*4*, *8*, *11*, *16*, and *19*). Correspondingly, the spike trains of these neurons were modified such that they contained slightly lagged and jittered copies of the spikes of the first spike train in addition to spikes generated independently. This represents a situation in which each spike in the first spike train causes (triggers) a spike in these spike trains, but there are also other spikes (which might have been caused by different neurons that were not recorded).

The task is to identify these five neurons. This is very difficult via visual inspection, and also the overall temporal average is unable to do so (Fig. 7*B*, *left*). However, these neurons can be identified by averaging over the pairwise instantaneous values obtained for the spike times of the first spike train (the internal trigger instants) only. The resulting dissimilarity matrix shown in Fig. 7*B*, *right*, includes an irregular grid of very small distance values.

Another representation of dissimilarity matrices is hierarchical cluster trees also known as dendrograms (Fig. 7*C*). They are constructed as follows: First, the closest pair of spike trains is identified and thereby linked by a Π-shaped line, where the height of the connection measures the mutual distance. These two time series are merged into a single element, and the next closest pair of elements is then identified and connected. The procedure is repeated iteratively until a single cluster remains. The implementation of this method requires introducing the distance between a pair of clusters. In the single linkage algorithm used here, this distance is defined as the minimum over all the distances between pairs of spike trains in the two clusters. In Fig. 7*C*, the dendrogram obtained from the average dissimilarity matrix (Fig. 7*C*, *left*) does not fall into separate clusters, whereas in the dendrogram of the triggered average (Fig. 7*C*, *right*) the five modified spike trains form one distinct cluster with the first spike train and can thus easily be identified.

In contrast to internal triggering, externally triggered averaging allows certain (external) stimulus features to be related with spike train synchrony and might thus be a promising tool for the investigation of neuronal coding. While the example for internal triggering assumes a simultaneously recorded population of neurons, in the setup of the external triggering example (Fig. 7, *D–F*) just one neuron is recorded for repeated stimulation with the same stimulus. This stimulus is a chirp function, representative of a nonperiodic time-varying stimulus. It is assumed that the neuron is sensitive to negative amplitudes and accordingly it exhibits higher (lower) reliability for local minima (maxima) of the chirp function (Fig. 7*D*). However, to better illustrate the gradual increase in clustering, half the trials were left unaffected from the stimulus.

As the amplitude of the chirp function varies, so does the spike train synchrony of half of the trials. Because of the nonperiodicity of the stimulus this is quite difficult to detect. Detection is facilitated by externally triggered averaging where the triggering is performed on common stimulus features (in this example the amplitude of the chirp function). As can be seen in the dissimilarity matrices (Fig. 7*E*) and even better in the dendrogram (Fig. 7*F*), a decrease of this stimulus amplitude leads to the emergence of a spike train cluster consisting of the modulated spike trains, which indicates their increase in reliability.

In the Supplemental Material we present Supplemental Movie S1, which uses the artificially generated spike trains from Figs. 5 and 6 and includes instantaneous clustering, selective temporal averaging of individual or combined intervals, several examples of triggered averaging, as well as the corresponding dendrograms.^{1} As can be seen in the screenshot of the movie shown in Fig. 8, we added one more feature, the comparison of spike train groups, where the spike trains are manually assigned to subgroups and a block matrix (and the corresponding dendrogram) is obtained by averaging over the respective submatrices of the original dissimilarity matrix. For both distances these spatial averages over groups of spike train pairs are denoted by <∙>_{G}.

#### Application to single-unit recordings from epilepsy patients.

In all previous examples we have used artificially generated spike trains for which the relative levels of spike train synchrony were known and could serve as validation for the spike train distances. Here we present an exemplary application of both the SPIKE-distance and its real-time variant to field data for which no a priori knowledge is available. As field data we chose recordings of neuronal spiking from the human medial temporal lobe. These recordings were performed at the University of Bonn in epilepsy patients undergoing seizure monitoring prior to epilepsy surgery. For a description of the data refer to appendix b.

In Fig. 9*A* we show the spike trains recorded from 42 single units and multiunits of an epilepsy patient during an epoch that contained an epileptic seizure. For this particular patient the epileptic focus was later confirmed to lie in the hippocampal formation of the left brain hemisphere. In this example the SPIKE-distance and its real-time variant exhibit a rather limited amount of variability before and after the epileptic seizure, while during the seizure both distances exhibit strong fluctuations, particularly during the second part of the seizure when the local field potentials exhibit high-amplitude rhythmic activity (data not shown). Remarkably, both distances show a pronounced drop at the beginning of the seizure, at a time when only subtle changes are discernible in the continuous local field potential (data not shown). This could be an indication that an increased level of synchrony among a population of neurons plays an important role in the generation of seizure activity.

Figure 9*B* shows the dissimilarity matrices for both spike train distances averaged over the periods before, during, and after the seizure. These matrices reflect the relationships between all the neurons recorded from different regions of both temporal lobes. In the nonfocal hemisphere, i.e., the side of the brain that does not contain the epileptic focus, *neurons 26–42* exhibit a constant pattern of synchrony before, during, and after the seizure. In the focal hemisphere the general level of spike distance is lower during the seizure than before or afterwards, indicating an elevated level of neuronal synchrony during the seizure. The spike train distances between the two brain hemispheres (Fig. 9*B*, *top right* and *bottom left* quadrants) show a relatively high level of synchrony for the sparsely firing neurons (*26*, *27*, *28*, *33*, *34*, and *37*) that is substantially diminished during the seizure.

These findings are in line with standard theories on seizure dynamics (Engel and Pedley 2008). The spike train distances thus appear to be suitable measures to describe the mechanisms of seizure generation, propagation, and termination at the neuronal level. Specifically, they provide an opportunity to extend our mechanistic understanding of spatiotemporal seizure dynamics by elucidating the functional role of synchronization and desynchronization processes. A separate analysis of spike train synchrony for different regions of the focal and nonfocal medial temporal lobe may provide additional insights about the evolution of a seizure (see Schevon et al. 2012).

The real-time variant could furthermore be integrated into prospective algorithms aimed, e.g., at an early online detection of seizure occurrence (Jouny et al. 2011).

#### Application to continuous data.

To our knowledge, there are no time series analysis methods that allow reliable tracking of instantaneous synchrony in continuous data, i.e., to map their local dissimilarity to one value for each time instant, either once the complete segment is available for analysis or online in real time. Now that we have provided such a method for discrete data, the question arises as to whether it is possible to extend their applicability to continuous data. Thus in the third application we use the SPIKE-distance and its real-time variant to measure time-resolved dissimilarity in continuous signals. For this purpose, a high temporal resolution is once again beneficial since changes in synchronization can occur on very small timescales. Examples include the transition to seizure observable in the EEG of epilepsy patients (Lopes da Silva et al. 2003).

The principal idea is to first transform the continuous time series into one or more sequences of discrete events that are chosen to capture the most relevant characteristics of the data. In the case of neuronal membrane potentials, these are the spikes. Under the assumption that both the shape of the spike and the background activity carry minimal information, neuronal responses are reduced to a spike train in which the only information maintained is the timing of the individual spikes. For the rather oscillatory EEG data that we analyze here, each continuous time series is transformed into two separate sequences of local maxima and local minima; similar choices were made, for example, by Quian Quiroga et al. (2002) and Kugiumtzis et al. (2004). The SPIKE-distance is applied to both kinds of sequences, and the two resulting dissimilarity profiles are averaged. As before, the temporal average of this combined profile yields the SPIKE-distance. For this type of data a causal calculation is also possible; the procedure is the same as used above for the regular SPIKE-distance.

To validate the instantaneous measures of synchronization on controlled continuous data, we used data freely available on the internet via the EEG time series download page (**www.meb.uni-bonn.de/epileptologie/science/physik/eegdata.html**) of the Department of Epileptology at the University of Bonn (Andrzejak et al. 2001). We randomly selected *N* = 10 independent channels (sampling rate 173.16 Hz) and then introduced a time-dependent mixing parameter *m*. For *m* = 0 the original independent channels are maintained, whereas for *m* = 1/*N* all channels are exactly identical. Intermediate values of *m* interpolate linearly between these two extremes. Although not relevant for the event detection, the variance, which is decreased for mixed signals, is adjusted to maintain the appearance of a regular EEG signal.

In Fig. 10*A* we show an illustration of the application of the averaged bivariate SPIKE-distance and its real-time variant to continuous EEG data using just three channels over a short interval of time that includes a transition from independent channels to perfectly synchronous channels. This transition is traced by both measures. Figure 10*B* depicts the dissimilarity profiles of the two SPIKE-distances for alternating piecewise constant variations of the mixing parameter, which converges stepwise toward an intermediate level. Both measures are capable of monitoring these transitions in the level of synchronization. Starting from zero values for identical synchronization and high values for independent channels, two gradual transitions can be observed. However, as the different gradients show, it is easier to trace deviations from identical signals than deviations from independent signals.

## DISCUSSION

There are three different contributions of this study. We have improved the dissimilarity profile of the SPIKE-distance (Kreuz et al. 2011) by eliminating the spurious high values that were previously obtained for eventlike firing patterns, we have added a variant of the SPIKE-distance that is causal and allows real-time calculation, and, finally, we have extended the applicability of both of these variants to continuous data (such as EEG).

The instantaneous reliability obtained by the elimination of the spurious high values has very important consequences. In addition to the improved capability to track the overall level of synchronization within a group of two or more spike trains (see Figs. 2 and 3), it is now possible to visualize instantaneous clustering (Fig. 5 and Supplemental Movie S1), i.e., to represent evolving patterns of (dis)similarity in multiple spike trains either as instantaneous matrices of pairwise spike train dissimilarities or as hierarchical cluster trees (dendrograms). Since such matrices and dendrograms exist for each and every time instant *t*, it is possible to selectively average over certain (continuous or noncontinuous) time intervals (Fig. 6), which do not have to have the same length (see also Fig. 9). In real data these intervals could be chosen to correspond to different external conditions such as normal versus pathological, asleep versus awake, target versus nontarget stimulus, or presence/absence of a certain channel blocker. The fact that there are no limits to the temporal resolution allows further analyses such as internally or externally triggered temporal averaging (see Fig. 7). In real multineuron data, internal triggering might help to uncover certain structures and patterns in neural networks or to detect converging or diverging patterns of firing propagation. External triggering might be used to address questions of neuronal coding, e.g., it could be used to evaluate the influence of localized stimulus features on the reliability of real neurons under repeated stimulation. Finally, another possibility is spatial averaging over spike train groups (see Fig. 8). In applications to real data, these groups could be different neuronal populations or responses to different stimuli, depending on whether the spike trains were recorded simultaneously or successively.

The SPIKE-distance appears to provide a reliable time-resolved measure of spike train similarity without the need to optimize a timescale parameter. This extends the range of possible neurophysiological applications of spike train similarity measures. In the past, methods aimed at measuring spike train similarity have been applied in very controlled situations, typically protocols in which the animal is anesthetized and the same stimulus is repeated over multiple trials. With the SPIKE-distance, it is envisaged that similarity-based approaches could be extended to experiments with awake, behaving animals: the identical trials typically required to select an appropriate timescale parameter are not required for the SPIKE-distance, and the time resolution allows the similarity to be monitored during behavior or in response to complex evolving stimuli; for example, the extent to which neurons in a population synchronize in response to stimuli could be analyzed.

In cases where the complete spike trains are known, the regular SPIKE-distance compares favorably against the real-time SPIKE-distance. It is locally more reliable because it has information about both the past and the future at its disposal. For the real-time SPIKE-distance the lack of information about spikes that occur in the future leads to high values during reliable spiking events. While these values are not spurious (see Fig. 4), they are not really informative since they reflect local uncertainty that can easily be resolved by incorporating all the information available. However, in situations that demand online monitoring, the real-time SPIKE-distance can be relied upon, as we could show both for simulated and field data. A good example for the latter is Fig. 9*B*, in which even with only half the information available the results achieved were very similar to those for the regular SPIKE-distance (compare Fig. 9*B*, *top* and *bottom*).

The capability to monitor the synchrony between two or more spike trains (or continuous recordings) in real time opens up new possibilities for several important applications. Synchronization has been hypothesized to play a pivotal role in neuronal coding (Miller and Wilson 2008), and thus a real-time tracker of spike train synchrony could be an essential tool for rapid online decoding with brain-machine interfaces in order to control prosthetic devices (Hochberg et al. 2006; Mulliken et al. 2008; Sanchez et al. 2008). In epilepsy patients, monitoring the spiking activity of large ensembles of single neurons could lead to a better understanding of the mechanisms of seizure generation. Furthermore, if neuronal spiking patterns turn out to be specific predictors of seizure occurrence as reported by a recent study (Truccolo et al. 2011), the real-time SPIKE-distance could be a viable tool for the implementation of a prospective seizure prediction algorithm (Mormann et al. 2007).

In a similar manner, the SPIKE-distance and its real-time variant can be applied to monitor synchrony in continuous data (which are first transformed into discrete data, see Fig. 10). Even the analysis of mixed continuous-discrete signals is possible (see also Andrzejak and Kreuz 2011). A potential application could be the combined analysis of discrete spike trains and continuous neuronal oscillations. In particular, it would be interesting to investigate neuronal synchronization patterns in dependence of the phase of the local field potential, a scenario reminiscent of the “neuronal communication through neuronal coherence” scenario (Fries 2005).

Note that there is a conceptual analogy between the improved definition of the SPIKE-distance and the similarity measure proposed by Hunter and Milton (2003), which basically sums the exponentially weighted distances from each spike to its nearest spike in the other spike train. However, the main difference, apart from the additional exponential weighting and the parameter (the decay constant of the exponential), is that the calculation is not done in a time-resolved manner by means of a local weighting of the terms of the four corner spikes. Instead, the term for each spike is considered just once. The SPIKE-distance is a rather elementary measure that can be regarded as complementary to cross correlation. Both measures rely on differences in spike timing. However, while the latter estimates spike synchrony in dependence on a time lag but is not sensitive to instantaneous synchrony, the SPIKE-distance estimates instantaneous synchrony but is not sensitive to time lags (these should be eliminated beforehand).

Below we provide an overview of the main properties of both the SPIKE-distance and its real-time variant.

#### Straightforward extension to multiple spike trains and normalization.

The SPIKE-distance directly addresses the lack of approaches to analyze multiple spike train data (Brown et al. 2004). It can readily be applied to very large data sets (some of the data sets analyzed contained >100 spike trains with a total of almost 250,000 spikes). Both the bivariate and averaged bivariate distance are normalized between 0 and 1, where 0 is obtained for identical spike trains only. The same limits hold for the underlying dissimilarity profiles.

#### Different levels of information reduction.

For the SPIKE-distance there are three different levels of information reduction. The starting point is the most detailed representation in which one instantaneous value is obtained for each pair of spike trains. The resulting matrix of size “number of sampled time instants” × “squared number of spike trains” [i.e., #(*t*_{n})*N*^{2}] can be appreciated best in a movie; an example can be found in the Supplemental Material. For the first step of information reduction there are two possible averages: the average over spike train pairs and the temporal average. These commute; either can be performed first. By averaging instantaneously over all pairs of spike trains a dissimilarity profile for the whole population (e.g., Fig. 2) is obtained, whereas the temporal averaging leads to a bivariate distance matrix (e.g., Fig. 5). Finally, in both cases the second average leads to one distance value that describes the overall level of synchrony for a group of spike trains over a given time interval. Note that here we restricted the analysis to average values, but the application of higher-order statistics (such as the variance) is also conceivable. Depending on the application in mind, the appropriate representation can be chosen. Mapping the similarity of a whole population onto one single value might allow for an easier comparison, but one might lose too much information since high and low spike time differences at different time instants or for different pairs of spike trains might cancel each other out, leading to a value that could also be obtained for a constant intermediate level of similarity. This is one example where a higher-order statistic such as the variance could be useful.

#### High temporal resolution: reliance on local information, then global averaging.

The SPIKE-distance relies on instantaneous values that only take into account local information from preceding and—with the exception of the real-time SPIKE-distance—following spikes. Temporal averaging (*Eq. 1*) then yields a more global picture by means of a single-value distance. The fact that the averaging window can be chosen arbitrarily allows for easy comparisons between the levels of spike train synchrony in different time intervals without the need for recalculations. The high temporal resolution that renders a sliding-window analysis obsolete compares them favorably to other widespread measures of spike train variability such as the Fano factor.

#### Dependence on spike train of origin.

The SPIKE-distance takes into account the spike train of origin for each individual spike, and it is not invariant to shuffling spikes among the spike trains. This is in contrast to measures that act on the pooled spike train, such as all measures based on the peristimulus time histogram (PSTH), which yield the same value regardless of how spikes are distributed among the different spike trains and could possibly fail to distinguish between qualitatively different behaviors such as high-reliability spiking and chain-fire bursting (see Kreuz et al. 2011).

#### Computational efficiency.

The SPIKE-distance is based on simple differences and ratios that furthermore can easily be parallelized. For a large set of very long spike trains for which computer memory might be a concern, the dissimilarity profile for successive intervals can be calculated sequentially.

#### Timescale independence and absence of parameters.

The SPIKE-distance is invariant to stretching and compressing of the spike trains. It is also timescale adaptive since the information used at each time instant is not contained within a window of fixed size but within a time interval whose size depends on the local rates of the spike trains. In contrast to timescale-dependent measures such as the Victor-Purpura distance (Victor and Purpura 1996) or the van Rossum distance (van Rossum 2001), parameter-free single-valued methods give a more objective and comparable estimate of neuronal variability. Other drawbacks of timescale-dependent measures are the computational cost and the time and effort that are needed to find the right parameter. Moreover, it is not at all guaranteed that a right parameter exists.

One of the main arguments for the use of timescale-dependent measures of spike train (dis)similarity is their potential insight into the precision of the neuronal code (Victor and Purpura 1996). This argument has recently been reevaluated by Chicharro et al. (2011). According to this study the optimal timescale obtained from a spike train discrimination analysis is far from being conclusive. Rather, it results in a nontrivial way from the interplay of many different factors such as the distribution of the information contained in different parts of the response and the degree of redundancy between them.

#### SPIKE-distance is complementary to ISI-distance.

The SPIKE-distance and its real-time variant share many properties with the ISI-distance (see appendix a), but there are also a few essential differences. The ISI-distance is based on interspike intervals and quantifies covariations in the local firing rate, while the SPIKE-distance tracks synchrony mediated by spike timing. This does not mean that the ISI-distance is sensitive to rate coding and the SPIKE-distance sensitive to temporal coding. Rather, it is the relative timing of interspike intervals and spikes, respectively, that matters.

Another difference is the range of values obtained. For the ISI-distance the piecewise-constant dissimilarity profile and the distance itself cover the same range. In particular, the distance value can come arbitrarily close to the maximum value of 1. This is not the case for the SPIKE-distance. Here only the piecewise linear dissimilarity profile can cover the whole range of values, whereas the value of the SPIKE-distance always seems to be below 0.5. For Poisson spike trains of equal rate the expectation values for the ISI-distance and the SPIKE-distance are 0.5 and 0.295 (numerical results), respectively.

The MATLAB source codes for the ISI-distance and the SPIKE-distance (including the real-time variant) as well as additional material (including a longer version of Supplemental Movie S1) can be found at **www.fi.isc.cnr.it/users/thomas.kreuz/sourcecode.html**.

## GRANTS

T. Kreuz acknowledges funding support from the European Commission through the Marie Curie Initial Training Network “Neural Engineering Transformative Technologies (NETT),” project 289146, as well as from the Italian Ministry of Foreign Affairs regarding the activity of the Joint Italian-Israeli Laboratory on Neuroscience. C. Houghton acknowledges support from the James S McDonnell Foundation through a Scholar Award in Human Cognition. R. G. Andrzejak acknowledges grant FIS-2010-18204 of the Spanish Ministry of Education and Science. F. Mormann acknowledges support from the Lichtenberg Program of the Volkswagen Foundation.

## DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the author(s).

## ENDNOTE

At the request of the author(s), readers are herein alerted to the fact that additional materials related to this manuscript may be found at the institutional website of one of the authors, which at the time of publication they indicate is **www.fi.isc.cnr.it/users/thomas.kreuz/sourcecode.html.** These materials are not a part of this manuscript, and have not undergone peer review by the American Physiological Society (APS). APS and the journal editors take no responsibility for these materials, for the website address, or for any links to or from it.

## AUTHOR CONTRIBUTIONS

Author contributions: T.K., D.C., C.H., R.G.A., and F.M. conception and design of research; T.K., R.G.A., and F.M. performed experiments; T.K. analyzed data; T.K., D.C., C.H., R.G.A., and F.M. interpreted results of experiments; T.K. prepared figures; T.K. and F.M. drafted manuscript; T.K., D.C., C.H., R.G.A., and F.M. edited and revised manuscript; T.K., D.C., C.H., R.G.A., and F.M. approved final version of manuscript.

## ACKNOWLEDGMENTS

We thank Nebojša Božanić, Stefano Luccioli, Antonio Politi, Alessandro Torcini, Heinz Beck, Martin Pofahl, Emily Caporello, and Timothy Q. Gentner for useful discussions. We also thank Christian E. Elger for patient access and Volker A. Coenen for electrode implantation.

## APPENDIX A: THE ISI-DISTANCE

While the dissimilarity profile of the SPIKE-distance is extracted from differences between the spike times of the two spike trains, the dissimilarity profile of the ISI-distance (Kreuz et al. 2007, 2009) is calculated as the instantaneous ratio between the interspike intervals *x*_{ISI}^{(1)} and *x*_{ISI}^{(2)} (*Eq. 5*) according to
*Eq. 1* is performed on the absolute value of the ISI-ratio; thus both kinds of deviations are treated equally. Since the ISI-distance relies on the instantaneous ISI-values and thus requires knowledge about the following spikes, no causal real-time extension is possible. The dissimilarity profile is condensed into a distance value by means of temporal averaging analogous to *Eq. 1*. In Andrzejak and Kreuz (2011) the ISI-distance has been integrated in a measure that can detect unidirectional coupling not only between spike trains but also between spike trains and time-continuous flows.

## APPENDIX B. FIELD DATA: SINGLE-UNIT RECORDINGS FROM EPILEPSY PATIENTS

All experimental recordings were performed prior to and independently from the design of this study and were reviewed and approved by the Medical Institutional Review Board at the University of Bonn; all subjects gave informed consent before participating. Electrode locations were based exclusively on clinical criteria and were verified by MRI and by computer tomography coregistered to preoperative MRI. Each electrode probe had eight high-impedance (typically 800–1,000 kΩ) microwires (platinum/iridium, 40-μm diameter) protruding from its tip (Ad-Tech, Racine, WI). The differential signal from bipolar montages of the microwires (1 wire was used as local ground) was amplified with a 128-channel Neuralynx system (Digital Lynx 10S, Neuralynx, Bozeman, MT), filtered between 0.1 and 9,000 Hz, and sampled continuously at 32 kHz for later processing and analysis. The Neuralynx headstages used had unity gain, very high input impedances (∼1 TΩ), and no phase shift. Spike detection and sorting were performed after band-pass filtering the signals between 300 and 3,000 Hz (Quian Quiroga et al. 2004).

## Footnotes

↵1 Supplemental Material for this article is available online at the Journal website.

- Copyright © 2013 the American Physiological Society