|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
INNOVATIVE METHODOLOGY
1Department of Physics, University at Albany, State University of New York, Albany; 2Department of Neuroscience, Albert Einstein College of Medicine, Bronx; 3Cognitive Neuroscience and Schizophrenia Program, Nathan Kline Institute, Orangeburg; 4Department of Psychiatry, Columbia College of Physicians and Surgeons, New York, New York; 5Department of Internal Medicine, Yale University School of Medicine, New Haven, Connecticut; 6Neuroscience Department, Brown University, Providence, Rhode Island; 7The J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville; and 8Center for Complex Systems and Brain Sciences, Florida Atlantic University, Boca Raton, Florida
Submitted 24 June 2005; accepted in final form 2 February 2006
|
|
ABSTRACT |
|---|
|
|
|
INTRODUCTION |
|---|
|
The last decade has seen great developments in blind source separation (BSS) and independent component analysis (ICA) techniques, such as Infomax ICA (Bell and Sejnowski 1995
), FastICA (Hyvärinen and Oja 1997
), and second-order blind identification (SOBI) (Belouchrani et al. 1993
). These algorithms have been useful in identifying sources in EEG and MEG signals using both ensemble-averaged data (Makeig et al. 1997
; Särelä et al. 1998
; Vigário et al. 1999
) and single trials (Cao et al. 2000
; Jung et al. 1999
; Makeig et al. 2002
; Tang et al. 2002
). However, along with its strengths, each technique has limitations, and often these limitations can be addressed. ICA, for example, allows reliable source (component) separation with minimal a priori assumptions and constraints. Its limitation is that although trial-to-trial variability can assist in separation, these effects are not explicitly considered and quantified, and these are substantial opportunities missed. Also, like many other techniques, ICA solves for maximal independence of components, despite the fact that components are often dynamically coupled. Thus although ICA may be reasonable for source separation per se, it is not explicitly designed to quantify the dynamical interactions between the neuronal ensembles that generate the components. This is one of the most interesting and important aspects of the behavior of brain function and a major focus of our efforts.
Here we describe a model of the sensory-evoked neural response that is more realistic than previous models in that it explicitly models trial-to-trial amplitude as well as latency variability in single-trial responses. Using this model, we derive the differentially variable component analysis (dVCA) algorithm, and demonstrate how different variability patterns in neural ensemble activity can be used to separate and identify their component signals. Using simulations, we evaluate not only the ability of dVCA to characterize single-trial responses, but also its robustness to noise. We also compare the specific capabilities of dVCA with two other popular decomposition techniques: principal component analysis (PCA) and ICA. Finally, we demonstrate how to apply dVCA by analyzing neuronal ensemble responses recorded intracortically in macaque V1. Throughout this paper we emphasize the ability of the dVCA algorithm to 1) more accurately account for observed trial-to-trial variability, 2) use differential variation in the amplitudes and latencies to separate and identify sources, 3) avoid enforcing statistical independence of the sources, and 4) more accurately estimate the ongoing background activity. It also merits emphasis that, in addition to its uses in ERP and ERF source separation, the dVCA algorithm is a generally useful instrument for defining and analyzing single-trial neural responses.
|
|
METHODS |
|---|
|
Trial-to-trial variability of evoked responses can conceivably take many forms: latency shifts, amplitude variation, and even waveshape changes. Our earlier findings have shown that the observed variability can be well described by amplitude and latency variations of stereotypic response waveshape components (Truccolo et al. 2002
). Of course, waveshape variability also exists, but robust single-trial amplitude and latency estimates are nonetheless obtainable with the assumption of fixed component waveshapes. For this reason, we model the response evoked from a single neural ensemble as a stationary stereotypic waveshape that can vary in amplitude and onset latency from trial to trial. We write the response evoked in a given trial mathematically as
s(t
), where the function s( · ) represents the waveshape of the response as a function of elapsed time t,
represents the trial-specific amplitude scaling factor of the response, and
represents the trial-specific onset latency shift. Furthermore, when multiple neural ensembles are engaged in the response to a stimulus, the activity of each ensemble is represented in the model as a separate waveform with a distinct trial-specific amplitude and latency shift. It is important to note that by modeling both the waveshape and its amplitude and latency, there is degeneracy in the model because an overall change in amplitude scale or latency shift can be described either by the amplitude and latency parameters or by the overall scale and shift of the waveshape. To eliminate this degeneracy, we take as a convention that the ensemble average amplitude scaling factor of the response over the recorded trials is unity, 

= 1, and the ensemble average latency shift is zero, 

= 0.
In many experimental paradigms, investigators use several detectors positioned at different locations to measure the evoked responses of neural ensembles. The degree to which a detector records a signal evoked by a particular source depends on many factors including the position and orientation of the source relative to the detector. To describe this sourcedetector coupling, we introduce a coupling matrix C, where the matrix element Cmn describes the degree to which the mth detector detects the nth source. This coupling matrix is known as the mixing matrix in the source separation literature and as the lead-field matrix in electrophysiology.
During the course of an experiment the investigator records responses to multiple presentations of a stimulus. Each presentation is typically called a trial. For the rth recorded trial, we model the data x(t) recorded by the mth detector in component form as
![]() | (1) |
nr is the amplitude scale of the nth source during the rth trial,
nr is the latency shift of the nth source during the rth trial, sn( · ) is the waveshape of the nth source, and
mr(t) is the unmodeled part of the data recorded in the mth detector during the rth trial. The unmodeled part of the data record is typically a combination of the recorded background activity along with any noise in the detector. For simplicity, we assume that
mr(t) has zero mean. Thus Eq. 1 describes the data recorded in a given detector during a given trial as a sum of the signals generated by each of the N neural sources appropriately scaled in amplitude and shifted in latency for that trial and also scaled according to the coupling between each source and that detector plus the signals that we do not yet understand or care to understand. We call Eq. 1 the multiple component event-related potential (mcERP) model of evoked activity.
By adopting the mcERP model of the evoked responses, we implicitly adopt a well-defined set of characteristics capable of describing a neural source. The term "component" refers to the waveshape describing the temporal activation pattern of a particular neural source. Because no information regarding the spatial locations or distributions of the neural sources has gone into the model, this model does not distinguish between two spatially distinct groups of neurons that produce the same waveshape varying identically with latency and proportionally in amplitude. However, in such a situation, examination of the estimated coupling matrix would reveal two spatially distinct sources if there are detectors positioned within the proximity of each source. The major advantage obtained by estimating the coupling matrix is that practical experience in conjunction with previously obtained physiological or anatomical data suggesting source distributions can be used to independently evaluate the solutions obtained with this technique. The disadvantage, aside from withholding information from the algorithm, is that there remain two degeneracies in the model. First, there is no specified order to the sources. Second, there is no specified scale for the coupling matrix; one could halve the coupling matrix elements while doubling the magnitude of the source waveshapes and obtain an equally valid solution. These degeneracies, which are present in every other linear source separation algorithm, including PCA and ICA, pose no difficulties to the interpretation of the results. In our implementation we have chosen to normalize the columns of the coupling matrix so that the maximum value is equal to one. However, it should be noted that this scaling degeneracy could be remedied by adopting a physical model of the source currents (Knuth and Vaughan 1998
). With these conventions defined, it is a straightforward matter to map the parameter values estimated by the dVCA algorithm back to a model of the single-trial response. A single-trial component is found by applying the single-trial estimates to the relevant portion of the mcERP model above, which is simply
nrsn(t
nr). The single-trial ERP measured by the mth detector can be found by summing the contributions from each component,
n=1N Cmn
nrsn(t
nr), which gives a noise-free estimate of the recorded evoked response.
One last notable strength of the mcERP model is that no component waveform is required to be present in every trial. In other words, a single-trial amplitude of zero is allowable for any component. This is important because it acknowledges the empirical fact that in a given site in the brain, the trial-to-trial response is not stereotypic. The general resemblance of most single-trial responses to the averaged waveform arises from the fact that in most experimental trials, similar configurations of neural elements are activated with trial-varying latency and amplitude; by the same token, the occasional extreme deviation from the "stereotypic" waveform shows that essentially different ensemble configurations can substitute on a subset of the trials. By not assuming a priori that identical neural processes are active during every trial of the experiment, one is able to investigate the possibility that different processing strategies are used by the brain during the course of the recordings.
The dVCA algorithm
Bayesian probability theory is used to derive equations that use the recorded data to identify the maximum probability estimates of the mcERP model parameters in Eq. 1. Like PCA and ICA, each source waveshape is modeled as a pointwise time series where each point of the time series is considered to be a model parameter. The entire set of model parameters consists of these time-series points along with the mcERP parameters described above. Equations are obtained for each model parameter and an iterative fixed-point method is used to solve them jointly. The implemented algorithm is hierarchical in the sense that components are added one by one as the data analysis progresses. Figure 1 provides a flowchart describing the analysis stages used by the algorithm. In the text that follows, we present a basic outline of the operation of the algorithm. A more detailed description, including derivations of the equations, can be found in APPENDIX A.
|
The dVCA algorithm improves on the average ERP in two ways: by extending our ability to address single-trial responses and trial-to-trial variability, and by generalizing the one-component stereotypic waveshape model of the average ERP to a multiple-component stereotypic waveshape model. The trial-to-trial variability addressed by dVCA includes only single-trial amplitude and latency variation. Although retaining the concept of a stereotypic component waveshape from the average ERP may be seen as a liability, the state-of-the-art in signal processing does not allow us to consider radical multiple-component waveshape variations because that approach could easily lead to an algorithm that is overly sensitive to noise. Below, we will apply the dVCA algorithm to field potential data recorded from cortical area V1 of the macaque and demonstrate how dVCA can be used to address significant waveshape variations.
The fact that this algorithm is derived from a signal model that describes the recordings as a mixture of components implies that dVCA has something in common with techniques such as factor analysis (FA), the more familiar principal component analysis (PCA), and the more recent independent component analysis (ICA). This is in fact the case, and by reverting to a more simplistic signal model, assigning the appropriate prior probabilities and marginalizing over the source waveshape parameters one can derive, using the same methodology described in APPENDIX A, these more familiar techniques (Knuth 1997
, 1999
, 2005
). PCA assumes that the source amplitudes are Gaussian-distributed and chooses a solution such that a single source will contribute as much of the signal variance as possible. The result is an unphysical separation, which is constrained only by this unnatural variance assumption. ICA allows for more general assumptions regarding the distribution of the source amplitudes. ICA basically assigns non-Gaussian priors to the source amplitude distributions. By assuming that the sources are independent (or equivalently that these priors are factorable) and that there are as many sources as detectors, an integration over all possible source waveshapes can be performed, resulting in a straightforward learning rule for the coupling (mixing) matrix. Mixed signals will tend to have more Gaussian amplitude distributions, so ICA strives to find a separation matrix that minimizes the Gaussianity of the results, thus optimally separating the signals.
dVCA relies on the prior information about the phenomenon of trial-to-trial variability. This information is incorporated by explicitly introducing new relevant model parameters into the signal model (Eq. 1). In the derivation, we assume that we are ignorant about the values of the model parameters and therefore assign uniform priors to the source waveshapes, single-trial amplitudes, and single-trial latencies. One can also view this as an assignment of a multidimensional uniform density, which is maximally ignorant, but also conveniently factorable, because the probability of each variable is independent of one another. However, because it is a multidimensional uniform density, any combination of values of amplitude and latency from any source is as equally probable as any other. So the algorithm is not biased toward any particular values of amplitude or latency given the amplitude or latency of the same source in another trial or another source altogether. In this sense, dVCA does not impose any condition of independence in the basic assumptions underlying the algorithm. All possibilities are equally probable.
Because dVCA makes no explicit assumptions regarding independence, this algorithm has the ability to identify distinct sources with similar waveshapes or similar amplitude and latency variability patterns. We have performed some early experiments demonstrating the ability of dVCA to identify coupled sources and sources with similar waveshapes (Shah et al. 2003
). Not surprisingly, if the sources have similar waveshapes, but different trial-to-trial variability patterns, they can be separated. However, sources with identical waveshapes and identical trial-to-trial variability are inseparable. Later, we demonstrate using field potential data recorded from cortical area V1 of the macaque that coupled sources can be identified. Specifically, we identify amplitudeamplitude coupling across components, latencylatency coupling across components, and amplitudelatency both within and across components.
|
|
RESULTS |
|---|
|
The dVCA algorithm was evaluated using synthetic data to demonstrate the utility of amplitude and latency variability in the identification of multiple evoked components and also to assess the robustness of the algorithm in the presence of noise. We simulated electric field potentials recorded from a linear-array multielectrode with 15 channels spanning the cortical laminae in V1. Specifically, we designed three synthetic ERP components (Fig. 2A) sampled at 2 kHz to approximate the neural ensemble response to diffuse red-light stimulation in macaque V1 (Givre et al. 1995
; Mehta et al. 2000
). Figure 2B shows the field potentials derived from the noise-free synthetic data as the detectors in the multielectrode would record them. Superimposing the field potentials from the three sources and approximating the second spatial derivative of this summed activity yields the current source density (CSD) profile shown in Fig. 2C. The value of the CSD technique (Freeman and Nicholson 1975
; Nicholson and Freeman 1975
; Mitzdorf 1985
; Tenke et al. 1993
) is evident as it both localizes the neural activity at the current sources and sinks, and eliminates volume-conducted, or far-field, activity (i.e., see component c3 below). Many of our results will be displayed using the CSD profile. We emphasize that dVCA works with the recorded field potentials only, and that displaying the CSD profile is used only to emphasize the quality of our results by retrieving accurate (in simulations) or physiologically reasonable (in real data) localizations of neural activity.
|
In the majority of simulations, uncorrelated Gaussian-distributed, additive noise was introduced, as in Eq. 1. The signal-to-noise ratio (SNR) is different for each component from trial to trial because the three simulated components have different single-trial amplitudes. For this reason, we specify the trial-average SNR for each component. Since the mean amplitude scale of the components is set to one, this is easily computed from the standard deviation (SD) of the detected component waveshape divided by the SD of the additive noise
![]() | (2) |
component = 0.876, 0.174, and 0.935 for c1, c2, and c3, respectively. Because we are using multiple detectors, the values of
component were found by multiplying the SD of the waveshape
wave by the L2 norm of the coupling coefficients for that component. For the nth component, this is
![]() | (3) |
The performance of the dVCA algorithm was evaluated in several ways. First, the ability of dVCA to separate the three mixed signals was measured using a unit-normalized Amari error (Amari et al. 1996
), which describes the degree of component mixing (details can be found in APPENDIX B, Eq. B5). A completely separated solution gives an Amari error of zero, whereas the worst case mixture gives an Amari error of one. For three signals of equal variance, an Amari error of 0.065 corresponds to 10% mixing.
Second, the ability of dVCA to estimate each component waveshape was evaluated by calculating the fractional root-mean-square (RMS) error of the estimated waveshape as compared with the correct waveshape. Note that before this comparison could be made the estimated components
(t) needed to be rescaled and permuted because of the indeterminacy described by Eq. B1. For the jth component
![]() | (4) |
Finally, the accuracies of the amplitude and onset latency estimates for the jth component were evaluated by computing the deviation from the correct values for the entire set of single-trial estimates using
![]() | (5) |
![]() | (6) |
), then the error one obtains by working with the average ERP necessarily must be greater than or equal to
because it does not account for trial-to-trial variability. Thus errors in the dVCA estimates below the SDs of the variability represent a significant improvement over the standard assumption that amplitudes and latencies do not vary. Effects of amplitude variability
We first demonstrate that amplitude variability aids in the separation process. Eleven experiments using synthetic data, each consisting of 50 simulated trials from the source component configuration described above, were performed where the degree of variability of the single-trial amplitudes was controlled. To generate the synthetic data, 50 single-trial amplitudes for each component were randomly sampled from a log-normal distribution with a sample mean µamp = 1.0 and sample SDs of
amp = {0, 0.0625, 0.125, 0.1875, 0.219, 0.25, 0.375, 0.5, 0.625, 0.75, 1.0} for each of the 11 sets of synthetic data. Because latency variability was not simulated in this set of experiments, the single-trial latency parameters were set to zero (
jr = 0). Given the component waveshapes, the coupling matrix, and the single-trial amplitudes and latencies, Eq. 1 was used to generate the synthetic data. The SD of
noise = 0.217 was used for the noise, resulting in SNRs of 12.1, 1.9, and 12.7 dB (4.04, 0.80, and 4.31 in terms of SD ratios) for c1, c2, and c3, respectively. The dVCA algorithm was used to estimate the coupling matrix, component waveshapes, and single-trial amplitudes and latencies from the synthetic data. Because the true parameters were known, the performance of the algorithm was evaluated as previously described.
Without amplitude variability, dVCA was unable to separate the components as demonstrated by the high Amari error of 0.219 (see Fig. 3A). A small degree of amplitude variability,
amp = 0.25, renders the problem solvable as demonstrated in Fig. 3A, where the Amari error drops below 0.05 corresponding to a fractional RMS waveshape error of <10% (SNR
23 dB) for three equal variance components, and remains relatively constant with increasing variability hovering about an average of 0.028. Figure 3B shows the dramatic improvement in the waveshape estimation as quantified by the RMS errors, which rapidly drop to levels commensurate with the SNRs of the individual components. The effect of amplitude variability on the single-trial amplitude estimates (not shown) remained relatively constant for
amp
0.25 with mean SDs of the errors in the single-trial amplitude estimates of 0.014, 0.076, and 0.010 and for c1, c2, and c3 respectively. This is well below the SD of the amplitude variability. It should be noted that even though the true onset latencies were set to zero in these simulations, dVCA still estimates these quantities. The errors of the onset latency estimates (not shown) also remained relatively constant with respective mean SDs of 0.417, 2.059, and 1.000 ms.
|
Effects of latency variability
Next we demonstrate that latency variability also aids in the separation process. Eight experiments using synthetic data, each consisting of 50 synthetic trials from the source component configuration described above, were performed where the variability of the single-trial latencies was controlled. In these experiments there was no amplitude variability (
jr = 1). To generate the simulated data, 50 single-trial latencies were randomly drawn from a Gaussian distribution with sample mean µlat = 0 and sample SDs of
lat = {0, 1.25, 2.5, 3.75, 5.0, 6.25, 7.5, 10.0} ms for each of the eight simulations. The same noise variance was used as in the amplitude variability case.
Amari error was found to decrease with increasing latency variability (Fig. 3C), dropping to <0.05 with
lat
7.5 ms. Component waveshape estimation was also found to improve with increasing onset latency variability, although the effect is not nearly so dramatic as in the amplitude variability case. Moreover, with the onset latency variability ranges considered, the accuracy of the algorithm's estimates never attained the levels found with amplitude variability. This difference between the effects of amplitude and latency variability was not attributable to the sampling distributions because we controlled for both the sample mean and variance. Instead, the most likely reason for this effect is the fact that in terms of signal amplitude, amplitude variability is a first-order effect, whereas latency variability, when written as a Taylor expansion, is a second-order effect dependent on the first derivative of the waveshape. Although the amplitudes were all set to one in the simulated data, dVCA still estimates these quantities. Again, these amplitude estimate errors were low for
lat
7.5 ms with the mean SD of the errors in the estimates equal to 0.017, 0.077, and 0.011 for c1, c2, and c3, respectively, which is on the order of the errors seen in the amplitude variability trials. Onset latency estimate errors also remained relatively constant with SDs of 0.250, 2.250, and 1.142 ms, respectively.
Sensitivity to additive white Gaussian noise
In this first noise study we examined the robustness of dVCA in the face of additive white Gaussian noise by simulating 12 different noise levels. Each simulation relied on 50 trials of synthetic data where variability in both the amplitudes and latencies of the components was modeled. Variability ranges were in accordance with those observed in our previous investigations (Truccolo et al. 2001
). Single-trial amplitudes were drawn from a log-normal distribution with sample mean µamp = 1.0 and sample SD
amp = 1.0. Single-trial latencies were drawn from a normal distribution with sample mean µlat = 0 and sample SD
lat = 10.0 ms. The synthetic signal from each electrode (detector) was contaminated with a unique white Gaussian noise waveform, which is specified by
mr(t) in Eq. 1.
The component-specific SNRs and resulting Amari errors, listed in Table 1 and shown in Fig. 4A, indicate a relatively smooth increase in Amari error with decrease in c1 SNR. However, a significant jump in error occurs as the SNR of c2 passes below 17 dB (note that this corresponds to a c1 SNR = 3 dB; see figure caption), suggesting that an SNR level of about 17 dB may denote a transition from a useful data set to a prohibitively noisy one. These results can be compared with the waveshape fractional RMS errors, which grow exponentially with decreasing SNR (Fig. 4B). The errors reach 1.0, signifying that the deviations in the estimates are on the scale of the waveshape itself, at about 15 to 17 dB for the localized components and about 17 dB for the far-field component. Figure 4E shows waveshape results for four different noise levels, providing a better idea of the quality of the separation under these conditions. Most noteworthy is the fact that the majority of the waveshape error arises from the high-frequency contamination of the Gaussian noise rather than mixing of the components. Much of this could easily be improved by incorporating a prior probability describing the expectation that components should be slowly varying with respect to typical sampling rates, which would effectively low-pass filter the results. As we will demonstrate, explicitly filtering a real data set can remove important signals, and alter others (Bogacz et al. 2002
; Mocks et al. 1986
). Regardless, some distortion in the positive phase of c1 can be seen at SNR1 = 3.0 dB, which is much more easily noticeable at SNR1 = 15 dB, where only c1 and c3 remain detectable. The improved accuracy in the estimation of the far-field component c3 (illustrated in Fig. 4B) is most likely explained by the fact that each electrode in the array provides information about the far-field source, whereas for local sources some detectors have small signals.
|
|
Figure 4D shows the behavior of the onset latency estimates, which are less well estimated than the amplitudes. In addition, the degradation of the quality of the far-field estimates was not noticeably different from those of the local sources with 95% of the estimates being within the range of variability down to an SNR of 3 to 1 dB, and 68% of the estimates within the range of variability down to 15 to 18 dB. Figure 4G shows scatterplots of the true onset latencies versus the estimates for c1. The diagonal pattern, which indicates a high level of predictability, is almost lost by 15 dB (r = 0.392). Note that as a result of the difference in variability levels of the amplitudes and onset latencies, the distribution of points in Fig. 4F should not be directly compared with those in Fig. 4G because they are effectively at different magnifications.
Sensitivity to correlated far-field noise
In an effort to more accurately simulate the conditions that may be experienced during a real experiment, we designed a second noise study to determine the effect of ongoing correlated far-field activity on dVCA performance. The ongoing far-field activity (noise) was modeled as a time series with a 1/f spectrum so that correlations would exist at all timescales. To simulate its far-field nature the ongoing activity was coupled identically to each detector, which can also be viewed as a spatial correlation across channels.
Again, 12 levels of the ongoing noise amplitude were tested; however, in this study Gaussian noise was not added to the individual electrodes, only the ongoing correlated far-field noise was used. Noise amplitude was computed from the sample variance of the time-series noise signal. The specific noise levels, SNRs of the three components and resulting Amari errors are listed in Table 2. Figure 5A shows the Amari error smoothly decreasing with increasing SNR. The Amari error reaches the same level of error as in the Gaussian noise case with 20 dB less noise, amounting to large errors around a c1 SNR of 3 to 8 dB. It is apparent that this type of noise more severely limits the ability of dVCA to separate signals. Figure 5B shows the component waveshape errors blowing up around 1 dB, with the effect on the far-field component c3 (red) being understandably catastrophic because both c3 and the noise are correlated across detectors. The amplitude errors reach unacceptable levels around 0 to 3 dB (Fig. 5C). However, most interesting are errors of the onset latencies (Fig. 5D). The errors for the local sources reach high levels (68% of estimates being within the range of variability) between 0 and 1 dB, whereas the errors for the far-field component c3 are large across the entire SNR range considered with 95% of the estimates never being within the range of variability. This is because the correlated far-field noise is severely interfering with the ability to identify the far-field component in any given trial.
|
|
Neuroscientists have several available techniques to analyze evoked responses, although each of these techniques is based on different assumptions and signal models. Here we briefly evaluate two popular techniques, PCA and ICA, and compare them to dVCA using our synthetic data, which as described above, is based on experimental findings regarding both the componentry and their laminar distribution (Givre et al. 1995
; Mehta et al. 2000
) and observed trial-to-trial variability (Truccolo et al. 2002
). ICA was performed by applying the Infomax ICA algorithm in the EEGLAB toolbox (Delorme and Makeig 2004
). The runica.m algorithm (Makeig et al. 1996
) was used both with and without the "extend" option (extended ICA), which allows ICA to model sub-Gaussian sources (Lee et al. 1999
). Because extended ICA consistently gave better results in the single-trial data sets exhibiting trial-to-trial variability, we will present those results here and in Fig. 6. PCA was performed using the EEGLAB routine runpca.m (Colin Humphries, CNL/Salk Institute, August 1997), which computes the singular value decomposition (SVD) of the data and identifies the resulting eigenvectors as the estimates of the source waveshapes. Because each data set has 15 channels, both PCA and ICA estimate 15 source waveshapes. Because it is known that the synthetic data are composed of three sources, the three estimated source waveshapes with highest correlation to the three original sources were chosen for analysis, whereas the remaining estimated sources were ignored. This avoids the difficult issues of estimating the number of sources in both PCA and ICA, and thus gives them a slight advantage.
|
amp = 1.0, normally distributed latency variability with sample mean µlat = 0 and sample SD
lat = 10.0 ms, and possessed additive Gaussian noise identical in SNR level to that of the no variability case. These two data sets were analyzed using PCA and ICA in two different ways: first by averaging the epochs and analyzing the average response (average analysis), and second by treating the data as a long string of concatenated single-trial responses (single-trial analysis). Figure 6A shows the source waveshape results for PCA, extended ICA, and dVCA in the single-trial analysis cases. Note that only the dVCA analysis of the variable responses results in accurate identification of the underlying source waveshapes. The Amari errors in Fig. 6B quantify the degree to which the components were correctly identified in the four cases. In this example, the performance of the original ICA algorithm was comparable to PCA. Because dVCA can analyze only single-trial responses, the dVCA analysis consisted of analyzing the single-trial cases, resulting in only two bars in the bar graph. Note that dVCA performs poorly when there is no trial-to-trial variability. However, the presence of trial-to-trial variability dramatically improves the performance of dVCA, enabling it to surpass both PCA and extended ICA in separation quality.
Finally, we compared the ability of extended ICA and dVCA to identify source waveshapes in the face of additive white Gaussian noise. These data sets are identical to those analyzed in the earlier section on Sensitivity to additive white Gaussian noise. Note also that the data set with the smallest noise level (SNR1 = 15 dB) is the same data set that was used in the variable single-trial case considered above. Figure 6C demonstrates that dVCA consistently outperforms extended ICA with respect to robustness to noise. It should be noted that the extended ICA algorithm outperformed the original ICA algorithm in all but two cases. At SNR1 = 9 dB, the Amari error of extended ICA was 0.240, whereas the Amari error of ICA was 0.177 compared with the Amari error of dVCA, which was 0.100. In the second case, the Amari error of extended ICA and ICA differed by only 0.001.
Application to cortical field potential data
To further demonstrate the utility of dVCA, we applied the algorithm to field potentials recorded from primary visual cortex of a male macaque monkey during the cuing period of an earlier selective attention study (Mehta et al. 2000
) for which all animal care and use procedures were approved by the Institutional Animal Care and Use Committee, and were in accordance with National Institutes of Health publication no. 8623 (rev. 1087). During data collection, the subject was required to discriminate between standard and target visual stimuli for a juice reward. The standard visual stimulus was a 10-µs red-light flash presented through a diffusing plate subtending a 20° visual angle, centered on the point of visual fixation, while the target was presented similarly, although differing slightly in intensity. Stimuli were presented at irregular interstimulus intervals (ISIs) (minimum of 350 ms, average of 2 stimuli/s). Intracortical field potential profiles were recorded using a linear-array multielectrode with 14 contacts (or channels), equally spaced at 150 µm, inserted into V1 and positioned so that the channels spanned all six laminae (see Fig. 2A, middle). The continuous field potential record from all channels, incorporating the stimulus tags, was sampled at 2 kHz and recorded on a PC-based data acquisition system (Neuroscan, El Paso, TX).
The signals examined with dVCA were recorded during 171 trial presentations of the standard visual stimulus and were epoched from 0 to 300 ms (0 ms indicates stimulus presentation). No other preprocessing of the data was performed. The average field potential signals in each electrode contact were calculated and used to determine the current source density (CSD) profile of these data (Fig. 7A). The CSD is an estimate of the second spatial derivative of the electric potential, which is proportional to the vector divergence of the current field. This profile was approximated using a three-point, second-order difference of the field potentials (Nicholson and Freeman 1975
; Schroeder et al. 1995
). Because the CSD represents the divergence of the electric current, it indexes the local regions of transmembrane current flow and eliminates volume-conducted activity generated at a distant site, which often contaminates field potential recordings. This makes the CSD a useful tool for studying the electric signals detected by an electrode array. Consistent with earlier studies reviewed by Schroeder et al. (1995
, 2001
), the visually evoked CSD profile sampled from V1 during this experimental session shows that the earliest activation occurs in the granular subdivisions of Layer 4, which constitute the major target of thalamic afferents. A second focus of activity localizes to the supragranular (laminae 2 and 3) layers and is thought to index a feedforward activation of pyramidal cells by interneurons in the granular layers (Schroeder et al. 1990
, 1991
). Note that because we are using a three-point, second-order difference to approximate the CSD, the top waveform in Fig. 7A is associated with electrode channel 2 rather than channel 1.
|
1 = 0.1044), which is on the order of 10% variation. In contrast, the distribution of single-trial latency shifts (Fig. 7D) shows something quite different and unexpectedit is bimodal with an early latency peak at 4.625 ms and a late latency peak at 2.125 ms with a difference of 6.75 ms. The ratio of late to early responses is 2.29:1 (119:52), as defined by a 1.25-ms cutoff between the two modes. We performed several different analyses to confirm the existence of the bimodal latency distribution displayed in Fig. 7D. First, Fig. 7E displays a colorized plot of the actual, single-trial field potentials (FPs) recorded from electrode channel 10, which is located in the granular layer. Time runs along the horizontal axis, chronological trial number runs along the vertical axis, and color indicates the amplitude of the FP. The red dashes on the left side of the plot indicate trials for which dVCA determined the latency to be early. It is easily seen that these highlighted trials have a response onset before the late trials, confirming that the bimodal latency distribution of the dVCA estimation concurs with observations from the actual data. It is also apparent that early and late responses are grouped over trials, which suggests a slow cyclical shift between different response states with one state dominated by faster and the other by slower responses. Second, we performed a cross-correlation analysis between each single-trial waveshape and the trial-averaged, FP waveform between 25 and 95 ms (this time period captures the initial negative deflection of the averaged FP waveform). This analysis also yielded a bimodal distribution (not shown). Finally, we selectively averaged the early and late FP recordings in channel 10 and showed them to be significantly different in both latency and waveshape (Fig. 7F). The trial-averaged FP waveshape (black) lies between the early (red) and late (blue) waveshapes. As expected, the minimum of the early averaged responses occurs before the minimum of the late averaged responses (5.5-ms difference). Although this time difference reflects the fact that these early responses precede the late responses, it is however not as accurate as the dVCA estimate because it is derived from a single time point rather than from the entire waveshape. It is also not surprising that the late responses more closely resemble the ensemble-averaged FP because they outnumber the early responses by more than 2 to 1.
Because these two types of granular responses exhibit distinct waveshapes and latencies this suggests that different physiological mechanisms are at work in these two processes. These mechanisms will be much easier to distinguish by separating the original data set into two subsetsan early subset and a late subsetand performing separate analyses on each and comparing. We then used dVCA to reestimate the first component in each subset. Figure 8, A and B shows the CSD of the first component from the early and late subsets, respectively. The onset latency of the prominent granular sink over source configuration was determined by descending down the left side of the peak and identifying the point in time where five consecutive previous data points were monotonically increasing. With this measure the response onset latency of the early subset (Fig. 8A) was 28 ms, which is clearly earlier than its counterpart in the late subset (Fig. 8B), which was 42.5 ms, as indicated by the drop lines. The waveshapes of the two types of Layer 4 responses differ significantly, indicating a difference in activation patterns between subsets. This is also confirmed by the different degrees to which this activation appears in the supragranular layers because the later responses seem to be more strongly coupled to the supragranular activation than the early responses.
|
To verify that these UHF oscillations were not present in the late responses, we applied a Morlet-based wavelet transformation to the two residual subsets of channel 11. To preserve information about timefrequency behavior of oscillations not precisely time-locked to the stimulus, we applied the wavelet transform to each single trial separately and the wavelet results were then averaged. These results show that there is a burst of UHF activity ranging from about 160 to 220 Hz and occurring between 42 and 57 ms in the early subset, which is represented as an "island of activity" in the timefrequency plot (Fig. 8D). Such an island of activity is completely absent in the late subset (Fig. 8E). This finding demonstrates the ability of dVCA to identify distinct physiological modes of activation using the single-trial characteristics of the responses, even in situations where the data are unfiltered.
To demonstrate how dVCA can be used to extract multiple components, we focus on the late subset. A complete analysis of the entire data set will be published elsewhere. In this example we model three components (Fig. 9, AC). Components 2 and 3 (c2 and c3) have been magnified by a factor of 2 to make them easily visible, and are thus not on the same scale as component 1 (c1), which is the largest response. To ensure that the SNR is within the acceptable range for dVCA, we used Eq. B4 to compute the SNR for each component in each channel. For a given component this is accomplished by estimating the SD of the component in that channel and the SD of the residual signal in that channel. The average SNR for each component was then computed by averaging its SNRs across channels. We found that all three components are well within the range of acceptable performance with average SNRs of 1.59, 1.68, and 0.98 dB for c1, c2, and c3, respectively.
|
A low-frequency biphasic response is described by c2 (Fig. 9B). Significant onset of this component occurs at 37 ms (see drop line), which is after c1 begins the biphasic activation in the granular layers. Note that the source and sink distribution of c2 are in superficial layers compared with the small supragranular response of c1. At first glance it appears that the spatial distribution of c3 is the same as that of c2. However, whereas these components may involve the same populations of supragranular neurons, c3 also involves others in the granular layers. The initial activation of c3 is at 30 ms, but it is not until 45 ms when the component begins to display a biphasic activation followed later by a slow wave.
Figure 9D shows the single-trial amplitude histograms along the diagonal, along with scatterplots showing the relationships between the single-trial amplitudes of the three components. The initial granular response, c1, has the lowest amplitude variability (
amp1 = 0.102), whereas the later supragranular responses display greater variability (
amp2 = 0.248,
amp3 = 0.362). Figure 9E shows the single-trial latency results along with scatterplots depicting the trial-by-trial relationships between component latencies. Component c1 again displays the least variability (
lat1 = 1.08 ms), but in this case, c2 displays greater variability than that of c3 (
lat2 = 9.85 ms,
lat3 = 1.58 ms).
Correlations in the scatterplots between component parameters indicate dynamic interactions among these components. Of the amplitudeamplitude interactions (Fig. 9D), we see that the amplitudes of c2 are correlated with c3 (r = 0.495, P < 107), so that when c2 is larger than average, c3 is also larger than average. Such correlation might suggest that these components have not been separated. However, note that the latency variability of each component is very different (Fig. 9E), and that there is little correlation between the latency of c2 and the latency of c3 (r = 0.171, P = 0.075).
Of the latencylatency interactions, the largest correlation is between the c1 and c2 latencies (r = 0.327, P < 0.001), so that when c1 is earlier than average, c2 is also earlier than average. More interesting are the amplitudelatency interactions. Figure 9F shows the two most probable relationships. First the amplitude and latency of c1 are correlated (r = 0.297, P < 0.002), so that when c1 is early its response is smaller, and when it is late its response is larger. Self-interactions, such as this one illustrated by c1, contain much information about the underlying dynamics of the neural response. More apparent is the relationship between the amplitude of c1 and the latency of c2 (r = 0.483, P < 107). In this case, when the c1 granular response is larger than average, c2 onsets later than average. This result is somewhat counterintuitive because generally we expect that if the c1 is driving the c2, a larger c1 might produce an earlier-onset c2. The difference between c1 and c2 is further highlighted by noting that the amplitude of c1 is anticorrelated to the latency of c3 (r = 0.190, P = 0.048) (not shown).
|
|
DISCUSSION |
|---|
|
Maximum-likelihood techniques have previously been used to approach the problem of trial-to-trial variability of evoked responses. These past works have relied on single-component signal models that incorporate latency variability (Pham et al. 1987
; Woody 1967
), and more recently both amplitude and latency variability (Jaskowski and Verleger 1999
) to characterize evoked responses. Multiple-component models were introduced by Lange et al. (1997)
by adopting a template model of the ERP waveshapes, and were used to characterize both the amplitude and latency of multiple components in single-channel EEG recordings. Early versions of the dVCA algorithm also dealt with multiple-component, single-channel estimation (Knuth et al. 2001
; Truccolo et al. 2001
, 2002
, 2003
), but instead used a completely general waveshape model that did not rely on a given functional form. The dVCA algorithm presented here allows one to use multiple-channel recordings to identify and characterize multiple ERP components in the single trial by taking advantage of the trial-to-trial variability.
The dVCA algorithm was derived by approaching the problem as an exercise in Bayesian parameter estimation. There are several advantages to a Bayesian approach. First, the strategy is model based in the sense that given a quantitative model of the phenomena of interest, probability theory can be used to estimate the values of the model parameters from the data. Any failure in the algorithm can be traced back to either inadequacies of the model in representing the physical situation, assumptions, or approximations made in its implementation, or a situation produced by an insufficiently informative data set. Second, inadequacies in the model, once identified, are readily remediedleading to improvements in the algorithm. Third, once the model components have been adequately estimated, the residual data can be examined to identify previously hidden phenomena. For example, herein we have discovered UHF oscillations associated with early initial granular layer responses and absent during the more frequent late responses. By accurately identifying the evoked components in the single-trial epochs, one can more accurately study the ongoing activity, which has been purported to contain signals important for communication among brain regions (Bressler et al. 1993
; Singer 1993
; Truccolo et al. 2003
), in perception, working memory, and sensorimotor integration (Engel et al. 2001
; Lee et al. 2003
). Finally, the Bayesian methodology allows one to incorporate additional prior information into the problem to improve one's inferences, which is a major advantage that we plan to capitalize on in future work.
dVCA has a number of technical strengths. The first is that it capitalizes on both amplitude and latency variability to aid in component separation. Second, neither the components nor their underlying neural sources are assumed to be independent of one another. This avoids the adoption of physiologically implausible assumptions and enables one to study the dynamical interactions among neural sources. A third strength is that by accounting for trial-to-trial variability in amplitude and latency, one is able not only to quantify the interactions among sources but also to study their time-dependent properties over the duration of the entire experiment. Single-trial analysis also allows dVCA to detect components, which are not present in every experimental trial, thus allowing an experimenter to study cognitive or sensorimotor processes, which may use different processing strategies at different times. Additionally, one can use dVCA to study how systemic factors such as attention, arousal, or varying disease states may modify neural responses on both a single-trial and a single-component basis. Finally, although no explicit prior information about the values of the model parameters was included in the development of the algorithm, much prior information went into the choice of the model. This is in contrast to other approaches such as ICA and other general BSS techniques, which strive to make very general assumptions about the model and the distributions of the parameter values (Knuth 1997
, 1999
).
dVCA is robust in the presence of noise, allowing accurate estimates of all parameters down to SNRs on the order of 15 dB for white Gaussian noise and down to around 01 dB for correlated far-field noise. The estimation of far-field signals in the presence of correlated far-field noise was difficult. This is undoubtedly explained by the fact that this noise possesses the same spatial distribution as the source, making the two difficult to distinguish. Although this is a problem for local field potential measurements, it will not be a problem for whole-head recordings because, in that case, a far-field source for one detector will be local for another. Also in most cases, the behavior of the background noise will fall somewhere between the pure white Gaussian noise and pure correlated far-field noise extremes. When using dVCA, we recommend that one calculate the SNR of the estimated components to ensure that the algorithm is operating in a regime where the quality of the parameter estimates is guaranteed. In addition, the spatial distribution of the sources across the array should be examined to ensure that they are physiologically reasonable. We have been able to construct cases where the sources remain inseparable by the algorithm. This can typically be detected by examining the CSD map of the estimated components across the array. In cases where two sources remain mixed, each is characterized by multiple sets of neural sources in identical locations. However, such a criterion is not necessarily conclusive in intracortical laminar recordings because the cortical architecture allows for only a small number of current sources and sinks, and tight couplings between sources is extremely likely (Shah et al. 2003
). One possible solution to this problem is to restart the algorithm using starting conditions consistent with the source locations indicated by the CSD. The value of the posterior probability of these solutions obtained from different starting points can also be computed and compared (when using the same model order) to ensure that one has the most probable solution and is not merely stuck at a local maximum. Searching such enormous solution spaces is a difficult problem faced by all source separation algorithms.
We are currently working to improve the dVCA algorithm along several lines.
First, there are situations where the experimenter has knowledge about the forward problem, which describes the propagation of the signals to the detectors. Such knowledge can be incorporated by adopting a more specific source model (e.g., current dipole model) and expressing the coupling coefficients in terms of the new ERP source parameters, detector coordinates, and head geometry (Knuth and Vaughan 1998
; Schmidt et al. 1999
), or by using information about the propagation law to derive prior probabilities for the coupling matrix elements (Knuth 1998
, 1999
, 2005
). Similarly, more detailed information about the correlation structure of the background noise can be used to derive more accurate likelihood functions (Sivia 2003
). After much experimentation in a specific cortical region, one can assign Gaussian distributions to the priors for the amplitude and latency variability of specific sources based on previously observed means and variances.
Second, by using a discrete model of the component waveshapes, we are restricted to estimating discrete values of onset latency shift. By adopting a waveshape model that relies on a set of continuous basis functions, continuous values of latency shift could be investigated. An example of a continuous time model is the frequency domain model of the ERP used by Pham et al. (1987)
and Jaskowski and Verleger (1999)
. They model the waveshape as a sum of a discrete set of sinusoids, which is continuous in the time domain, but discrete in the frequency domain. In principle, such a model allows one to describe latency shifts with arbitrary precision. It should also be noted that dVCA does not accommodate trial-dependent waveshape changes or spatiotemporal wave activity across cortex. Incorporating such activity into the signal model will require great care not to erroneously overfit the data.
Third, our algorithm represents a maximum a posteriori (MAP) estimate based on an iterative, or fixed-point, solution. Although each step in our algorithm has intuitive appeal, it is perhaps not the most effective means for obtaining a solution. Markov chain Monte Carlo (MCMC) with simulated annealing is well suited to performing simultaneous model selection calculations and parameter estimations. Several years ago, we briefly investigated this approach only to find it to be extremely time consuming given the large number of parameters in the model. It is expected that given the advances in computing power and in MCMC technology, such an approach would most likely outperform the algorithm presented here by automatically identifying the number of components warranted by the data, avoiding local optimal solutions, and readily providing error bars for the results.
Fourth, oscillatory responses of sufficiently low frequency pose great difficulties when it comes to accurate estimation of their onset latency. Furthermore, ongoing oscillations that are weakly time locked to stimuli are not well described by the multiple-component event-related model and suggest that modifications in the model to allow for ongoing oscillations or oscillatory bursts would lead to new results.
Finally, in principle this algorithm is equally applicable to human scalp EEG and MEG data. In fact, the Woody filter, which was the first method to estimate single-trial evoked responses, was used on human scalp EEG with success (Woody 1967
). The signal-to-noise levels are certainly well within the applicable range (0 to 2 dB, or 1.0 to 0.1 in terms of SD ratios) for many of the evoked responses currently studied. However, in practice, several challenges remain. The number of detectors used in these paradigms is typically an order of magnitude greater than those we have demonstrated here. It is certain that the algorithm in its present implementation will run more slowly. Also, whole-head studies expose the experimenter to an order-of-magnitude more sources than we work with in our intracortical recordings. This increases the possibility that the dVCA algorithm can become trapped in nonoptimal local solutions. This of course is a potential problem for all source separation and localization techniques, and dVCA is no exception. We are beginning to examine the application of dVCA in human scalp studies, and expect that if such pathological solutions are encountered, they can be avoided by using more sophisticated search algorithms.
|
|
Appendix A: Algorithm development |
|---|
|
![]() | (A1) |
To apply this theorem to our problem, we consider the change in our knowledge about the model with the acquisition of new data, which consists of the set of recorded data x(t) from M detectors over the course of R trials. In this case, Bayes' theorem can be written as
![]() | (A2) |
= {
1,
2, ...,
R}. The most probable set of model parameters maximizes the probability in Eq. A2, and thus in practice the equation can be expressed as a proportionality with the inverse of the evidence p[x(t)|I] as the implicit proportionality constant. Equation A2 then becomes
![]() | (A3) |
For simplicity, the joint prior can be factored into four terms, each describing what we know about the source-detector coupling, the source waveshapes, the single-trial amplitudes, and the single-trial latency offsets
![]() | (A4) |
|I) and p(
|I), respectively, we assign uniform densities with appropriate cutoffs denoting a range of physiologically realizable values. Note that a joint uniform density p(
,
|I) can always be factored in this way, and is not necessarily a statement about independence. Because the amplitude and latency priors are each represented by a uniform density, we can absorb those two factors into the implicit proportionality constant.
Our derivation continues by using the principle of maximum entropy to assign a Gaussian likelihood (e.g., Jaynes 2003
; Sivia 1996
) by introducing a parameter
reflecting the expected square deviation between our predictions and the mean
![]() | (A5) |
|I) is the prior probability for
. Q represents the sum of the square of the residuals between the data and our model in Eq. 1, written
![]() | (A6) |
2. Thus higher-order statistical structure in the noise is tolerated and the noise does not have to be Gaussian distributed.
The fact that we do not actually know the value of
is not a problem because we can obtain a conservative result by considering all possible noise levels by integrating the joint posterior over all possible values of
. Symmetry considerations require that we assign what is called a Jeffreys prior for
, p(
|I) =
1 (Sivia 1996
). Performing the integral of the joint posterior over all possible values of
, we obtain a marginal posterior probability for our original set of model parameters. To do this, it helps to make the change of variables by defining t =
1 and then integrating over t by iteratively integrating by parts. The proportionality constant one obtains depends on whether the product MRT is even or odd, so there are two cases to consider. However, the functional form of the result is independent of this fact
![]() | (A7) |
If we were more knowledgeable, prior information regarding the source waveforms could be used to improve our inferences. In addition, knowledge of the source-detector coupling, which is found by solving the electromagnetic forward problem, could be used to create an algorithm that simultaneously performs source separation and localization (Knuth 1998
; Knuth and Vaughan 1998
). For simplicity, in this development we choose to assign uniform priors to p(C|I) and p(s|I), and absorb the terms into the implicit proportionality constant. It is more convenient to work with the logarithm of the posterior probability (Eq. A5), which can be compactly written as
![]() | (A8) |
,
|x(t), I].
An iterative algorithm to identify mcERPs, which we call differentially variable component analysis (dVCA), is implemented by solving Eq. A8 for the most probable set of model parameters. Such a solution is called the maximum a posteriori (MAP) estimate. This most probable set of model parameters is represented as a peak in the space of posterior probabilities. At this maximum, the partial derivative of the posterior probability with respect to any one of the model parameter values is zero. For this reason, our first step in deriving an optimal estimate of the component waveshape by equating each of the partial derivatives of Eq. A8 with zero. In the current dVCA implementation, each source waveshape is modeled as a pointwise time series where each point of the time series is considered as a model parameter. To solve this we must look at the partial derivative of the log posterior with respect to the waveshape amplitude at each of the times over which the waveshape is defined. The partial derivative with respect to the jth component waveshape at time q gives
![]() | (A9) |
![]() | (A10) |
![]() | (A11) |
jr). From this, one subtracts all other components after each has been appropriately scaled and time-shifted, Cmn
nrsn(q
nr +
jr). The derivative of the log probability is zero when the scaled, estimated waveshape equals W. Thus one can obtain an expression for the optimal waveshape of the jth source at time q in terms of the other sources
![]() | (A12) |
Similarly for the source amplitudes, one obtains an optimal estimate by considering the derivative of the log probability with respect to the amplitude of the jth source during the pth trial. Setting this derivative equal to zero results in
![]() | (A13) |
![]() | (A14) |
![]() | (A15) |
jp) onto the data after removing the other scaled, time-shifted components. This can be viewed in terms of a dot product, which is related to a matching filter solution.
The optimal source-detector coupling coefficients are found similarly with
![]() | (A16) |
![]() | (A17) |
![]() | (A18) |
Estimating the latency shift of the jth source during the pth trial using the approach taken for the other parameters leads to a complex solution as the latency appears implicitly as the argument of the waveshape function. Instead we examine the necessary conditions for maximizing the quadratic form Q. Expanding the square in Eq. A6, one can see that as the latency shift
jp is varied, only the cross-terms corresponding to the jth source change. The optimal estimate of the latency shift
jp can be found by maximizing the cross-correlation between the estimated source and the data after the contributions from the other sources have been subtracted such that
![]() | (A19) |
![]() | (A20) |
Perhaps the greatest challenge is determining the number of sources warranted by the data. In our investigations to date we have been focused on understanding the major sources responsible for the recorded data. This typically entails first modeling a single component and examining the un-modeled residual signal in detail. We then attempt to model the data with two components and so on to greater numbers of sources. As we demonstrate, the responses we are examining are sufficiently complex that this approach rapidly reveals previously unknown characteristics of the neurophysiology.
There are many ways that one can use the equations above to implement the dVCA algorithm. A useful iterative method (Fig. 1) begins by modeling a single neural source with the single-trial amplitudes set to unity and the latency shifts set to zero.
This implementation represents only one possible approach to using these equations to obtain useful solutions.
|
|
Appendix B: Amari error |
|---|
|
can at best be determined to within a scaled permutation of the true coupling matrix
![]() | (B1) |
is a diagonal scaling matrix and
is a permutation matrix. Here the quantities decorated with a caret (also known as a "hat") denote estimated quantities and those without a caret denote the unknown true values. Ideally, the source estimates are therefore related to the original sources by a simple matrix transformation
![]() | (B2) |
is the N x T matrix of the estimated source waveshapes, and M is a transformation matrix found by
![]() | (B3) |
![]() | (B4) |
![]() | (B5) |
|
|
GRANTS |
|---|
|
|
|
|
ACKNOWLEDGMENTS |
|---|
|
|
|
FOOTNOTES |
|---|
Address for reprint requests and other correspondence: K. Knuth, Department of Physics, University at Albany (SUNY), Albany, NY 12222 (E-mail: kknuth{at}albany.edu)
|
|
REFERENCES |
|---|
|
Bell AJ and Sejnowski TJ. An information-maximization approach to blind source separation and deconvolution. Neural Comp 7: 11291159, 1995.[Web of Science][Medline]
Belouchrani A, Abed-Meraim K, Cardoso J-F, and Moulines E. Second-order blind separation of correlated sources. Proc Int Conf Digital Sig Proc Nicosia, Cyprus, 1993, p. 346351.
Bogacz R, Yeung N, and Holroyd CB. Detection of phase resetting in the electroencephalogram: an evaluation of methods. Soc Neurosci Abstr 28: 506.9, 2002.
Bressler SL, Coppola R, and Nakamura R. Episodic multiregional cortical coherence at multiple frequencies during visual task performance. Nature 366: 153156, 1993.[CrossRef][Medline]
Cao J, Murata N, Amari S, Cichocki A, Takeda T, Endo H, and Harada N. Single-trial magnetoencephalographic data decomposition and localization based on independent component analysis approach. IEICE T Fund Electr 9: 17571766, 2000.
Delorme A and Makeig S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods 134: 921, 2004.[CrossRef][Web of Science][Medline]
Engel AK, Fries P, and Singer W. Dynamic predictions: oscillations and synchrony in top-down processing. Nat Rev Neurosci 2: 704716, 2001.[CrossRef][Web of Science][Medline]
Freeman JA and Nicholson C. Experimental optimization of current source-density technique for anuran cerebellum. J Neurophysiol 38: 369382, 1975.
Givre SJ, Arezzo JC, and Schroeder CE. Effects of wavelength on the timing and laminar distribution of illuminance-evoked activity in macaque V1. Vis Neurosci 12: 229239, 1995.[Web of Science][Medline]
Hyvärinen A and Oja E. A fast fixed-point algorithm for independent component analysis. Neural Comput 9: 14831492, 1997.[CrossRef][Web of Science]
Jaskowski P and Verleger R. Amplitude and latencies of single-trial ERP's estimated by a maximum-likelihood method. IEEE Trans Biomed Eng 46: 987993, 1999.[CrossRef][Web of Science][Medline]
Jaynes ET. Probability TheoryThe Logic of Science. Cambridge, UK: Cambridge Univ. Press, 2003.
Jung T-P, Makeig S, Westerfeld M, Townsend J, Courchesne E, and Sejnowski TJ. Independent component analysis of single-trial event-related potentials. In: Proceedings of the First International Workshop on Independent Component Analysis and Signal Separation: ICA'99, edited by Cardoso J-F, Jutten Ch, and Loubaton Ph. Aussois, France: ICA'99, 1999, p. 173178.
Knuth KH. Difficulties applying recent blind source separation techniques to EEG and MEG. In: Maximum Entropy and Bayesian Methods, Boise 1997, edited by Erickson GJ, Rychert JT, and Smith CR. Dordrecht, The Netherlands: Kluwer Academic, 1997, p. 209222.
Knuth KH. Bayesian source separation and localization. In: Proceedings of SPIE: Bayesian Inference for Inverse Problems, edited by Mohammad-Djafari A. Bellingham, WA: SPIE, 1998, vol. 3459, p. 147158.
Knuth KH. A Bayesian approach to source separation. In: Proceedings of the First International Workshop on Independent Component Analysis and Signal Separation: ICA'99, edited by Cardoso J-F, Jutten Ch, Loubaton Ph, Aussois France: ICA'99, 1999, p. 283288.
Knuth KH. Informed source separation: a Bayesian tutorial (Invited paper). In: Proceedings of the 13th European Signal Processing Conference (EUSIPCO 2005), edited by Sankur B, Cetin E, Tekalp M, and Kuruo
lu E. Antalya, Turkey, 2005.
Knuth KH, Truccolo WA, Bressler SL, and Ding M. Separation of multiple evoked responses using differential amplitude and latency variability. In: Proceedings of the Third International Workshop on Independent Component Analysis and Blind Signal Separation: ICA2001, edited by Lee T-W, Jung T-P, Makeig S, and Sejnowski TJ. San Diego, CA: ICA2001, 2001, p. 463468.
Knuth KH and Vaughan HG Jr. Convergent Bayesian formulations of blind source separation and electromagnetic source estimation. In: Maximum Entropy and Bayesian Methods, Garching, Germany 1998, edited by von der Linden W, Dose V, Fischer R, and Preuss R. Dordrecht, The Netherlands: Kluwer Academic, 1998, p. 217226.
Lange DH, Pratt H, and Ingbar GF. Modeling and estimation of single evoked brain potential components. IEEE Trans Biomed Eng 44: 791799, 1997.[CrossRef][Web of Science][Medline]
Lee KH, Williams LM, Breakspear M, and Gordon E. Synchronous gamma activity: a review and contribution to an integrative neuroscience model of schizophrenia. Brain Res Brain Res Rev 41: 5778, 2003.[CrossRef][Medline]
Lee T-W, Girolami M, and Sejnowski TJ. Independent component analysis using an extended infomax algorithm for mixed sub-Gaussian and super-Gaussian sources. Neural Comput 11: 609633, 1999.
Makeig S, Bell AJ, Jung T-P, and Sejnowski TJ. Independent component analysis of electroencephalographic data. In: Advances in Neural Information Processing Systems, edited by Touretzky D, Mozer M, and Hasselmo M. Cambridge, MA: MIT Press, 1996, vol. 8, p. 145151.
Makeig S, Jung T-P, Bell A, Ghahremani D, and Sejnowski TJ. Blind separation of auditory event-related brain responses in independent components. Proc Natl Acad Sci USA 94: 1097910984, 1997.
Makeig S, Westerfield M, Jung T-P, Enghoff S, Townsend J, Courchesne E, and Sejnowski TJ. Dynamic brain sources of visual evoked responses. Science 295: 690694, 2002.
Mehta AD, Ulbert I, and Schroeder CE. Intermodal selective attention in monkeys. I. Distribution and timing of effects across visual areas. Cereb Cortex 10: 343358, 2000.
Mitzdorf U. Current source-density method and application in cat cerebral cortex: investigation of evoked potentials and EEG phenomena. Physiol Rev 65: 37100, 1985.
Mocks J, Gasser T, Kohler W, and De Weerd JP. Does filtering and smoothing of average evoked potentials really pay? A statistical comparison. Electroencephalogr Clin Neurophysiol 64: 469480, 1986.[CrossRef][Web of Science][Medline]
Nicholson C and Freeman JA. Theory of current source-density analysis and determination of conductivity tensor for anuran cerebellum. J Neurophysiol 38: 356368, 1975.
Pham DT, Mocks J, Kohler W, and Gasser T. Variable latencies of noisy signals: estimation and testing in brain potential data. Biometrika 74: 525533, 1987.
Särelä J, Vigário R, Jousmäki V, Hari R, and Oja E. ICA for the extraction of auditory evoked fields. In: 4th International Conference on Functional Mapping of the Human Brain. Montreal, Canada: HBM'98, 1998.
Schmidt DM, George JS, and Wood CC. Bayesian inference applied to the electromagnetic inverse problem. Hum Brain Mapp 7: 195212, 1999.[CrossRef][Web of Science][Medline]
Schroeder CE. Defining the neural bases of visual selective attention: conceptual and empirical issues. Int J Neurosci 80: 6578, 1995.[Medline]
Schroeder CE, Mehta AD, and Foxe JJ. Determinants and mechanisms of attentional modulation of neural processing. Front Biosci 6: D672D684, 2001.[Web of Science][Medline]
Schroeder CE, Mehta AD, and Givre SJ. A spatiotemporal profile of visual system activation revealed by current source density analysis in the awake macaque. Cereb Cortex 8: 575592, 1998.
Schroeder CE, Steinschneider M, Javitt DC, Tenke CE, Givre SJ, Mehta AD, Simpson GV, Arezzo JC, and Vaughan HG Jr. Localization of ERP generators and identification of underlying neural processes. Electroencephalogr Clin Neurophysiol Suppl 44: 5575, 1995.[Medline]
Schroeder CE, Tenke CE, Givre SJ, Arezzo JC, and Vaughan HG Jr. Laminar analysis of bicuculline-induced epileptiform activity in area 17 of the awake macaque. Brain Res 515: 326330, 1990.[CrossRef][Web of Science][Medline]
Schroeder CE, Tenke CE, Givre SJ, Arezzo JC, and Vaughan HG Jr. Striate cortical contribution to the surface-recorded pattern-reversal VEP in the alert monkey. Vision Res 31: 11431157, 1991.[CrossRef][Web of Science][Medline]
Shah AS, Knuth KH, Lakatos P, and Schroeder CE. Lessons from applying differentially variable component analysis (dVCA) to electroencephalographic activity. In: Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Jackson Hole WY 2003, AIP Conference Proceedings 707, edited by Zhai Y and Erickson GJ. Melville, NY: American Institute of Physics, 2004, p. 167181.
Shah AS, Knuth KH, Truccolo WA, Ding M, Bressler SL, and CE. A Bayesian approach to estimating coupling between neural components: evaluation of the multiple component event related potential (mcERP) algorithm. In: Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Moscow ID 2002, AIP Conference Proceedings 659, edited by Williams C. Melville, NY: American Institute of Physics, 2003, p. 2338.
Singer W. Synchronization of cortical activity and its putative role in information processing and learning. Annu Rev Physiol 55: 349374, 1993.[CrossRef][Web of Science][Medline]
Sivia DS. Data Analysis. A Bayesian Tutorial. Oxford, UK: Clarendon Press, 1996.
Sivia DS. Some thoughts on correlated noise. In: Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Jackson Hole WY 2003, AIP Conference Proceedings 707, edited by Erickson GJ and Smith CR. Melville, NY: American Institute of Physics, 2003, p. 303313.
Student. The probable error of a mean. Biometrika 6: 124, 1908.
Tang AC, Pearlmutter BA, Malaszenko NA, and Phung DB. Independent components of magnetoencephalography: single-trial response onset times. Neuroimage 17: 17731789, 2002.[CrossRef][Web of Science][Medline]
Tenke CE, Schroeder CE, Arezzo JC, and Vaughan HG Jr. Interpretation of high-resolution current source density profiles: a simulation of sublaminar contributions to the visual evoked potential. Exp Brain Res 94: 183192, 1993.[Web of Science][Medline]
Truccolo WA, Ding M, Knuth KH, Nakamura R, and Bressler SL. Trial-to-trial variability of cortical evoked responses: implications for the analysis of functional connectivity. Clin Neurophysiol 113: 206226, 2002.[CrossRef][Web of Science][Medline]
Truccolo WA, Knuth KH, Ding M, and Bressler SL. Bayesian estimation of amplitude, latency and waveform of single trial cortical evoked components. In: Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Baltimore MD 2001, AIP Conference Proceedings 617, edited by Fry RL and Bierbaum M. Melville, NY: American Institute of Physics, 2001, p. 6473.
Truccolo WA, Knuth KH, Shah AS, Bressler SL, Schroeder C, and Ding M. Estimation of single-trial multicomponent ERPs: differentially variable component analysis (dVCA). Biol Cybern 89: 426438, 2003.[CrossRef][Web of Science][Medline]
Vigário R, Särelä J, Jousmäki V, and Oja E. Independent component analysis in decomposition of auditory and somatosensory evoked fields. In: Proceedings of the First International Workshop on Independent Component Analysis and Signal Separation: ICA'99, edited by Cardoso J-F, Jutten Ch, and Loubaton Ph. Aussois, France: ICA'99, 1999, p. 167172.
Woody CD. Characterization of an adaptive filter for the analysis of variable latency neuroelectric signals. Med Biol Eng 5: 539553, 1967.[CrossRef][Web of Science]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |