Short Communication
Open Access

A computational approach to developing cost-efficient adaptive-threshold algorithms for EEG neuro feedback

Eddy J Davelaar

^{*}^{*}Department of Psychological Sciences, Birkbeck, University of London, Malet Street, WC1E 7HX London, UK

***Corresponding authors address:**Eddy J Davelaar, Department of Psychological Sciences, Birkbeck, University of London, Malet Street, WC1E 7HX London, UK, Tel: +44 207 0790807; Fax: +44 207 6316312; E-mail: e.davelaar@bbk.ac.uk

Received: 12 November, 2017; Accepted: 21 December, 2017; Published: 29 December, 2017

**Citation:**Eddy J Davelaar (2017) A computational approach to developing cost-efficient adaptive-threshold algorithms for EEG neuro feedback. Int J Struct Comput Biol 2(1): 1-4.

Abstract

In electroencephalography (EEG) neurofeedback protocols, trainees receive feedback about the spectral power of the target brain wave oscillation and are tasked to increase or decrease this feedback signal compared to a predetermined threshold. In a recent computational analysis of a neurofeedback protocol it was shown that the placement of the threshold has a major impact on the learning rate and that placed too low or too high leads to no learning or even unlearning, respectively. However, the optimal threshold placement is not known in real-life scenarios. Here, these analyses were extended to assess whether an adaptive-mean threshold procedure could lead to faster learning curves. The results indicate that such a procedure is indeed superior to a fixed-mean procedure and that the distribution of asymptotic EEG power values converges to that obtained with the optimal-threshold procedure. Surprisingly, the adaptive-mean procedure leads to thresholds that are higher than the optimal one, which is explained through the increase in threshold lagging behind the increase in the likelihood of activation of the target neurons. To date, no computational model was used to compute the cost-efficiency of EEG neurofeedback procedures. The current simulation (within the specific reinforcement schedule) demonstrated a 35% reduction in training time, which could translate into sizeable financial savings. This study demonstrates the utility of computational methods in neurofeedback research and opens up further developments that tackle specific neurofeedback protocols to assess their real-life cost-efficiency.

**Keywords:**EEG; Neurofeedback; Computational model; Adaptive threshold; Cost-efficiency;Introduction

Neurofeedback is a brain training procedure in which trainees receive information about their brain activation. Here, we consider electroencephalography (EEG) neurofeedback. The aim of the training is to gain voluntary control over the spectral power of the brain waves. To do this, trainees’ brain signals are processed in real-time, computing the spectral power in the target frequency such as the alpha frequency (8-12 Hz). The power is then compared against a predetermined threshold, with positive or negative feedback given if the actual power is above or below that threshold. The feedback itself can be of any modality (visual, auditory, haptic) and could even involve parameters in a gaming environment, such as the speed of a car or the height of a levitating vase [1].

There is a long tradition using EEG neurofeedback with roots in the clinical practice. Research has been dominated by assessing the validity and efficacy of EEG neurofeedback training in alleviating symptoms associated with substance abuse [2], epilepsy [3], attention-deficit/hyperactivity disorder (ADHD; [4]), depression [5,6], post-traumatic stress disorder (PTSD; [7,8]), and many more neuropsychiatric, neurological, and neurodevelopmental disorders. In recent years, the impact of EEG neurofeedback on peak mental performance [9,10] has led to discussions regarding methodological rigour (for review see [11,12]) and questions about the theoretical mechanisms underlying neurofeedback learning [13,14,15,16].

Davelaar [14] developed a computational model consisting of spiking neurons that produce an EEG signal and was used to address the neural mechanisms involved in the initiation of learning in the context of neurofeedback. The model implemented the first stage of a larger multi-stage framework, with each stage commencing at a later time over the course of neurofeedback training. The first stage involves the selection of the medium spiny neurons (MSNs) in the striatum that are critical in nudging the thalamus into producing the target brain wave oscillation. These target MSNs were active probabilistically at the millisecond scale, whereas the power of the target frequency and thus the feedback was updated at the second scale, creating a credit-assignment problem – which neurons out of many were responsible for the reward. Davelaar incorporated recent advances in computational biology [17] to solve this problem and in doing so demonstrated the model’s ability to hone in on the target MSNs.

The model was analysed using the distribution of power values before and after training, which were very similar to actual empirical data. The analysis assessed the optimal placement for the predetermined threshold. To summarise, the analytical simulation involved drawing a sample that represents alpha frequency power from one of two distributions: the baseline and the target distribution. The baseline distribution was obtained during a pre-training EEG recording and is available in real-life situations. The target distribution was obtained when setting the target MSNs to be continuously active. This scenario is unlikely in reality and thus formed a case in the limit. Both distributions are shown in figure 1 together with vertical lines indicating the mean of the baseline distribution and the optimal criterion that separates the two. Davelaar [14] showed that using the optimal criterion as the predetermined threshold for reward leads to the fastest learning curve with the highest asymptotic level of alpha frequency power. The simulation model provided insight in the temporal dynamics, such that the optimal learning involves first rejection (unlearning) of non-target MSNs followed by enhancing the likelihood of activating target MSNs.In reality, information about the target distribution is absent and thus no optimal threshold can be defined before the training intervention commences. Here, a simulation study is presented that investigated the use of an adaptive threshold algorithm that is applicable in real-life scenarios.

There is a long tradition using EEG neurofeedback with roots in the clinical practice. Research has been dominated by assessing the validity and efficacy of EEG neurofeedback training in alleviating symptoms associated with substance abuse [2], epilepsy [3], attention-deficit/hyperactivity disorder (ADHD; [4]), depression [5,6], post-traumatic stress disorder (PTSD; [7,8]), and many more neuropsychiatric, neurological, and neurodevelopmental disorders. In recent years, the impact of EEG neurofeedback on peak mental performance [9,10] has led to discussions regarding methodological rigour (for review see [11,12]) and questions about the theoretical mechanisms underlying neurofeedback learning [13,14,15,16].

Davelaar [14] developed a computational model consisting of spiking neurons that produce an EEG signal and was used to address the neural mechanisms involved in the initiation of learning in the context of neurofeedback. The model implemented the first stage of a larger multi-stage framework, with each stage commencing at a later time over the course of neurofeedback training. The first stage involves the selection of the medium spiny neurons (MSNs) in the striatum that are critical in nudging the thalamus into producing the target brain wave oscillation. These target MSNs were active probabilistically at the millisecond scale, whereas the power of the target frequency and thus the feedback was updated at the second scale, creating a credit-assignment problem – which neurons out of many were responsible for the reward. Davelaar incorporated recent advances in computational biology [17] to solve this problem and in doing so demonstrated the model’s ability to hone in on the target MSNs.

The model was analysed using the distribution of power values before and after training, which were very similar to actual empirical data. The analysis assessed the optimal placement for the predetermined threshold. To summarise, the analytical simulation involved drawing a sample that represents alpha frequency power from one of two distributions: the baseline and the target distribution. The baseline distribution was obtained during a pre-training EEG recording and is available in real-life situations. The target distribution was obtained when setting the target MSNs to be continuously active. This scenario is unlikely in reality and thus formed a case in the limit. Both distributions are shown in figure 1 together with vertical lines indicating the mean of the baseline distribution and the optimal criterion that separates the two. Davelaar [14] showed that using the optimal criterion as the predetermined threshold for reward leads to the fastest learning curve with the highest asymptotic level of alpha frequency power. The simulation model provided insight in the temporal dynamics, such that the optimal learning involves first rejection (unlearning) of non-target MSNs followed by enhancing the likelihood of activating target MSNs.In reality, information about the target distribution is absent and thus no optimal threshold can be defined before the training intervention commences. Here, a simulation study is presented that investigated the use of an adaptive threshold algorithm that is applicable in real-life scenarios.

**Figure 1:**Parent distributions for the simulation model. These distributions were obtained using a spiking neuron model [14], with the baseline distribution (in blue) obtained prior to learning and the target distribution (in red) obtained by setting the target MSNs to be active continuously. The blue vertical line represents the mean of the baseline distribution, which is used with the fixed-mean procedure. The red vertical line represents the optimal criterion that separates the two distributions and is used with the optimal-threshold procedure.

Methods

The model from the second simulation study of Davelaar [14] was used without modification. See the paper of details. The probability of the target MSN to be active is 0.001, which changes during the training period based on feedback. When the target neuron is active, an EEG power sample is drawn from the target distribution, otherwise it is drawn from the baseline distribution. When the sample is larger or smaller than the threshold, the synaptic strength of the active neurons is incremented or decremented by 0.1, respectively. The values are normalised before assessing whether the target neuron will be active on the next iteration.

The current model simulated three threshold procedures. In the optimal-threshold procedure, the optimal criterion (= 78) was used throughout the training period and was included here as it produces the learning curve that needs to be approximated or bettered. However, the optimal criterion procedure requires unknown information. A more realistic procedure is the fixed-mean procedure, which uses the mean of the baseline distribution (= 64.98). This procedure was included as it is the simplest threshold that can be implemented in neurofeedback software. The focus here is on the simplest adaptive-mean procedure, which is setting the threshold to the mean of the baseline distribution in the first instance and then replacing it every 1000 iterations with the mean of the preceding 1000 EEG power values. This type of adaptive procedure does not take much memory overhead and has the feature of tracking the threshold across learning. All simulations were run for 10000 iterations and repeated100 times. Epochs were created by averaging across 100 iterations to produce 100 epochs. Each block contained 1000 iterations (or 10 epochs).

The current model simulated three threshold procedures. In the optimal-threshold procedure, the optimal criterion (= 78) was used throughout the training period and was included here as it produces the learning curve that needs to be approximated or bettered. However, the optimal criterion procedure requires unknown information. A more realistic procedure is the fixed-mean procedure, which uses the mean of the baseline distribution (= 64.98). This procedure was included as it is the simplest threshold that can be implemented in neurofeedback software. The focus here is on the simplest adaptive-mean procedure, which is setting the threshold to the mean of the baseline distribution in the first instance and then replacing it every 1000 iterations with the mean of the preceding 1000 EEG power values. This type of adaptive procedure does not take much memory overhead and has the feature of tracking the threshold across learning. All simulations were run for 10000 iterations and repeated100 times. Epochs were created by averaging across 100 iterations to produce 100 epochs. Each block contained 1000 iterations (or 10 epochs).

Results and discussion

Figure 2 presents the results of the simulation. The top panel shows neural learning curves for the three algorithms averaged across 100 simulation repetitions. The optimal-threshold procedure led to the fastest learning curve with the highest asymptotic value, whereas the fixed-mean procedure produced the slowest learning curve with the lowest asymptote, replicating [14]. Although the adaptive-mean procedure led to an intermediate learning rate, the asymptotic level was identical to that of the optimal-threshold procedure. This indicates that there is no loss of reaching the maximum possible value when using the adaptive-mean procedure. The middle panel presents the standard deviations of the EEG power values of the top panel. Of interest here is that the standard deviations towards the end of the simulations have converged. In other words, after 10000 iterations the model did indeed stabilise to the final asymptotic distribution for all threshold procedures. Finally, in the bottom panel, the actual thresholds are plotted for every block (equals 1000 iterations). The adaptive-mean procedure led to thresholds that are much higher than the optimal threshold. A separate fixed-mean simulation was run (not shown) using the higher threshold (= 84.73), but it did not converge, as shown by [14], as at high threshold levels, the model is more likely to decrease the synaptic connections to the target MSN. In 50% of the simulation runs, the target MSN was completely unlearned before the end of the run. The threshold update lags behind the increase in the likelihood of target activation, therefore the synaptic connections are not unlearned. The simulations were also run using an adaptive-median procedure (not shown). The results were qualitatively similar with the difference that the lower value of the median compared to the mean led the adaptive-mean procedure to be superior.

What does this mean in practical terms? From the neural learning curves it is clear that there is a speed-up with the adaptive-mean over the fixed-mean procedure. To quantify this speed-up, the asymptotic value with the fixed-mean procedure was reached after 65 epochs using the adaptive-mean procedure, which represents a 35% reduction in time. Putting it in a real-world perspective, just over four people could be helped using adaptive-mean neurofeedback compared with three using the fixed-mean procedure. In financial terms, a 35% drop in costs may mean getting the insurance to cover these reduced costs or having to pay the full 100%. The threshold algorithm implemented in the neurofeedback software can thus be seen as a critical component in the cost-efficiency analysis.

What does this mean in practical terms? From the neural learning curves it is clear that there is a speed-up with the adaptive-mean over the fixed-mean procedure. To quantify this speed-up, the asymptotic value with the fixed-mean procedure was reached after 65 epochs using the adaptive-mean procedure, which represents a 35% reduction in time. Putting it in a real-world perspective, just over four people could be helped using adaptive-mean neurofeedback compared with three using the fixed-mean procedure. In financial terms, a 35% drop in costs may mean getting the insurance to cover these reduced costs or having to pay the full 100%. The threshold algorithm implemented in the neurofeedback software can thus be seen as a critical component in the cost-efficiency analysis.

**Figure 2:**Simulation results. Top panel: Neural learning curves over 100 epochs. With the adaptive-mean procedure the learning curve converges to that obtained using the optimal-threshold procedure. Middle panel: Standard deviations of the neural learning curves in the top panel. The curves show that with all three threshold procedures the simulation model has reached it asymptotic distribution. Bottom panel: Threshold setting per block. Using the adaptive-mean procedure leads to thresholds that are higher than the optimal one, without leading to unlearning. The black lines in the top panel show the speed-up obtained using the adaptive-mean over the fixed-mean procedure, which in this simulation (within this specific reinforcement schedule) is a 35% reduction in training time.

Concretely, a practical implementation of the above findings could be as follows. First, conduct a baseline EEG recording and set the initial threshold at the mean of the baseline distribution. Second, change the threshold to the mean of the values of the preceding training block (note: this is for within-session threshold adjustment). Third, stop the training session when the allocated time has passed or three consecutive blocks have produced similar means and standard deviations. This latter can provide additional cost-savings.

The procedure as described here is discretised, as the values are updated after every block. Continuous algorithms can also be developed using either moving windows or running averaging. It should be noted that the results presented apply only for the specific reinforcement schedule used here, i.e., mirrored binary reward values for either side of the threshold. Different schedules, such as reward proportional to distance from threshold, absence of negative rewards, or cumulative reward (e.g., counter) will lead to different learning profiles. The present analysis demonstrated the utility of computational methods in evaluating threshold algorithms to optimise neurofeedback learning. Future work could tackle specific neurofeedback protocols with physiological realistic computational models to evaluate the cost-efficiency of different thresholding procedures across different reinforcement schedules.

The procedure as described here is discretised, as the values are updated after every block. Continuous algorithms can also be developed using either moving windows or running averaging. It should be noted that the results presented apply only for the specific reinforcement schedule used here, i.e., mirrored binary reward values for either side of the threshold. Different schedules, such as reward proportional to distance from threshold, absence of negative rewards, or cumulative reward (e.g., counter) will lead to different learning profiles. The present analysis demonstrated the utility of computational methods in evaluating threshold algorithms to optimise neurofeedback learning. Future work could tackle specific neurofeedback protocols with physiological realistic computational models to evaluate the cost-efficiency of different thresholding procedures across different reinforcement schedules.

- Berger AM, Davelaar EJ. Frontal alpha oscillations and attentional control: a virtual reality neurofeedback study. Neuroscience . 2017. Doi: 10.1016/j.neuroscience.2017.06.007
- Scott WC, Kaiser D, Othmer S, Sideroff SI. Effects of an EEG biofeedback protocol on a mixed substance abusing population. Am J Drug Alcohol Ab. 2005;31(3):455-469.
- Tan G, Thornby J, Hammond DC, Strehl U, Canady B, Arnemann K, Kaiser DA. Meta-analysis of EEG biofeedback in treating epilepsy. Clin EEG Neurosci. 2009;40(3):173-179. Doi: 10.1177/155005940904000310
- Arns M, de Ridder S, Strehl U, Breteler M, Coenen A. Efficacy of neurofeedback treatment in ADHD: the effects on inattention, impulsivity and hyperactivity: a meta-analysis. Clin EEG Neurosci. 2009;40(3):180-189. Doi: 10.1177/155005940904000311
- Walker JE, Lawson R. FP02 beta training for drug-resistant depression - a new protocol that usually reduces depression and keeps it reduced. J Neurotherapy. 2013;3:198-200. Doi: 10.1080/10874208.2013.785784
- Choi SW, Chi SE, Chung SY, Kim JW, Ahn CY, Kim HT. Is alpha wave neurofeedback effective with randomized clinical trials in depression? Neuropsychobiology. 2011;63(1):43-51. Doi: 10.1159/000322290
- Othmer S, Othmer SF. Post-traumatic stress disorder – the neurofeedback remedy. Biofeedback 2009;37(1):24-31.
- Peniston EG, Kulkovsky PJ. Alpha-theta brain wave neuro-feedback for Vietnam veterans with combat-related post-traumatic stress disorder. Med Psychotherapy 1991;4:47-60.
- Gruzelier JH. EEG-neurofeedback for optimising performance. I: a review of cognitive and affective outcome in healthy participants. Neurosci Biobehav R. 2014;44:124-141. Doi: 10.1016/j.neubiorev.2013.09.015
- Gruzelier JH. EEG-neurofeedback for optimising performance. II: creativity, the performing arts and ecological validity. Neurosci Biobehav R. 2014;44:142-158. Doi: 10.1016/j.neubiorev.2013.11.004
- Gruzelier JH. EEG-neurofeedback for optimising performance. III: a review of methodological and theoretical considerations. Neurosci Biobehav R. 2014;44:159-182. Doi: 10.1016/j.neubiorev.2014.03.015
- Vernon D, Dempster T, Bazanova O, Rutterford N, Pasqualini M, Andersen S. Alpha neurofeedback training for performance enhancement: reviewing the methodology. J Neurotherapy. 2009;13:214-227. Doi: 10.1080/10874200903334397
- Birbaumer N, Ruiz S, Sitaram R. Learned regulation of brain metabolism. Trends Cogn Sci. 2013;17(6):295-302.Doi: 10.1016/j.tics.2013.04.009
- DavelaarEJ. Mechanisms of neurofeedback: acomputation theoretic approach. Neurosciencehttps://doi.org/10.1016/j. Neuroscience. 2017.05.052. Epub 2017 Jun 9.Doi: 10.1016/j.neuroscience.2017.05.052
- Niv S. Clinical efficacy and potential mechanisms of neurofeedback. Pers Ind Diff. 2013;54:676-686.
- Ros T, Baars BJ, Lanius RA, Vuilleumier P. Tuning pathological brain oscillations with neurofeedback: a systems neuroscience framework. Front Hum Neurosci. 2014;8. Doi: 10.3389/fnhum.2014.01008
- Legenstein R, Chase SM, Schwartz AB, Maass W. A reward-modulated hebbian learning rule can explain experimentally observed network reorganization in a brain control task. J Neurosci. 2010;30(25):8400-8410. Doi: 10.1523/JNEUROSCI.4284-09.2010

FacebookTwitterLinkedInGoogle+YouTubeRSS