Results

A total of 172 datasets entered the analysis. These were distributed accordingly: 37 HC and 50 patients had baseline data; 15 HC and 34 patients (68%) had data from session 2; 36 patients (72%) had data from session 3. All cases of dropout were related to issues with treatment, e.g., the patient was, contrary to initial evaluation, deemed not suitable for psychotherapy or was rejected due to too low attendance. Note that for HC, session 2 was not after 10 weeks but after > 8 weeks, as HC did not participate in treatment.

Demographics and behavioral measures

All participants were right-handed save for 5 in the HC group and 1 in the patient group and all participants reported normal hearing. Diagnoses were distributed accordingly: agoraphobia: 4; depression: 20; GAD: 5; OCD: 12; PD: 8; and SAD: 1. Seven patients (14%) received no medication, 34 patients (68%) received one type of medication and the remaining nine patients (18%) received more than one medication. In total 86% of patients received at least one psychiatric medication, of which all received at least one type of selective serotonin re-uptake inhibitors (SSRI). No patients were treated with anti-psychotic medication.
Table 2 shows demographics and behavioral measures at baseline for the two groups. There was no difference between groups in sex or age, nor in number of Correct and Error trials in the Flanker paradigm. The patient group had significantly longer reaction times (RT) in both Correct and Error trials in the Flanker paradigm and to Target stimuli in the AO paradigm. The mean number of Error trials for each group was high, 53.6 for HC and 47.4 for patients, on average well above the recommended minimum of 17 trials for reliable (traditional) estimation of the ERN (Clayson, 2020). However, four patients were below 17 remaining trials (5, 8, 13 and 16, respectively). Note that in WLS, parameter estimation is on the total number of trials, which in the Flanker paradigm was well above 250 trials for all subjects. Upon visual inspection of the grand average and beta coefficient plots for these subjects, the data and parameter estimation were deemed to be of sufficient quality, in all cases showing the characteristic response-locked Flanker triphasic waveform. Nevertheless, although no empirical value exists, it must be noted that the more available trials, the more precise the estimation of variables (C. Pernet et al., 2022).
>> Table 2 here <<

Psychometrics

Table 3 shows McDonald’s Omega as a measure of internal consistency as well as results from group comparisons for all self-report questionnaires at baseline.
>> Table 3 here <<
All psychopathology measures showed good internal consistency (Omega > 0.7). The patient grouped scored significantly higher than HC on all items with two exceptions: Positive temperament in MEDI, where the patient group scored significantly lower than HC, and the Antagonism personality trait in PID36 where there was no significant difference between the groups.
Table 4 shows change in psychopathology measures across sessions assessed using mixed linear models with subject as random factor and baseline (Session 1) as reference, e.g., (Bates et al., 2015).
>> Table 4 here <<
For MEDI, treatment significantly reduced total score and almost all of the of sub scale scores. For some dimensions, e.g., Neurotic Temperament, a reduction from baseline was only significant at week 14 after the end of treatment. For others, e.g., Intrusive Cognitions, significant and lasting reduction could be measured already at week 10. Interestingly, Positive Temperament improved significantly at 10 weeks, but effects diminished to not significant compared to baseline at week 14. Traumatic Re-experiencing was the only symptom dimension in MEDI not significantly changed by treatment.
Somewhat surprisingly, treatment did not lead to an overall improvement in PID36 total score. Among the PID36 sub scales, reduction was only in Detachment.
Finally, both K10 and LPFS showed significant improvement due to treatment already at 10 weeks.

ERP grand average waveforms

Figures 1 to 4 show the group-wise 20% trimmed mean of mean weighted subject-level single-trial ERP data across stimulus types for all paradigms at baseline. Shaded areas indicate the 95% HDI. Well-known ERP components are marked on each plot and appear in agreement with the literature. For the UO ERPs, it is interesting to note that cMMN, the MMN in the Combined difference wave, is a mixture of fMMN and dMMN in that both the earlier-detected frequency change as well as the later-detected duration change are captured in the waveform. As such, the cMMN has two peaks.

Attended oddball

>> Figure 1 here <<

Response-locked Flanker

>> Figure 2 here <<

Stimulus-locked Flanker

>> Figure 3 here <<

Unattended oddball

>> Figure 4 here <<

ERP beta coefficients

Figure 5 shows beta time courses at FCz corresponding to stimuli in each paradigm.
>> Figure 5 here <<
, in LIMO EEG called the adjusted mean, is in itself not a measure of brain activity but depends on the other beta coefficients which are modulations around this constant term. Note that the beta coefficient to Target stimuli eliciting the P3b is more suitably plotted at Pz where it reaches maximum (see Figure 7 below). Note also that the Correct trial beta coefficient is positive-going even though the CRN is a negative-going wave. Finally, note that plots for the UO paradigm are difference contrasts between deviant and standard beta coefficients.

ERP correlations with psychometrics

Due to the many correlation analyses (10 ERP models * 4 psychopathology measures), we present only those for which TFCE detected large significant regions at the corrected level of 0.1%. Full results at an uncorrected TFCE of 5% are available in Supplementary materials. Due to TFCE often detecting many and overlapping significant regions and since we cannot know which of the voxels in each cluster are significant, we use visual inspection to describe significant regions in terms of start and end times and general cortical regions. Throughout the analysis we interpret effects on ERPs in terms of changes in amplitudes even though we cannot rule out that some effects are due to differences in peak latency rather than peak amplitude. That being said, correlations between higher scores and changes in amplitude within a given time interval is valid irrespective of whether effects are due to increased peak latency or reduced peak amplitude within the significant region . In addition, most significant regions span at least 50 ms, making it unlikely that the observed effects are due to increases in peak latency.

K10 Distress Scale (K10)

Figure 5 shows results for K10 assessing general psychological distress. We found several correlations between K10 and ERPs from the AO and response-locked Flanker paradigms. There were no significant correlations between K10 and ERP models from the stimulus-locked Flanker or UO paradigms at the main analysis level.
>> Figure 6 here <<
For Target stimuli in the AO paradigm, we found a large significant region from 324 ms to 696 ms covering both frontal, central and parietal channels. Figure 7 shows the Target stimulus beta along with the adjusted mean at Pz.
>> Figure 7 <<
This region clearly corresponded to the P3b and the negativet -values at central-parietal and parietal channels, where is more positive-going than , indicated that a reduced P3b is associated with higher scores in K10.
For Standard stimuli in the AO paradigm, we found a significant region from 304 to 372 ms centered at CPz, Pz and POz. This region immediately succeeds the N2, which is commonly analyzed at more frontal regions. As such, the significant region does not correspond to a known ERP. To aid in interpretation, Figure 8 shows the Standard stimulus grand average and beta time course at Pz.
>> Figure 8 here <<
It can be seen that the significant region corresponds to a negative peak starting just after 300 ms on the grand average plot (Figure 8, left). Because is more negative-going than in this region (Figure 8, right), the positive t -values indicate that decreased or less negative-going amplitudes at this region correlates with higher scores in K10.
For Standard stimuli, we also found a large region, not corresponding to any known ERPs either, starting at 396 ms and ending at 588 ms covering both frontal, central and parietal channels. Figure 8 shows a negative-going slow wave at Pz and the positive t -values indicate that less negative-going amplitudes across this wave correlates with higher scores in K10.
For Correct trials, we observed a significant region from -16 to 64 ms at central electrodes, centered at FCz, Cz and CPz. The time range and involved regions clearly corresponded to the CRN. The negativet- values indicated that higher scores in K10 correlated with a increased, or more negative-going, . As can be deduced from Figure 5 (top right) and because , a more negative translates to an increased, or more negative-going, CRN. In other words, an increased CRN correlated with higher scores in K10.
For Error trials, we observed a significant region from -20 to 56 ms, more frontal than the corresponding regions for the CRN, centered at Fz, FCz and Cz. This region clearly corresponded to the ERN and the positivet -values indicated that higher scores in K10 are associated with a reduced (less negative-going) ERN.
Finally, also for Error trials, we observed a large significant region from 284 ms to 468 ms, mainly at central-parietal and parietal regions, and with opposite effects at frontal regions. However, no obvious peak or wave is shown in neither the grand average plot (Figure 2, right) nor in the beta coefficient time course plot for Error trials and adjusted mean (Figure 5, top right). We interpreted this region as a continuation of the Pe, and the negative t -values indicated that a reduced late part of the Pe correlates with higher scores in K10.

Level of Personality Functioning Scale (LPFS)

Figure 8 shows results for LPFS assessing severity of personality pathology. We found correlations between LPFS and ERPs from the AO, response-locked Flanker and UO paradigms. There were no significant correlations with ERPs from the stimulus-locked Flanker at the main analysis level.
>> Figure 9 here <<
For ERPs from the AO paradigm, we found similar correlations as for K10. For Target stimuli, we found a large region from 312 ms to 688 ms covering frontal, central and parietal channels, clearly corresponding to the P3b. As for K10, reduced (less positive-going) P3b at parietal channels correlated with increased scores in LPFS. For Standard stimuli, we found that reduced amplitudes in the same wave covering a large region - from 312 to 680 ms, at both frontal, central and parietal channels - correlated with increased scores in LPFS. So did the negative peak just after 300 ms described for K10 above.
For the response-locked Flanker ERPs, as for K10, an increased CRN and a reduced ERN correlated with increased scores in LPFS. However, effects for the CRN were more frontally distributed this time, with centers at Fz, FCz and Cz.
We also noted a reversal of effects, with positive t -values correlating with increased scores in parietal regions. However, we also noted that the effect was absent at parietal channels along the midline, e.g., CPz and Pz. A plot of the Correct trial beta at P6, where the effect was strong, did not reveal a CRN-like waveform. As such, results were not straightforward to interpret. It is possible that these regional effects reflected activity from the brain processes generating the CRN, which were then picked up by lateral electrodes through volume conduction. A similar but weaker effect was observed for the ERN.
For the Combined deviant difference wave from the UO paradigm, we found a significant region from 156 to 200 ms centered around Fz. This region corresponded to the cMMN and the negative t -values indicated than an increased or more negative-going cMMN correlated with higher scores in LPFS. In addition, we found a significant region from 212 ms to 256 ms, centered around Fz and FCz. The positive t -values indicated that an increased combined deviant dP3a11Note that the d in dP3a denotes the difference wave ERP component corresponding to P3a. dP3a elicited to one of the three deviant types is spelled out as such, e.g., frequency deviant dP3a. correlated with increased LPFS scores.

Multidimensional Emotional Disorder Inventory (MEDI)

Figure 9 shows results for MEDI assessing symptom dimensions within the Internalizing spectrum.
>> Figure 9 here <<
For MEDI total scores, in a similar fashion to results for K10 and LPFS, we found that an increased CRN, a reduced ERN and a reduced P3b correlated with higher scores in MEDI. Again we found a significant region corresponding to the above-described late wave in response to Standard stimuli in the AO paradigm, but the effect was weaker. In addition, for Error trials we found a significant region from 184 ms to 388 ms centered at Cz and CPz. This region corresponded to Pe and because is above at this time interval, the negative t -values indicated that a reduced Pe correlated with higher scores in MEDI.
Because we found significant correlations for MEDI total score, we also analysed the MEDI sub scales at the less conservative level of 5%, each assessing a specific symptom dimension within the Internalizing spectrum. These results are available in Supplementary materials.
Not surprisingly, at this less conservative level, all of the regional effects for MEDI total score described above were present and more pronounced in covering more channels and longer time frames. In addition, higher scores correlated with a reduced stimulus-locked Flanker P3b and a reduced duration deviant dP3a from the UO paradigm.
Post-hoc , then, a reduced P3b elicited to Target stimuli in the AO paradigm correlated with all of the MEDI sub scales. Similarly, a reduced ERN correlated with higher scores in all MEDI sub scales except Traumatic Re-experiencing, albeit weakly for Somatic Anxiety and Avoidance. An increased CRN correlated with higher scores in Depression and less strongly with higher scores in Avoidance, Social Anxiety and Somatic Anxiety.
As perhaps could be expected, a reversal of effects was seen for Positive temperament, which is the only dimension where healthy comparison subjects scored higher than patients (Table 3). Here we observed that, e.g., a reduced CRN and, more weakly, an increased ERN correlated with higher scores.
For Error trials in the response-locked Flanker paradigm, in addition to the results already described above, we found that a reduced Pe correlated, more or less strongly, with higher scores in the sub scales Autonomic Arousal, Avoidance, Depression (the late part of Pe), Intrusive Cognitions, Somatic Anxiety, Social Anxiety as well as Traumatic Re-experiencing (weak and the late part of Pe). In addition, an increased Pc correlated with increased scores in Social Anxiety.
Post-hoc , we also found significant effects for the ERPs in the stimulus-locked Flanker paradigm, e.g., N2 and P3b, which did not survive the conservative level of 0.1%. However, because the P3b in the stimulus-locked Flanker paradigm falls immediately before button press, it cannot be reliably analysed due to some trials overlapping. Therefore, here we consider only results for the N2. Interestingly, for MEDI sub scales Intrusive Cognitions and Traumatic Re-experiencing areduced N2 correlated with higher scores, whereas for Neurotic Temperament, an increased N2 correlated with higher scores.
Post-hoc , we also found that a reduced Novelty P300 at CPz and Pz correlated with increased scores in MEDI Depression.
For ERPs from the UO paradigm, the strongest result was a correlation between MEDI Avoidance and a reduced (less positive) duration deviant dP3a and, more weakly, a reduced combined deviant dP3a. A reduced dP3a also correlated with increased scores in Intrusive Cognitions (duration deviant) and with increased scores in Traumatic Re-experiencing (both combined and duration deviants). For the MMN, correlations were quite specific in that only Social Anxiety correlated with an increased (more negative) cMMN and more weakly with an increased fMMN.
Finally, for the N1-P2-N2 complex, which is elicited to both Standard, Distractor and Target stimuli in varying degrees and with different overlap and proximity of peaks, results are a bit more difficult to analyze. Given that little is known about the properties of this wave complex in terms of, e.g., polarity-reversal, we resort to reporting that, post-hoc , we found correlations for each of these peaks with several of the MEDI sub scales (Winkler et al., 2013). The interested reader is referred to Supplementary materials. Plots of the beta coefficients at relevant channels can be supplied at request.

Modified Personality Inventory for DSM-5 and ICD-11 (PID36)

Figure 10 shows results for PID36 indexing maladaptive personality trait dimensions.
>> Figure 10 here <<
Results for PID36 largely mimicked those already described above. We found that a reduced P3b and ERN and an increased CRN correlated with higher scores. For Standard stimuli in the AO paradigm, we found the same regional effects corresponding to reductions in the negative peak just after 300 ms and the following slow negative wave in parietal regions.
As for MEDI, since we found significant correlations for the PID36 total score, post-hoc we also analysed each PID36 sub scale at the uncorrected 5% level. Again, the correlations found for PID36 total score were stronger, involving more channels and longer time ranges. A reduced P3b correlated strongly with increased scores in all sub scales except for Psychoticism. A corresponding pattern was observed for the late parietal wave elicited to Standard stimuli in the AO paradigm. Of the response-locked Flanker ERPs, an increased CRN and a decreased ERN correlated with higher scores in the same three sub scales: Anankastia, Detachment and Negative Affect. In addition, an increased CRN correlated with increased scores in Psychoticism, but only at Cz and CPz.
For Negative Affect, a reduced Pc in frontal-central regions correlated with higher scores while a reduced Pe in the same regions correlated with higher scores in Anankastia.
Also for Negative Affect, we found correlations between a reducedNovelty P300 in frontal-central regions and higher scores. We noted that, along with the results for MEDI Depression described above, this was the only significant correlation for Novelty P300 in our entire analysis.
We also observed some interesting correlations for the UO paradigm. For Anankastia, increases in dMMN and the following duration deviant dP3a, both centered around Fz and FCz, correlated with higher scores, albeit weakly. For Antagonism, reductions in cMMN (although only at CPz and Pz) and dMMN and the following duration deviant dP3a correlated with higher scores. For Detachment, increased cMMN at Fz correlated with higher scores. For Disinhibition, increases in all three MMN measures as well as decreases in all three corresponding dP3a measures correlated with higher scores. Finally, of opposite direction, reductions in cMMN and fMMN correlated with higher scores in Negative Affect.
Other than these go-to results, as in the post-hoc analysis of the MEDI sub scales described above, several correlations involving the N1-P2-N2 complex were observed. Again we invite the reader to study the results in the Supplementary materials.