Abstract
Alterations of mismatch responses (ie, neural activity evoked by unexpected stimuli) are often considered a potential biomarker of schizophrenia. Going beyond establishing the type of observed alterations found in diagnosed patients and related cohorts, computational methods can yield valuable insights into the underlying disruptions of neural mechanisms and cognitive function. Here, we adopt a typology of model-based approaches from computational cognitive neuroscience, providing an overview of the study of mismatch responses and their alterations in schizophrenia from four complementary perspectives: (a) connectivity models, (b) decoding models, (c) neural network models, and (d) cognitive models. Connectivity models aim at inferring the effective connectivity patterns between brain regions that may underlie mismatch responses measured at the sensor level. Decoding models use multivariate spatiotemporal mismatch response patterns to infer the type of sensory violations or to classify participants based on their diagnosis. Neural network models such as deep convolutional neural networks can be used for improved classification performance as well as for a systematic study of various aspects of empirical data. Finally, cognitive models quantify mismatch responses in terms of signaling and updating perceptual predictions over time. In addition to describing the available methodology and reviewing the results of recent computational psychiatry studies, we offer suggestions for future work applying model-based techniques to advance the study of mismatch responses in schizophrenia.
Introduction
Efficient perceptual processing depends on detecting unexpected changes in the sensory environment. For example, when a surprising stimulus follows a sequence of expected ones, macroscopic neural measurements like the electroencephalogram (EEG) yield discernible responses signaling the mismatch between previous sensory regularities and present stimuli. One well-studied neural correlate of this deviance detection in the auditory system is the mismatch negativity (MMN). It is quantified as the difference in response waveforms between deviants versus standards. MMN is observed in classic oddball paradigms 1 where standards are repeated many times, 2 and in complex stimulus sequences where stimulus expectation is based on context rather than repetition. 3 In cognitive and computational terms, it has been linked to prediction error (PE) signaling, resulting from a comparison between the brain's predictions and incoming stimuli.2,4 Besides MMN, which occurs around 100 to 200 ms after stimulus onset and is modulated by attention only in some conditions,5,6 another widely studied mismatch response (MMR) is the P3 component 7 of the event-related potential (ERP), occurring around 250 to 500 ms after stimulus onset and more influenced by attention. 8 P3 is traditionally linked to attentional switches in response to salient stimulus features but more recent studies have suggested that MMN and P3 are two stages of a predictive hierarchy 9 and cross-modal evidence of mismatch processing has been found in the P3 time range. 10
MMRs are considered potential schizophrenia biomarkers
11
mainly due to established MMN reduction in patients,
12
in addition to more nuanced effects on P3 amplitude.
13
Importantly, MMN reduction is thought to reflect underlying neurobiological impairments, such as glutamatergic N-methyl-
Interest in schizophrenia biomarkers has fueled the growth of computational psychiatry,26–28 which employs computational models to uncover the mechanisms of psychiatric disorders. These biomarkers can relate to diagnosis (indicating disease presence), prognosis (assessing the likely course of the disease), and risk (estimating the chances of being diagnosed), among other clinical aspects. 11 Beyond biomarker identification, computational psychiatry aims to elucidate schizophrenia pathophysiology and inter-individual differences for personalized treatments. Here, we adopt the typology of computational approaches to neuroscientific data outlined in a foundational review of computational cognitive neuroscience, 29 including two main types: “from data to theory” and “from theory to data” (Figure 1).

Overview of modeling techniques. In applying computational models to the study of MMRs in schizophrenia, two broadly complementary approaches can be used. Going from empirical data to theoretical/computational models, computational approaches include connectivity modeling (inferring effective connectivity patterns mediating mismatch signaling in patients and controls) and decoding techniques (eg, allowing the classification of study participants into patients and controls based on multivariate MMR features). Going from models to empirical data, computational approaches include neural networks (which, similar to decoding, can allow classifying study participants, but also encompass eg MMR simulation studies) and cognitive models (typically quantifying stimulus sequences in computational terms based on eg probability and surprise).
Approaches going from data to theory are largely data-driven, make few assumptions about the underlying cognitive processes (such as change detection), and often (but not always) rely on standard statistical techniques such as regression and classification. These approaches include (a) models of connectivity and dynamics, which in the context of MMRs aim at elucidating effective connectivity patterns between distinct (sub)-cortical regions underlying the measured signals; and (b) decoding and representational models, which use multivariate pattern analysis (MVPA) and classification techniques to uncover the information present in population responses such as MMRs. Models of connectivity and dynamics are primarily distinguished from decoding and representational models not in terms of mathematical or statistical techniques, but rather in terms of the research questions they are used to elucidate. Connectivity models can provide more mechanistic insights into putative neurophysiological impairments, related to impaired connectivity between brain regions or neuromodulation within distinct neural populations, which may underlie altered MMRs. Conversely, decoding models can be adapted to complex stimulus sequences and experimental paradigms, going beyond simple differences in MMRs between clinical populations and inferring more fine-grained representational contents of MMR responses.
In contrast, approaches going from theory to data typically encompass explicit assumptions about the underlying cognitive processes, and often rely on more complex computational modeling techniques. These approaches include (c) neural network models, comprising neurobiologically inspired but relatively abstract connectionist models mapping inputs to outputs, which can take a form of mapping stimulus inputs to neural (eg, MMR) outputs, or neural inputs to clinically relevant (eg, diagnostic or prognostic) outputs; and (d) cognitive models, which aim at understanding the theoretical and algorithmic properties governing trial-by-trial MMR dynamics. While both aim at reproducing cognitive phenomena related to change detection, these models differ in terms of levels of abstraction and complexity: neural network models are composed of a large number of relatively simple building blocks (artificial “neurons”) performing basic operations, while cognitive models have fewer components which perform more abstract operations (eg, quantifying information-theoretical quantities related to surprise). They also differ in terms of research questions they are typically used to answer. While neural network models can be used similarly to decoding models, they can also be used to simulate altered MMRs following different manipulations of network parameters. Conversely, cognitive models can be used to infer eg learning impairments which may subserve MMR alterations in clinical populations.
We will now turn to presenting a survey of each of these four types of model-based approaches—both in the broader context of investigating MMRs, and in the more specific context of the role that MMRs may play in schizophrenia.
Connectivity Models
As mentioned earlier, MMN reduction in schizophrenia has been putatively linked to underlying NMDAR hypofunction. 14 NMDAR hypofunction is also central to the influential “dysconnection” hypothesis of schizophrenia,30,31 suggesting abnormal interactions between NMDAR function and other neuromodulatory systems. These interactions lead to aberrant connectivity patterns, including extrinsic (long-range ascending and descending projections) and intrinsic (local synaptic gain control) connectivity. Importantly for the interpretation of MMN in terms of PE responses, 2 different types of connections have been linked to predictive processing. Descending connections are thought to mediate predictions, ascending connections subserve PEs, while gain control scales their precision. Identifying connectivity patterns which may be impaired in schizophrenia not only has neurobiological relevance, but can also relate to cognitive symptoms.
Among a myriad of connectivity estimation techniques applicable to EEG32,33 (for fMRI applications, see elsewhere34,35), we will focus primarily on model-based effective connectivity analysis (as opposed to model-free and/or functional connectivity such as phase locking value, in which connections are symmetric 36 ) based on time-domain data such as ERPs (as opposed to frequency-domain). In particular, dynamic causal modeling (DCM) is a hypothesis-driven, model-based effective connectivity approach 37 aspiring to biological realism by modeling interactions between excitatory and inhibitory cells within and between brain regions. Models are typically fitted to individual participant's data, yielding connectivity estimates which maximize model accuracy (goodness of fit) while minimizing complexity (preventing overfitting). Connectivity parameters are compared between experimental manipulations (eg, deviants vs standards) or participant groups (eg, patients vs controls) to identify connections sensitive to these factors.5,19 DCM has often been applied to EEG or MEG (magnetoencephalographic) data acquired in oddball sequences to uncover connections modulated by stimulus deviance.38,39 This approach established a standard model of auditory oddball processing, including bilateral primary auditory cortices (A1), superior temporal gyri (STG), and inferior frontal gyri (IFG), which has since been applied in many MMN studies,40–43 including psychopharmacological manipulations19,44 and clinical groups.45,46 Below, we review such applications in the context of schizophrenia (Figure 2). For more comprehensive connectivity-related results, please refer to recent reviews on early-stage schizophrenia, at-risk mental states, 47 and the ketamine model of NMDAR hypofunction. 48

Qualitative overview of effective connectivity results. The graph shows the reported modulatory effects of Sz-relevant groups versus controls, mapped onto the widely used dynamic causal model (DCM) of auditory mismatch responses. The most consistent effects include IFG and A1 disinhibition, right STG inhibition, as well as increased right-to-left STG connectivity in patients or related groups. While the reviewed studies show qualitatively heterogeneous results (largely dependent on the investigated cohort and paradigm), please note that this overview should not be interpreted as a direct comparison of posterior parameter estimates between studies, as different studies may select different winning models. Sz: schizophrenia diagnosis; FEP: first-episode psychosis; CAPE: community assessment of psychic experiences (quantifying psychotic-like experiences); inpat.: inpatients without psychosis. In case of multiple groups investigated, asterisk denotes the modulatory effect on connectivity associated with membership of a specific group.
Connectivity models applied to schizophrenia research cover various paradigms, such as classic and multiple oddball,49–51 roving oddball,52–56 and more complex sequences.57–59 In a classic duration oddball study, 60 patients showed increased temporal-to-prefrontal ascending connectivity and reduced prefrontal cross-hemispheric connectivity, quantified using partial directed coherence. The latter finding was linked to greater negative symptom severity. However, DCM studies on duration and frequency deviants61–63 revealed different results, indicating impaired intrinsic (self-)connectivity with putative links to abnormal neuromodulation. 19 This is likely due to unique properties of DCM, which can distinguish between extrinsic and intrinsic connectivity. The locus of intrinsic connectivity modulation varied across studies. An earlier report 62 suggested that patients and (to a lesser degree) unaffected relatives showed increased prefrontal self-inhibition (interpreted as reflecting NMDAR abnormalities). This finding was recently replicated, 63 but additionally positive auditory symptoms were linked to A1 disinhibition. However, another recent study 61 linked prefrontal connectivity to chronic psychosis, whereas first-episode patients showed localized connectivity changes in the left A1. Finally, a multiple-deviant study has compared children at familial risk of schizophrenia or bipolar disorder to age-matched controls. 51 Both at-risk cohorts showed impaired connectivity in A1 and stronger forward connectivity to the prefrontal cortex. Notably, children at risk of schizophrenia exhibited different connectivity patterns than those at risk of bipolar disorder, including impaired intrinsic STG and prefrontal connectivity, as well as weaker extrinsic connectivity to the STG. Based on these studies, MMN alterations in schizophrenia and at-risk populations may stem from disrupted intrinsic (and, to a lesser extent, extrinsic) connectivity in different cortical regions, influenced by disease progression stage and symptom severity.
DCM studies on the roving oddball paradigm showed diverse findings depending on the studied cohort. One MEG study in adolescents with schizophrenia 52 revealed altered cross-hemispheric connectivity, emphasizing stronger right-to-left connections between bilateral STG; however, intrinsic connectivity was not analyzed. Another MEG study 53 with a small sample of 14 schizophrenia patients found reduced self-connectivity in the (right) A1 and increased descending frontotemporal connectivity. The same subset of connections was found in a study involving young nonpsychotic 22q11.2 deletion carriers 54 at genetic risk for schizophrenia. Here, however, both connectivity parameters were nominally reduced in at-risk individuals, albeit the effects did not survive correction for multiple comparisons. Finally, a ketamine study 56 linked NMDAR blockade to reduced intrinsic inhibition in prefrontal regions. Overall, these studies predominantly identified changes to intrinsic and descending connectivity, although their specific pattern varied across groups (diagnosed schizophrenia, genetic risk, or pharmacologically induced NMDAR hypofunction).
In complex MMR-evoking sequences where stimulus probabilities change over time, studies have shown that neural responses in healthy volunteers can be influenced by whether sounds are initially perceived as standards or deviants, and how rapidly this assignment changes. 64 In research involving patients with schizophrenia, nonpsychotic inpatients, and healthy controls, it was found that schizophrenia diagnosis was associated with reduced MMN in stable sequences and P3 in volatile sequences. 58 DCM linked these reductions to decreased intrinsic connectivity in the left A1 and right IFG. Additionally, symptom severity correlated with changes in frontotemporal connectivity. However, in a study involving undiagnosed individuals with subclinical psychotic-like experiences, DCM of EEG data did not predict prodromal scores, unlike raw data features based on ERPs. 59 Finally, another study compared diagnosed schizophrenia patients with nonpsychotic inpatients and healthy controls, using a stochastic oddball paradigm. 57 Here, both inpatient groups showed decreased intrinsic connectivity in the left A1 and descending frontotemporal connectivity. However, schizophrenia patients differed from nonpsychotic inpatients primarily in cross-hemispheric connections across the cortical hierarchy, which also correlated with psychotic-like symptom severity. It should be noted that these connections were distinct from the connections found in a study from the same group using reversal sequences. 58 Given such heterogeneous findings across participant cohorts and stimulus sequences, the pattern of connectivity alterations in patients with schizophrenia and related groups remains elusive, and the field would benefit from independent replications of the reported findings.
Decoding Models
In contrast to connectivity models which characterize the interactions between regions, decoding models can reveal the information present in a region's population activity. As a type of MVPA, decoding models allow researchers to exploit the fine-grained multivariate information present in most neuroimaging and electrophysiological data. Decoding and other types of MVPA have helped uncover the content of regional representations of the brain, which adds a functional interpretation of brain activity. 29 Decoding and encoding models can be conceptualized as the inverse of one another. In short, decoding models take data and use it to build a model of representations, whereas encoders begin with the model and use it to predict the data. Decoding models such as support vector machines (SVM), especially in combination with representational similarity analysis (RSA), have proved invaluable in other areas of research, such as perception and learning. 65 RSA stipulates that stimuli with more distinct neural representations are easier to decode, thus, representational similarity can be indexed by the degree of decodability. The decodability of all possible pairwise combinations of stimuli are compared in a representational dissimilarity matrix (RDM). 66 In contrast, SVM attempts to identify a reproducible hyperplane that maximizes the distance between two categories. 67 Despite its success in other areas, MVPA is severely underutilized in MMN and schizophrenia research, both separately and combined. One study assessed MMN in healthy participants using RSA analysis on a roving oddball paradigm. 68 The researchers found that acoustic features were decodable from the topography of MMRs, although at later latencies than typical MMN. This approach could be highly useful, and easily translated, to a population with schizophrenia. By decoding stimulus features from the MMN and comparing the resulting decoding accuracies between patients with schizophrenia and healthy controls, it may be possible to tell if MMRs are sensitive to different stimulus features in schizophrenia than in healthy controls.
Alternatively, instead of decoding stimulus features from MMRs, MVPA can be applied to MMRs to decode schizophrenia diagnosis (Figure 3A). For instance, a previous study applied SVM to fMRI-derived MMR, comparing patients with schizophrenia to healthy controls in an auditory mismatch task. 49 SVM was applied to two types of data features; MMR-related brain activation patterns across multiple fMRI voxels, and functional connectivity measured by a correlation analysis across the whole brain. The activation-based features showed 83% participant classification accuracy already within four regions of interest (ROI), while the functional connectivity dataset performed similarly or worse for up to 10 ROIs. However, inclusion of up to 24 ROIs in the functional connectivity reached a maximal accuracy of 90%. In short, the study found that the inclusion of functional connectivity measures across the distributed networks yielded higher classification accuracy. Thus, MVPA methods capitalized on altered functional connectivity to infer the presence of schizophrenia diagnosis, which is consistent with previous research and outlines MMR's potential as a promising diagnostic biomarker of core impairments in schizophrenia. 30 Another study 69 also used a decoding model to assess MMN as a diagnostic biomarker of schizophrenia. Using a similar experimental paradigm, this study measured EEG activity in patients with schizophrenia and healthy controls who listened to different types of oddball sequences. Oddball stimuli were based on three physical aspects: stimulus duration, aural gap, and interstimulus interval. Two separate MVPA models were tested; SVM and a Gaussian process classifier (GPC). These models elicited accuracy of up to 80%, with the best performing models being generated via the GPC in response to a gap stimulus paradigm. Global functioning scores predicted by the model were shown to have a 73% correlation with true scores, providing additional evidence for MMN responses as a diagnostic biomarker of schizophrenia and symptom severity.

Possible applications of decoding and neural network models. (A) Example of SVM application to decode schizophrenia diagnosis. By applying SVM to a multivariate set of MMR features (eg, EEG amplitudes in the MMR window vs a later time window), it is possible to classify participants based on diagnosis. The SVM creates a hyperplane which separates the data into classes with up to 90-98% accuracy. (B) Possible application of DNN models in MMR/schizophrenia research. Two recurrent neural network (RNN) models are created, with their mechanisms altered in a fashion that represents a given hypothesis in schizophrenia research (In this example: schizophrenia patients exhibit impaired top-down feedback). Then, neuroconnectionist methods (here: RSA) are used to compare the dynamics or representations between the model variations and neurophysiological data from healthy and schizophrenia-diagnosed participants. If the representational dissimilarity matrix (RDM) of the hypothesis-altered model better fits the RDM of schizophrenia patients than the RDM of the standard model (and vice versa for healthy participants), the altered underlying mechanism can be taken as a better model for the corresponding neural mechanisms of schizophrenia.
Neural Network Models
As an alternative to SVM and other classical machine learning decoding models, deep neural network (DNN) models are gaining popularity for schizophrenia-related decoding tasks, especially for automatically classifying participants based on their diagnosis. As mentioned above, MMR-related features can be used as a practical basis for classification, with their employment usually increasing the distinguishability between patients and healthy individuals (relative to other data features).70,71 This holds for detection based on classical machine learning models69,72–78 (see decoding models) as well as DNN models.79,80 While DNN models might still rely on MMR-related variables of the underlying neuroimaging data, they do not necessarily rely on the heuristic preprocessing of MMR-related features, since they are usually capable of extracting classification-relevant information directly from the raw data.79–85 Accordingly, DNN models have the advantage of classifying based on intricate and complex markers of schizophrenia which generally yield a higher accuracy of detecting the underlying trait than classical machine learning models, 70 often scoring as high as 90% to 98%.71,80,86–91 However, this results in a decrease in interpretability (a consequence of high parameter count in most DNNs and the resulting “black box” effect). While this accuracy-interpretability tradeoff might be acceptable for classification, interpretability becomes challenging when using DNN as scientific models of the brain.
Using DNN as scientific models is a relatively new but quickly expanding field in neuroscience, which has led to the emergence of the neuroconnectionist research program. 92 Its main mode of operation is to implement different neural network models (altered in a way to account for a given neuroscientific hypothesis), which are used to encode, decode, or replicate various aspects of empirical neurophysiological responses. This approach can include comparing behavioral, activational, or representational data features, but can also include additional methods like in silico lesion studies, which investigate the influence of altering DNN parameters on simulated responses. An overview of the rationale and analysis techniques can be found elsewhere. 92 The neuroconnectionist research program has mostly gained traction in vision neuroscience.93–97 However, the same principles can be applied for the investigation of MMR98–101 and schizophrenia. 102 Cortez-Briones et al 71 give an extensive overview of previous studies using DNN models in schizophrenia research. While to date, there are few studies applying this approach to MMRs, the flexibility of DNN models makes them a very promising method for future research. A potential application of DNN models in MMR/schizophrenia research is illustrated in Figure 3B.
Beyond DNNs, other neural network models include attractor-based models, like Hopfield Networks 103 or spiking neural networks (SNN). 104 These models aim to simulate certain physical attributes of the brain, and have been employed for neural simulations for several decades. Attractor-based networks have been used to model MMR in general,105,106 as well as in connection with schizophrenia research (for a review, see Ref. 107 ). As an example, SNNs have been used to simulate neural noise levels and link the alteration of NMDA, gamma-aminobutyric acid (GABA) and DA receptors to increased signal-to-noise ratio and impaired MMR, pointing out similarities to neurophysiological data of schizophrenia patients. 108 The main advantage of using attractor-based models is their depth of biological realism, which is often used to model cell-level dynamics such as excitability and neurotransmitter responses. While until recently, the strength of attractor-based neural networks was usually restricted to purely simulatory approaches, recent advances have enabled the training of these models for decision and perception tasks. This raises the possibility that these types of neural networks, similarly to DNNs, can be used as comprehensive models of cognitive functions and their disruption in neuropsychiatric disorders.109,110
Cognitive Models
In recent years, the study of MMRs has shifted from average-based analyses to single-trial modeling, capturing more intricate brain response dynamics. MMRs such as MMN and P3 have been proposed to reflect a neuronal expression of error or mismatch between current sensory inputs and those predicted under the brain's generative model.4,111,112 Thus, MMRs may provide information about the brain's inference process on the environment's statistical structure. 3 Given this sensitivity, MMRs should depend on gradual changes in input statistics, necessitating the analysis of trial-by-trial dynamics. Consequently, a growing area of research has employed computational models, particularly Bayesian observer models, to examine single-trial EEG dynamics of MMN in audition,113–117 vision118,119 and somatosensation,10,120,121 and P3 across different senses.116,120–128
Bayesian observer models can incorporate sequential information to infer the probability of new observations (Figure 4A). Given the evidence pointing towards a probabilistic inference-based model of psychosis,28,129,130 dissecting various aspects of sequential inference and learning could provide insights into differences between healthy and clinical populations. At the level of probabilistic inference, models might differentiate between quantifying (a) stimulus probability,115,122,128 typical for classic oddball paradigms where deviants are defined by low probability4,131; (b) repetition/alternation probability,132–134 given the sensitivity of MMN to (un)expected stimulus repetitions,2,135,136 which may be disrupted in schizophrenia137,138; and (c) transition probability,10,116,121,127,139 typical for Markovian stimulus sequences 140 and considered essential for probabilistic sequence processing. 139 These probabilistic quantities, estimated by the models, are used with different read-out functions to connect model dynamics to brain data, highlighting different aspects of the inference process reflected in MMRs. 115 The read-out functions include (a) novelty detection,123,124,127,128 indicating MMR sensitivity to the degree of surprise following new observations115,141–143 and quantified using information-theoretical surprise measures 144 ; (b) belief commitment,145–149 whereby probabilistic inference is weighted by prediction confidence, quantified as confidence-corrected surprise10,121,150,151; and (c) model update,10,115,120,121,123,128,152 indicating the degree to which new observations trigger adjustments of the generative model,115,153,154 quantified as Bayesian surprise.155,156 Recent work yields evidence that earlier MMRs (eg, MMN) may be more related to belief commitment, while later MMRs (eg, P3) rather reflect model updates.10,121

Cognitive models. (A) Observation probabilities include probabilistic quantities related to stimulus occurrence, alternation/repetition, and transitions between stimuli. These probabilistic quantities are subject to different read-out functions based on surprise. (B) Modeling MMRs using the HGF indicated that the MMN and P3 can be mapped onto different hierarchical levels of predictions and PEs. The directed graph shows a typical HGF architecture, tracking probability estimates over time. The highest level relates to volatility estimates and has been linked to the P3, while the lower level relates to transition probability estimates and has been linked to the MMN.117,157 Both levels have been shown to be altered in schizophrenia.158,159
A widely used observer model in computational psychiatry117,159–165 is the hierarchical Gaussian filter (HGF),166,167 which models neural or behavioral responses as PEs scaled by prediction precision (inverse variance of belief distribution). These prediction-weighted PEs are used to update predictions of the next observation, with multiple hierarchically organized levels where each level sends predictions to the lower level, and resulting PEs are sent back up the hierarchy. Lower-level PEs typically track stimulus transition probabilities, while higher-level PEs track changes in these probabilities over time. Thus, the HGF excels in modeling probabilistic inference and learning of dynamic stimulus sequences with changing statistical features..168,169 The HGF is also versatile, as it can capture behavioral and neural responses across a wide range of tasks,117,118,157,159,169–171 and aligns with normative theories of brain function based on probabilistic inference such as hierarchical predictive coding.172,173
The HGF has been used in studies investigating neurocomputational mechanisms mediating schizophrenia and related conditions. In a ketamine study of the roving oddball paradigm, 117 stimulus sequences were modeled to generate lower versus higher-level PEs. Similar to findings from surprise-based studies mentioned earlier, this research indicated that MMN reflects lower-level PEs about stimulus transitions, while P3 reflects higher-level PEs regarding their volatility (ie, how fast these transitions change), used to update estimates of environmental statistics (Figure 4B). Importantly, the study found that ketamine reduced high-level PE signaling, suggesting that NMDAR antagonism impairs probabilistic inference related to abstract statistical regularities (but see a recent study showing low-level effects of ketamine 174 ). Another study, using cholinergic antagonist biperiden and DA antagonist amisulpride (and respective agonists), 157 replicated the finding that lower- versus higher-level PEs can be mapped onto MMN versus P3. Furthermore, the study found that biperiden (compared to placebo) decreased the correlation of EEG amplitudes with low-level PEs, but increased the correlation with high-level (volatility) PEs, suggesting a different effect of DA antagonism than the NMDAR modulation described earlier. 117 No other drug effects were identified in this study.
In recent studies, the HGF has been directly applied to investigate schizophrenia. 175 In one study of MMN responses, 158 early-stage patients were compared with individuals at clinical risk and with healthy controls. Both low (sensory) and high-level (volatility) PEs were altered in patients and those at risk, compared to controls. Furthermore, low-level PEs could predict the conversion to psychosis in at-risk individuals. Another study, using fMRI data to model PE signals, involved participants with diagnosed schizophrenia with varying levels of delusions and healthy participants with varying delusional-like ideation. 159 Participants engaged in a reversal learning task with volatile (ie, frequently changing) stimulus contingencies. Delusions (regardless of diagnosis) led to increased precision-weighted PE-related neural activation in fronto-striatal regions. In contrast, schizophrenia diagnosis (independent of delusion strength) resulted in overestimated environmental volatility and weaker neural correlates of volatility in the anterior insula, medial frontal, and angular gyrus. This suggests that schizophrenia may be associated with false beliefs regarding environmental volatility and their impaired neural encoding. However, since patients in the latter study were relatively older than in the former study (mean >30 vs <25 years), 158 model-based correlates of schizophrenia may change during disease progression, moving from lower to higher levels of the processing hierarchy—as previously hypothesized. 129
Limitations of Statistical and Machine Learning Models in Computational Psychiatry
When using statistical and machine learning tools to infer brain functions it is important to address that these models are usually highly driven by the underlying data and can be subject to overfitting if not validated properly. This is especially true for highly sensitive models, such as nonlinear SVM or DNN. When interpreting or devising studies employing such models, it is therefore crucial to keep in mind that their generalizability only goes so far as the underlying sample of subjects and validation procedure allow for. This is especially relevant in clinical group prediction where the implied claim of a statistical or machine learning approach is to provide criteria that inform on the condition or treatment of new (ie out-of-distribution) subjects or measurements. A good example of this are clinical versus non-clinical group classification tasks: in many such studies, models are fitted and evaluated on a very narrow dataset, often on a subject-level basis or with both training and validation data recorded in the same measurement session. After this, a (usually very high) average decoding accuracy is reported. This average accuracy does not necessarily imply that these models are capable of performing out-of-distribution classification (eg, on new subjects) with a similar accuracy, since the neural and behavioral patterns on which these models are based can vastly differ between subjects and measurements.
Indeed, several studies have shown that many proposed procedures in schizophrenia prediction176–178 and other clinical predictions179,180 fail to generalize across different contexts, such as across subjects or across time ranges. Accordingly, when using statistical and machine learning approaches for clinical modeling, it is important to ensure the validity of the approach by carefully considering factors such as sample size, sample heterogeneity, temporal stability, as well as employing additional methods such as feature reduction or regularization178,181,182 to avoid overfitting the models to limited datasets.
Conclusions and Future Directions
Applying computational models to study the role of MMR in schizophrenia can yield several types of insights. First, connectivity models can help infer the putative neurophysiological mechanisms of MMR reduction (eg, disrupted long-range connectivity and/or local gain control). Since connectivity parameters (based on memory-related fMRI activity) have been demonstrated to outperform direct measurements of behavioral or neural data in schizophrenia subgroup definition, 183 adopting a similar approach to MMRs could be equally promising. However, MMR-based connectivity studies so far have yielded heterogeneous results (Figure 2), and thus more systematic replications are required.
Second, decoding models and DNNs can help establish MMRs as diagnostic/risk biomarkers (via classification methods), but also simulate the mechanisms of MMR disruption (via neuroconnectionist approaches). These computational approaches may benefit from a higher sensitivity to subtle neural signatures of mismatch impairment, going beyond more traditional methods such as univariate analyses of MMRs. 184 While few studies have made use of decoding models for the purpose of inferring MMR representational content, 68 this approach could elucidate the type of sensory features whose processing may be selectively altered in schizophrenia. Furthermore, combining RSA with deep neural networks (Figure 3B) can help infer the network parameters whose disruption would approximate MMR effects found in schizophrenia.
Finally, cognitive models can quantify MMR attenuation in schizophrenia in terms of the underlying computational algorithms (eg, predictive processing), which may help disentangle related MMRs such as the MMN and P3 (Figure 4). Since this approach is relatively recent, it is yet to be extended to at-risk populations and related groups. In future work, cognitive models could also be combined with generative models of neural data to directly link disruptions of trial-by-trial predictive processing to the underlying neural mechanisms. In summary, applying complementary models to empirical data has the potential to elucidate both the pathophysiology and cognitive symptoms of schizophrenia.
Footnotes
Author Contributions
DCG, HHM, and MG contributed to design, drafted manuscript, and gave final approval. RA contributed to conception and design, drafted manuscript, and gave final approval. All authors agree to be accountable for all aspects of work ensuring integrity and accuracy.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: the Deutsche Forschungsgemeinschaft (grant number AU 423/2-1). We acknowledge support by the Open Access Publication Fund of Freie Universität Berlin.
