yen-lin.pan@univ-amu.fr
INTRODUCTION
Individuals tend to remember speech addressing emotional experiences, perhaps due to the use of emotionally-charged words (Kensinger & Corkin, 2003). The emotion that a word elicits has been frequently measured and contrasted by vector models, suggesting that each emotion can be described in terms of two main dimensions: valence and arousal (Rubin & Talarico, 2009). Valence refers to the degree to which an emotion is positive or negative; for instance, words like “paradise” have high, positive valence, while those such as “earthquake” have low, negative valence. Arousal refers to the intensity that an emotion activates – a word such as “death” evokes a high level of arousal, whereas a more neutral word such as “carousel” is associated with low arousal ratings. In vector models, words with higher arousal tend to be located at the two extreme ends of valence scale, while words with lower arousal are rated in the middle of valence scale. Additionally, negatively valenced words generally show a stronger correlation with arousal than positively valenced words. This arousal bias to negative words as well as the u-shaped distribution along arousal and valence dimensions have been observed across several European languages (Warriner et al. 2013; Söderholm et al. 2013; Monnier & Syssau 2014; Stadthagen-Gonzalez et al. 2017) as well as in Mandarin (Yu et al. 2015). These findings reflect a universal pattern of emotion classification, as revealed by questionnaires where participants evaluate their perception of a word’s arousal and valence on a numeric scale.
The impact of emotional valence on word processing has been investigated through several physiological measures, such as heart rate (Iffland et al. 2020), skin conductance (Jankowiak et al. 2018) or facial muscle activity (Niedenthal et al. 2009). Number of studies using electroencephalography (EEG) have demonstrated specific ERP signatures evoked by the processing of valenced words presented in written format, both at initial as well as at later stages (for a review of early studies, see Citron, 2012). Two early ERP components, the P2 and the Early Posterior Negativity (EPN), have been frequently observed in response to these stimuli. The P2 component, peaking at approximately 150 - 300 ms over centro-frontal sites, is characterized by more positive amplitudes for highly arousing stimuli compared to less arousing stimuli, reflecting the automatic allocation of attentional resources to words that elicit emotion (Hajcak et al. 2012). The EPN response is similar to the P2 in terms of its sensitivity to emotional content of verbal stimuli and the time window. The two, however, have distinct polarity and scalp distribution. Specifically, EPN shows larger negative-going amplitudes for emotionally-valenced words compared to neutral words, observed mostly over occipito-temporal sites. In contrast, the ERP components elicited by emotional words during the later stages of processing, notably the N400 and Late Positivity Component (LPC) tend to be influenced by task demands. Some studies have reported reduced N400 effects for valenced words compared to neutral words when participants performed a lexical decision task (Kanske & Kotz, 2007; Schacht & Sommer, 2009; Pauligk et al., 2019), an emotion-color stroop task (Sass et al., 2010) or a gender decision task (Kanske & Kotz, 2011). These findings thus suggest facilitated lexical or semantic processing of emotional stimuli. The LPC is a positive deflection that occurs at a latency of around 500-800 msec post stimulus onset over parietal regions. Emotionally valenced (positive and/or negative) words generally elicit a greater response than neutral words (Carretié et al. 2008; Citron et al., 2013; Herbert et al., 2008; Hinojosa et al., 2010; Hofmann et al., 2009; Palazova et al., 2011), reflecting the sustained attention towards a more in-depth evaluation of the emotional features of a stimulus.
The auditory processing of valenced words has been less widely documented in comparison to written words. In a seminal study, Mittermeier and colleagues (2011) reported an early modulation, in the P2 time window, for valenced words in comparison to simple tones. Using the same materials as Mittermeier et al (2011) in a combined fMRI/ERP design, Jaspers-Fayer and colleagues (2012) reported a similar early modulation of the P2 component, which was coupled with early activation of the anterior and orbito-frontal cortex specifically for emotionally laden auditory words. Importantly however, in both studies, these effects were obtained by contrasting auditory words with negative or positive valence to either simple tones or meaningless syllables. In contrast to these results, studies that compared the auditory processing of valenced words to neutral words rather than to non-linguistic stimuli, found no evidence of such early modulations. Grass and colleagues (2016) reported effects of valence 370-530 msec post-stimulus onset, evidenced by an increased frontal positivity and parieto-occipital negativity, which the authors suggested to be a mix of an N400 response and the auditory equivalent of the visual EPN. They did not find that the modulation of these later components, linked to lexical-semantic processing, was affected by modulations of the physical characteristics of the auditory words (volume), which affected the N1-P2 complex. Grass et al. (2016) argued that the auditory response evoked by the emotional content of words is thus distinct from early auditory evoked potentials. It is important to note that none of the above studies manipulated the prosodic contours of the auditory words, which were produced in a neutral manner. Indeed, the emotion conveyed by spoken words can be transmitted by both semantic content and by the speaker’s prosody, and the latter can affect earlier components such as the P2 (cf. Kotz & Paulmann, 2007). Hatzidaki and colleagues (2015) reported that valenced words evoked an increased late positivity, coming in after the offset of auditory stimuli, which resembled an LPC. Rohr and Rahman (2015), in contrast, did not find a reliable effect of valence on EEG signatures of auditory word processing at either early or later stages in pre-defined time windows or electrode sites. Post hoc exploratory analyses revealed a small but reliable increase in negativity at central ROI in response to negatively-valenced words from 300-400 msec. To further explore how valence and arousal may affect processing in the auditory domain, Kanske and Kotz (2011) compared the cortical response for negatively valenced compared to neutral auditory words in a task that involved response conflict. Both the ERP and fMRI results showed an interaction between response conflict and emotion, with an increased positivity at anterior sites between 420 and 550 msec and increased activation in the ventral anterior cingulate cortex when processing conflict in an emotional context compared to neutral context. Taken together, these studies suggest that the processing of isolated auditory words pronounced in neutral tone has shown a rather wide range of cortical response, with none of the reported effects occurring earlier than 300 msec.
Valence not only has an immediate impact on processing, as indexed by changes of neural activity, but its longer-lasting effects on our cognitive functions, especially memory, have also been studied with behavioral measures (for a review, see Kensinger & Schacter, 2008) and ERPs. Indeed, people tend to remember more visual stimuli that have high arousal, emotional valence than that have low arousal, neutral valence, including images (Jaeger et al., 2022), faces (Johansson et al., 2004), sentence contexts (Maratos et al., 2001) or isolated words (Leclerc & Kensinger, 2011). Several ERP studies were also conducted to understand the neural underpinning of how recognition memory is modulated by emotional valence. A common experimental design is the “study-test” paradigm, in which participants are first presented with a set of stimuli they are instructed to memorize, followed by a test phase during which they are asked to indicate whether items are old or new. Windmann and Kutas (2001) used this paradigm under the hypothesis that valence would bias participants’ recognition memory, leading them to both correctly identify and falsely recognize more negatively-valenced words as old, which would in turn impact the ERP response. Their hypothesis bore out behaviorally (see Inaba et al., 2005 for similar behavioral evidence). In contrast, valence produced no effect on ERPs prior to 450 msec and only a limited effect on the LPC. Using a similar design, Inaba and colleagues (2005) reported an “increased positivity” starting at 150 msec and continuing through 700 msec for correctly identified negative and positive words (new and old) compared to neutral words. The difference across the two studies lies thus in the ERP signature, which showed an early effect of valence, most likely related to the N400, in Inaba et al. (2005) but only a late effect, linked to the LPC, in Windmann and Kutas (2001). Santaniello and colleauges (2018) employed a short and long lag repetition priming paradigm to examine the influence of valence both behaviorally and on ERPs. They demonstrated that, compared to neutral and positive words, repeated negative words elicited a reduced N400 in central-posterior regions, suggesting a stronger episodic trace for these words. Critically, the facilitation was short-lived as the reduced N400 was significant only for very short lag repetition. Auditory stimuli, both linguistic (Schirmer, 2010) and non-linguistic (Alonso et al. 2015), have also been used in behavioral research to investigate the effects of valence on recognition memory. Schirmer (2010) showed that emotional prosody of auditory neutral words modulated the subsequent valence ratings of written words, but did not increase their recognition accuracy. In sum, previous ERP research on the effect of valence on the recognition of printed words has produced inconsistent results. The present study aimed to further address this question.
To our knowledge, no research to date has tested whether the valence of auditory words enhanced the subsequent recognition of written words, nor how such may impact the underlying neural mechanisms. To address these questions, we conducted an ERP experiment where participants were presented with positive, negative and neutral words in auditory format, and were later tested for their recognition of these words in written format. Based on previous literature, we hypothesized that participants would show enhanced behavioral recognition for valenced words, compared to neutral words. In relation to ERPs, previous research on auditory processing of valenced words has produced mixed results such that no clear hypotheses can be made. For written words, we predicted that valenced stimuli would induce an increased early attention, as indexed by the P2 or EPN. We also predicted facilitated processing to valenced words, based on not only pre-existing valence norms but also the ratings of individual participants.
EXPERIMENT 1
The first experiment had three main objectives. The first was to expose participants to the set of stimuli, presented as individual spoken words in Mandarin. The second was to reexamine the effect of valence and arousal on the cortical processing of spoken words as evidenced by ERPs. Only valance and arousal were manipulated; prosody was neutral for all stimuli. The third aim was to provide a cross-linguistic validation of the valence of the auditory stimuli, originally rated in English, by asking participants to rate each item on a 5 point scale, from negative to positive, in Mandarin.