Introduction
Misophonia, literally ‘hatred of sounds’, is a disorder characterized by
strong negative behavioral and physiological emotional responses to
certain specific sounds which we come across frequently in our
day-to-day life (Jastreboff and Jastreboff, 2001; Kumar et al., 2014;
Brout et al., 2018; Swedo et al., 2022). In typical cases of misophonia,
these ‘trigger sounds’ tend to be sounds of eating, drinking, breathing,
and chewing produced by other people. The emotional response to trigger
sounds includes feelings of anger, irritation, anxiety, and disgust
accompanied by a strong urge to escape from the situation in which
trigger sounds are produced. Since trigger sounds are common and almost
inescapable in social situations, misophonia has debilitating effects on
occupational, family, and social life. A person with misophonia, for
example, might not have a meal or share a common space with other family
members, avoid using public transport for travel, and will tend to avoid
social situations at their workplace. This social isolation,
particularly in severe cases of misophonia, has a high impact on mental
health, and cases of suicide or suicide attempts have been reported in
the academic literature (Siepsiak et al., 2022) and media (Nauman,
2017), with up to 20% of people with misophonia indicating they have
had thoughts of suicide (Rouw and Erfanian, 2018). Although the exact
prevalence of misophonia is not known due to a lack of any comprehensive
epidemiological study, a few studies (Wu et al., 2014; Zhou et al.,
2017; Naylor et al., 2021) targeting specific populations (i.e.,
students) have reported a prevalence of anywhere between 5 and 20%.
The mechanism of misophonia remains largely unknown. A dominant paradigm
to understand misophonia currently is the ‘auditory processing
framework’, in which misophonia is considered a disorder of sound
processing (Jastreboff and Jastreboff, 2001; Jastreboff and Jastreboff,
2015). In this framework, trigger sounds cause aberrant processing in
the auditory/emotion processing parts of the brain which subsequently
drives the emotional response in those with misophonia (via a currently
unknown mechanism). The fact that many typical trigger sounds are
generated by other people, is completely ignored in the auditory
framework, as is the fact that the context in which trigger sounds are
produced is largely social, and that people with misophonia can be
triggered by particular individuals (e.g., close family members or
someone familiar) but not by others. In other words, the current
auditory-focussed model of misophonia employs a ‘detached’ framework
where sounds are isolated from the social context in which they
typically occur. There is now overwhelming evidence that the brain
processes social signals in a different way to non-social signals (for
reviews see (Adolphs, 2010; Molapour et al., 2021)). For example, brain
imaging in social cognitive neuroscience has shown that specialized
cognitive processes implemented in dedicated brain structures are
employed to extract socially relevant information such as faces
(Kanwisher et al., 1997), voices (Belin et al., 2000), the body as a
whole (de Gelder et al., 2010), and other ‘higher order’ social
information such as mental states of other people (Frith and Frith,
2006). In other words, these brain structures become active only in
response to sensory stimulation in social situations and they work
differently or are ‘silent’ when dealing with non-social sensory input.
This has important implications for misophonia: emphasis only on the
auditory/sound dimension while completely ignoring or delinking the
‘social’ aspect of misophonia may hinder progress in understanding the
perceptual/cognitive processes and underlying brain mechanisms.
Using an integrated ‘framework’, where both trigger sounds and their
social source are considered together, a different picture of misophonia
emerges. Since the trigger sounds, for example someone eating/chewing,
are associated with an action (e.g., orofacial movement) of another
person, it could be the case that misophonic distress is due to the
perceived action of others and not due to the sound per se, which is a
by-product of that action. In social cognition and neuroscience, it is
well known that mere observation or hearing the sounds of actions of
other leads to ‘mirroring’ or ‘mimicking’ of the same actions by the
perceiver without any intention or awareness to do so (Chartrand and van
Baaren, 2009; Heyes, 2011; Chartrand and Lakin, 2013). The mechanism
behind mimicry is commonly understood within the framework of a
‘perception-action’ link, which posits that perceiving the action of
others automatically activates representations of that action in the
perceiver which, in turn, executes movements that are congruent to the
perceived actions. With respect to brain function, the
‘perception-action’ link is instantiated as communication between
sensory areas and the motor areas of the brain. With emphasis on action
of the trigger-person, could it be that the perception-action link is
relatively stronger in misophonia which is activated by the sight or
sounds of action? Initial evidence for activation of the
perception-action link and ‘mirroring’ of actions in misophonia was
provided by a recent study (Kumar et al., 2021) from our group. The
study, using functional magnetic resonance imaging (fMRI), demonstrated
that in the resting state, when no explicit stimuli are presented,
people with misophonia, who had eating/chewing sounds as their dominant
triggers, show stronger connectivity (compared to control subjects) of
auditory and visual cortex to a part of the pre-motor cortex involved in
the movement of orofacial muscles (movement of mouth, lip, tongue etc).
An implication of the stronger resting state connectivity is that the
orofacial motor cortex may be ‘primed’ to respond strongly to auditory
and visual stimulation arising from the (orofacial) actions of others.
This was supported by the study showing that activation of the orofacial
motor cortex was stronger, specifically for trigger sounds (mostly
orofacial in nature) in misophonia.
One implication of the evidence from our neuroimaging data is that
mimicry should be widely present in the misophonia population. With
respect to misophonia, there are anecdotal reports of people mimicking
the actions of the trigger person, but the effect has not attracted much
attention within misophonia research, except for a couple of case
reports. In a report of two misophonia subjects (Hadjipavlou et al.,
2008), one subject who had eating /chewing sounds as triggers had urges
to mimic the sounds by moving their lips and mouth. In another study,
Edelstein et al. (2013) reported 6 out of 11 subjects (55%) having a
tendency to mimic the sounds. Within the ‘auditory framework’ mimicry is
difficult to explain, whereas the social/action framework suggests
mimicry would be present in people with misophonia.
To address the relation between misophonia and mimicry, we asked more
than 600 participants to complete online questionnaires relating to
their misophonia severity and their tendency to mimic the sounds/action
of the trigger person. More specifically, we explored the following: (i)
with increasing misophonia severity, how likely people with misophonia
are to mimic, (ii) whether the act of mimicking relates to a particular
set of trigger sounds (i.e., social), and (iii) what effect mimicking
has on individuals with misophonia. Our data suggest that the tendency
to mimic is associated with misophonia severity and that the act of
mimicking provides relief. The current data, along with results from our
neuroimaging study, provide support for a social
perception/cognition-based model of misophonia.