15th Speech in Noise Workshop, 11-12 January 2024, Potsdam, Germany 15th Speech in Noise Workshop, 11-12 January 2024, Potsdam, Germany

P05Session 1 (Thursday 11 January 2024, 15:35-18:00)
Unmasking attention: Investigating the competing acoustic and cognitive influences during spatial speech-on-speech listening

Georgie Maher, Sarah Knight, Sven Mattys
University of York, UK

Understanding speech-perception-in-noise (SpiN) requires modelling the interaction between bottom-up (acoustic) factors and top-down (cognitive) processes. However, the precise mechanisms by which cognition supports SpiN are unclear. In particular, the role of working memory (WM) remains debated. Some studies show that high-WM individuals have better SpiN. While this link is broadly established for older and/or hearing-impaired listeners, however, it is much less clear for young, normal-hearing adults. This may be partly due to: (1) the failure of existing studies to assess the multiple components of WM and account for other relevant abilities, such as attentional control; (2) lack of power to exploit the narrow range of individual variability in many WM tests; (3) differing degrees of acoustic difficulty in existing paradigms, usually implemented by manipulating energetic masking (EM; interference between speech and noise at the auditory periphery).

In this study, participants completed a selective listening task in which they were asked to transcribe the speech of one of two simultaneously-presented talkers. We filtered the speech into frequency bands that were either identical or non-overlapping between talkers (EM-present vs. EM-absent). We also manipulated perceived spatial distance between the talkers (collocated, i.e., diotic, vs. +/- 90⁰ azimuth, i.e., dichotic). For EM-present stimuli, this resulted in maximal EM in the collocated condition and minimal EM in the dichotic condition, whereas spatial-attentional demands were maximal in the dichotic condition and minimal in the collocated condition. For EM-absent stimuli, only spatial-attentional demands varied across spatial distance. Participants also undertook a battery of cognitive tasks to assess three key components of WM and attention: phonological loop, executive function and selective/divided attentional control. The study is currently being run online with a target of N=240. Results will be reported at the conference.

For EM-present stimuli, we expect that performance will be better in the dichotic than diotic condition due to spatial release from energetic masking. For EM-absent stimuli, however, we predict that performance will be better in the diotic than dichotic condition due to the cognitive cost of spatial attentional control in the dichotic condition. This hypothesised cognitive cost also leads us to expect that cognitive task scores will be more strongly linked to performance in dichotic than diotic conditions. When EM is severe, we expect that listeners will not be able to restore degraded speech via recruitment of cognitive resources, thus making the link between listening and cognition task scores weakest in the collocated, EM-present condition.

Last modified 2024-01-16 10:49:05