Compensating for a shifting world: evolving reference frames of visual and auditory signals across three multimodal brain areas

Abstract

ABSTRACTStimulus locations are detected differently by different sensory systems, but ultimately they yield similar percepts and behavioral responses. How the brain transcends initial differences to compute similar codes is unclear. We quantitatively compared the reference frames of two sensory modalities, vision and audition, across three interconnected brain areas involved in generating saccades, namely the frontal eye fields (FEF), lateral and medial parietal cortex (M/LIP), and superior colliculus (SC). We recorded from single neurons in head-restrained monkeys performing auditory- and visually-guided saccades from variable initial fixation locations, and evaluated whether their receptive fields were better described as eye-centered, head-centered, or hybrid (i.e. not anchored uniquely to head- or eye-orientation). We found a progression of reference frames across areas and across time, with considerable hybrid-ness and persistent differences between modalities during most epochs/brain regions. For both modalities, the SC was more eye-centered than the FEF, which in turn was more eye-centered than the predominantly hybrid M/LIP. In all three areas and temporal epochs from stimulus onset to movement, visual signals were more eye-centered than auditory signals. In the SC and FEF, auditory signals became more eye-centered at the time of the saccade than they were initially after stimulus onset, but only in the SC at the time of the saccade did the auditory signals become predominantly eye-centered. The results indicate that visual and auditory signals both undergo transformations, ultimately reaching the same final reference frame but via different dynamics across brain regions and time.New and NoteworthyModels for visual-auditory integration posit that visual signals are eye-centered throughout the brain, while auditory signals are converted from head-centered to eye-centered coordinates. We show instead that both modalities largely employ hybrid reference frames: neither fully head-nor eye-centered. Across three hubs of the oculomotor network (intraparietal cortex, frontal eye field and superior colliculus) visual and auditory signals evolve from hybrid to a common eye-centered format via different dynamics across brain areas and time.

DOI
10.1101/669333
Year