Voice and facial processing occurs through convergent neural systems that enable speaker recognition. Neuroimaging studies suggest that familiar voice processing appeals to the early visual cortex, including the bilateral fusiform gyri (FG) of the basal temporal lobe. However, what role the FG plays in known speech processing and whether this is driven by bottom-up or top-down mechanisms has not been resolved.

According to a new study from the University of Pittsburgh, voice and facial recognition are even more closely linked than previously thought. It raises the intriguing idea that visual and auditory information needed for identification is fed into a single brain region, enabling more powerful, comprehensive recognition by merging different sensational modes.

Using direct cortex recordings from epilepsy surgical patients, researchers looked at human FG’s neurological responses to familiar voices and faces. They looked at the temporal characteristics of voice responses in FG and tested the theory that neuronal populations in human FG respond to recognizable sounds. Five adult epilepsy surgical patients were recorded as they performed a task in which they had to identify people using visual and audio cues from familiar speakers. Patients were shown pictures of presidents or fragments of their voices and asked to identify the portrait/speaker.

The fusiform gyri, or FG, is the area of ​​the brain that processes visual signals. Recordings of electrical activity from this brain region revealed that when individuals heard familiar voices, the same region became active, albeit with a lowered and slightly delayed response.

Senior author Taylor Abel, MD, an associate professor of neurological surgery at the University of Pittsburgh School of Medicine, said: “We know from behavioral research that people recognize a familiar voice faster and more accurately if they can associate it with the face of the speaker, but we’ve never had a good explanation for why that happens. In the visual cortex, particularly in the part that typically processes faces, we also see electrical activity in response to famous people’s voices, highlighting how deeply the two systems are connected.”

“It shows that auditory and visual areas interact very early in identifying people and don’t work in isolation. In addition to enriching our understanding of basic brain function, our study explains the mechanisms behind conditions where voice or facial recognition is compromised, such as in some forms of dementia or related conditions.”

Magazine reference:

  1. Ariane E. Rhone et al. Electrocorticography reveals the dynamics of famous voice responses in human fusiform gyrus. Journal of Neurophysiology. DOI: 10.1152/jn.00459.2022