Convergence refers to how data from different sensory modalities are combined to form a perception. Many perceptual processes actually require input from one or more senses.
Different areas of the brain receive the data and then synthesize the information that eventually results in a perceptual phenomenon.
This process is sometimes referred to as multisensory integration or multimodal integration.
Integrating various data streams from the five senses allows the organism to form a coherent perception of the environment. This is a highly adaptive mechanism necessary for survival.
While data from different senses travel from nerve receptors in the skin or eyes, for example, to specific locations in the cerebral cortex, that data is also processed and constructed in other brain regions.
This analysis and synthesis of data usually happens automatically and at incredibly rapid speeds.
In many scenarios, the process is completed mere milliseconds prior to conscious awareness. In many other scenarios, the process is completed but not transmitted to conscious awareness.
Instead, it is utilized in other aspects of functioning such as walking or in autonomic processes involved in maintaining heartbeat and breathing.
Origins of Convergence
Traditionally, research on sensory perception has focused on one modality at a time. In fact, many academic journals are specifically devoted to only one sensory modality such as vision or hearing.
However, there are exceptions. One of the earliest lines of research on multisensory processes was George Stratton’s (1896) research involving vision-distorting glasses.
In another example, Hartmann (1933) demonstrated that visual acuity could be improved through the simultaneous presentation of auditory, olfactory, or tactile stimuli.
Hartmann later produced a book on Gestalt psychology in 1935 which reviewed research on multisensory integration and how perception of one stimulus is affected by the perception of other stimuli.
London (1954) provided a review of the vast amount of research on sensory integration being conducted in the Soviet Union at that time (with a mere 506 references).
London noted that while research in the Soviet Union “adheres to standards of execution, reportage, an interpretation that would be quite unacceptable to the western researcher,” at the same time, Western research was “scattered and desultory,” while in the Soviet Union it was “systematic and sustained” (p. 531).
A jump forward to the 21st century sees a renewed interest in convergence research in the form of multisensory integration (see Stevenson et al., 2014).
“In the last few decades, our views concerning sensory processing have been revolutionized to now consider this from the perspective of a highly interactive, multisensory network of closely interrelated functional brain regions and mechanisms” (Stevenson et al., 2014, p. 706).
Convergence Examples in Psychology
- Ducking from a Ball: When a ball approaches a person’s head with some velocity, it’s a good thing that convergence can perform its duty so rapidly. Instantaneously integrating visual input with a reflexive muscular contraction of the torso allows the person to duck and avoid a direct hit to the skull.
- When Reading: Reading is an interesting example of multisensory integration because it involves blocking one’s conscious awareness of so much sensory input. This allows for the focus of one’s attentional resources and the subsequent cognitive elaboration of concepts being digested without the interference and distraction of other stimuli.
- Riding a Bicycle: Riding a bicycle through the neighborhood streets or down a path in the forest requires the convergence of visual input, vestibular feedback, and the simultaneous coordination of pedaling and steering.
- In Infant Grasping: Even infants as young as 20 weeks old are very accurate when reaching for objects. This is due, in part, to convergence in the visual and muscular systems emerging early in development.
- The Car Honk: While driving being able to locate the source of a car honk can help one avoid a collision. This ability is in part due to the integration of visual and auditory cues in the environment. This is accomplished by determining which car is spatially congruent to the auditory stimulus.
- Hitting a Baseball: Hitting a baseball might be one of the most difficult feats in all sports. The ball is coming incredibly fast and has to be struck by a round object that is not very wide at all. In addition to excellent visual acuity, players with good batting averages also have fast-acting convergence with their musculoskeletal system.
- The Vestibulo-Ocular Reflex: This refers to how the body stabilizes vision while engaged in a physical activity that involves body movement. The eyes are rotated slightly in the opposite direction to maintain the focal image being placed on the fovea in the retina. This process is conducted in coordination with vestibular feedback regarding positioning of the body.
- In the Culinary Arts: At first glance, cooking is all about taste. But to get to that point, a good chef needs to integrate several sensory modalities during the cooking process. Most chefs have a great sense of smell and a keen eye for when an item is ready to be served. And many of them will sample what their cooking along the way to assess the need for additional spices.
- Playing in a Band: Although it might seem that playing in a band is predominantly a function of the auditory sensory modality, it still requires the coordination of the visual system and the tactile feedback from playing the instrument itself.
- In Sculpting: To be a great sculptor or to just make a simple clay pot, it requires the integration of at least two senses. Of course, the visual system is monitoring the shaping of the clay. At the same time, the tactile system is calculating pressure requirements to be applied to mold the clay into the desired shape.
Key Principles of Convergence
1. The Spatial Rule
The spatial rule was initially proposed by Meredith and Stein (1986). The rule states that multisensory integration is more likely to occur when the inputs from various modalities have spatial proximity.
For instance, the speakers of most televisions are located very near the screen. This enhances comprehension of the auditory and visual inputs. If the auditory stimuli were for some reason located in a much different location, it would make watching a show more difficult and maybe even irritating.
See Also: Gestalt Law of Proximity
2. The Temporal Rule
As proposed by Meredith et al. (1987), the temporal rule suggests that the more contiguous the input from various sensory modalities, the more effective the multisensory integration.
For example, the sound of thunder and the occurrence of lightning are more likely to be seen as a unified perception the more closely in time they both occur.
The further apart the timing the more difficult it is to put the two together as being part of a single phenomenon.
3. The Principle of Inverse Effectiveness
Multisensory integration is more likely to occur when sensory input from one modality is relatively weak when presented in isolation (Meredith & Stein, 1986). Not all stimuli are equal in terms of magnitude or intensity. Therefore, across modalities there will exist some variation in strength of stimuli.
When each modality is presented with a weaker form of a stimulus, then sensory integration is more likely. For example, being presented with an ambiguous facial expression and muted tone of voice, an individual is rely on the integration of the two sources to form an overall perception.
Applications of Convergence
1. Interpretation of Emotions and Cultural Differences
Most research on facial expressions and interpretation of underlying emotional concomitants have focused on the universality of facial expressions across cultures (Ekman, 1972).
Some research has explored multiple signals of emotional cues within a single sensory modality. For instance, Masuda et al. (2008) found that East Asian individuals rely more on contextual cues (visual in nature) when interpreting facial expressions than Western individuals.
Most research on interpretations of emotions have been primarily concerned with a singular sensory modality, the visual system. However, as Tanaka et al. (2010) pointed out, in a natural environment, facial expressions often do not occur in isolation. Emotion is also expressed in the voice as well.
Collignon et al. (2008) found that interpretation of emotions was faster and more accurate when there was a congruence of information in the visual and auditory modalities compared to information only presented via one modality.
When information in the modalities was incongruent, participants relied more on the visual than auditory modality
Tanaka and colleagues were interested in exploring cross-cultural differences in interpretation across multiple modalities. They presented Japanese and Dutch participants with faces and voices that expressed either congruent or incongruent emotions. For example, a happy face paired with an angry voice.
Participants were asked to judge the emotion expressed in one modality, but ignore the other.
The results indicated that Japanese participants were more strongly influenced by the auditory modality (i.e., voice cues) than Dutch participants.
This was one of the first studies to describe how “culture modulates multisensory integration of affective information” (p. 1261).
Topic Focus: Binocular Vision and Convergence
Binocular vision refers to the ability to maintain focus on an object with both eyes working simultaneously. The term convergence in this context refers to the rotation of the eyes in the perception of distance.
When an object is distant, the eyes rotate outwardly. As the object comes nearer, the eyes rotate inward…that is convergence.
Convergence occurs to keep the image focused on the fovea in the retina.
People are heavily dependent on convergence, as we spend a great deal of time looking at objects that are near, such as a computer or smartphone screen.
Rotation of the eyes actually requires seven muscle groups to work simultaneously, called the extraocular muscles or extrinsic ocular muscles.
The muscles that move the eye outward are relaxed while the muscles that move the eye inward are contracted. At the same time, muscles that control the rotation of the eye up and down are also activated, along with one additional muscle that controls the eyelid.
Although convergence sounds simple enough, the perception of distance also involves another process called accommodation.
Accommodation refers to changing the optical power of the lens. Changing the shape of the lens adjusts the incoming light so that the image’s focus on the fovea is maintained.
The ciliary muscles are responsible for adjusting the shape of the lens. Accommodation happens like a reflex and works in concert with the rotation of the eyes. This is called the accommodation-vergence reflex.
Another term for convergence is multisensory integration. This refers to how the mind synthesizes information from various sensory modalities to create an overall perception which allows for everyday functioning.
Sensory integration plays a vital role in nearly everything we do, from riding a bicycle to preparing a fine meal; from reading a book to hitting a baseball.
In the early days of research in the West, there was a tendency to study each modality in isolation. This led to a wealth of detail regarding each sensory system, but lacked a more cohesive understanding of how the various modalities coordinate.
Fortunately, modern research has been more focused on integration and revealed several basic principles of how different sensory input are combined.
Collignon, O., Girard, S., Gosselin, F., Roy, S., Saint-Amour, D., Lassonde, M., & Lepore, F. (2008). Audio-visual integration of emotion expression. Brain Research, 1242, 126-135.
Feldman, A. G., & Zhang, L. (2020). Eye and head movements and vestibulo-ocular reflex in the context of indirect, referent control of motor actions. Journal of Neurophysiology, 124(1), 115-133.
Hartmann, G. W. (1935). Gestalt psychology: A survey of facts and principles. Ronald Press Company.
Hartmann, G. W. (1933). II. Changes in visual acuity through simultaneous stimulation of other sense organs. Journal of Experimental Psychology, 16(3), 393.
Hromas, G., Woods, A.J. (2018). Visual convergence. In: Kreutzer, J.S., DeLuca, J., Caplan, B. (Eds). Encyclopedia of Clinical Neuropsychology. Springer, Cham. https://doi.org/10.1007/978-3-319-57111-9_9107
Laby, D. M., Davidson, J. L., Rosenbaum, L. J., Strasser, C., Mellman, M. F., Rosenbaum, A. L., & Kirschen, D. G. (1996). The visual function of professional baseball players. American Journal of Ophthalmology, 122(4), 476-485.
London, I. D. (1954). Research on sensory interaction in the Soviet Union. Psychological Bulletin, 51(6), 531.
Masuda, T., Ellsworth, P., Mesquita, B., Leu, J., Tanida, S., & van de Veerdonk, E. (2008). Placing the face in context: Cultural differences in the perception of facial emotion. Journal of Personality and Social Psychology, 94, 365–381.
Meredith, M. A., & Stein, B. E. (1986). Spatial factors determine the activity of multisensory neurons in cat superior colliculus. Brain Research 365(2), 350-354.
Meredith, M. A., Nemitz, J. W., & Stein, B. E. (1987). Determinants of multisensory integration in superior colliculus neurons. I. Temporal factors. Journal of Neuroscience, 7(10), 3215-3229.
Stevenson, R. A., Ghose, D., Fister, J. K., Sarko, D. K., Altieri, N. A., Nidiffer, A. R., … & Wallace, M. T. (2014). Identifying and quantifying multisensory integration: a tutorial review. Brain Topography, 27, 707-730.
Stratton, G. M. (1896). Some preliminary experiments on vision without inversion of the retinal image. Psychological Review, 3(6), 611.
Stratton, G. M. (1897). Vision without inversion of the retinal image. Psychological Review, 4(4), 341.
Tanaka, A., Koizumi, A., Imai, H., Hiramatsu, S., Hiramoto, E., & De Gelder, B. (2010). I feel your voice: Cultural differences in the multisensory perception of emotion. Psychological Science, 21(9), 1259-1262.
von Hofsten, C. (1976). The role of convergence in visual space perception. Vision Research, 16(2), 193-198.
von Hofsten, C. (1977). Binocular convergence as a determinant of reaching behavior in infancy. Perception, 6(2), 139-144.