What is Sound Source Localization?

Return to site

What is Sound Source Localization?

· Science,Education

What is Sound Source Localization?

As the name suggests, sound source localization means to determine where the sound of our source of interest originates.

Sound source localization can be broken down further depending on the environment of where the sound originates from. Imagine someone in an underground garage clapping their hands. After the clapping stopped, their sound waves reflections will linger in the room for a short period after. These acoustic reflections are a part of a reverberant environment in which the reflections interfere with the direct sound arriving at the listener's ears, distorting the spatial cues for sound localization. The ear perceives the sound to be farther or closer than it is, which adds another layer to the problem. Although humans can quickly localize sound sources in moderate reverberation, localization accuracy degrades in a stronger reverberant environment. With this in mind, the need for a breakthrough in technology to solve this has proven to be in dire need.

How Machine Learning is Revolutionizing Our Music Experience

Where can we use sound source localization? Why is it important? Imagine being inside a Madison Square Garden concert game or concert. Everyone around you is yelling at the top of their lungs, except one person. Now, we want to find that one person. Sound source localization will help us isolate this person and determine where he or she is in the crowd. While this is a trivial example, multiple applications require sound source localization such as in hearing aids, robotics, navigation for ships as well as self-driving cars, and in surveillance too.

Previous work in sound source localization has concerned the design of microphone arrays and the use of digital signal processing techniques.

The Future Of Gaming Is More Colorful Than Movies Or Music

These techniques can be broken up into four groups: Time difference of arrival (TDOA) methods, beamforming ones, methods using high-resolution processing, and the processes which need a training phase. Time difference of arrival (TDOA) is a technique that involves using two or more receivers to locate a signal source from the different arrival times at the receivers. In our case, it is a sound source signal. Popular techniques used to estimate TDOA are the Generalized Cross Correlation (GCC) and its derivatives, such as Generalized Cross-Correlation using Phase Transform (GCC-PHAT) and the Cross Power Spectrum Phase (CSP). However, these methods are defined for an environment without any vibration, so they do not help localize reverberated sound sources.

Math Meets Music To Aid The "Piki" Listener: An Interview With Piki Founder & Cornell Mathematician Sasha Stoikov

Beamforming, on the other hand, or spatial filtering is a signal processing technique that combines elements in an antenna array in such a way that at particular angles signals experience constructive interference while others experience destructive interference. Using a microphone array, beamforming will help isolate the source of the sound. The best-known beamforming approaches are the Minimum Variance Distortionless Response (MVDR), and linearly Constrained Minimum Variance (LCMV) method. However, when a microphone array is faced with multiple sound sources, the TDOA and beamforming approaches are not successful in finding the source. Hence, the other two methods were created.

Faster than Sound and Undetectable by Radar

Next, the methods using high-resolution processing, known as subspace localization methods, utilize the spectra estimation, and perform better than in comparison to the TDOA and beamforming approaches. Common examples of subspace localization methods are the (MUSIC), Estimation of Signal Parameters via Rotational Invariant Technique (ESPRIT) and root-MUSIC. Due to the nature of the reverberant environment, other methods such as the Recursively Applied and Projected MUSIC, RAP-MUSIC, and Self-Consistent MUSIC are other choices as well but are not widely implemented.

A Crowded Field: Competition Between Music Streaming Services

Finally, the last approach is a reasonably recent advancement. A new method, based on the phase information of the MUSIC spectra, for localization of very closed-source with the limited number of sensors, has been proposed in a journal paper. However, because of its novelty, there is not much more to report, and more work needs to be conducted before one can test its usefulness.

GS' New CEO David M. Solomon is a DJ: Banking, DJs, and Machine Learning

What to expect in the Future? Unlike humans, the machines that use these techniques are not as robust in all environments and cannot find the source because they assume the source to be either stationary or in a non-reverberant environment. SONAR and RADAR are extremely useful navigation systems because transmitting or finding vessels in a setting where the reverberations are not so high—underwater sound waves—is a simple procedure. However, if SONAR or RADAR were used in a glass room to find a vessel, the results would not be promising. These limitations need to be surpassed, so technology can accurately locate the origin of the sound. With the recent advancement of personal data assistants such as Google Assistant and Siri, there has been a lot of development in the Speech-Language Processing field. The rise brings about new methods to solve the source localization problem.

Spotify, Move Over!

In this decade, machine learning will help alleviate problems in solving sound source localization in almost all environments. In particular, deep learning, a subset of machine learning, has yielded some exciting results in terms of detecting the sources with networks like SELD-net. However, at the moment, the advancements are extremely limited.

Interview with NASA Astronaut Scott Kelly: An American Hero​
13 Questions With General David Petraeus
Why Choose Machine Learning Investing Over A Traditional Financial Advisor?
Interview With Home Depot Co-Founder Ken Langone
Interview with the Inventor of Amazon's Alexa
China Debuts Stealth Unmanned Combat Aerial Vehicle
Nuclear Submarines: A 7,000 Lb Swiss Watch
Ai Can Write Its Own Computer Program
On Black Holes: Gateway to Another Dimension, or Ghosts of Stars’ Pasts?
Supersonic Travel: The Future of Aviation
Was Our Moon Once Habitable?
The Modern Global Arms Race
NASA Seeks New Worlds
Cowboy Turned Space Surgeon
Shedding Light on Dark Matter: Using Machine Learning to Unravel Physics’ Hardest Questions
Aquaponics: How Advanced Technology Grows Vegetables In The Desert
The World Cup Does Not Have a Lasting Positive Impact on Hosting Countries
America’s Next Spy Plane
Faster than Sound and Undetectable by Radar
The Implications of Machine Learning on Condensed Matter Physics & Quantum Computing

Written by Akhil Vasvani & Edited by Qilin Guo & Alexander Fleiss