Since the rise of the mobile phone, no public space has been safe from the sudden shock of some inconsiderate person yelling into their handset like a sailor struggling to be heard above a storm.
But new technology being developed by Facebook could finally consign such intrusions to the neglected voicemail inbox of history by giving people “auditory superpowers”.
In an upbeat blog post on Thursday, the giant social media company revealed prototype headphone software that uses artificial intelligence (AI) to amplify distant or muffled sounds while silencing background hubbub.
One system can interpret various sources of noise and adjust their volume on the fly, while another can virtually “place” sounds in different locations around the listener’s body.
An augmented reality (AR) headset using such methods would allow its wearer to hear people across long distances, make phone calls with a mere whisper or hold crystal-clear conversations on deafening dancefloors.
However, the technology currently requires specialised arrays of dozens of microphones or detailed measurements taken inside a soundless chamber, and would need to be drastically minimised to be viable in public.
Facebook said: “Imagine putting on a virtual reality [VR] headset or a pair of AR glasses and being transported thousands of miles away to attend class, go to work, or attend a relative’s birthday party – as if you were there in real life.
“Now imagine that same pair of AR glasses takes your hearing abilities to an entirely new level and lets you hear better in noisy places, like restaurants, coffee shops, and concerts.”
The prototypes are part of Facebook’s ongoing attempt to dominate the nascent industry of AR and VR, which chief executive Mark Zuckerberg believes to be the “next platform” of online social life.
The company boasts that it can use advanced hardware and AI to restoring some of the sense of “presence” and “connection” that is lost in traditional video and phone calls.
While AR evangelists have traditionally focused on enhancing people’s sight, much of the medium’s current success has come via sound, which can transform someone’s experience of their surroundings without the difficult technical challenge of making AI understand physical space.
Facebook first built “spatialised audio” into its Oculus VR headsets in 2017, but its new system analyses the unique shape of users’ ears to create a personalised model of how they experience sound.
Another AI program separates out all the different sources of noise in a space and then tracks the user’s head and eye movements to guess which ones they want to focus on. That creates an auditory “spotlight” effect that boosts the volume of whatever the user is looking at, which Facebook described as “magic made real”.
The two technologies could work together: one person could unobtrusively murmur into their microphone in a crowded hospital waiting room, confident that the other person can hear them perfectly well while loudly mowing the lawn.
In theory, such AI could also be trained to treat specific voices differently, or dampen and enhance specific noises. That could be a boon to people with hearing problems or conditions such as autism and post-traumatic stress syndrome which make them sensitive to certain noises.
Perhaps it would also be of interest to anyone who feels obliged to nod along to a loud, boorish friend – or, more innocently, couples who have to work from home, Zoom meetings and all, in the same cramped flat.
Facebook admitted that the technology carries severe privacy and security risks, both to the person wearing the headset and to the people around them.
In the past, the company’s Portal smart home devices have been hobbled by its history of privacy violations, finding little purchase among consumers even when they are more secure than competing products.
Existing AI also frequently suffers from race and gender bias, such as face recognition systems that misidentify black people as criminals or smartphone cameras that don’t know how to process darker skin.
Super-hearing AI were trained on the wrong data could suffer similar glitches: perhaps mistaking female voices for background noise more often than male voices, or erroneously dampening strong foreign accents because most of the user’s friends and colleagues speak with British ones.