Apple has published a new entry in its Machine Learning Journal providing in-depth technical information on how Apple designed Siri on the HomePod to deal with hearing and understanding a user’s voice in the larger spaces in which HomePod is intended to operate. Titled Optimizing Siri on HomePod in Far‑Field Settings, the paper explains how Siri on HomePod had to be designed to work in “challenging usage scenarios” such as dealing with users standing much farther away from the HomePod than they typically would be from their iPhone, as well as dealing with loud music playback from the HomePod itself, and making out the user speaking despite other sound sources in a room like a TV or household appliances. In the article, Apple goes on to outline how the HomePod’s six microphones and multichannel signal processing system built into its A8 chip work together to adapt to a variety of changing conditions while still making sure that Siri can hear the person speaking and respond appropriately. Machine learning algorithms are employed as part of the signal processing to create advanced algorithms for common features like echo cancellation and noise reduction, improving Siri’s reliability across a wide variety of frequently changing environments.
A Multichannel Echo Cancellation (MCEC) feature is used to effectively “cancel out” any music playing from the HomePod itself, since the processor can determine the specific acoustic properties of whatever music or other audio is playing from the speaker, although the article discusses the need to add a mask-based residual echo suppressor (RES) that uses machine learning to model things like mechanical vibrations of the HomePod, and variations in signal due to the beamforming speaker array. The paper also explains additional technique that are used to filter out reverberation from commands issued from across the room, and mask-based noise reduction to filter out background noise from things like home appliances, heating and air conditioning systems, and outdoor sounds entering through windows. Additional technical details are provided on the approaches that Apple took to implementing each of these strategies, and Apple concludes by outlining the environments and conditions in which it tested the HomePod in real-world configurations, as well as specifications on system performance, and actual audio samples of original audio under various conditions both before and after processing by the various machine-learning algorithms. [via MacRumors]
Post a Comment