Apple’s Vision Pro headset enables spatial Facetime calls with scanned avatars that use AI to mimic the appearance and body language of the person wearing the headset.
When Apple announced its new VR/AR headset, the Apple Vision Pro, the company showed off not only entertainment and productivity applications, but also a host of social apps. Facetime calls are likely to become a major application, similar to the previous Apple ecosystem.
Apple has developed special spatial avatars for the Vision Pro. When users wear the headset during Facetime calls, they appear to others as a three-dimensional, animated avatar. They scan their face and hands beforehand so that this “persona” convincingly mimics their appearance, facial expressions, and gestures.
Facetime calling in mixed reality
Using a technology called neural scanning, an AI algorithm identifies and calculates the characteristic movements of the “persona” avatar during a meeting. This meeting gives participants the impression that they are interacting with the 3D avatar in a relatively authentic, natural way. To the headset user, other participants appear on tiles floating in space around them. Because Vision Pro is a mixed reality headset with video transparency, it can also integrate the outside world into virtual conversations and vice versa.
Apple also designed the video tiles to be as natural as possible for conversation: They’re life-size, more like a natural conversation in a room than on a smaller screen. Meanwhile, 3D audio ensures that the sound comes from where the participants are standing and talking.
The headset also has integrated earphones with side-firing slits, which are intended to make spatial sound more realistic. The headset even adapts the spatial sound to the current room using the headset’s numerous sensors. They detect room conditions and adjust the surround mix accordingly.
Source: Apple WWDC 2023