Ego-Centric Visual-Inertial Body Tracking, Scene Capturing, and XR Display


This work package (WP3) creates the interface between the human, in their natural environment, and the virtual world of the Shared Hybrid Space (SHS), by developing new approaches for capturing the whole human body pose, body motion, hand motion, gaze, voice as well as the 3D surrounding of the person. Moreover, it utilizes innovative ways to display the digital world to each user connected to SHS. 


Overall, the technology that is researched and developed in this work package aims to push the state-of-the-art in mobile, easy-to-use capturing approaches. These should deliver accurate and reliable capturing results of sufficiently high fidelity. The aim is that all body signals, which are relevant for virtual social interaction, are captured. The natural environment can be quickly reconstructed and represented with sufficient quality, and the digital world is displayed such that it naturally integrates with the physical surrounding of each user of the SHS technology. 
 
The developments can be categorized into three aspects 1) capturing of the human body 2) capturing of the scene 3) display via XR-glasses. 

The first aspect is dedicated to the fusion of single camera and wearable sensors based human pose capturing. The camera can either be mounted on the body (ego-centric perspective) or in the environment (third-person perspective). The focus is on a high usability for non-expert-users, while optimizing the reliability of the results in different surroundings and situations (e.g. during severe occlusion). The approach targets applications in sports, health (therapy) and arts. The captured information will be used to obtain sensory motor primitives of social interaction (WP2), as inputs to the interaction with other users and avatars via the cognitive architecture (WP5) and to render the human motion in a virtual scene (WP4). The overall technology will be exploited in the defined application scenarios (WP6). 

The second aspect evolves around fast 3D scene capturing via 360 degree cameras. Two approaches are followed: 1) 3D digitalization of large 3D environments via multiple cameras that deliver a 360 degree view. The capturing will be optimized for VR scene rendering. 2) A new 360 degree depth camera will be exploited for quick and precise indoor reconstruction.  Both approaches will be exploited for rendering in WP4 and the scenarios in WP6. 

The third aspect focuses on XR glasses with integrated eye-tracking and hand capturing. The XR glasses offer advanced spatial awareness, since the so-called Vergence-accomodation conflict is removed. This allows to augment the physical surrounding with digital 3D content in a natural way. The XR glasses will be combined with the mobile human body tracking to enable consistent whole body and hand tracking, with eye-tracking and voice capturing. These tasks are combined with the rendering from WP4 and the application scenarios (WP6).