Cognitive Architectures for Virtual Avatars


SHARESPACE utilizes Virtual Humans (hereafter referred to as avatars) to enable embodied collaboration and interaction in Social Hybrid Spaces (SHS). Up to this point, interaction in digital spaces is nowhere near as seamless as real-life communication due to the heavy loss of many physical subtleties. The academic research in SHARESPACE aims to identify social cues in movements (sensorimotor primitives), and wishes to explore if digital communication can be improved if we amplify these sensorimotor primitives by utilizing AI technology.

SHARESPACE plans to utilize three different categories of avatars (L1, L2, L3), each with their own AI-driven Cognitive Architecture (CA). The exact category is determined by the properties of the CA, which include the level of intelligence and autonomy of the avatar and the level of amplification of its movements. The three categories are:

  • L0: A human in a physical space.
  • L1: These avatars are simple representations of the user’s body in the virtual world (L0). They directly replicate the user’s movement without autonomy or amplification.
  • L2: These avatars are more intelligent than L1 avatars. They can readout and understand their users’ movements and perform amplification of motion when they identify relevant sensorimotor primitives. The goal of the amplification is to make the social cues more readable for others in the SHS. These avatars still lack autonomy, and they rely on the signals given by the user. 
  • L3: These avatars are the most intelligent and autonomous. They can readout and understand other avatars’ movements. They can also learn and adapt to their environment to make their own decisions. The key objective of these kinds of avatar is to perform a given task to improve human interactions in VR spaces. 

The level of intelligence and autonomy of a virtual reality avatar can have a significant impact on the user’s experience. L1 avatars are the most basic, but they can be useful for simple tasks. L2 avatars are more engaging, and are better suited to achieve a better transmission of intentions through movements in VR. Eventually,  L3 avatars offer the potential for a truly lifelike virtual experience with completely AI driven characters which can engage in group tasks and improve synchronization between individuals. 

Humans and Avatars interacting in SHS, Credit: DFKI

Cognitive Architectures

Work package 5 has three main objectives. The first one is to design the cognitive architectures of the avatars using a combination of physics-informed and control-based deep reinforcement learning algorithms. These architectures will drive the virtual avatars with increasing levels of autonomy, optimizing different aspects in specific scenarios. The second goal is to develop a cloud collaborative platform that can host and manage avatars driven by the SHARESPACE architecture. Finally, the cognitive architectures will be implemented and tested in various application scenarios to refine their designs. 

The communication and management platform to support and drive the avatars in SHARESPACE will be based on the technologies and functionalities of Rainbow. This platform will be capable of hosting and managing avatars as multimedia objects for communication and interaction. The platform will handle different modes and support VR and AR devices, making it suitable for future industrial and commercial use.