This Cool Robot Gives a Warm Welcome
The FurChat social robot leverages LLMs and an expressive robot face to act as a receptionist and naturally interact with people.
Advances in artificial intelligence (AI) and robotics have ushered in a new era of automation, with robots taking on roles and functions that extend far beyond their traditional applications in industry. These technological breakthroughs are paving the way for a new generation of robots that are more versatile, adaptable, and capable of contributing to various aspects of our daily lives.
One of the most notable developments is the rise of social robots. These robots are designed to interact with humans in natural and meaningful ways. They can serve as companions for the elderly, providing emotional support and assistance with daily tasks. Social robots are also being used in education, helping children with learning disabilities or language development. Moreover, they have found applications in customer service, enhancing the efficiency and personalization of interactions in sectors such as retail and hospitality. By leveraging AI-driven natural language processing and computer vision, social robots are becoming increasingly sophisticated in their ability to understand and respond to human emotions and intentions.
To be useful in specific applications, social robots must combine the open-domain dialogue of modern large language models (LLMs) for general conversation with special knowledge relevant to their application area. Moreover, to go beyond simply providing information and create a more natural conversational experience, these robots should be capable of emotive facial expressions to help convey ideas and better connect with users.
In today’s world, all of these capabilities exist individually, and are quite advanced. They had not been combined into a single, polished system until FurChat came along, however. Developed by engineers at Heriot-Watt University and Alana AI, the FurChat system leverages a Furhat social robot and the GPT-3.5 LLM to create a robotic receptionist that can naturally interact with humans, answering their questions while exhibiting appropriate facial expressions.
The Furhat robot projects an image onto a three-dimensional mask to mimic a human face. Head and neck movements are achieved through the integration of a motorized platform. Both a microphone array and speakers are included to allow for interactions with humans.
When a person interacts with this robot, a speech-to-text service converts their spoken words into a string of text. The text is then forwarded to a dialogue manager that processes it with a natural language understanding model to understand the user’s intentions. This information is utilized to generate an appropriate text prompt, which is sent to an LLM. Since LLMs can sometimes hallucinate and provide inaccurate information, domain-specific information stored in a database is first matched with the user’s intent and sent along with the LLM prompt.
The response from the LLM is processed by a text-to-speech service, then played on the Furhat robot’s speakers. Simultaneously, the ability of recent GPT models to recognize emotions and sentiments is utilized to generate appropriate emotions, from a pre-developed set of gestures, for the robot face to express.
The team tested out their system, appropriately enough, as a receptionist at the UK National Robotarium in Scotland. There, FurChat was tasked with providing visitors with information about the facility, upcoming events, and answering any other questions that might come up. Observations of the interactions showed that the robot seemed to be quite capable of providing accurate information and communicating naturally with visitors.
At present, FurChat is only capable of one-on-one interactions with humans, but the team is exploring the possibility of allowing for multi-party conversations. They are also looking into other venues where FurChat could be deployed, like at museums or festivals.