NTT realizes the world's first interactive AI that can talk to drivers: Able to hold natural conversations involving topics like changing scenery

NTT has developed an AI that can have natural conversations with the driver regarding topics like the changing scenery outside a car window. Rather than simply answering the driver's questions, the AI provides an entirely new experience, acting as a driving partner that naturally responds to what the driver is saying and provides information that may interest them at that time. NTT presented the results of its R&D at the Communication Science Laboratories OPEN HOUSE 2022, which was held online on June 2 and 3.

Demonstration of NTT's driving conversation AI that talks with the driver
Provided by NTT

Most conventional dialogue-based systems only use text-based information obtained from what the user is saying and cannot incorporate real-time imagery and information about their surroundings. The dialogue-based AI that NTT developed uses a deep learning-based large-scale text dialogue model built in-house. The model was additionally trained using dialogue and location data from driving. This innovation enables a more natural conversation based on visible imagery such as scenery and related external knowledge.

For example, if the AI notices a café, it could say, "This café is so stylish," to which the driver might reply, "It sure is." The AI may then respond, "It's so romantic to drink coffee while looking at the ocean." The driver may then reply, "You sure have cool taste," to which the AI might respond, "Haha, you think so?" and so on. NTT describes it as the world's first dialogue system that differs from conventional systems in that it responds with knowledge and empathy as if it were a human.

The company used a deep learning-based dialogue model trained on the world's most extensive dialogue data set to realize this new dialogue-based AI. This large-scale text dialogue model is of exceptionally high performance, built using data of 2.1 billion conversation pairs collected from social media. Its size makes it the largest dialogue model in the world. It achieves complex contextual understanding and natural speech generation at a radically different level from conventional dialogue models, which are based on rules, dependency relations, and other statistical information.

However, because dialogue models only use textual information as input, it was difficult to conduct dialogue that incorporates the surrounding situation, despite achieving very natural text-based conversations. NTT, therefore, developed technology to input and introduce information on objects in images and areas around the user into the large-scale dialogue model. This enabled the model to produce dialogue that incorporates such information. By training this model with dialogue data based on driving, the company was able to generate speech based on scenery and location information.

Furthermore, since the driver's position sequentially and continuously changes while driving, the AI needs to understand which imagery and location information the driver is talking about and interact with the newly input information at the appropriate time.

To enable this, NTT developed technologies for estimating the image being discussed from the dialogue context and speech intensity regarding the topic for sequentially input images. By appropriately incorporating timing controls into these technologies, NTT was able to meet this requirement and realize an AI that can chat about scenery and its surroundings.

In the future, NTT intends for the AI to become a daily driving partner. The company will continue to conduct demonstration tests using vehicles and solutions such as VR, applying the technology to daily conversations, helping prevent drowsy and distracted driving, and realizing a voice navigator that can search for information through natural conversation.

This article has been translated by JST with permission from The Science News Ltd.(https://sci-news.co.jp/). Unauthorized reproduction of the article and photographs is prohibited.

NTT realizes the world's first interactive AI that can talk to drivers: Able to hold natural conversations involving topics like changing scenery

Recommended

Recent Updates

Most Viewed