As robots take over our world, they must not only learn how to communicate with us but also with each other. Recent scholarship has so far demonstrated that it’s possible for two robots to communicate in a shared language in the form of binary vectors. These conversations between the sender (Robot A) and receiver (Robot B) are typically mono-directional, and limited to a fixed number of yes/no answers.
But CDS Master’s student Katrina Evtimova, professor Kyunghyun Cho, Andrew Drozdov (NYU Computer Science), and Douwe Keila (Facebook) all believe that our robots can do better. In “Emergent Language in a Multi-Modal, Multi-Step Referential Game,” they have invented a new conversational game for robots that mimics human communication more closely.
First, the robots in their new game are prompted to have bi-directional conversations, meaning that they are allowed to send as many messages to each other as they would like. Second, they must use their shared language to exchange two different modes of information: text and image.
The game itself, then, unfolds something like this. The sender robot is shown photograph of a mammal, and then instructed to communicate what it saw to the receiving robot . The receiver is prompted to guess what animal the sender saw by reading the textual data, and then asking the sender additional questions about the photograph. This means that the conversation is not merely about shifting information from one robot to another, but exchanging different modes of information (e.g. textual and pictorial) between them. To help the robots, the researchers harnessed the power of neural networks and implemented techniques like visual and textual attention into the game.
The researchers found that the multi-modal, multi-step game not only improved the robots’ predictive accuracy, but that their conversations also became more human-like. Will these developments bring humans and machines closer together—or will we become too close for comfort?
by Cherrie Kwok