How close is humanity to creating AI? Machine learning’s remarkable progress in object and speech recognition, video games, and other areas has been broadly covered by news media, and it would seem the AI dream is fast becoming reality. However, are these machines learning to truly think like humans?
Brenden Lake, Moore-Sloan Data Science Fellow at NYU’s Center for Data Science, published in May the paper “Building Machines That Learn and Think Like People.” He said that while machines have equaled or surpassed humans on many important benchmarks, current systems like deep neural networks differ from human intelligence in crucial ways. Many contemporary deep learning models use a pattern-recognition approach, finding patterns in data to achieve an objective goal. Building machines that utilize a model-based approach to learning, which Lake said is a cornerstone of human thought, is the key to more human-like machines that not only predict and solve problems, but understand and explain their solutions.
Lake’s paper put forward several crucial ingredients for human-like thinking that could be integrated into machines. The first set of ingredients was what he termed “developmental start-up software,” like intuitive physics and psychology present in human infants; the second set comprised tools for model-building, such as compositionality and “learning-to-learn.” Lake used two examples to demonstrate the current differences in human and machine learning. The “Characters Challenge” tests a subject’s recognition of handwritten characters, while the “Frostbite Challenge” tests a subject’s ability to learn and master a game – in this case, the Atari game “Frostbite.”
The “Characters Challenge,” Lake said, demonstrated the vast difference between pattern-recognition and model-building. Given the task of classifying images of numbers into the categories “0” to “9”, humans and neural networks performed equally well, but Lake said two crucial differences were that people learned from fewer examples, and they learned what he termed “richer representations”; humans could recognize a new character from a single example, but beyond pattern-recognition, they learned a model that allowed them to apply their knowledge in new ways, like generate new examples of the character, break it down into its most important parts, and even generate new characters based on a set of related characters, all of which would not be possible for a neural network using a pattern-recognition approach.
“When people learn a new letter, such as a letter in a foreign alphabet, they gain many abilities at the same time. People learn a model for a new character that is flexible enough to support all of these tasks,” Lake said.
Lake said the “Frostbite Challenge” also exposed the gulf that still exists between natural intelligence and machine intelligence. Google Deepmind’s “Deep Q-Network” (DQN) was the system trained to play the Frostbite game, where the player’s avatar must construct an igloo by jumping on ice floes in the water. The player earns extra points by gathering fish, while avoiding fatal hazards like polar bears.
While the DQN learned to play “Frostbite” at a human-level of performance, it required a much larger amount of experience. The DQN was compared to a professional gamer who received two hours of practice on the game; the DQN was trained on 200 million frames of the game – approximately 924 hours, and almost 500 times as much as the human gamer. The DQN initially achieved less than 10% of human-level performance; revised variants eventually reached 96% of human performance, but still required a disproportionate amount of experience compared to the human gamer.
“Learning a new video game also speaks to the flexibility of our model-building capabilities. When you learn to play a new video game, you usually aim to get the highest score. But you can also change your goal and still play effectively,” Lake said. He listed examples like collecting as many fish as possible, or teaching one’s friend how to play, as methods of play that differed from the goal of obtaining the highest score. In contrast, Lake said that the DQN network was inflexible to changes in its inputs and goals; it would not be able to play the game with a different objective without extensive retraining.
Lake said he was intrigued by the differences in performance between humans and machines on the challenges, and that it hinted at fundamental differences in learning between humans and machines.
Lake said integrating his recommended developmental start-up software, like intuitive physics and psychology, could produce more powerful learning and thinking abilities in machines. Intuitive physics are physical concepts present in human infants as young as 2 months; for example, infants expect objects to follow principles of persistence (they exist continuously and do not wink in and out of existence) and solidity (objects do not inter-penetrate), among others. Lake said these concepts may be used to solve daily physics-related tasks, including games like “Frostbite.”
Lake said integrating an intuitive physics engine could enable a machine intelligence to adapt to a spectrum of scenarios; a physics-engine reconstruction of a game of “Jenga” might be used to predict the possibility and manner of a tower falling, as well as capture human-like hypothetical scenarios such as “what would happen if certain blacks are taken away or more blocks are added?”
Lake said in his paper that intuitive psychology is a concept, learned or innate, in humans that leads them to expect that entities will act in a goal-oriented and efficient manner. He used the example of people learning to play “Frostbite” from watching an experienced player play first, and then playing themselves; he said that “intuitive psychology provides a basis for efficient learning from others… in the case of watching an expert play “Frostbite,” intuitive psychology lets us infer the beliefs desires and intentions of the experienced player.” Lake said that incorporating intuitive psychology into deep learning systems could change the way they learned to play games, reducing the learning curve to one approximating human performance.
In conclusion, Lake said that deep learning and other machine intelligences still have a long way to go before matching natural intelligence, which he said is still the best example of intelligence.
“This is a higher bar worth reaching for, potentially leading to more powerful algorithms while also helping unlock the mysteries of the human mind,” Lake said.