At last Thursday’s text as data seminar, professor Hong Yu from the University of Massachusetts Medical School explained the effort that she, along with her fellow colleagues, have made to incorporate human intelligence and cognitive functions to improve deep learning through new models.
For instance, “Building an Evaluation Scale Using Item Response Theory,” is one published paper which discusses Hong Yu, John P. Lalor and Hao Wu’s Item Response Theory model (IRT) that compares NLP systems to the performance of human population.
Once the new deep learning models are built, Yu goes on to evaluate whether or not they learn like the human brain. The human brain has to learn things gradually, beginning with low difficulty content and working its way up to more difficult content. “No human first understands an Albert Einstein equation before understanding basic physics,” said Yu. A traditional machine learning model can solve difficult problems before it can solve easy ones — though that is not a sign of machine intelligence, but rather of memorization, or random chance. Yu and her colleagues found that deep-learning models learn like humans, taking less time on easy problems than on difficult ones, and are therefore considered “intelligent” systems.
IRT uses difficulty as a gauge to compare system performance. This method is widely used in psychometrics and psycholinguistics for generating tasks like the much dreaded SAT, GRE or other aptitude tests beginning in childhood and onward, and is thoroughly tested in the population. This same methodology is applied to test performance for machine learning models. Using a thousand humans to generate intelligent questions sets to separate NLP performance, the computer system performed with an impressive 97% accuracy, but in fact was in the low 44th percentile when IRT was used as the evaluation metrics, meaning there is a 56% chance that humans would outperform the computer.
Memory-augmented neural networks were also discussed. Yu’s memory-augmented network, unlike other deep learning models, accepts noise when up to 35% of the sentence pairs in the training set are randomly replaced with incorrect labels.
Intelligent deep learning models are classified as such if they learn problems from easy to hard in difficulty and can tolerate noise, to a degree. Essentially, these intelligent deep learning models would be applied to adverse drug events, or adverse drug reactions, which are a leading cause of death in the U.S. Using Electronic Health Records, Yu’s goal moving forward is to develop a sophisticated competition system that improves drug safety of patients.
by Nayla Al-Mamlouk