Postdoctoral researcher, University of Illinois, Urbana-Champaign, USA
Natural Language Understanding
Despite recent advancements in Artificial Intelligence (AI), computers today cannot understand text in the ways that humans can. E.g. interactive agents like Siri can follow predefined instructions (such as to report current weather), but cannot engage in meaningful conversations like a human assistant. This is because existing methods are designed to process text but don't necessarily ‘understand’ it. My research aims at creating computational models/algorithms that not only ‘read’ but also interpret and reason about text. These models are ‘structured’ in the sense that, unlike previous methods, they view each sentence in light of other sentences in the text. These structured models are also aware of social dynamics of humans, linguistic structure and semantic knowledge. Such systems can find use in many domains such as social networks like Facebook and Twitter, discussion forums like Quora, intelligent virtual assistants like Siri and Alexa, artificial tutors, etc.
Over the past few years, I have developed models for understanding relationships between various people in text (such as novels, news articles, discussion forum etc.). Understanding these relationships is essential to text understanding because they help in explaining people's desires, goals, actions and expected behaviors. E.g. consider ‘The duke asked the king to surrender, and he refused' and ‘Tom asked his mother for another cookie, and she refused'. Despite syntactic similarities, the relationship between people is one of mutual hostility in one case, and asymmetric authority in the other. Automatically inferring such relationships is important for allowing a computer infer what is not explicitly stated and what is expected in the story. E.g., ‘The duke then imprisoned the king' is something that a reader might expect, but ‘Tom then imprisoned mother' would be surprising. A computer can have this capability only if it understands dynamics of social relationships.
My research also explores models for general understanding of story-like texts such as news articles. This can enable computers to understand social norms, human behavior and commonsense. I develop models that attempt to understand a story on three semantic axes: (i) the sequence of events described in the text, (ii) its emotional trajectory, and (iii) its plot consistency. We judge the model’s understanding by inquiring if, like humans, it can develop an expectation of what will happen next in a given story.
Abstract: In this paper, we describe the Lithium Natural Language Processing (NLP) system - a resource-constrained, high- throughput and language-agnostic system for information extraction from noisy user generated text on social media. Lithium NLP extracts a rich set of information including entities, topics, hashtags and sentiment from text. We discuss several real world applications of the system currently incorporated in Lithium products. We also compare our system with existing commercial and academic NLP systems in terms of performance, information extracted and languages supported. We show that Lithium NLP is at par with and in some cases, outperforms state- of-the-art commercial NLP systems.
Pub.: 13 Jul '17, Pinned: 04 Aug '17
Abstract: Machine comprehension(MC) style question answering is a representative problem in natural language processing. Previous methods rarely spend time on the improvement of encoding layer, especially the embedding of syntactic information and name entity of the words, which are very crucial to the quality of encoding. Moreover, existing attention methods represent each query word as a vector or use a single vector to represent the whole query sentence, neither of them can handle the proper weight of the key words in query sentence. In this paper, we introduce a novel neural network architecture called Multi-layer Embedding with Memory Network(MEMEN) for machine reading task. In the encoding layer, we employ classic skip-gram model to the syntactic and semantic information of the words to train a new kind of embedding layer. We also propose a memory network of full-orientation matching of the query and passage to catch more pivotal information. Experiments show that our model has competitive results both from the perspectives of precision and efficiency in Stanford Question Answering Dataset(SQuAD) among all published results and achieves the state-of-the-art results on TriviaQA dataset.
Pub.: 27 Jul '17, Pinned: 04 Aug '17
Abstract: Un-supervise learned word embeddings have seen tremendous success in numerous Natural Language Processing (NLP) tasks in recent years. The main contribution of this paper is to develop a technique called Skill2vec, which applies machine learning techniques in recruitment to enhance the search strategy to find the candidates who possess the right skills. Skill2vec is a neural network architecture which inspired by Word2vec, developed by Mikolov et al. in 2013, to transform a skill to a new vector space. This vector space has the characteristics of calculation and present their relationship. We conducted an experiment using AB testing in a recruitment company to demonstrate the effectiveness of our approach.
Pub.: 31 Jul '17, Pinned: 04 Aug '17
Abstract: Machine learning has become pervasive in multiple domains, impacting a wide variety of applications, such as knowledge discovery and data mining, natural language processing, information retrieval, computer vision, social and health informatics, ubiquitous computing, etc. Two essential problems of machine learning are how to generate features and how to acquire labels for machines to learn. Particularly, labeling large amount of data for each domain-specific problem can be very time consuming and costly. It has become a key obstacle in making learning protocols realistic in applications. In this paper, we will discuss how to use the existing general-purpose world knowledge to enhance machine learning processes, by enriching the features or reducing the labeling work. We start from the comparison of world knowledge with domain-specific knowledge, and then introduce three key problems in using world knowledge in learning processes, i.e., explicit and implicit feature representation, inference for knowledge linking and disambiguation, and learning with direct or indirect supervision. Finally we discuss the future directions of this research topic.
Pub.: 08 May '17, Pinned: 04 Aug '17
Abstract: Many real world systems need to operate on heterogeneous information networks that consist of numerous interacting components of different types. Examples include systems that perform data analysis on biological information networks; social networks; and information extraction systems processing unstructured data to convert raw text to knowledge graphs. Many previous works describe specialized approaches to perform specific types of analysis, mining and learning on such networks. In this work, we propose a unified framework consisting of a data model -a graph with a first order schema along with a declarative language for constructing, querying and manipulating such networks in ways that facilitate relational and structured machine learning. In particular, we provide an initial prototype for a relational and graph traversal query language where queries are directly used as relational features for structured machine learning models. Feature extraction is performed by making declarative graph traversal queries. Learning and inference models can directly operate on this relational representation and augment it with new data and knowledge that, in turn, is integrated seamlessly into the relational structure to support new predictions. We demonstrate this system's capabilities by showcasing tasks in natural language processing and computational biology domains.
Pub.: 24 Jul '17, Pinned: 04 Aug '17