Knowledgeable Machine Learning for Natural Language Processing

Communications of the ACM 

In the past decades, one line has run through the entire research spectrum of natural language processing (NLP)--knowledge. With various kinds of knowledge, such as linguistic knowledge, world knowledge, and commonsense knowledge, machines can understand complex semantics at different levels. In this article, we introduce a framework named "knowledgeable machine learning" to revisit existing efforts to incorporate knowledge in NLP, especially the recent breakthroughs in the Chinese NLP community. Since knowledge is closely related to human languages, the ability to capture and utilize knowledge is crucial to make machines understand languages. As shown in the accompanying figure, the symbolic knowledge formalized by human beings was widely used by NLP researchers before 1990, such as applying grammar rules for linguistic theories3 and building knowledge bases for expert systems.1