Goto

Collaborating Authors

 Thomson Reuters


TipMaster: A Knowledge Base of Authoritative Local News Sources on Social Media

AAAI Conferences

Twitter has become an important online source for real-time news dissemination. Especially, official accounts of local government and media outlets have provided newsworthy and authoritative information, revealing local trends and breaking news. In this paper, we describe TipMaster an automatically constructed knowledge base of Twitter accounts that are likely to report local news, from government agencies to local media outlets. First, we implement classifiers for detecting these accounts by integrating heterogeneous information from the accounts' textual metadata, profile images, and their tweet messages. Next, we demonstrate two use cases for TipMaster: 1) as a platform that monitors real-time social media messages for local breaking news, and 2) as an authoritative source for verifying nascent rumors. Experimental results show that our account classification algorithms achieve both high precision and recall (around 90%). The demonstrated case studies prove that our platform is able to detect local breaking news or debunk emergent rumors faster than mainstream media sources.


Data Sets: Word Embeddings Learned from Tweets and General Data

AAAI Conferences

A word embedding is a low-dimensional, dense and real-valued vector representation of a word. Word embeddings have been used in many NLP tasks. They are usually generated from a large text corpus. The embedding of a word captures both its syntactic and semantic aspects. Tweets are short, noisy and have unique lexical and semantic features that are different from other types of text. Therefore, it is necessary to have word embeddings learned specifically from tweets. In this paper, we present ten word embedding data sets. In addition to the data sets learned from just tweet data, we also built embedding sets from the general data and the combination of tweets and the general data. The general data consist of news articles, Wikipedia data and other web data. These ten embedding models were learned from about 400 million tweets and 7 billion words from the general data. In this paper, we also present two experiments demonstrating how to use the data sets in some NLP tasks, such as tweet sentiment analysis and tweet topic classification tasks.


Collaborative Language Grounding Toward Situated Human-Robot Dialogue

AI Magazine

One particular challenge is to ground human language to robot internal representation of the physical world. Although copresent in a shared environment, humans and robots have mismatched capabilities in reasoning, perception, and action. A robot not only needs to incorporate collaborative effort from human partners to better connect human language to its own representation, but also needs to make extra collaborative effort to communicate its representation in language that humans can understand. This article gives a brief introduction to this research effort and discusses several collaborative approaches to grounding language to perception and action.


Collaborative Language Grounding Toward Situated Human-Robot Dialogue

AI Magazine

To enable situated human-robot dialogue, techniques to support grounded language communication are essential. One particular challenge is to ground human language to robot internal representation of the physical world. Although copresent in a shared environment, humans and robots have mismatched capabilities in reasoning, perception, and action. Their representations of the shared environment and joint tasks are significantly misaligned. Humans and robots will need to make extra effort to bridge the gap and strive for a common ground of the shared world. Only then, is the robot able to engage in language communication and joint tasks. Thus computational models for language grounding will need to take collaboration into consideration. A robot not only needs to incorporate collaborative effort from human partners to better connect human language to its own representation, but also needs to make extra collaborative effort to communicate its representation in language that humans can understand. To address these issues, the Language and Interaction Research group (LAIR) at Michigan State University has investigated multiple aspects of collaborative language grounding. This article gives a brief introduction to this research effort and discusses several collaborative approaches to grounding language to perception and action.


Ontology Instance Linking: Towards Interlinked Knowledge Graphs

AAAI Conferences

Due to the decentralized nature of the Semantic Web, the same real-world entity may be described in various data sources with different ontologies and assigned syntactically distinct identifiers. In order to facilitate data utilization and consumption in the Semantic Web, without compromising the freedom of people to publish their data, one critical problem is to appropriately interlink such heterogeneous data. This interlinking process is sometimes referred to as Entity Coreference, i.e., finding which identifiers refer to the same real-world entity. In this paper, we first summarize state-of-the-art algorithms in detecting such coreference relationships between ontology instances. We then discuss various techniques in scaling entity coreference to large-scale datasets. Finally, we present well-adopted evaluation datasets and metrics, and compare the performance of the state-of-the-art algorithms on such datasets.


Fifteenth International Conference on Artificial Intelligence and Law (ICAIL 2015)

AI Magazine

The 15th International Conference on AI and Law (ICAIL 2015) will be held in San Diego, California, USA, June 8-12, 2015, at the University of San Diego, at the Kroc Institute, under the auspices of the International Association for Artificial Intelligence and Law (IAAIL), an organization devoted to promoting research and development in the field of AI and law with members throughout the world. The conference is held in cooperation with the Association for the Advancement of Artificial Intelligence (AAAI) and with ACM SIGAI (the Special Interest Group on Artificial Intelligence of the Association for Computing Machinery).


Fifteenth International Conference on Artificial Intelligence and Law (ICAIL 2015)

AI Magazine

The 15th International Conference on AI and Law (ICAIL 2015) will be held in San Diego, California, USA, June 8-12, 2015, at the University of San Diego, at the Kroc Institute, under the auspices of the International Association for Artificial Intelligence and Law (IAAIL), an organization devoted to promoting research and development in the field of AI and law with members throughout the world. The conference is held in cooperation with the Association for the Advancement of Artificial Intelligence (AAAI) and with ACM SIGAI (the Special Interest Group on Artificial Intelligence of the Association for Computing Machinery).