Goto

Collaborating Authors

 Communications: Overviews


EmTech India 2016: The digital future

#artificialintelligence

Global technology leaders and senior executives from around the world spoke on a range of topics, including Digital India, Smart Cities, Make in India, Skill India and cutting-edge technologies such as artificial intelligence, machine learning, 3D printing, drones, robotics, robotic surgeries and genomics, at the two-day EmTech India 2016 event, held in New Delhi on 18 and 19 March. The event was organized by Mint and MIT Technology Review, published by the Massachusetts Institute of Technology (MIT). The speakers included Jack Hidary, senior adviser at Google X Labs; Bhaskar Pramanik, chairman of Microsoft India; and Sharad Sharma, co-founder of think tank iSPIRT. The full list can be accessed at emtech.livemint.com/speakers. Here are edited excerpts from their speeches. A moonshot is an initiative that accompanies a goal that was previously thought to be near impossible. Moonshot philosophy sounds like it is quite radical and risky, but actually it is low-risk. That is because it attracts the best human capital and finance. Moonshot approaches do a few things. First, they attract the best human capital, which is a key driver of growth. They attract the best financial capital as well; capital from big and long-term thinkers. One describes India as a moonshot nation. India itself is going through a radical transformation--the likes of what we have never seen. This is very different to what is happening in China or any other country in the world. It is a combination of smartphones, digital payments, broadband and power of energy storage coming together. Smartphones ease the access to the Internet and open up users to mobile apps and that really changes the name of the game.


SpeechTEK agenda for Monday, May 23, 2016

#artificialintelligence

The field of intellectual property is rapidly evolving, both with respect to the law and the technologies being considered for protection. This session provides a primer about what a patent is, current best practices for protecting speech technologies and defending against assertion, and the recent evolution of intellectual property law in the United States, with emphasis on speech, software user interfaces, and mobile technologies. Fraudsters are using robodialing and ANI spoofing to wreak havoc on call centers. From the illegal practice of toll-free traffic pumping and international revenue-sharing fraud, to the more villainous acts of financial account fraud, identity theft, and drug trafficking, this seminar explores the unusual ways criminals are hacking our businesses. We also examine simple and cost-effective practices to protect our businesses, and our customers.


Vertex nomination schemes for membership prediction

arXiv.org Machine Learning

Suppose that a graph is realized from a stochastic block model where one of the blocks is of interest, but many or all of the vertices' block labels are unobserved. The task is to order the vertices with unobserved block labels into a ``nomination list'' such that, with high probability, vertices from the interesting block are concentrated near the list's beginning. We propose several vertex nomination schemes. Our basic - but principled - setting and development yields a best nomination scheme (which is a Bayes-Optimal analogue), and also a likelihood maximization nomination scheme that is practical to implement when there are a thousand vertices, and which is empirically near-optimal when the number of vertices is small enough to allow comparison to the best nomination scheme. We then illustrate the robustness of the likelihood maximization nomination scheme to the modeling challenges inherent in real data, using examples which include a social network involving human trafficking, the Enron Graph, a worm brain connectome and a political blog network.


A Review of Relational Machine Learning for Knowledge Graphs

arXiv.org Machine Learning

Relational machine learning studies methods for the statistical analysis of relational, or graph-structured, data. In this paper, we provide a review of how such statistical models can be "trained" on large knowledge graphs, and then used to predict new facts about the world (which is equivalent to predicting new edges in the graph). In particular, we discuss two fundamentally different kinds of statistical relational models, both of which can scale to massive datasets. The first is based on latent feature models such as tensor factorization and multiway neural networks. The second is based on mining observable patterns in the graph. We also show how to combine these latent and observable models to get improved modeling power at decreased computational cost. Finally, we discuss how such statistical models of graphs can be combined with text-based information extraction methods for automatically constructing knowledge graphs from the Web. To this end, we also discuss Google's Knowledge Vault project as an example of such combination.


An End-to-End Conversational Second Screen Application for TV Program Discovery

AI Magazine

In this article, we report on a multiphase R&D effort to develop a conversational second screen application for TV program discovery. Our goal is to share with the community the breadth of artificial intelligence (AI) and natural language (NL) technologies required to develop such an application along with learnings from target end-users. We first give an overview of our application from the perspective of the end-user. We then present the architecture of our application along with the main AI and NL components, which were developed over multiple phases. The first phase focuses on enabling core functionality such as effectively finding programs matching the user’s intent. The second phase focuses on enabling dialog with the user. Finally, we present two user studies, corresponding to these two phases. The results from both studies demonstrate the effectiveness of our application in the target domain.


An Unsupervised Framework of Exploring Events on Twitter: Filtering, Extraction and Categorization

AAAI Conferences

Twitter, as a popular microblogging service, has become a new information channel for users to receive and exchange the mostup-to-date information on current events. However, since there is no control on how users can publish messages on Twitter, finding newsworthy events from Twitter becomes a difficult task like "finding a needle in a haystack". In this paper we propose a general unsupervised framework to explore events from tweets, which consists of a pipeline process of filtering, extraction and categorization. To filter out noisy tweets, the filtering step exploits a lexicon-based approach to separate tweets that are event-related from those that are not. Then, based on these event-related tweets, the structured representations of events are extracted and categorized automatically using an unsupervised Bayesian model without the use of any labelled data. Moreover, the categorized events are assigned with the event type labels without human intervention. The proposed framework has been evaluated on over 60 millions tweets which were collected for one month in December 2010. A precision of 70.49% is achieved in event extraction, outperforming a competitive baseline by nearly 6%. Events are also clustered into coherence groups with the automatically assigned event type label.


Efficient Task Sub-Delegation for Crowdsourcing

AAAI Conferences

Reputation-based approaches allow a crowdsourcing system to identify reliable workers to whom tasks can be delegated. In crowdsourcing systems that can be modeled as multi-agent trust networks consist of resource constrained trustee agents (i.e., workers), workers may need to further sub-delegate tasks to others if they determine that they cannot complete all pending tasks before the stipulated deadlines. Existing reputation-based decision-making models cannot help workers decide when and to whom to sub-delegate tasks. In this paper, we proposed a reputation aware task sub-delegation (RTS) approach to bridge this gap. By jointly considering a worker's reputation, workload, the price of its effort and its trust relationships with others, RTS can be implemented as an intelligent agent to help workers make sub-delegation decisions in a distributed manner. The resulting task allocation maximizes social welfare through efficient utilization of the collective capacity of a crowd, and provides provable performance guarantees. Experimental comparisons with state-of-the-art approaches based on the Epinions trust network demonstrate significant advantages of RTS under high workload conditions.


A Survey of Point-of-Interest Recommendation in Location-Based Social Networks

AAAI Conferences

With the rapid development of mobile devices, global position system (GPS) and Web 2.0 technologies, location-based social networks (LBSNs) have attracted millions of users to share rich information, such as experiences and tips. Point-of-Interest (POI) recommender system plays an important role in LBSNs since it can help users explore attractive locations as well as help social network service providers design location-aware advertisements for Point-of-Interest. In this paper, we present a brief survey over the task of Point-of-Interest recommendation in LBSNs and discuss some research directions for Point-of-Interest recommendation. We first describe the unique characteristics of Point-of-Interest recommendation, which distinguish Point-of-Interest recommendation approaches from traditional recommendation approaches. Then, according to what type of additional information are integrated with check-in data by POI recommendation algorithms, we classify POI recommendation algorithms into four categories: pure check-in data based POI recommendation approaches, geographical influence enhanced POI recommendation approaches, social influence enhanced POI recommendation approaches and temporal influence enhanced POI recommendation approaches. Finally, we discuss future research directions for Point-of-Interest recommendation.


Computational Urban Modeling: From Mainframes to Data Streams

AAAI Conferences

Assuming computational technologies as a dominant factor in forming new scientific methods during the last century, we review the field of computational urban modeling based on the ways different approaches deal with evolving computational and informational capacities. We claim that during the last few years, due to advancements in ubiquitous computing the flow of unstructured data streams have changed the landscape of empirical modeling and simulation. However, there is a conceptual mismatch between the state of the art in urban modeling paradigms and the capacities offered by these urban data streams. We discuss some alternative mathematical methodologies that introduce an abstraction from the traditional urban modeling methodologies.


Report 77-27 Overview and Bibliography of Distributed Stanford -- KSL Databases

AI Classics

Because of the recent - echnological advances in computer networks and communications, and because of the cost reduction of computer hardware, there has been a great interest in distributed data bases including some attempts at actual implementations. In this paper, we will first define what we mean by a distributed data base. Then we will give some of the reasons why people are so interested in this new field. After classifying the different types of distributed data bases, we will describe the current areas of research. Finally, we will give an annotated bibliography that lists the most important papers in thi:3 area.