AITopics | Yahoo Labs

Collaborating Authors

Yahoo Labs

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Graph Analysis for Detecting Fraud, Waste, and Abuse in Healthcare Data

AI MagazineJul-4-2016

Detection of fraud, waste, and abuse (FWA) is an important yet challenging problem. In this article, we describe a system to detect suspicious activities in large healthcare datasets. Each healthcare dataset is viewed as a heterogeneous network consisting of millions of patients, hundreds of thousands of doctors, tens of thousands of pharmacies, and other entities. Graph analysis techniques are developed to find suspicious individuals, suspicious relationships between individuals, unusual changes over time, unusual geospatial dispersion, and anomalous network structure.

graph analysis, law enforcement, public safety, (10 more...)

AI Magazine

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Graph Analysis for Detecting Fraud, Waste, and Abuse in Healthcare Data

AI MagazineJul-4-2016

Healthcare-related programs include federal and series of technical challenges. From a data representation state government programs such as Medicaid, view, healthcare data sets are often large and Medicare Advantage (Part C), Medicare FFS, and diverse. It is common to see a state's Medicaid program Medicare Prescription Drug Benefit (Part D). Nonhealth-care or a private healthcare insurance program having programs include Earned Income Tax hundreds of millions of claims per year, involving Credit (EITC), Pell Grants, Public Housing/Rental millions of patients and hundreds of thousands of Assistance, Retirement, Survivors and Disability Insurance providers of various types, for example, physicians, (RSDI), School Lunch, Supplemental Nutrition pharmacies, clinics and hospitals, and laboratories. Assistance Program (SNAP), Supplemental Security Any fraud-detection system needs to be able to handle Income (SSI), Unemployment Insurance (UI), and the large data volume and data diversity. While healthcare data (insurance claims, health Data patterns from both sides are dynamic. The complexity records, clinical data, provider information, and others) of the problem calls for a rich set of techniques offers tantalizing opportunities, it also poses a to examine healthcare data. Healthcare financials are complex, involving a from a suspicious individual or activity (as singled multitude of providers (physicians, pharmacies, clinics out by the automated screening components) and and hospitals, and laboratories), payers (insurance interacts with the system to navigate through data plans), and patients. To design a good fraud-detection items and collect evidence to build an investigation system, one must have a deep understanding of the case. The two categories have quite different technical financial incentives of all parties. Starting from database indexing/caching for fast data retrieval and domain knowledge, auditors and investigators have user interface design for intuitive user-system interaction.

law enforcement, provider, us government, (20 more...)

AI Magazine

Country: North America > United States > California (0.28)

Genre: Research Report (0.34)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Health & Medicine > Health Care Providers & Services > Reimbursement (1.00)
Health & Medicine > Government Relations & Public Policy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Biomedical Informatics > Clinical Informatics (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Recommendation with Social Dimensions

AAAI ConferencesApr-19-2016

The pervasive presence of social media greatly enriches online users' social activities, resulting in abundant social relations. Social relations provide an independent source for recommendation, bringing about new opportunities for recommender systems. Exploiting social relations to improve recommendation performance attracts a great amount of attention in recent years. Most existing social recommender systems treat social relations homogeneously and make use of direct connections (or strong dependency connections). However, connections in online social networks are intrinsically heterogeneous and are a composite of various relations. While connected users in online social networks form groups, and users in a group share similar interests, weak dependency connections are established among these users when they are not directly connected. In this paper, we investigate how to exploit the heterogeneity of social relations and weak dependency connections for recommendation. In particular, we employ social dimensions to simultaneously capture heterogeneity of social relations and weak dependency connections, and provide principled ways to model social dimensions, and propose a recommendation framework SoDimRec which incorporates heterogeneity of social relations and weak dependency connections based on social dimensions. Experimental results on real-world data sets demonstrate the effectiveness of the proposed framework. We conduct further experiments to understand the important role of social dimensions in the proposed framework.

artificial intelligence, social dimension, social media, (19 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.46)

Industry:

Information Technology > Services (0.56)
Government (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Evaluation of Semantic Dependency Labeling Across Domains

Stoyanchev, Svetlana (Interactions Corporation) | Stent, Amanda (Yahoo Labs) | Bangalore, Srinivas (Interactions Corporation)

AAAI ConferencesApr-19-2016

One of the key concerns in computational semantics is to construct a domain independent semantic representation which captures the richness of natural language, yet can be quickly customized to a specific domain for practical applications. We propose to use generic semantic frames defined in FrameNet, a domain-independent semantic resource, as an intermediate semantic representation for language understanding in dialog systems. In this paper we: (a) outline a novel method for FrameNet-style semantic dependency labeling that builds on a syntactic dependency parse; and (b) compare the accuracy of domain-adapted and generic approaches to semantic parsing for dialog tasks, using a frame-annotated corpus of human-computer dialogs in an airline reservation domain.

air transportation, argument, text processing, (19 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > New York (0.14)

Genre: Research Report (0.48)

Industry:

Transportation > Passenger (0.48)
Transportation > Air (0.34)
Consumer Products & Services > Travel (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

DECT: Distributed Evolving Context Tree for Understanding User Behavior Pattern Evolution

Shu, Xiaokui (Virginia Polytechnic Institute and State University) | Laptev, Nikolay (Yahoo Labs) | Yao, Danfeng (Daphne) (Virginia Polytechnic Institute and State University)

AAAI ConferencesApr-19-2016

Internet user behavior models characterize user browsing dynamics or the transitions among web pages. The models help Internet companies improve their services by accurately targeting customers and providing them the information they want. For instance, specific web pages can be customized and prefetched for individuals based on sequences of web pages they have visited. Existing user behavior models abstracted as time-homogeneous Markov models cannot efficiently model user behavior variation through time. This demo presents DECT, a scalable time-variant variable-order Markov model. DECT digests terabytes of user session data and yields user behavior patterns through time. We realize DECT using Apache Spark and deploy it on top of Yahoo! infrastructure. We demonstrate the benefits of DECT with anomaly detection and ad click rate prediction applications. DECT enables the detection of higher-order path anomalies and provides deep insights into ad click rates with respect to user visiting paths.

artificial intelligence, data mining, dect, (13 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > Virginia (0.16)

Industry: Information Technology (0.35)

Technology:

Information Technology > Communications > Web (0.77)
Information Technology > Data Science > Data Mining (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)
Information Technology > Communications > Networks (0.52)

Add feedback

An Image Is Worth More than a Thousand Favorites: Surfacing the Hidden Beauty of Flickr Pictures

Schifanella, Rossano (University of Turin) | Redi, Miriam (Yahoo Labs) | Aiello, Luca Maria (Yahoo Labs)

AAAI ConferencesApr-4-2015

The dynamics of attention in social media tend to obey power laws. Attention concentrates on a relatively small number of popular items and neglecting the vast majority of content produced by the crowd. Although popularity can be an indication of the perceived value of an item within its community, previous research has hinted to the fact that popularity is distinct from intrinsic quality. As a result, content with low visibility but high quality lurks in the tail of the popularity distribution. This phenomenon can be particularly evident in the case of photo-sharing communities, where valuable photographers who are not highly engaged in online social interactions contribute with high-quality pictures that remain unseen. We propose to use a computer vision method to surface beautiful pictures from the immense pool of near-zero-popularity items, and we test it on a large dataset of creative-commons photos on Flickr. By gathering a large crowdsourced ground truth of aesthetics scores for Flickr images, we show that our method retrieves photos whose median perceived beauty score is equal to the most popular ones, and whose average is lower by only 1.5%.

flickr picture, hidden beauty, surfacing

AAAI Conferences

Ninth International AAAI Conference on Web and Social Media

Industry: Information Technology > Services (0.80)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (0.87)

Add feedback

Taxonomy-Based Discovery and Annotation of Functional Areas in the City

Vaca, Carmen Karina (Escuela Superior Politecnica del Litoral, ESPOL, Facultad de Ingeniería en Electricidad y Computacion) | Quercia, Daniele (University of Cambridge) | Bonchi, Francesco (Yahoo Labs) | Fraternali, Piero (Politecnico di Milano)

AAAI ConferencesApr-4-2015

Mapping the functional use of city areas (e.g., mapping clusters of hotels or of electronic shops) enables a variety of applications (e.g., innovative way-finding tools). To do that mapping, researchers have recently processed geo-referenced data with spatial clustering algorithms. These algorithms usually perform two consecutive steps: they cluster nearby points on the map, and then assign labels (e.g., 'electronics') to the resulting clusters. When applied in the city context, these algorithms do not fully work, not least because they consider the two steps of clustering and labeling as separate. Since there is no reason to keep those two steps separate, we propose a framework that clusters points based not only on their density but also on their semantic relatedness. We evaluate this framework upon Foursquare data in the cities of Barcelona, Milan, and London. We find that it is more effective than the baseline method of DBSCAN in discovering functional areas. We complement that quantitative evaluation with a user study involving 111 participants in the three cities. Finally, to illustrate the generalizability of our framework, we process temporal data with it and successfully discover seasonal uses of the city.

functional area, taxonomy-based discovery and annotation

AAAI Conferences

Ninth International AAAI Conference on Web and Social Media

Technology: Information Technology > Artificial Intelligence (0.73)

Add feedback

Inertial Hidden Markov Models: Modeling Change in Multivariate Time Series

Montanez, George D. (Carnegie Mellon University) | Amizadeh, Saeed (Yahoo Labs) | Laptev, Nikolay (Yahoo Labs)

AAAI ConferencesMar-6-2015

Faced with the problem of characterizing systematic changes in multivariate time series in an unsupervised manner, we derive and test two methods of regularizing hidden Markov models for this task. Regularization on state transitions provides smooth transitioning among states, such that the sequences are split into broad, contiguous segments. Our methods are compared with a recent hierarchical Dirichlet process hidden Markov model (HDP-HMM) and a baseline standard hidden Markov model, of which the former suffers from poor performance on moderate-dimensional data and sensitivity to parameter settings, while the latter suffers from rapid state transitioning, over-segmentation and poor performance on a segmentation task involving human activity accelerometer data from the UCI Repository. The regularized methods developed here are able to perfectly characterize change of behavior in the human activity data for roughly half of the real-data test cases, with accuracy of 94% and low variation of information. In contrast to the HDP-HMM, our methods provide simple, drop-in replacements for standard hidden Markov model update rules, allowing standard expectation maximization (EM) algorithms to be used for learning.

artificial intelligence, health & medicine, regularization, (19 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Exploiting Task-Feature Co-Clusters in Multi-Task Learning

Xu, Linli (University of Science and Technology of China) | Huang, Aiqing (University of Science and Technology of China) | Chen, Jianhui (Yahoo Labs) | Chen, Enhong (University of Science and Technology of China)

AAAI ConferencesMar-6-2015

In multi-task learning, multiple related tasks are considered simultaneously, with the goal to improve the generalization performance by utilizing the intrinsic sharing of information across tasks. This paper presents a multi-task learning approach by modeling the task-feature relationships. Specifically, instead of assuming that similar tasks have similar weights on all the features, we start with the motivation that the tasks should be related in terms of subsets of features, which implies a co-cluster structure. We design a novel regularization term to capture this task-feature co-cluster structure. A proximal algorithm is adopted to solve the optimization problem. Convincing experimental results demonstrate the effectiveness of the proposed algorithm and justify the idea of exploiting the task-feature relationships.

artificial intelligence, machine learning, multi-task learning, (16 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Delivering Guaranteed Display Ads under Reach and Frequency Requirements

Hojjat, Ali (University of California, Irvine) | Turner, John (University of California, Irvine) | Cetintas, Suleyman (Yahoo Labs) | Yang, Jian (Yahoo Labs)

AAAI ConferencesJul-14-2014

We propose a novel idea in the allocation and serving of online advertising. We show that by using predetermined fixed-length streams of ads (which we call patterns) to serve advertising, we can incorporate a variety of interesting features into the ad allocation optimization problem. In particular, our formulation optimizes for representativeness as well as user-level diversity and pacing of ads, under reach and frequency requirements. We show how the problem can be solved efficiently using a column generation scheme in which only a small set of best patterns are kept in the optimization problem. Our numerical tests suggest that with parallelization of the pattern generation process, the algorithm has a promising run time and memory usage.

artificial intelligence, impression, optimization problem, (18 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > United States > California (0.28)

Industry:

Marketing (1.00)
Information Technology > Services (0.49)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback