Yahoo Labs
Graph Analysis for Detecting Fraud, Waste, and Abuse in Healthcare Data
Liu, Juan (Medallia) | Bier, Eric (Palo Alto Research Center) | Wilson, Aaron (Palo Alto Research Center) | Guerra-Gomez, John Alexis (Yahoo Labs) | Honda, Tomonori (Inflection.com) | Sricharan, Kumar (Palo Alto Research Center) | Gilpin, Leilani (Massachusetts Institute for Technology) | Davies, Daniel (Palo Alto Research Center)
Detection of fraud, waste, and abuse (FWA) is an important yet challenging problem. In this article, we describe a system to detect suspicious activities in large healthcare datasets. Each healthcare dataset is viewed as a heterogeneous network consisting of millions of patients, hundreds of thousands of doctors, tens of thousands of pharmacies, and other entities. Graph analysis techniques are developed to find suspicious individuals, suspicious relationships between individuals, unusual changes over time, unusual geospatial dispersion, and anomalous network structure.
Graph Analysis for Detecting Fraud, Waste, and Abuse in Healthcare Data
Liu, Juan (Medallia) | Bier, Eric (Palo Alto Research Center) | Wilson, Aaron (Palo Alto Research Center) | Guerra-Gomez, John Alexis (Yahoo Labs) | Honda, Tomonori (Inflection.com) | Sricharan, Kumar (Palo Alto Research Center) | Gilpin, Leilani (Massachusetts Institute for Technology) | Davies, Daniel (Palo Alto Research Center)
Healthcare-related programs include federal and series of technical challenges. From a data representation state government programs such as Medicaid, view, healthcare data sets are often large and Medicare Advantage (Part C), Medicare FFS, and diverse. It is common to see a state's Medicaid program Medicare Prescription Drug Benefit (Part D). Nonhealth-care or a private healthcare insurance program having programs include Earned Income Tax hundreds of millions of claims per year, involving Credit (EITC), Pell Grants, Public Housing/Rental millions of patients and hundreds of thousands of Assistance, Retirement, Survivors and Disability Insurance providers of various types, for example, physicians, (RSDI), School Lunch, Supplemental Nutrition pharmacies, clinics and hospitals, and laboratories. Assistance Program (SNAP), Supplemental Security Any fraud-detection system needs to be able to handle Income (SSI), Unemployment Insurance (UI), and the large data volume and data diversity. While healthcare data (insurance claims, health Data patterns from both sides are dynamic. The complexity records, clinical data, provider information, and others) of the problem calls for a rich set of techniques offers tantalizing opportunities, it also poses a to examine healthcare data. Healthcare financials are complex, involving a from a suspicious individual or activity (as singled multitude of providers (physicians, pharmacies, clinics out by the automated screening components) and and hospitals, and laboratories), payers (insurance interacts with the system to navigate through data plans), and patients. To design a good fraud-detection items and collect evidence to build an investigation system, one must have a deep understanding of the case. The two categories have quite different technical financial incentives of all parties. Starting from database indexing/caching for fast data retrieval and domain knowledge, auditors and investigators have user interface design for intuitive user-system interaction.
Recommendation with Social Dimensions
Tang, Jiliang (Yahoo Labs) | Wang, Suhang (Arizona State University) | Hu, Xia (Texas A&M University) | Yin, Dawei (Yahoo Labs) | Bi, Yingzhou (Guangxi Teachers Education University) | Chang, Yi (Yahoo Labs) | Liu, Huan (Arizona State University)
The pervasive presence of social media greatly enriches online users' social activities, resulting in abundant social relations. Social relations provide an independent source for recommendation, bringing about new opportunities for recommender systems. Exploiting social relations to improve recommendation performance attracts a great amount of attention in recent years. Most existing social recommender systems treat social relations homogeneously and make use of direct connections (or strong dependency connections). However, connections in online social networks are intrinsically heterogeneous and are a composite of various relations. While connected users in online social networks form groups, and users in a group share similar interests, weak dependency connections are established among these users when they are not directly connected. In this paper, we investigate how to exploit the heterogeneity of social relations and weak dependency connections for recommendation. In particular, we employ social dimensions to simultaneously capture heterogeneity of social relations and weak dependency connections, and provide principled ways to model social dimensions, and propose a recommendation framework SoDimRec which incorporates heterogeneity of social relations and weak dependency connections based on social dimensions. Experimental results on real-world data sets demonstrate the effectiveness of the proposed framework. We conduct further experiments to understand the important role of social dimensions in the proposed framework.
Evaluation of Semantic Dependency Labeling Across Domains
Stoyanchev, Svetlana (Interactions Corporation) | Stent, Amanda (Yahoo Labs) | Bangalore, Srinivas (Interactions Corporation)
One of the key concerns in computational semantics is to construct a domain independent semantic representation which captures the richness of natural language, yet can be quickly customized to a specific domain for practical applications. We propose to use generic semantic frames defined in FrameNet, a domain-independent semantic resource, as an intermediate semantic representation for language understanding in dialog systems. In this paper we: (a) outline a novel method for FrameNet-style semantic dependency labeling that builds on a syntactic dependency parse; and (b) compare the accuracy of domain-adapted and generic approaches to semantic parsing for dialog tasks, using a frame-annotated corpus of human-computer dialogs in an airline reservation domain.
DECT: Distributed Evolving Context Tree for Understanding User Behavior Pattern Evolution
Shu, Xiaokui (Virginia Polytechnic Institute and State University) | Laptev, Nikolay (Yahoo Labs) | Yao, Danfeng (Daphne) (Virginia Polytechnic Institute and State University)
Internet user behavior models characterize user browsing dynamics or the transitions among web pages. The models help Internet companies improve their services by accurately targeting customers and providing them the information they want. For instance, specific web pages can be customized and prefetched for individuals based on sequences of web pages they have visited. Existing user behavior models abstracted as time-homogeneous Markov models cannot efficiently model user behavior variation through time. This demo presents DECT, a scalable time-variant variable-order Markov model. DECT digests terabytes of user session data and yields user behavior patterns through time. We realize DECT using Apache Spark and deploy it on top of Yahoo! infrastructure. We demonstrate the benefits of DECT with anomaly detection and ad click rate prediction applications. DECT enables the detection of higher-order path anomalies and provides deep insights into ad click rates with respect to user visiting paths.
An Image Is Worth More than a Thousand Favorites: Surfacing the Hidden Beauty of Flickr Pictures
Schifanella, Rossano (University of Turin) | Redi, Miriam (Yahoo Labs) | Aiello, Luca Maria (Yahoo Labs)
The dynamics of attention in social media tend to obey power laws. Attention concentrates on a relatively small number of popular items and neglecting the vast majority of content produced by the crowd. Although popularity can be an indication of the perceived value of an item within its community, previous research has hinted to the fact that popularity is distinct from intrinsic quality. As a result, content with low visibility but high quality lurks in the tail of the popularity distribution. This phenomenon can be particularly evident in the case of photo-sharing communities, where valuable photographers who are not highly engaged in online social interactions contribute with high-quality pictures that remain unseen. We propose to use a computer vision method to surface beautiful pictures from the immense pool of near-zero-popularity items, and we test it on a large dataset of creative-commons photos on Flickr. By gathering a large crowdsourced ground truth of aesthetics scores for Flickr images, we show that our method retrieves photos whose median perceived beauty score is equal to the most popular ones, and whose average is lower by only 1.5%.
Taxonomy-Based Discovery and Annotation of Functional Areas in the City
Vaca, Carmen Karina (Escuela Superior Politecnica del Litoral,ย ESPOL, Facultad de Ingenierรญa en Electricidad y Computacion) | Quercia, Daniele (University of Cambridge) | Bonchi, Francesco (Yahoo Labs) | Fraternali, Piero (Politecnico di Milano)
Mapping the functional use of city areas (e.g., mapping clusters of hotels or of electronic shops) enables a variety of applications (e.g., innovative way-finding tools). To do that mapping, researchers have recently processed geo-referenced data with spatial clustering algorithms. These algorithms usually perform two consecutive steps: they cluster nearby points on the map, and then assign labels (e.g., 'electronics') to the resulting clusters. When applied in the city context, these algorithms do not fully work, not least because they consider the two steps of clustering and labeling as separate. Since there is no reason to keep those two steps separate, we propose a framework that clusters points based not only on their density but also on their semantic relatedness. We evaluate this framework upon Foursquare data in the cities of Barcelona, Milan, and London. We find that it is more effective than the baseline method of DBSCAN in discovering functional areas. We complement that quantitative evaluation with a user study involving 111 participants in the three cities. Finally, to illustrate the generalizability of our framework, we process temporal data with it and successfully discover seasonal uses of the city.
Inertial Hidden Markov Models: Modeling Change in Multivariate Time Series
Montanez, George D. (Carnegie Mellon University) | Amizadeh, Saeed (Yahoo Labs) | Laptev, Nikolay (Yahoo Labs)
Faced with the problem of characterizing systematic changes in multivariate time series in an unsupervised manner, we derive and test two methods of regularizing hidden Markov models for this task. Regularization on state transitions provides smooth transitioning among states, such that the sequences are split into broad, contiguous segments. Our methods are compared with a recent hierarchical Dirichlet process hidden Markov model (HDP-HMM) and a baseline standard hidden Markov model, of which the former suffers from poor performance on moderate-dimensional data and sensitivity to parameter settings, while the latter suffers from rapid state transitioning, over-segmentation and poor performance on a segmentation task involving human activity accelerometer data from the UCI Repository. The regularized methods developed here are able to perfectly characterize change of behavior in the human activity data for roughly half of the real-data test cases, with accuracy of 94% and low variation of information. In contrast to the HDP-HMM, our methods provide simple, drop-in replacements for standard hidden Markov model update rules, allowing standard expectation maximization (EM) algorithms to be used for learning.
Exploiting Task-Feature Co-Clusters in Multi-Task Learning
Xu, Linli (University of Science and Technology of China) | Huang, Aiqing (University of Science and Technology of China) | Chen, Jianhui (Yahoo Labs) | Chen, Enhong (University of Science and Technology of China)
In multi-task learning, multiple related tasks are considered simultaneously, with the goal to improve the generalization performance by utilizing the intrinsic sharing of information across tasks. This paper presents a multi-task learning approach by modeling the task-feature relationships. Specifically, instead of assuming that similar tasks have similar weights on all the features, we start with the motivation that the tasks should be related in terms of subsets of features, which implies a co-cluster structure. We design a novel regularization term to capture this task-feature co-cluster structure. A proximal algorithm is adopted to solve the optimization problem. Convincing experimental results demonstrate the effectiveness of the proposed algorithm and justify the idea of exploiting the task-feature relationships.
Delivering Guaranteed Display Ads under Reach and Frequency Requirements
Hojjat, Ali (University of California, Irvine) | Turner, John (University of California, Irvine) | Cetintas, Suleyman (Yahoo Labs) | Yang, Jian (Yahoo Labs)
We propose a novel idea in the allocation and serving of online advertising. We show that by using predetermined fixed-length streams of ads (which we call patterns) to serve advertising, we can incorporate a variety of interesting features into the ad allocation optimization problem. In particular, our formulation optimizes for representativeness as well as user-level diversity and pacing of ads, under reach and frequency requirements. We show how the problem can be solved efficiently using a column generation scheme in which only a small set of best patterns are kept in the optimization problem. Our numerical tests suggest that with parallelization of the pattern generation process, the algorithm has a promising run time and memory usage.