Xerox Research Centre India
ICU Mortality Prediction: A Classification Algorithm for Imbalanced Datasets
Bhattacharya, Sakyajit (Xerox Research Centre India) | Rajan, Vaibhav (Xerox Research Centre India) | Shrivastava, Harsh (Xerox Research Centre India)
Determining mortality risk is important for critical decisions in Intensive Care Units (ICU). The need for machine learning models that provide accurate patient-specific prediction of mortality is well recognized. We present a new algorithm for ICU mortality prediction that is designed to address the problem of imbalance, which occurs, in the context of binary classification, when one of the two classes is significantly under--represented in the data. We take a fundamentally new approach in exploiting the class imbalance through a feature transformation such that the transformed features are easier to classify. Hypothesis testing is used for classification with a test statistic that follows the distribution of the difference of two chi-squared random variables, for which there are no analytic expressions and we derive an accurate approximation. Experiments on a benchmark dataset of 4000 ICU patients show that our algorithm surpasses the best competing methods for mortality prediction.
Inductive Pairwise Ranking: Going Beyond the n log( n ) Barrier
Niranjan, U.N. (University of California Irvine) | Rajkumar, Arun (Xerox Research Centre India)
We study the problem of ranking a set of items from nonactively chosen pairwise preferences where each item has feature information with it. We propose and characterize a very broad class of preference matrices giving rise to the Feature Low Rank (FLR) model, which subsumes several models ranging from the classic BradleyโTerryโLuce (BTL) (Bradley and Terry 1952) and Thurstone (Thurstone 1927) models to the recently proposed blade-chest (Chen and Joachims 2016) and generic low-rank preference (Rajkumar and Agarwal 2016) models. We use the technique of matrix completion in the presence of side information to develop the Inductive Pairwise Ranking (IPR) algorithm that provably learns a good ranking under the FLR model, in a sample-efficient manner. In practice, through systematic synthetic simulations, we confirm our theoretical findings regarding improvements in the sample complexity due to the use of feature information. Moreover, on popular real-world preference learning datasets, with as less as 10% sampling of the pairwise comparisons, our method recovers a good ranking.
QA RT : A System for Real-Time Holistic Quality Assurance for Contact Center Dialogues
Roy, Shourya (Xerox Research Centre India) | Mariappan, Ragunathan (Xerox Research Centre India) | Dandapat, Sandipan (Xerox Research Centre India) | Srivastava, Saurabh (Xerox Research Centre India) | Galhotra, Sainyam (University of Massachussets, Amherst) | Peddamuthu, Balaji (Xerox Research Centre India)
Quality assurance (QA) and customer satisfaction (C-Sat) analysis are two commonly used practices to measure goodness of dialogues between agents and customers in contact centers. The practices however have a few shortcomings. QA puts sole emphasis on agentsโ organizational compliance aspect whereas C-Sat attempts to measure customersโ satisfaction only based on post dialogue surveys. As a result, outcome of independent QA and C-Sat analysis may not always be in correspondence. Secondly, both processes are retrospective in nature and hence, evidences of bad past dialogues (and consequently bad customer experiences) can only be found after hours or days or weeks depending on their periodicity. Finally, human intensive nature of these practices lead to time and cost overhead while being able to analyze only a small fraction of dialogues. In this paper, we introduce an automatic real-time quality assurance system for contact centers โ QART (pronounced cart). QART performs multi-faceted analysis on dialogue utterances, as they happen, using sophisticated statistical and rule-based natural language processing (NLP) techniques. It covers various aspects inspired by todayโs QA and C-Sat practices as well as introduces novel incremental dialogue summarization capability. QART front-end is an interactive dashboard providing views of ongoing dialogues at different granularity enabling agentsโ supervisors to monitor and take corrective actions as needed. We demonstrate effectiveness of different back-end modules as well as the overall system by experimental results on a real-life contact center chat dataset.
PISCES: Participatory Incentive Strategies for Effective Community Engagement in Smart Cities
Biswas, Arpita (Xerox Research Centre India) | Chander, Deepthi (Xerox Research Centre India) | Dasgupta, Koustuv (Xerox Research Centre India) | Mukherjee, Koyel (Xerox Research Centre India) | Singh, Mridula (Xerox Research Centre India) | Mukherjee, Tridib (Xerox Research Centre India)
A key challenge in participatory sensing systems has been the design of incentive mechanisms that motivate individuals to contribute data to consuming applications. Emerging trends in urban development and smart city planning indicate the use of citizen reports to gather insights and identify areas for transformation. Consumers of these reports (e.g. city agencies) typically associate non-uniform utility (or values) to different reports based on the spatio-temporal context of the reports. For example, a report indicating traffic congestion near an airport, in early morning hours, would tend to have much higher utility than a similar report from a sparse residential area. In such cases, the design of an incentive mechanism must motivate participants, via appropriate rewards (or payments), to provide higher utility reports when compared to less valued ones. The main challenge in designing such an incentive scheme is two-fold: (i) lack of prior knowledge of participants in terms of their availability (i.e. who are in the vicinity) and reporting behaviour (i.e. what are the rewards expected); and (ii) minimizing payments to the reporters while ensuring that the desired number of reports are collected. In this paper, we propose STOC-PISCES, an algorithm that guarantees a stochastic optimal solution in the generalized setting of an unknown set of participants, with non-deterministic availabilities and stochastically rational reporting behaviour. The superior performance of STOC-PISCES in experimental settings, based on real-world data, endorses its adoption as an incentive strategy in participatory sensing applications like smart city management.
Post It or Not: Viewership Based Posting of Crowdsourced Tasks
Manohar, Pallavi (Xerox Research Centre India) | Chander, Deepthi (Xerox Research Centre India) | Celis, Elisa (Ecole Polytechnique Fรฉdรฉrale de Lausanne (EPFL)) | Dasgupta, Koustuv (Xerox Research Centre India) | Bhattacharya, Sakyajit (Xerox Research Centre India)
We propose an online scheduling algorithm for posting crowdsourcing tasks which maximizes a novel metric called task viewership. This metric is computed using stochastic model based on coverage process and it measures the likelihood that a task is viewed by multiple crowd workers, which is correlated to the likelihood that it will be selected and completed.
TRACCS: A Framework for Trajectory-Aware Coordinated Urban Crowd-Sourcing
Chen, Cen (Singapore Management University) | Cheng, Shih-Fen (Singapore Management University) | Gunawan, Aldy (Singapore Management University) | Misra, Archan (Singapore Management University) | Dasgupta, Koustuv (Xerox Research Centre India) | Chander, Deepthi (Xerox Research Centre India)
We investigate the problem of large-scale mobile crowd-tasking, where a large pool of citizen crowd-workers are used to perform a variety of location-specific urban logistics tasks. Current approaches to such mobile crowd-tasking are very decentralized: a crowd-tasking platform usually provides each worker a set of available tasks close to the worker's current location; each worker then independently chooses which tasks she wants to accept and perform. In contrast, we propose TRACCS, a more coordinated task assignment approach, where the crowd-tasking platform assigns a sequence of tasks to each worker, taking into account their expected location trajectory over a wider time horizon, as opposed to just instantaneous location. We formulate such task assignment as an optimization problem, that seeks to maximize the total payoff from all assigned tasks, subject to a maximum bound on the detour (from the expected path) that a worker will experience to complete her assigned tasks. We develop credible computationally-efficient heuristics to address this optimization problem (whose exact solution requires solving a complex integer linear program), and show, via simulations with realistic topologies and commuting patterns, that a specific heuristic (called Greedy-ILS) increases the fraction of assigned tasks by more than 20%, and reduces the average detour overhead by more than 60%, compared to the current decentralized approach.
CrowdUtility: A Recommendation System for Crowdsourcing Platforms
Chander, Deepthi (Xerox Research Center India) | Bhattacharya, Sakyajit (Xerox Research Centre India) | Celis, Elisa (EPFL Lausanne) | Dasgupta, Koustuv (Xerox Research Centre India) | Karanam, Saraschandra (Xerox Research Centre India) | Rajan, Vaibhav (Xerox Research Centre India) | Gupta, Avantika (Xerox Research Centre India)
Crowd workers exhibit varying work patterns, expertise, and quality leading to wide variability in the performance of crowdsourcing platforms. The onus of choosing a suitable platform to post tasks is mostly with the requester, often leading to poor guarantees and unmet requirements due to the dynamism in performance of crowd platforms. Towards this end, we demonstrate CrowdUtility, a statistical modelling based tool for evaluating multiple crowdsourcing platforms and recommending a platform that best suits the requirements of the requester. CrowdUtility uses an online Multi-Armed Bandit framework, to schedule tasks while optimizing platform performance. We demonstrate an end-to end system starting from requirements specification, to platform recommendation, to real-time monitoring.
A Markov Decision Process Framework for Predictable Job Completion Times on Crowdsourcing Platforms
Lakshminarayanan, Chandrashekar (Indian Institute of Science) | Dubey, Ayush (Indian Institute of Science) | Bhatnagar, Shalabh (Indian Institute of Science) | Balamurugan, Chithralekha (Xerox Research Centre India)
Task starvation leads to huge variation in the completion times of the tasks posted on to the crowd. The price offered to a given task together with the dynamics of the crowd at the time of posting affect its completion time. Large organizations/requesters who frequent the crowd at regular intervals in order to get their tasks done desire predictability in completion times of the tasks. Thus, such requesters have to take into account the crowd dynamics at the time of posting the tasks and price them accordingly. In this work, we study an instance of the pricing problem and propose a solution based on the framework of Markov Decision Processes (MDPs).