Not enough data to create a plot.
Try a different view from the menu above.
Tan, Chenhao
Evaluating and Characterizing Human Rationales
Carton, Samuel, Rathore, Anirudh, Tan, Chenhao
Two main approaches for evaluating the quality of machine-generated rationales are: 1) using human rationales as a gold standard; and 2) automated metrics based on how rationales affect model behavior. An open question, however, is how human rationales fare with these automatic metrics. Analyzing a variety of datasets and models, we find that human rationales do not necessarily perform well on these metrics. To unpack this finding, we propose improved metrics to account for model-dependent baseline performance. We then propose two methods to further characterize rationale quality, one based on model retraining and one on using "fidelity curves" to reveal properties such as irrelevance and redundancy. Our work leads to actionable suggestions for evaluating and characterizing rationales.
Characterizing the Value of Information in Medical Notes
Hsu, Chao-Chun, Karnwal, Shantanu, Mullainathan, Sendhil, Obermeyer, Ziad, Tan, Chenhao
Machine learning models depend on the quality of input data. As electronic health records are widely adopted, the amount of data in health care is growing, along with complaints about the quality of medical notes. We use two prediction tasks, readmission prediction and in-hospital mortality prediction, to characterize the value of information in medical notes. We show that as a whole, medical notes only provide additional predictive power over structured information in readmission prediction. We further propose a probing framework to select parts of notes that enable more accurate predictions than using all notes, despite that the selected information leads to a distribution shift from the training data ("all notes"). Finally, we demonstrate that models trained on the selected valuable information achieve even better predictive performance, with only 6.8% of all the tokens for readmission prediction.
Ask not what AI can do, but what AI should do: Towards a framework of task delegability
Lubars, Brian, Tan, Chenhao
While artificial intelligence (AI) holds promise for addressing societal challenges, issues of exactly which tasks to automate and to what extent to do so remain understudied. We approach this problem of task delegability from a human-centered perspective by developing a framework on human perception of task delegation to AI. We consider four high-level factors that can contribute to a delegation decision: motivation, difficulty, risk, and trust. To obtain an empirical understanding of human preferences in different tasks, we build a dataset of 100 tasks from academic papers, popular media portrayal of AI, and everyday life, and administer a survey based on our proposed framework. We find little preference for full AI control and a strong preference for machine-in-the-loop designs, in which humans play the leading role.
Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations
Mothilal, Ramaravind Kommiya, Sharma, Amit, Tan, Chenhao
Post-hoc explanations of machine learning models are crucial for people to understand and act on algorithmic predictions. An intriguing class of explanations is through counterfactuals, hypothetical examples that show people how to obtain a different prediction. We posit that effective counterfactual explanations should satisfy two properties: feasibility of the counterfactual actions given user context and constraints, and diversity among the counterfactuals presented. To this end, we propose a framework for generating and evaluating a diverse set of counterfactual explanations based on average distance and determinantal point processes. To evaluate the actionability of counterfactuals, we provide metrics that enable comparison of counterfactual-based methods to other local explanation methods. We further address necessary tradeoffs and point to causal implications in optimizing for counterfactuals. Our experiments on three real-world datasets show that our framework can generate a set of counterfactuals that are diverse and well approximate local decision boundaries.
Learning Fair Representations via an Adversarial Framework
Feng, Rui, Yang, Yang, Lyu, Yuehan, Tan, Chenhao, Sun, Yizhou, Wang, Chunping
Fairness has become a central issue for our research community as classification algorithms are adopted in societally critical domains such as recidivism prediction and loan approval. In this work, we consider the potential bias based on protected attributes (e.g., race and gender), and tackle this problem by learning latent representations of individuals that are statistically indistinguishable between protected groups while sufficiently preserving other information for classification. To do that, we develop a minimax adversarial framework with a generator to capture the data distribution and generate latent representations, and a critic to ensure that the distributions across different protected groups are similar. Our framework provides a theoretical guarantee with respect to statistical parity and individual fairness. Empirical results on four real-world datasets also show that the learned representation can effectively be used for classification tasks such as credit risk prediction while obstructing information related to protected groups, especially when removing protected attributes is not sufficient for fair classification.
No Permanent Friends or Enemies: Tracking Relationships between Nations from News
Han, Xiaochuang, Choi, Eunsol, Tan, Chenhao
Understanding the dynamics of international politics is important yet challenging for civilians. In this work, we explore unsupervised neural models to infer relations between nations from news articles. We extend existing models by incorporating shallow linguistics information and propose a new automatic evaluation metric that aligns relationship dynamics with manually annotated key events. As understanding international relations requires carefully analyzing complex relationships, we conduct in-person human evaluations with three groups of participants. Overall, humans prefer the outputs of our model and give insightful feedback that suggests future directions for human-centered models. Furthermore, our model reveals interesting regional differences in news coverage. For instance, with respect to US-China relations, Singaporean media focus more on "strengthening" and "purchasing", while US media focus more on "criticizing" and "denouncing".
Ask Not What AI Can Do, But What AI Should Do: Towards a Framework of Task Delegability
Lubars, Brian, Tan, Chenhao
Although artificial intelligence holds promise for addressing societal challenges, issues of exactly which tasks to automate and the extent to do so remain understudied. We approach the problem of task delegability from a human-centered perspective by developing a framework on human perception of task delegation to artificial intelligence. We consider four high-level factors that can contribute to a delegation decision: motivation, difficulty, risk, and trust. To obtain an empirical understanding of human preferences in different tasks, we build a dataset of 100 tasks from academic papers, popular media portrayal of AI, and everyday life. For each task, we administer a survey to collect judgments of each factor and ask subjects to pick the extent to which they prefer AI involvement. We find little preference for full AI control and a strong preference for machine-in-the-loop designs, in which humans play the leading role. Our framework can effectively predict human preferences in degrees of AI assistance. Among the four factors, trust is the most predictive of human preferences of optimal human-machine delegation. This framework represents a first step towards characterizing human preferences of automation across tasks. We hope this work may encourage and aid in future efforts towards understanding such individual attitudes; our goal is to inform the public and the AI research community rather than dictating any direction in technology development.
Reports of the Workshops Held at the 2018 International AAAI Conference on Web and Social Media
Editor, Managing (AAAI) | An, Jisun (Qatar Computing Research Institute) | Chunara, Rumi (New York University) | Crandall, David J. (Indiana University) | Frajberg, Darian (Politecnico di Milano) | French, Megan (Stanford University) | Jansen, Bernard J. (Qatar Computing Research Institute) | Kulshrestha, Juhi (GESIS - Leibniz Institute for the Social Sciences) | Mejova, Yelena (Qatar Computing Research Institute) | Romero, Daniel M. (University of Michigan) | Salminen, Joni (Qatar Computing Research Institute) | Sharma, Amit (Microsoft Research India) | Sheth, Amit (Wright State University) | Tan, Chenhao (University of Colorado Boulder) | Taylor, Samuel Hardman (Cornell University) | Wijeratne, Sanjaya (Wright State University)
The Workshop Program of the Association for the Advancement of Artificial Intelligence’s 12th International Conference on Web and Social Media (AAAI-18) was held at Stanford University, Stanford, California USA, on Monday, June 25, 2018. There were fourteen workshops in the program: Algorithmic Personalization and News: Risks and Opportunities; Beyond Online Data: Tackling Challenging Social Science Questions; Bridging the Gaps: Social Media, Use and Well-Being; Chatbot; Data-Driven Personas and Human-Driven Analytics: Automating Customer Insights in the Era of Social Media; Designed Data for Bridging the Lab and the Field: Tools, Methods, and Challenges in Social Media Experiments; Emoji Understanding and Applications in Social Media; Event Analytics Using Social Media Data; Exploring Ethical Trade-Offs in Social Media Research; Making Sense of Online Data for Population Research; News and Public Opinion; Social Media and Health: A Focus on Methods for Linking Online and Offline Data; Social Web for Environmental and Ecological Monitoring and The ICWSM Science Slam. Workshops were held on the first day of the conference. Workshop participants met and discussed issues with a selected focus — providing an informal setting for active exchange among researchers, developers, and users on topics of current interest. Organizers from nine of the workshops submitted reports, which are reproduced in this report. Brief summaries of the other five workshops have been reproduced from their website descriptions.
On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection
Lai, Vivian, Tan, Chenhao
Humans are the final decision makers in critical tasks that involve ethical and legal concerns, ranging from recidivism prediction, to medical diagnosis, to fighting against fake news. Although machine learning models can sometimes achieve impressive performance in these tasks, these tasks are not amenable to full automation. To realize the potential of machine learning for improving human decisions, it is important to understand how assistance from machine learning models affects human performance and human agency. In this paper, we use deception detection as a testbed and investigate how we can harness explanations and predictions of machine learning models to improve human performance while retaining human agency. We propose a spectrum between full human agency and full automation, and develop varying levels of machine assistance along the spectrum that gradually increase the influence of machine predictions. We find that without showing predicted labels, explanations alone do not statistically significantly improve human performance in the end task. In comparison, human performance is greatly improved by showing predicted labels (>20% relative improvement) and can be further improved by explicitly suggesting strong machine performance. Interestingly, when predicted labels are shown, explanations of machine predictions induce a similar level of accuracy as an explicit statement of strong machine performance. Our results demonstrate a tradeoff between human performance and human agency and show that explanations of machine predictions can moderate this tradeoff.
Urban Dreams of Migrants: A Case Study of Migrant Integration in Shanghai
Yang, Yang (Zhejiang University) | Tan, Chenhao (University of Colorado Boulder) | Liu, Zongtao (Zhejiang University) | Wu, Fei (Zhejiang University) | Zhuang, Yueting (Zhejiang University)
Unprecedented human mobility has driven the rapid urbanization around the world. In China, the fraction of population dwelling in cities increased from 17.9% to 52.6% between 1978 and 2012. Such large-scale migration poses challenges for policymakers and important questions for researchers. To investigate the process of migrant integration, we employ a one-month complete dataset of telecommunication metadata in Shanghai with 54 million users and 698 million call logs. We find systematic differences between locals and migrants in their mobile communication networks and geographical locations. For instance, migrants have more diverse contacts and move around the city with a larger radius than locals after they settle down. By distinguishing new migrants (who recently moved to Shanghai) from settled migrants (who have been in Shanghai for a while), we demonstrate the integration process of new migrants in their first three weeks. Moreover, we formulate classification problems to predict whether a person is a migrant. Our classifier is able to achieve an F1-score of 0.82 when distinguishing settled migrants from locals, but it remains challenging to identify new migrants because of class imbalance. This classification setup holds promise for identifying new migrants who will successfully integrate into locals (new migrants that misclassified as locals).