Accuracy
RelaxLoss: Defending Membership Inference Attacks without Losing Utility
Chen, Dingfan, Yu, Ning, Fritz, Mario
As a long-term threat to the privacy of training data, membership inference attacks (MIAs) emerge ubiquitously in machine learning models. Existing works evidence strong connection between the distinguishability of the training and testing loss distributions and the model's vulnerability to MIAs. Motivated by existing results, we propose a novel training framework based on a relaxed loss with a more achievable learning target, which leads to narrowed generalization gap and reduced privacy leakage. RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead. Through extensive evaluations on five datasets with diverse modalities (images, medical data, transaction records), our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs as well as model utility. Our defense is the first that can withstand a wide range of attacks while preserving (or even improving) the target model's utility. Source code is available at https://github.com/DingfanChen/RelaxLoss
A Conceptual Framework for Using Machine Learning to Support Child Welfare Decisions
Chor, Ka Ho Brian, Rodolfa, Kit T., Ghani, Rayid
Human services systems make key decisions that impact individuals in the society. The U.S. child welfare system makes such decisions, from screening-in hotline reports of suspected abuse or neglect for child protective investigations, placing children in foster care, to returning children to permanent home settings. These complex and impactful decisions on children's lives rely on the judgment of child welfare decisionmakers. Child welfare agencies have been exploring ways to support these decisions with empirical, data-informed methods that include machine learning (ML). This paper describes a conceptual framework for ML to support child welfare decisions. The ML framework guides how child welfare agencies might conceptualize a target problem that ML can solve; vet available administrative data for building ML; formulate and develop ML specifications that mirror relevant populations and interventions the agencies are undertaking; deploy, evaluate, and monitor ML as child welfare context, policy, and practice change over time. Ethical considerations, stakeholder engagement, and avoidance of common pitfalls underpin the framework's impact and success. From abstract to concrete, we describe one application of this framework to support a child welfare decision. This ML framework, though child welfare-focused, is generalizable to solving other public policy problems.
Causal Conceptions of Fairness and their Consequences
Nilforoshan, Hamed, Gaebler, Johann, Shroff, Ravi, Goel, Sharad
Recent work highlights the role of causality in designing equitable decision-making algorithms. It is not immediately clear, however, how existing causal conceptions of fairness relate to one another, or what the consequences are of using these definitions as design principles. Here, we first assemble and categorize popular causal definitions of algorithmic fairness into two broad families: (1) those that constrain the effects of decisions on counterfactual disparities; and (2) those that constrain the effects of legally protected characteristics, like race and gender, on decisions. We then show, analytically and empirically, that both families of definitions \emph{almost always} -- in a measure theoretic sense -- result in strongly Pareto dominated decision policies, meaning there is an alternative, unconstrained policy favored by every stakeholder with preferences drawn from a large, natural class. For example, in the case of college admissions decisions, policies constrained to satisfy causal fairness definitions would be disfavored by every stakeholder with neutral or positive preferences for both academic preparedness and diversity. Indeed, under a prominent definition of causal fairness, we prove the resulting policies require admitting all students with the same probability, regardless of academic qualifications or group membership. Our results highlight formal limitations and potential adverse consequences of common mathematical notions of causal fairness.
Revealing Unfair Models by Mining Interpretable Evidence
Bajaj, Mohit, Chu, Lingyang, Romaniello, Vittorio, Singh, Gursimran, Pei, Jian, Zhou, Zirui, Wang, Lanjun, Zhang, Yong
The popularity of machine learning has increased the risk of unfair models getting deployed in high-stake applications, such as justice system, drug/vaccination design, and medical diagnosis. Although there are effective methods to train fair models from scratch, how to automatically reveal and explain the unfairness of a trained model remains a challenging task. Revealing unfairness of machine learning models in interpretable fashion is a critical step towards fair and trustworthy AI. In this paper, we systematically tackle the novel task of revealing unfair models by mining interpretable evidence (RUMIE). The key idea is to find solid evidence in the form of a group of data instances discriminated most by the model. To make the evidence interpretable, we also find a set of human-understandable key attributes and decision rules that characterize the discriminated data instances and distinguish them from the other non-discriminated data. As demonstrated by extensive experiments on many real-world data sets, our method finds highly interpretable and solid evidence to effectively reveal the unfairness of trained models. Moreover, it is much more scalable than all of the baseline methods.
Inner Monologue: Embodied Reasoning through Planning with Language Models
Huang, Wenlong, Xia, Fei, Xiao, Ted, Chan, Harris, Liang, Jacky, Florence, Pete, Zeng, Andy, Tompson, Jonathan, Mordatch, Igor, Chebotar, Yevgen, Sermanet, Pierre, Brown, Noah, Jackson, Tomas, Luu, Linda, Levine, Sergey, Hausman, Karol, Ichter, Brian
Recent works have shown how the reasoning capabilities of Large Language Models (LLMs) can be applied to domains beyond natural language processing, such as planning and interaction for robots. These embodied problems require an agent to understand many semantic aspects of the world: the repertoire of skills available, how these skills influence the world, and how changes to the world map back to the language. LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them - answers that change over time in response to the agent's own choices. In this work, we investigate to what extent LLMs used in such embodied contexts can reason over sources of feedback provided through natural language, without any additional training. We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to more richly process and plan in robotic control scenarios. We investigate a variety of sources of feedback, such as success detection, scene description, and human interaction. We find that closed-loop language feedback significantly improves high-level instruction completion on three domains, including simulated and real table top rearrangement tasks and long-horizon mobile manipulation tasks in a kitchen environment in the real world.
The Three Principles Of Responsible AI And How They'll Make Us Better Humans
Seventeenth-century Amsterdam is known for three things: Rembrandt, the Bubonic Plague and Tulip Mania. It was 1637, the height of the Dutch Golden Age. Tulip bulbs were scarce and demand for them soared. They were also a symbol of status. Acres of land were swapped for seeds that would yield no more than a few flowers.
Learning Mutual Fund Categorization using Natural Language Processing
Vamvourellis, Dimitrios, Toth, Mate Attila, Desai, Dhruv, Mehta, Dhagash, Pasquali, Stefano
These categorization systems go deeper than the broader asset class based classification (equity, fixed income, etc) and provide Categorization of mutual funds or Exchange-Traded-funds (ETFs) further granular categories based on the portfolio breakdown. They have long served the financial analysts to perform peer analysis have been used to identify the top performing as well as worst for various purposes starting from competitor analysis, to quantifying performing funds within their peer groups, called peer analysis portfolio diversification. The categorization methodology of funds; to identify a home-grown fund to recommend against a usually relies on fund composition data in the structured format competitor's fund; to explain similarities and advantages of homegrown extracted from the Form N-1A. Here, we initiate a study to learn products compared to competitors' products for marketing the categorization system directly from the unstructured data as purposes; to quantify portfolio diversification of a given fund of depicted in the forms using natural language processing (NLP).
Patch-level instance-group discrimination with pretext-invariant learning for colitis scoring
Xu, Ziang, Ali, Sharib, Gupta, Soumya, Leedham, Simon, East, James E, Rittscher, Jens
Inflammatory bowel disease (IBD), in particular ulcerative colitis (UC), is graded by endoscopists and this assessment is the basis for risk stratification and therapy monitoring. Presently, endoscopic characterisation is largely operator dependant leading to sometimes undesirable clinical outcomes for patients with IBD. We focus on the Mayo Endoscopic Scoring (MES) system which is widely used but requires the reliable identification of subtle changes in mucosal inflammation. Most existing deep learning classification methods cannot detect these fine-grained changes which make UC grading such a challenging task. In this work, we introduce a novel patch-level instance-group discrimination with pretext-invariant representation learning (PLD-PIRL) for self-supervised learning (SSL). Our experiments demonstrate both improved accuracy and robustness compared to the baseline supervised network and several state-of-the-art SSL methods. Compared to the baseline (ResNet50) supervised classification our proposed PLD-PIRL obtained an improvement of 4.75% on hold-out test data and 6.64% on unseen center test data for top-1 accuracy.
Big Data in soccer: Creating an xG model - Damavis Blog
The ability to collect and process large amounts of data represents additional value for many companies in today's market. The world of sports has been no exception, starting with baseball with the emergence of SABRmetrics in the 1980s, through motor racing to sports such as basketball and soccer more recently. The creation of models and metrics through artificial intelligence allows sports fans to analyze the game from another perspective and, for their professionals, to gain a competitive advantage over their rivals. In the case of soccer, probably the most popular metric is the one known as expected goal (xG). The xG is intended to measure the probability that a shot will result in a goal, taking into account variables such as the position of the shot, the position of the goalkeeper or the part of the body with which the shot is taken. As it is a probability, it should take values between 0 and 1, so that for the clearest opportunities (for example, a shot inside the small area without a goalkeeper) it takes values close to 1, and for shots further away or with greater difficulty it tends to 0. This metric is very useful for coaching staffs and scouting teams to evaluate the finishing or chance-creating ability of different players.
Data Science Essentials -- AI Ethics (III)
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. This article is the third part of the AI Ethics for Data Science essential series.