AITopics

This paper introduces a novel dataset to help researchers evaluate their computer vision and audio models for accuracy across a diverse set of age, genders, apparent skin tones and ambient lighting conditions. Our dataset is composed of 3,011 subjects and contains over 45,000 videos, with an average of 15 videos per person. The videos were recorded in multiple U.S. states with a diverse set of adults in various age, gender and apparent skin tone groups. A key feature is that each subject agreed to participate for their likenesses to be used. Additionally, our age and gender annotations are provided by the subjects themselves. A group of trained annotators labeled the subjects' apparent skin tone using the Fitzpatrick skin type scale. Moreover, annotations for videos recorded in low ambient lighting are also provided. As an application to measure robustness of predictions across certain attributes, we provide a comprehensive study on the top five winners of the DeepFake Detection Challenge (DFDC). Experimental evaluation shows that the winning models are less performant on some specific groups of people, such as subjects with darker skin tones and thus may not generalize to all people. In addition, we also evaluate the state-of-the-art apparent age and gender classification methods. Our experiments provides a through analysis on these models in terms of fair treatment of people from various backgrounds.

auc 0, dataset, positive rate, (16 more...)

2104.02821

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > India (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.86)

Robust Semantic Interpretability: Revisiting Concept Activation Vectors

Pfau, Jacob, Young, Albert T., Wei, Jerome, Wei, Maria L., Keiser, Michael J.

Interpretability methods for image classification assess model trustworthiness by attempting to expose whether the model is systematically biased or attending to the same cues as a human would. Saliency methods for feature attribution dominate the interpretability literature, but these methods do not address semantic concepts such as the textures, colors, or genders of objects within an image. Our proposed Robust Concept Activation Vectors (RCAV) quantifies the effects of semantic concepts on individual model predictions and on model behavior as a whole. RCAV calculates a concept gradient and takes a gradient ascent step to assess model sensitivity to the given concept. By generalizing previous work on concept activation vectors to account for model non-linearity, and by introducing stricter hypothesis testing, we show that RCAV yields interpretations which are both more accurate at the image level and robust at the dataset level. RCAV, like saliency methods, supports the interpretation of individual predictions. To evaluate the practical use of interpretability methods as debugging tools, and the scientific use of interpretability methods for identifying inductive biases (e.g. texture over shape), we construct two datasets and accompanying metrics for realistic benchmarking of semantic interpretability methods. Our benchmarks expose the importance of counterfactual augmentation and negative controls for quantifying the practical usability of interpretability methods.

concept sensitivity, interpretability method, semantic interpretability method, (13 more...)

2104.02768

Country:

North America > United States (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report > Experimental Study (0.49)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Feldman, Tal, Peake, Ashley

On the Basis of Sex: A Review of Gender Bias in Machine Learning Applications

Machine Learning models have been deployed across almost every aspect of society, often in situations that affect the social welfare of many individuals. Although these models offer streamlined solutions to large problems, they may contain biases and treat groups or individuals unfairly. To our knowledge, this review is one of the first to focus specifically on gender bias in applications of machine learning. We first introduce several examples of machine learning gender bias in practice. We then detail the most widely used formalizations of fairness in order to address how to make machine learning models fairer. Specifically, we discuss the most influential bias mitigation algorithms as applied to domains in which models have a high propensity for gender discrimination. We group these algorithms into two overarching approaches -- removing bias from the data directly and removing bias from the model through training -- and we present representative examples of each. As society increasingly relies on artificial intelligence to help in decision-making, addressing gender biases present in these models is imperative. To provide readers with the tools to assess the fairness of machine learning models and mitigate the biases present in them, we discuss multiple open source packages for fairness in AI.

discrimination, fairness, proceedings, (15 more...)

2104.02532

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)

Genre: Research Report (1.00)

Industry: Law > Civil Rights & Constitutional Law (0.88)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Marek, Petr, Naik, Vishal Ishwar, Auvray, Vincent, Goyal, Anuj

OodGAN: Generative Adversarial Network for Out-of-Domain Data Generation

Detecting an Out-of-Domain (OOD) utterance is crucial for a robust dialog system. Most dialog systems are trained on a pool of annotated OOD data to achieve this goal. However, collecting the annotated OOD data for a given domain is an expensive process. To mitigate this issue, previous works have proposed generative adversarial networks (GAN) based models to generate OOD data for a given domain automatically. However, these proposed models do not work directly with the text. They work with the text's latent space instead, enforcing these models to include components responsible for encoding text into latent space and decoding it back, such as auto-encoder. These components increase the model complexity, making it difficult to train. We propose OodGAN, a sequential generative adversarial network (SeqGAN) based model for OOD data generation. Our proposed model works directly on the text and hence eliminates the need to include an auto-encoder. OOD data generated using OodGAN model outperforms state-of-the-art in OOD detection metrics for ROSTD (67% relative improvement in FPR 0.95) and OSQ datasets (28% relative improvement in FPR 0.95) (Zheng et al., 2020).

classifier, ood data, ood example, (11 more...)

2104.02484

Country:

North America > United States > California > Santa Clara County > Sunnyvale (0.05)
Europe > Czechia > Prague (0.04)
North America > United States > Florida > Sarasota County > Sarasota (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Tamburri, Damian Andrew, Heuvel, Willem-Jan Van den, Garriga, Martin

DataOps for Societal Intelligence: a Data Pipeline for Labor Market Skills Extraction and Matching

arXiv.org Artificial IntelligenceApr-5-2021

Big Data analytics supported by AI algorithms can support skills localization and retrieval in the context of a labor market intelligence problem. We formulate and solve this problem through specific DataOps models, blending data sources from administrative and technical partners in several countries into cooperation, creating shared knowledge to support policy and decision-making. We then focus on the critical task of skills extraction from resumes and vacancies featuring state-of-the-art machine learning models. We showcase preliminary results with applied machine learning on real data from the employment agencies of the Netherlands and the Flemish region in Belgium. The final goal is to match these skills to standard ontologies of skills, jobs and occupations.

intelligence, pipeline, skill extraction, (12 more...)

doi: 10.1109/IRI49571.2020.00063

2104.01966

Country:

Europe > Belgium (0.25)
Europe > Netherlands > North Brabant > 's-Hertogenbosch (0.05)
North America > United States (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Banking & Finance > Economy (0.73)
Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.57)
(2 more...)

arXiv.org Machine LearningApr-5-2021

Revisiting Rashomon: A Comment on "The Two Cultures"

D'Amour, Alexander

Here, I provide some reflections on Prof. Leo Breiman's "The Two Cultures" paper. I focus specifically on the phenomenon that Breiman dubbed the "Rashomon Effect", describing the situation in which there are many models that satisfy predictive accuracy criteria equally well, but process information in the data in substantially different ways. This phenomenon can make it difficult to draw conclusions or automate decisions based on a model fit to data. I make connections to recent work in the Machine Learning literature that explore the implications of this issue, and note that grappling with it can be a fruitful area of collaboration between the algorithmic and data modeling cultures.

breiman, machine learning, rashomon, (14 more...)

arXiv.org Machine Learning

2104.0215

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Collins, Jack H., Martín-Ramiro, Pablo, Nachman, Benjamin, Shih, David

Comparing Weak- and Unsupervised Methods for Resonant Anomaly Detection

arXiv.org Machine LearningApr-5-2021

Anomaly detection techniques are growing in importance at the Large Hadron Collider (LHC), motivated by the increasing need to search for new physics in a model-agnostic way. In this work, we provide a detailed comparative study between a well-studied unsupervised method called the autoencoder (AE) and a weakly-supervised approach based on the Classification Without Labels (CWoLa) technique. We examine the ability of the two methods to identify a new physics signal at different cross sections in a fully hadronic resonance search. By construction, the AE classification performance is independent of the amount of injected signal. In contrast, the CWoLa performance improves with increasing signal abundance. When integrating these approaches with a complete background estimate, we find that the two methods have complementary sensitivity. In particular, CWoLa is effective at finding diverse and moderately rare signals while the AE can provide sensitivity to very rare signals, but only with certain topologies. We therefore demonstrate that both techniques are complementary and can be used together for anomaly detection at the LHC.

arxiv, cwola hunting, signal region, (15 more...)

arXiv.org Machine Learning

2104.02092

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Spain > Galicia > Madrid (0.04)
North America > United States > Wisconsin (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Energy (0.93)
Government > Regional Government (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceApr-4-2021

Drug Recommendation System based on Sentiment Analysis of Drug Reviews using Machine Learning

Garg, Satvik

Since coronavirus has shown up, inaccessibility of legitimate clinical resources is at its peak, like the shortage of specialists, healthcare workers, lack of proper equipment and medicines. The entire medical fraternity is in distress, which results in numerous individuals demise. Due to unavailability, people started taking medication independently without appropriate consultation, making the health condition worse than usual. As of late, machine learning has been valuable in numerous applications, and there is an increase in innovative work for automation. This paper intends to present a drug recommender system that can drastically reduce specialists heap. In this research, we build a medicine recommendation system that uses patient reviews to predict the sentiment using various vectorization processes like Bow, TFIDF, Word2Vec, and Manual Feature Analysis, which can help recommend the top drug for a given disease by different classification algorithms. The predicted sentiments were evaluated by precision, recall, f1score, accuracy, and AUC score. The results show that classifier LinearSVC using TFIDF vectorization outperforms all other models with 93% accuracy.

artificial intelligence, machine learning, natural language, (15 more...)

doi: 10.1109/Confluence51648.2021.9377188

2104.01113

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Asia > India > Madhya Pradesh > Bhopal (0.04)
(2 more...)

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Kamani, Mohammad Mahdi, Forsati, Rana, Wang, James Z., Mahdavi, Mehrdad

Pareto Efficient Fairness in Supervised Learning: From Extraction to Tracing

arXiv.org Artificial IntelligenceApr-4-2021

As algorithmic decision-making systems are becoming more pervasive, it is crucial to ensure such systems do not become mechanisms of unfair discrimination on the basis of gender, race, ethnicity, religion, etc. Moreover, due to the inherent trade-off between fairness measures and accuracy, it is desirable to learn fairness-enhanced models without significantly compromising the accuracy. In this paper, we propose Pareto efficient Fairness (PEF) as a suitable fairness notion for supervised learning, that can ensure the optimal trade-off between overall loss and other fairness criteria. The proposed PEF notion is definition-agnostic, meaning that any well-defined notion of fairness can be reduced to the PEF notion. To efficiently find a PEF classifier, we cast the fairness-enhanced classification as a bilevel optimization problem and propose a gradient-based method that can guarantee the solution belongs to the Pareto frontier with provable guarantees for convex and non-convex objectives. We also generalize the proposed algorithmic solution to extract and trace arbitrary solutions from the Pareto frontier for a given preference over accuracy and fairness measures. This approach is generic and can be generalized to any multicriteria optimization problem to trace points on the Pareto frontier curve, which is interesting by its own right. We empirically demonstrate the effectiveness of the PEF solution and the extracted Pareto frontier on real-world datasets compared to state-of-the-art methods.

algorithm, objective, pareto frontier, (16 more...)

2104.01634

Country:

North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Law > Labor & Employment Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

#artificialintelligenceApr-3-2021, 05:00:24 GMT

How To Measure ML Model Accuracy

Machine learning (ML) is about making predictions about new data based on old data. The quality of any machine-learning algorithm is ultimately determined by the quality of those predictions. However, there is no one universal way to measure that quality across all ML applications, and that has broad implications for the value and usefulness of machine learning. "Every industry, every domain, every application has different care-abouts," said Nick Ni, director of product marketing, AI and software at Xilinx. "And you have to measure that care-about." Classification is the most familiar application, and "accuracy" is the measure used for it. But even so, there remain disagreements about exactly how accuracy should be measured or what it should mean. With other applications, it's much less clear how to measure the quality of results.

accuracy, application, segmentation, (14 more...)

#artificialintelligence

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.32)