Goto

Collaborating Authors

 Accuracy


Margin Calibration for Long-Tailed Visual Recognition

arXiv.org Artificial Intelligence

The long-tailed class distribution in visual recognition tasks poses great challenges for neural networks on how to handle the biased predictions between head and tail classes, i.e., the model tends to classify tail classes as head classes. While existing research focused on data resampling and loss function engineering, in this paper, we take a different perspective: the classification margins. We study the relationship between the margins and logits (classification scores) and empirically observe the biased margins and the biased logits are positively correlated. We propose MARC, a simple yet effective MARgin Calibration function to dynamically calibrate the biased margins for unbiased logits. We validate MARC through extensive experiments on common long-tailed benchmarks including CIFAR-LT, ImageNet-LT, Places-LT, and iNaturalist-LT. Experimental results demonstrate that our MARC achieves favorable results on these benchmarks. In addition, MARC is extremely easy to implement with just three lines of code. We hope this simple method will motivate people to rethink the biased margins and biased logits in long-tailed visual recognition.


Role of Human-AI Interaction in Selective Prediction

arXiv.org Artificial Intelligence

Recent work has shown the potential benefit of selective prediction systems that can learn to defer to a human when the predictions of the AI are unreliable, particularly to improve the reliability of AI systems in high-stakes applications like healthcare or conservation. However, most prior work assumes that human behavior remains unchanged when they solve a prediction task as part of a human-AI team as opposed to by themselves. We show that this is not the case by performing experiments to quantify human-AI interaction in the context of selective prediction. In particular, we study the impact of communicating different types of information to humans about the AI system's decision to defer. Using real-world conservation data and a selective prediction system that improves expected accuracy over that of the human or AI system working individually, we show that this messaging has a significant impact on the accuracy of human judgements. Our results study two components of the messaging strategy: 1) Whether humans are informed about the prediction of the AI system and 2) Whether they are informed about the decision of the selective prediction system to defer. By manipulating these messaging components, we show that it is possible to significantly boost human performance by informing the human of the decision to defer, but not revealing the prediction of the AI. We therefore show that it is vital to consider how the decision to defer is communicated to a human when designing selective prediction systems, and that the composite accuracy of a human-AI team must be carefully evaluated using a human-in-the-loop framework.


Linear Discriminant Analysis with High-dimensional Mixed Variables

arXiv.org Machine Learning

Datasets containing both categorical and continuous variables are frequently encountered in many areas, and with the rapid development of modern measurement technologies, the dimensions of these variables can be very high. Despite the recent progress made in modelling high-dimensional data for continuous variables, there is a scarcity of methods that can deal with a mixed set of variables. To fill this gap, this paper develops a novel approach for classifying high-dimensional observations with mixed variables. Our framework builds on a location model, in which the distributions of the continuous variables conditional on categorical ones are assumed Gaussian. We overcome the challenge of having to split data into exponentially many cells, or combinations of the categorical variables, by kernel smoothing, and provide new perspectives for its bandwidth choice to ensure an analogue of Bochner's Lemma, which is different to the usual bias-variance tradeoff. We show that the two sets of parameters in our model can be separately estimated and provide penalized likelihood for their estimation. Results on the estimation accuracy and the misclassification rates are established, and the competitive performance of the proposed classifier is illustrated by extensive simulation and real data studies.


Attentive Contextual Carryover for Multi-Turn End-to-End Spoken Language Understanding

arXiv.org Artificial Intelligence

Recent years have seen significant advances in end-to-end (E2E) spoken language understanding (SLU) systems, which directly predict intents and slots from spoken audio. While dialogue history has been exploited to improve conventional text-based natural language understanding systems, current E2E SLU approaches have not yet incorporated such critical contextual signals in multi-turn and task-oriented dialogues. In this work, we propose a contextual E2E SLU model architecture that uses a multi-head attention mechanism over encoded previous utterances and dialogue acts (actions taken by the voice assistant) of a multi-turn dialogue. We detail alternative methods to integrate these contexts into the state-ofthe-art recurrent and transformer-based models. When applied to a large de-identified dataset of utterances collected by a voice assistant, our method reduces average word and semantic error rates by 10.8% and 12.6%, respectively. We also present results on a publicly available dataset and show that our method significantly improves performance over a noncontextual baseline


Why My Model with 90% Accuracy Doesn't Work

#artificialintelligence

When you're dealing with marketing problems like customer churn (when a customer stops using a company's product over a certain period of time) prediction, the raw dataset is often imbalanced, meaning that the classes are inherently not balanced. Basically, what this means is the percentage of your customers who churn might be a lot lower than those who don't. In this example, the binary classification problem might have an 80–20 split, with only 20% of customers discontinuing their engagement with the company and 80% continuing to make a purchase. The problem is, that 20% could be VERY important to the business's bottom line. Think about it -- a gifting company has 100,000 customers with an average value of $50 per person.


WOOD: Wasserstein-based Out-of-Distribution Detection

arXiv.org Machine Learning

The training and test data for deep-neural-network-based classifiers are usually assumed to be sampled from the same distribution. When part of the test samples are drawn from a distribution that is sufficiently far away from that of the training samples (a.k.a. out-of-distribution (OOD) samples), the trained neural network has a tendency to make high confidence predictions for these OOD samples. Detection of the OOD samples is critical when training a neural network used for image classification, object detection, etc. It can enhance the classifier's robustness to irrelevant inputs, and improve the system resilience and security under different forms of attacks. Detection of OOD samples has three main challenges: (i) the proposed OOD detection method should be compatible with various architectures of classifiers (e.g., DenseNet, ResNet), without significantly increasing the model complexity and requirements on computational resources; (ii) the OOD samples may come from multiple distributions, whose class labels are commonly unavailable; (iii) a score function needs to be defined to effectively separate OOD samples from in-distribution (InD) samples. To overcome these challenges, we propose a Wasserstein-based out-of-distribution detection (WOOD) method. The basic idea is to define a Wasserstein-distance-based score that evaluates the dissimilarity between a test sample and the distribution of InD samples. An optimization problem is then formulated and solved based on the proposed score function. The statistical learning bound of the proposed method is investigated to guarantee that the loss value achieved by the empirical optimizer approximates the global optimum. The comparison study results demonstrate that the proposed WOOD consistently outperforms other existing OOD detection methods.


Programming 'Fairness' into Your Machine Learning Model

#artificialintelligence

To combat ethical risk proactively without sacrificing model performance, we must first define'fairness'. A model is considered'fair' if it gives similar predictions to similar groups or individuals. In more detail, a model is'fair' if for both groups of the positive outcome, the predictor has equal true positive rates and equal false positive rates for the negative outcomes. Next, we can break up our bias detection and mitigation techniques into phases -- the same phases that govern the development of an AI model: Data understanding & preparation; model development & post-processing; and model evaluation & auditing.


Bootstrapping an Artificial Intelligence Startup with Services: Nitesh Chawla, Founder, Aunalytics (Part 1)

#artificialintelligence

Where you born, raised, and in what kind of background? Nitesh Chawla: I'm from New Delhi although I was born in Calcutta. I did my schooling from Delhi Public School. Then I did my engineering from Pune. Then I came to the United States in 1993 right after finishing my undergrad.


Hyperdimensional Feature Fusion for Out-Of-Distribution Detection

arXiv.org Artificial Intelligence

We introduce powerful ideas from Hyperdimensional Computing into the challenging field of Out-of-Distribution (OOD) detection. In contrast to most existing work that performs OOD detection based on only a single layer of a neural network, we use similarity-preserving semi-orthogonal projection matrices to project the feature maps from multiple layers into a common vector space. By repeatedly applying the bundling operation $\oplus$, we create expressive class-specific descriptor vectors for all in-distribution classes. At test time, a simple and efficient cosine similarity calculation between descriptor vectors consistently identifies OOD samples with better performance than the current state-of-the-art. We show that the hyperdimensional fusion of multiple network layers is critical to achieve best general performance.


Researchers explain why they believe Facebook mishandles political ads

NPR Technology

Facebook has worked for years to revamp its handling of political ads -- but researchers who conducted a comprehensive audit of millions of ads say the social media company's efforts have had uneven results. The problems, they say, include overcounting political ads in the U.S. -- and undercounting them in other countries. And despite Facebook's ban on political ads around the time of last year's U.S. elections, the platform allowed more than 70,000 political ads to run anyway, according to the research team that is based at the NYU Cybersecurity for Democracy and at the Belgian university KU Leuven. Their research study was released early Thursday. They also plan to present their findings at a security conference next August.