Goto

Collaborating Authors

 Accuracy


ML for Security Is Dead. Long Live ML for Security

#artificialintelligence

When it comes to staying on top of security threats, machine learning, unquestionably, must be part of the equation. The volume of data is simply too great to cope without it. But as it's currently being used, ML may be doing more harm than good, particularly when it comes to alarm fatigue. Alarm fatigue is a condition that occurs when an operator is overloaded with alarms; in many of these cases, the majority of alarms turn out to be false positives. With too many alarms to investigate in a limited amount of timeโ€“and the knowledge that most of them are false positivesโ€“the operator begins to ignore some alarms, which invariably leads to bad outcomes.


E-Commerce Dispute Resolution Prediction

arXiv.org Artificial Intelligence

E-Commerce marketplaces support millions of daily transactions, and some disagreements between buyers and sellers are unavoidable. Resolving disputes in an accurate, fast, and fair manner is of great importance for maintaining a trustworthy platform. Simple cases can be automated, but intricate cases are not sufficiently addressed by hard-coded rules, and therefore most disputes are currently resolved by people. In this work we take a first step towards automatically assisting human agents in dispute resolution at scale. We construct a large dataset of disputes from the eBay online marketplace, and identify several interesting behavioral and linguistic patterns. We then train classifiers to predict dispute outcomes with high accuracy. We explore the model and the dataset, reporting interesting correlations, important features, and insights.


Ego4D: Around the World in 3,000 Hours of Egocentric Video

arXiv.org Artificial Intelligence

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,025 hours of daily-life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 855 unique camera wearers from 74 worldwide locations and 9 different countries. The approach to collection is designed to uphold rigorous privacy and ethics standards with consenting participants and robust de-identification procedures where relevant. Ego4D dramatically expands the volume of diverse egocentric video footage publicly available to the research community. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. Furthermore, we present a host of new benchmark challenges centered around understanding the first-person visual experience in the past (querying an episodic memory), present (analyzing hand-object manipulation, audio-visual conversation, and social interactions), and future (forecasting activities). By publicly sharing this massive annotated dataset and benchmark suite, we aim to push the frontier of first-person perception. Project page: https://ego4d-data.org/


HEDP: A Method for Early Forecasting Software Defects based on Human Error Mechanisms

arXiv.org Artificial Intelligence

As the primary cause of software defects, human error is the key to understanding, and perhaps to predicting and avoiding them. Little research has been done to predict defects on the basis of the cognitive errors that cause them. This paper proposes an approach to predicting software defects through knowledge about the cognitive mechanisms of human errors. Our theory is that the main process behind a software defect is that an error-prone scenario triggers human error modes, which psychologists have observed to recur across diverse activities. Software defects can then be predicted by identifying such scenarios, guided by this knowledge of typical error modes. The proposed idea emphasizes predicting the exact location and form of a possible defect. We conducted two case studies to demonstrate and validate this approach, with 55 programmers in a programming competition and 5 analysts serving as the users of the approach. We found it impressive that the approach was able to predict, at the requirement phase, the exact locations and forms of 7 out of the 22 (31.8%) specific types of defects that were found in the code. The defects predicted tended to be common defects: their occurrences constituted 75.7% of the total number of defects in the 55 developed programs; each of them was introduced by at least two persons. The fraction of the defects introduced by a programmer that were predicted was on average (over all programmers) 75%. Furthermore, these predicted defects were highly persistent through the debugging process. If the prediction had been used to successfully prevent these defects, this could have saved 46.2% of the debugging iterations. This excellent capability of forecasting the exact locations and forms of possible defects at the early phases of software development recommends the approach for substantial benefits to defect prevention and early detection.


Logic Constraints to Feature Importances

arXiv.org Machine Learning

In recent years, Artificial Intelligence (AI) algorithms have been proven to outperform traditional statistical methods in terms of predictivity, especially when a large amount of data was available. Nevertheless, the "black box" nature of AI models is often a limit for a reliable application in high-stakes fields like diagnostic techniques, autonomous guide, etc. Recent works have shown that an adequate level of interpretability could enforce the more general concept of model trustworthiness. The basic idea of this paper is to exploit the human prior knowledge of the features' importance for a specific task, in order to coherently aid the phase of the model's fitting. This sort of "weighted" AI is obtained by extending the empirical loss with a regularization term encouraging the importance of the features to follow predetermined constraints. This procedure relies on local methods for the feature importance computation, e.g. LRP, LIME, etc. that are the link between the model weights to be optimized and the user-defined constraints on feature importance. In the fairness area, promising experimental results have been obtained for the Adult dataset. Many other possible applications of this model agnostic theoretical framework are described.


COVID-19 rapid test national shortage mobilizes White House, leaves experts cautiously optimistic

FOX News

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Last week's White House report reiterated President Biden's employer mandate that businesses with 100 or more employees require every worker to be fully vaccinated for COVID-19 or tested weekly. Jeffrey Zients, the White House COVID-19 response coordinator, summarized in last week's press briefing that, "We are on track to quadruple the supply of rapid, at-home tests available to Americans by December to more than 200 million a month and to increase the number of places Americans can access free testing in the United States to 30,000 community-based locations." He emphasized the president's staunch commitment in adding $1 billion of extra funding already to the recent $2 billion investment to increase supply.


Tracking the risk of a deployed model and detecting harmful distribution shifts

arXiv.org Machine Learning

When deployed in the real world, machine learning models inevitably encounter changes in the data distribution, and certain -- but not all -- distribution shifts could result in significant performance degradation. In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially, making interventions by a human expert (or model retraining) unnecessary. While several works have developed tests for distribution shifts, these typically either use non-sequential methods, or detect arbitrary shifts (benign or harmful), or both. We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate. In this work, we design simple sequential tools for testing if the difference between source (training) and target (test) distributions leads to a significant drop in a risk function of interest, like accuracy or calibration. Recent advances in constructing time-uniform confidence sequences allow efficient aggregation of statistical evidence accumulated during the tracking process. The designed framework is applicable in settings where (some) true labels are revealed after the prediction is performed, or when batches of labels become available in a delayed fashion. We demonstrate the efficacy of the proposed framework through an extensive empirical study on a collection of simulated and real datasets.


The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control

arXiv.org Machine Learning

We propose the Terminating-Knockoff (T-Knock) filter, a fast variable selection method for high-dimensional data. The T-Knock filter controls a user-defined target false discovery rate (FDR) while maximizing the number of selected true positives. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the original data and multiple sets of randomly generated knockoff variables. A finite sample proof based on martingale theory for the FDR control property is provided. Numerical simulations show that the FDR is controlled at the target level while allowing for a high power. We prove under mild conditions that the knockoffs can be sampled from any univariate distribution. The computational complexity of the proposed method is derived and it is demonstrated via numerical simulations that the sequential computation time is multiple orders of magnitude lower than that of the strongest benchmark methods in sparse high-dimensional settings. The T-Knock filter outperforms state-of-the-art methods for FDR control on a simulated genome-wide association study (GWAS), while its computation time is more than two orders of magnitude lower than that of the strongest benchmark methods.


Domain Generalization via Domain-based Covariance Minimization

arXiv.org Machine Learning

Researchers have been facing a difficult problem that data generation mechanisms could be influenced by internal or external factors leading to the training and test data with quite different distributions, consequently traditional classification or regression from the training set is unable to achieve satisfying results on test data. In this paper, we address this nontrivial domain generalization problem by finding a central subspace in which domain-based covariance is minimized while the functional relationship is simultaneously maximally preserved. We propose a novel variance measurement for multiple domains so as to minimize the difference between conditional distributions across domains with solid theoretical demonstration and supports, meanwhile, the algorithm preserves the functional relationship via maximizing the variance of conditional expectations given output. Furthermore, we also provide a fast implementation that requires much less computation and smaller memory for large-scale matrix operations, suitable for not only domain generalization but also other kernel-based eigenvalue decompositions. To show the practicality of the proposed method, we compare our methods against some well-known dimension reduction and domain generalization techniques on both synthetic data and real-world applications. We show that for small-scale datasets, we are able to achieve better quantitative results indicating better generalization performance over unseen test datasets. For large-scale problems, the proposed fast implementation maintains the quantitative performance but at a substantially lower computational cost.


Gated Information Bottleneck for Generalization in Sequential Environments

arXiv.org Machine Learning

Deep neural networks suffer from poor generalization to unseen environments when the underlying data distribution is different from that in the training set. By learning minimum sufficient representations from training data, the information bottleneck (IB) approach has demonstrated its effectiveness to improve generalization in different AI applications. In this work, we propose a new neural network-based IB approach, termed gated information bottleneck (GIB), that dynamically drops spurious correlations and progressively selects the most task-relevant features across different environments by a trainable soft mask (on raw features). GIB enjoys a simple and tractable objective, without any variational approximation or distributional assumption. We empirically demonstrate the superiority of GIB over other popular neural network-based IB approaches in adversarial robustness and out-of-distribution (OOD) detection. Meanwhile, we also establish the connection between IB theory and invariant causal representation learning, and observed that GIB demonstrates appealing performance when different environments arrive sequentially, a more practical scenario where invariant risk minimization (IRM) fails. Code of GIB is available at https://github.com/falesiani/GIB