AITopics

2103.14224

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Burruss, Matthew, Ramakrishna, Shreyas, Dubey, Abhishek

Deep-RBF Networks for Anomaly Detection in Automotive Cyber-Physical Systems

arXiv.org Artificial IntelligenceMar-25-2021

Deep Neural Networks (DNNs) are popularly used for implementing autonomy related tasks in automotive Cyber-Physical Systems (CPSs). However, these networks have been shown to make erroneous predictions to anomalous inputs, which manifests either due to Out-of-Distribution (OOD) data or adversarial attacks. To detect these anomalies, a separate DNN called assurance monitor is often trained and used in parallel to the controller DNN, increasing the resource burden and latency. We hypothesize that a single network that can perform controller predictions and anomaly detection is necessary to reduce the resource requirements. Deep-Radial Basis Function (RBF) networks provide a rejection class alongside the class predictions, which can be utilized for detecting anomalies at runtime. However, the use of RBF activation functions limits the applicability of these networks to only classification tasks. In this paper, we show how the deep-RBF network can be used for detecting anomalies in CPS regression tasks such as continuous steering predictions. Further, we design deep-RBF networks using popular DNNs such as NVIDIA DAVE-II, and ResNet20, and then use the resulting rejection class for detecting adversarial attacks such as a physical attack and data poison attack. Finally, we evaluate these attacks and the trained deep-RBF networks using a hardware CPS testbed called DeepNNCar and a real-world German Traffic Sign Benchmark (GTSB) dataset. Our results show that the deep-RBF networks can robustly detect these attacks in a short time without additional resource requirements.

anomaly detection, deep-rbf network, prediction, (11 more...)

2103.14172

Country:

North America > United States > New Jersey (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.87)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Waites, Chris, Cummings, Rachel

Differentially Private Normalizing Flows for Privacy-Preserving Density Estimation

arXiv.org Artificial IntelligenceMar-25-2021

Normalizing flow models have risen as a popular solution to the problem of density estimation, enabling high-quality synthetic data generation as well as exact probability density evaluation. However, in contexts where individuals are directly associated with the training data, releasing such a model raises privacy concerns. In this work, we propose the use of normalizing flow models that provide explicit differential privacy guarantees as a novel approach to the problem of privacy-preserving density estimation. We evaluate the efficacy of our approach empirically using benchmark datasets, and we demonstrate that our method substantially outperforms previous state-of-the-art approaches. We additionally show how our algorithm can be applied to the task of differentially private anomaly detection.

dataset, density estimation, proceedings, (16 more...)

2103.14068

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Research Report > Promising Solution (0.68)
Overview > Innovation (0.68)
Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.71)
(2 more...)

arXiv.org Artificial IntelligenceMar-25-2021

Frame-rate Up-conversion Detection Based on Convolutional Neural Network for Learning Spatiotemporal Features

Yoon, Minseok, Nam, Seung-Hun, Yu, In-Jae, Ahn, Wonhyuk, Kwon, Myung-Joon, Lee, Heung-Kyu

With the advance in user-friendly and powerful video editing tools, anyone can easily manipulate videos without leaving prominent visual traces. Frame-rate up-conversion (FRUC), a representative temporal-domain operation, increases the motion continuity of videos with a lower frame-rate and is used by malicious counterfeiters in video tampering such as generating fake frame-rate video without improving the quality or mixing temporally spliced videos. FRUC is based on frame interpolation schemes and subtle artifacts that remain in interpolated frames are often difficult to distinguish. Hence, detecting such forgery traces is a critical issue in video forensics. This paper proposes a frame-rate conversion detection network (FCDNet) that learns forensic features caused by FRUC in an end-to-end fashion. The proposed network uses a stack of consecutive frames as the input and effectively learns interpolation artifacts using network blocks to learn spatiotemporal features. This study is the first attempt to apply a neural network to the detection of FRUC. Moreover, it can cover the following three types of frame interpolation schemes: nearest neighbor interpolation, bilinear interpolation, and motion-compensated interpolation. In contrast to existing methods that exploit all frames to verify integrity, the proposed approach achieves a high detection speed because it observes only six frames to test its authenticity. Extensive experiments were conducted with conventional forensic methods and neural networks for video forensic tasks to validate our research. The proposed network achieved state-of-the-art performance in terms of detecting the interpolated artifacts of FRUC. The experimental results also demonstrate that our trained model is robust for an unseen dataset, unlearned frame-rate, and unlearned quality factor.

fcdnet, interpolation scheme, video, (15 more...)

2103.13674

Country:

Asia > South Korea > Daejeon > Daejeon (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Song, Hyebin, Raskutti, Garvesh, Willett, Rebecca

Prediction in the presence of response-dependent missing labels

arXiv.org Machine LearningMar-24-2021

In a variety of settings, limitations of sensing technologies or other sampling mechanisms result in missing labels, where the likelihood of a missing label in the training set is an unknown function of the data. For example, satellites used to detect forest fires cannot sense fires below a certain size threshold. In such cases, training datasets consist of positive and pseudo-negative observations where pseudo-negative observations can be either true negatives or undetected positives with small magnitudes. We develop a new methodology and non-convex algorithm P(ositive) U(nlabeled) - O(ccurrence) M(agnitude) M(ixture) which jointly estimates the occurrence and detection likelihood of positive samples, utilizing prior knowledge of the detection mechanism. Our approach uses ideas from positive-unlabeled (PU)-learning and zero-inflated models that jointly estimate the magnitude and occurrence of events. We provide conditions under which our model is identifiable and prove that even though our approach leads to a non-convex objective, any local minimizer has optimal statistical error (up to a log term) and projected gradient descent has geometric convergence rates. We demonstrate on both synthetic data and a California wildfire dataset that our method out-performs existing state-of-the-art approaches.

dataset, exp, inequality, (16 more...)

2103.13555

Country:

North America > United States > California (0.25)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(3 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

arXiv.org Machine LearningMar-24-2021

A Two-Stage Variable Selection Approach for Correlated High Dimensional Predictors

Li, Zhiyuan

When fitting statistical models, some predictors are often found to be correlated with each other, and functioning together. Many group variable selection methods are developed to select the groups of predictors that are closely related to the continuous or categorical response. These existing methods usually assume the group structures are well known. For example, variables with similar practical meaning, or dummy variables created by categorical data. However, in practice, it is impractical to know the exact group structure, especially when the variable dimensional is large. As a result, the group variable selection results may be selected. To solve the challenge, we propose a two-stage approach that combines a variable clustering stage and a group variable stage for the group variable selection problem. The variable clustering stage uses information from the data to find a group structure, which improves the performance of the existing group variable selection methods. For ultrahigh dimensional data, where the predictors are much larger than observations, we incorporated a variable screening method in the first stage and shows the advantages of such an approach. In this article, we compared and discussed the performance of four existing group variable selection methods under different simulation models, with and without the variable clustering stage. The two-stage method shows a better performance, in terms of the prediction accuracy, as well as in the accuracy to select active predictors. An athlete's data is also used to show the advantages of the proposed method.

group lasso, group structure, lasso, (15 more...)

2103.13357

Country: North America > United States > New Jersey (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
(2 more...)

Hammerbacher, Tom, Lange-Hegermann, Markus, Platz, Gorden

Including Sparse Production Knowledge into Variational Autoencoders to Increase Anomaly Detection Reliability

arXiv.org Artificial IntelligenceMar-24-2021

Digitalization leads to data transparency for production systems that we can benefit from with data-driven analysis methods like neural networks. For example, automated anomaly detection enables saving resources and optimizing the production. We study using rarely occurring information about labeled anomalies into Variational Autoencoder neural network structures to overcome information deficits of supervised and unsupervised approaches. This method outperforms all other models in terms of accuracy, precision, and recall. We evaluate the following methods: Principal Component Analysis, Isolation Forest, Classifying Neural Networks, and Variational Autoencoders on seven time series datasets to find the best performing detection methods. We extend this idea to include more infrequently occurring meta information about production processes. This use of sparse labels, both of anomalies or production data, allows to harness any additional information available for increasing anomaly detection performance.

anomaly, anomaly label, dataset, (15 more...)

2103.12998

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)

Cruz, André F., Saleiro, Pedro, Belém, Catarina, Soares, Carlos, Bizarro, Pedro

Promoting Fairness through Hyperparameter Optimization

arXiv.org Artificial IntelligenceMar-23-2021

Considerable research effort has been guided towards algorithmic fairness but real-world adoption of bias reduction techniques is still scarce. Existing methods are either metric- or model-specific, require access to sensitive attributes at inference time, or carry high development and deployment costs. This work explores, in the context of a real-world fraud detection application, the unfairness that emerges from traditional ML model development, and how to mitigate it with a simple and easily deployed intervention: fairness-aware hyperparameter optimization (HO). We propose and evaluate fairness-aware variants of three popular HO algorithms: Fair Random Search, Fair TPE, and Fairband. Our method enables practitioners to adapt pre-existing business operations to accommodate fairness objectives in a frictionless way and with controllable fairness-accuracy trade-offs. Additionally, it can be coupled with existing bias reduction techniques to tune their hyperparameters. We validate our approach on a real-world bank account opening fraud use case, as well as on three datasets from the fairness literature. Results show that, without extra training cost, it is feasible to find models with 111% average fairness increase and just 6% decrease in predictive accuracy, when compared to standard fairness-blind HO.

artificial intelligence, configuration, machine learning, (18 more...)

2103.12715

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Portugal (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Banking & Finance (1.00)
Law Enforcement & Public Safety > Fraud (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Artificial IntelligenceMar-23-2021

Can I Solve It? Identifying APIs Required to Complete OSS Task

Santos, Fabio, Wiese, Igor, Trinkenreich, Bianca, Steinmacher, Igor, Sarma, Anita, Gerosa, Marco

Open Source Software projects add labels to open issues to help contributors choose tasks. However, manually labeling issues is time-consuming and error-prone. Current automatic approaches for creating labels are mostly limited to classifying issues as a bug/non-bug. In this paper, we investigate the feasibility and relevance of labeling issues with the domain of the APIs required to complete the tasks. We leverage the issues' description and the project history to build prediction models, which resulted in precision up to 82% and recall up to 97.8%. We also ran a user study (n=74) to assess these labels' relevancy to potential contributors. The results show that the labels were useful to participants in choosing tasks, and the API-domain labels were selected more often than the existing architecture-based labels. Our results can inspire the creation of tools to automatically label issues, helping developers to find tasks that better match their skills.

artificial intelligence, machine learning, natural language, (18 more...)

doi: 10.1109/MSR52588.2021.00047

2103.12653

Country:

North America > United States > New York > New York County > New York City (0.04)
South America > Brazil > Paraná (0.04)
North America > United States > Oregon (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

arXiv.org Machine LearningMar-23-2021

Anomaly detection using principles of human perception

Mohammad, Nassir

In the fields of statistics and unsupervised machine learning a fundamental and well-studied problem is anomaly detection. Although anomalies are difficult to define, many algorithms have been proposed. Underlying the approaches is the nebulous understanding that anomalies are rare, unusual or inconsistent with the majority of data. The present work gives a philosophical approach to clearly define anomalies and to develop an algorithm for their efficient detection with minimal user intervention. Inspired by the Gestalt School of Psychology and the Helmholtz principle of human perception, the idea is to assume anomalies are observations that are unexpected to occur with respect to certain groupings made by the majority of the data. Thus, under appropriate random variable modelling anomalies are directly found in a set of data under a uniform and independent random assumption of the distribution of constituent elements of the observations; anomalies correspond to those observations where the expectation of occurrence of the elements in a given view is $<1$. Starting from fundamental principles of human perception an unsupervised anomaly detection algorithm is developed that is simple, real-time and parameter-free. Experiments suggest it as the prime choice for univariate data and it shows promising performance on the detection of global anomalies in multivariate data.

algorithm, anomaly, dataset, (13 more...)

2103.12323

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > Wales (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)