AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Real-Time Sensor Anomaly Detection and Recovery in Connected Automated Vehicle Sensors

Wang, Yiyang, Masoud, Neda, Khojandi, Anahita

arXiv.org Machine LearningNov-4-2019

In this paper we propose a novel observer-based method to improve the safety and security of connected and automated vehicle (CAV) transportation. The proposed method combines model-based signal filtering and anomaly detection methods. Specifically, we use adaptive extended Kalman filter (AEKF) to smooth sensor readings of a CAV based on a nonlinear car-following model. Using the car-following model the subject vehicle (i.e., the following vehicle) utilizes the leading vehicle's information to detect sensor anomalies by employing previously-trained One Class Support Vector Machine (OCSVM) models. This approach allows the AEKF to estimate the state of a vehicle not only based on the vehicle's location and speed, but also by taking into account the state of the surrounding traffic. A communication time delay factor is considered in the car-following model to make it more suitable for real-world applications. Our experiments show that compared with the AEKF with a traditional $\chi^2$-detector, our proposed method achieves a better anomaly detection performance. We also demonstrate that a larger time delay factor has a negative impact on the overall detection performance.

anomaly, detection, vehicle, (15 more...)

arXiv.org Machine Learning

1911.01531

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Tennessee > Knox County > Knoxville (0.04)
North America > United States > Connecticut > Tolland County > Storrs (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Online Debiasing for Adaptively Collected High-dimensional Data

Deshpande, Yash, Javanmard, Adel, Mehrabi, Mohammad

arXiv.org Machine LearningNov-4-2019

Adaptive collection of data is increasingly commonplace in many applications. From the point of view of statistical inference however, adaptive collection induces memory and correlation in the samples, and poses significant challenge. We consider the high-dimensional linear regression, where the samples are collected adaptively and the sample size $n$ can be smaller than $p$, the number of covariates. In this setting, there are two distinct sources of bias: the first due to regularization imposed for estimation, e.g. using the LASSO, and the second due to adaptivity in collecting the samples. We propose \emph{`online debiasing'}, a general procedure for estimators such as the LASSO, which addresses both sources of bias. In two concrete contexts $(i)$ batched data collection and $(ii)$ high-dimensional time series analysis, we demonstrate that online debiasing optimally debiases the LASSO estimate when the underlying parameter $\theta_0$ has sparsity of order $o(\sqrt{n}/\log p)$. In this regime, the debiased estimator can be used to compute $p$-values and confidence intervals of optimal size.

matrix, probability, theorem 2, (16 more...)

arXiv.org Machine Learning

1911.0104

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Auditing and Achieving Intersectional Fairness in Classification Problems

Morina, Giulio, Oliinyk, Viktoriia, Waton, Julian, Marusic, Ines, Georgatzis, Konstantinos

arXiv.org Artificial IntelligenceNov-4-2019

Machine learning algorithms are extensively used to make increasingly more consequential decisions, so that achieving optimal predictive performance can no longer be the only focus. This paper explores intersectional fairness, that is fairness when intersections of multiple sensitive attributes -- such as race, age, nationality, etc. -- are considered. Previous research has mainly been focusing on fairness with respect to a single sensitive attribute, with intersectional fairness being comparatively less studied despite its critical importance for modern machine learning applications. We introduce intersectional fairness metrics by extending prior work, and provide different methodologies to audit discrimination in a given dataset or model outputs. Secondly, we develop novel post-processing techniques to mitigate any detected bias in a classification model. Our proposed methodology does not rely on any assumptions regarding the underlying model and aims at guaranteeing fairness while preserving good predictive performance. Finally, we give guidance on a practical implementation, showing how the proposed methods perform on a real-world dataset.

fairness, fairness metric, intersectional fairness, (14 more...)

arXiv.org Artificial Intelligence

1911.01468

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Smaller Is Better: Lightweight Face Detection For Smartphones

#artificialintelligenceNov-3-2019, 17:06:47 GMT

Although mobile devices were not designed to run compute-heavy AI models, in recent years AI-powered features like face detection, eye tracking, and voice recognition have all been added to smartphones. Much of the compute for such services is done on the cloud, but ideally these applications would be light enough to run directly on devices without an Internet connection. In this spirit of "smaller is better," Shanghai-based developer "Linzai" (GitHub user name @Linzaer) recently shared a new lightweight model that enables real-time face detection for smartphones. The project has garnered a whopping 3.3k Stars and over 600 forks on GitHub. Facial recognition technology is widely applied in security monitoring, surveillance, human-computer interaction, entertainment, etc. Detecting human faces in digital images is the first step in facial recognition, and an ideal face detection model can be evaluated by how quickly and accurately it performs.

face detection, lightweight face detection, retinaface-mobilenet-0, (12 more...)

#artificialintelligence

Country: Asia > China > Shanghai > Shanghai (0.26)

Industry: Media (0.37)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.33)

Add feedback

Detecting random filenames using (un)supervised machine learning

#artificialintelligenceNov-3-2019, 04:58:06 GMT

Combining both n-grams and random forest models to detect malicious activity. An essential part of Managed Detection and Response at Fox-IT is the Security Operations Center. This is our frontline for detecting and analyzing possible threats. Our Security Operations Center brings together the best in human and machine analysis and we continually strive to improve both. For instance, we develop machine learning techniques for detecting malicious content such as DGA domains or unusual SMB traffic.

filename, random filename, random forest, (12 more...)

#artificialintelligence

Industry:

Information Technology > Security & Privacy (0.60)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.39)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.85)

Add feedback

Regularized Adversarial Sampling and Deep Time-aware Attention for Click-Through Rate Prediction

Wang, Yikai, Zhang, Liang, Dai, Quanyu, Sun, Fuchun, Zhang, Bo, He, Yang, Yan, Weipeng, Bao, Yongjun

arXiv.org Machine LearningNov-3-2019

Improving the performance of click-through rate (CTR) prediction remains one of the core tasks in online advertising systems. With the rise of deep learning, CTR prediction models with deep networks remarkably enhance model capacities. In deep CTR models, exploiting users' historical data is essential for learning users' behaviors and interests. As existing CTR prediction works neglect the importance of the temporal signals when embed users' historical clicking records, we propose a time-aware attention model which explicitly uses absolute temporal signals for expressing the users' periodic behaviors and relative temporal signals for expressing the temporal relation between items. Besides, we propose a regularized adversarial sampling strategy for negative sampling which eases the classification imbalance of CTR data and can make use of the strong guidance provided by the observed negative CTR samples. The adversarial sampling strategy significantly improves the training efficiency, and can be co-trained with the time-aware attention model seamlessly. Experiments are conducted on real-world CTR datasets from both in-station and out-station advertising places.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1145/3357384.3357936

1911.00886

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Industry:

Marketing (0.48)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

A Study of Data Pre-processing Techniques for Imbalanced Biomedical Data Classification

Liu, Shigang, Zhang, Jun, Xiang, Yang, Zhou, Wanlei, Xiang, Dongxi

arXiv.org Machine LearningNov-3-2019

Biomedical data are widely accepted in developing prediction models for identifying a specific tumor, drug discovery and classification of human cancers. However, previous studies usually focused on different classifiers, and overlook the class imbalance problem in real-world biomedical datasets. There are a lack of studies on evaluation of data pre-processing techniques, such as resampling and feature selection, on imbalanced biomedical data learning. The relationship between data pre-processing techniques and the data distributions has never been analysed in previous studies. This article mainly focuses on reviewing and evaluating some popular and recently developed resampling and feature selection methods for class imbalance learning. We analyse the effectiveness of each technique from data distribution perspective. Extensive experiments have been done based on five classifiers, four performance measures, eight learning techniques across twenty real-world datasets. Experimental results show that: (1) resampling and feature selection techniques exhibit better performance using support vector machine (SVM) classifier. However, resampling and Feature Selection techniques perform poorly when using C4.5 decision tree and Linear discriminant analysis classifiers; (2) for datasets with different distributions, techniques such as Random undersampling and Feature Selection perform better than other data pre-processing methods with T Location-Scale distribution when using SVM and KNN (K-nearest neighbours) classifiers. Random oversampling outperforms other methods on Negative Binomial distribution using Random Forest classifier with lower level of imbalance ratio; (3) Feature Selection outperforms other data pre-processing methods in most cases, thus, Feature Selection with SVM classifier is the best choice for imbalanced biomedical data learning.

classification, classifier, dataset, (15 more...)

arXiv.org Machine Learning

1911.00996

Country:

Oceania > Australia (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Leukemia (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Hematology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Ten-year Survival Prediction for Breast Cancer Patients

Li, Changmao, He, Han, Hao, Yunze, Ziems, Caleb

arXiv.org Machine LearningNov-2-2019

Different stages of breast cancer require different treatments.

algorithm, clinical data, genomic data, (16 more...)

arXiv.org Machine Learning

1911.00776

Genre: Research Report > Experimental Study (0.34)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.62)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Fair Predictors under Distribution Shift

Singh, Harvineet, Singh, Rina, Mhasawade, Vishwali, Chunara, Rumi

arXiv.org Machine LearningNov-2-2019

Recent work on fair machine learning adds to a growing set of algorithmic safeguards required for deployment in high societal impact areas. A fundamental concern with model deployment is to guarantee stable performance under changes in data distribution. Extensive work in domain adaptation addresses this concern, albeit with the notion of stability limited to that of predictive performance. We provide conditions under which a stable model both in terms of prediction and fairness performance can be trained. Building on the problem setup of causal domain adaptation, we select a subset of features for training predictors with fairness constraints such that risk with respect to an unseen target data distribution is minimized. Advantages of the approach are demonstrated on synthetic datasets and on the task of diagnosing acute kidney injury in a real-world dataset under an instance of measurement policy shift and selection bias.

distribution shift, fairness constraint, predictor, (13 more...)

arXiv.org Machine Learning

1911.00677

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Kansas (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Nephrology (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Improving Cross-Lingual Transfer Learning by Filtering Training Data : Alexa Blogs

#artificialintelligenceNov-1-2019, 18:46:59 GMT

This type of cross-lingual transfer learning can make it easier to bootstrap a model in a language for which training data is scarce, by taking advantage of more abundant data in a source language. But sometimes the data in the source language is so abundant that using all of it to train a transfer model would be impractically time consuming. Moreover, linguistic differences between source and target languages mean that pruning the training data in the source language, so that its statistical patterns better match those of the target language, can actually improve the performance of the transferred model. In a paper we're presenting at this year's Conference on Empirical Methods in Natural Language Processing, we describe experiments with a new data selection technique that let us halve the amount of training data required in the source language, while actually improving a transfer model's performance in a target language. For evaluation purposes, we used two techniques to cut the source-language data set in half: one was our data selection technique, and the other was random sampling.

data selection technique, target language, transfer model, (10 more...)

#artificialintelligence

Industry: Retail > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)

Add feedback