AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Detecting Cyberattacks in Industrial Control Systems Using Online Learning Algorithms

Lia, Guangxia, Shena, Yulong, Zhaob, Peilin, Lu, Xiao, Liu, Jia, Liu, Yangyang, Hoi, Steven C. H.

arXiv.org Machine LearningDec-7-2019

Industrial control systems are critical to the operation of industrial facilities, especially for critical infrastructures, such as refineries, power gri ds, and transportation systems. Similar to other information systems, a significant threat to indust rial control systems is the attack from cyberspace--the offensive maneuvers launched by "anon ymous" in the digital world that target computer-based assets with the goal of compromising a system's functions or probing for information. Owing to the importance of industrial control systems, and the possibly devastating consequences of being attacked, significant endeavors have been attempted to secure industrial control systems from cyberattacks. Among them are intrusio n detection systems that serve as the first line of defense by monitoring and reporting potenti ally malicious activities. Classical machine-learning-based intrusion detection methods usua lly generate prediction models by learning modest-sized training samples all at once. Such approac h is not always applicable to industrial control systems, as industrial control systems must proces s continuous control commands with limited computational resources in a nonstop way. To satisf y such requirements, we propose using online learning to learn prediction models from the control ling data stream. W e introduce several state-of-the-art online learning algorithms categorical ly, and illustrate their efficacies on two typically used testbeds--power system and gas pipeline. Fur ther, we explore a new cost-sensitive online learning algorithm to solve the class-imbalance pro blem that is pervasive in industrial intrusion detection systems. Our experimental results ind icate that the proposed algorithm can achieve an overall improvement in the detection rate of cybe rattacks in industrial control systems. Modern industrial control systems are microprocessor-equ ipped devices and associated communication networks used to monitor and operate physica l equipment in the industrial environment.

algorithm, control system, industrial control system, (17 more...)

arXiv.org Machine Learning

1912.03589

Country:

North America > United States > Mississippi (0.04)
Asia > Middle East > Iran (0.04)
Oceania > Australia (0.04)
(13 more...)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Electrical Industrial Apparatus (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

PIDForest: Anomaly Detection via Partial Identification

Gopalan, Parikshit, Sharan, Vatsal, Wieder, Udi

arXiv.org Machine LearningDec-7-2019

We consider the problem of detecting anomalies in a large dataset. We propose a framework called Partial Identification which captures the intuition that anomalies are easy to distinguish from the overwhelming majority of points by relatively few attribute values. Formalizing this intuition, we propose a geometric anomaly measure for a point that we call PIDScore, which measures the minimum density of data points over all subcubes containing the point. We present PIDForest: a random forest based algorithm that finds anomalies based on this definition. We show that it performs favorably in comparison to several popular anomaly detection methods, across a broad range of benchmarks. PIDForest also provides a succinct explanation for why a point is labelled anomalous, by providing a set of features and ranges for them which are relatively uncommon in the dataset.

algorithm, dataset, pidforest, (17 more...)

arXiv.org Machine Learning

1912.03582

Country:

South America > Paraguay > Asunción > Asunción (0.04)
South America > Brazil (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

Technical Perspective: Bootstrapping a Future of Open Source, Specialized Hardware

Communications of the ACMDec-6-2019, 03:30:58 GMT

Computer architecture is currently undergoing a radical and exciting transition as the end of Moore's Law nears, and the burden of increasing humanity's ability to compute falls to the creativity of computer architects and their ability to fuse together the application and the silicon. A case in point is the recent explosion of deep neural networks, which occurred as a result of a drop in the cost of compute because of successful parallelization with GPGPUs (general-purpose graphics processing units) and the ability of cloud companies to gather massive amounts of data to feed the algorithms. As improvements in general-purpose architecture slow to a standstill, we must specialize the architecture for the application in order to overcome fundamental energy efficiency limits that prevent humanity's progress. This drive to specialize will bring another wave of chips with neural-network specific accelerators currently in development worldwide, but also a host of other kinds of accelerators, each specialized for a particular planet-scale purpose. Organizations like Google, Microsoft, and Amazon are increasingly finding reasons to bypass the confines imposed by traditional silicon companies by rolling their own silicon that is tailored to their own datacenter needs.

bootstrapping, specialized hardware, technical perspective, (4 more...)

Communications of the ACM

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Add feedback

Differentially Private Mixed-Type Data Generation For Unsupervised Learning

Tantipongpipat, Uthaipon, Waites, Chris, Boob, Digvijay, Siva, Amaresh Ankit, Cummings, Rachel

arXiv.org Machine LearningDec-6-2019

In this work we introduce the DP-auto-GAN framework for synthetic data generation, which combines the low dimensional representation of autoencoders with the flexibility of Generative Adversarial Networks (GANs). This framework can be used to take in raw sensitive data, and privately train a model for generating synthetic data that will satisfy the same statistical properties as the original data. This learned model can be used to generate arbitrary amounts of publicly available synthetic data, which can then be freely shared due to the post-processing guarantees of differential privacy. Our framework is applicable to unlabeled mixed-type data, that may include binary, categorical, and real-valued data. We implement this framework on both unlabeled binary data (MIMIC-III) and unlabeled mixed-type data (ADULT). We also introduce new metrics for evaluating the quality of synthetic mixed-type data, particularly in unsupervised settings.

dataset, privacy, synthetic data, (14 more...)

arXiv.org Machine Learning

1912.0325

Country: North America > United States > Massachusetts (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Make Thunderbolts Less Frightening -- Predicting Extreme Weather Using Deep Learning

Schön, Christian, Dittrich, Jens

arXiv.org Machine LearningDec-6-2019

Forecasting severe weather conditions is still a very challenging and computationally expensive task due to the enormous amount of data and the complexity of the underlying physics. Machine learning approaches and especially deep learning have however shown huge improvements in many research areas dealing with large datasets in recent years. In this work, we tackle one specific sub-problem of weather forecasting, namely the prediction of thunderstorms and lightning. We propose the use of a convolutional neural network architecture inspired by UNet++ and ResNet to predict thunderstorms as a binary classification problem based on satellite images and lightnings recorded in the past. We achieve a probability of detection of more than 94% for lightnings within the next 15 minutes while at the same time minimizing the false alarm ratio compared to previous approaches.

architecture, prediction, residual block, (16 more...)

arXiv.org Machine Learning

1912.01277

Country:

North America > United States (0.29)
Europe > Germany > Saarland (0.05)
North America > Canada (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.79)

Add feedback

How to Use Out-of-Fold Predictions in Machine Learning

#artificialintelligenceDec-5-2019, 22:32:50 GMT

Machine learning algorithms are typically evaluated using resampling techniques such as k-fold cross-validation. During the k-fold cross-validation process, predictions are made on test sets comprised of data not used to train the model. These predictions are referred to as out-of-fold predictions, a type of out-of-sample predictions. Out-of-fold predictions play an important role in machine learning in both estimating the performance of a model when making predictions on new data in the future, so-called the generalization performance of the model, and in the development of ensemble models. In this tutorial, you will discover a gentle introduction to out-of-fold predictions in machine learning. How to Use Out-of-Fold Predictions in Machine Learning Photos by Gael Varoquaux, some rights reserved.

dataset, out-of-fold prediction, prediction, (16 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.59)

Add feedback

Customer Churn Modeling using Machine Learning with parsnip

#artificialintelligenceDec-5-2019, 19:09:30 GMT

This article comes from Diego Usai, a student in Business Science University. Diego has completed both 101 (Data Science Foundations) and 201 (Advanced Machine Learning & Business Consulting) courses. Diego shows off his progress in this Customer Churn Tutorial using Machine Learning with parsnip. Diego originally posted the article on his personal website, diegousai.io, Recently I have completed the online course Business Analysis With R focused on applied data and business science with R, which introduced me to a couple of new modelling concepts and approaches.

customer, machine learning, parsnip, (13 more...)

#artificialintelligence

Genre: Instructional Material (0.47)

Industry:

Education > Educational Setting > Online (0.75)
Education > Educational Technology > Educational Software > Computer Based Training (0.35)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Data Science Certification Program (Course Lab) by E&ICT, IIT Roorkee

#artificialintelligenceDec-5-2019, 13:07:27 GMT

classification, regression, svm classification, (1 more...)

#artificialintelligence

Country: Asia > India > Uttarakhand > Roorkee (0.40)

Genre: Instructional Material > Course Syllabus & Notes (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.44)

Add feedback

Towards Robust Relational Causal Discovery

Lee, Sanghack, Honavar, Vasant

arXiv.org Artificial IntelligenceDec-5-2019

We consider the problem of learning causal relationships from relational data. Existing approaches rely on queries to a relational conditional independence (RCI) oracle to establish and orient causal relations in such a setting. In practice, queries to a RCI oracle have to be replaced by reliable tests for RCI against available data. Relational data present several unique challenges in testing for RCI. We study the conditions under which traditional iid-based conditional independence (CI) tests yield reliable answers to RCI queries against relational data. We show how to conduct CI tests against relational data to robustly recover the underlying relational causal structure. Results of our experiments demonstrate the effectiveness of our proposed approach.

artificial intelligence, machine learning, relational data, (15 more...)

arXiv.org Artificial Intelligence

1912.0239

Country:

North America > United States > Oregon > Benton County > Corvallis (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback

Causal structure based root cause analysis of outliers

Janzing, Dominik, Budhathoki, Kailash, Minorics, Lenon, Blöbaum, Patrick

arXiv.org Machine LearningDec-5-2019

We describe a formal approach to identify 'root causes' of outliers observed in $n$ variables $X_1,\dots,X_n$ in a scenario where the causal relation between the variables is a known directed acyclic graph (DAG). To this end, we first introduce a systematic way to define outlier scores. Further, we introduce the concept of 'conditional outlier score' which measures whether a value of some variable is unexpected *given the value of its parents* in the DAG, if one were to assume that the causal structure and the corresponding conditional distributions are also valid for the anomaly. Finally, we quantify to what extent the high outlier score of some target variable can be attributed to outliers of its ancestors. This quantification is defined via Shapley values from cooperative game theory.

data mining, machine learning, outlier score, (18 more...)

arXiv.org Machine Learning

1912.02724

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > West Yorkshire (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Game Theory (0.88)
Information Technology > Data Science > Data Mining (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback