AITopics

2011.05791

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Minnesota (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.89)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Building an Automated and Self-Aware Anomaly Detection System

Chakraborty, Sayan, Shah, Smit, Soltani, Kiumars, Swigart, Anna, Yang, Luyao, Buckingham, Kyle

Organizations rely heavily on time series metrics to measure and model key aspects of operational and business performance. The ability to reliably detect issues with these metrics is imperative to identifying early indicators of major problems before they become pervasive. It can be very challenging to proactively monitor a large number of diverse and constantly changing time series for anomalies, so there are often gaps in monitoring coverage, disabled or ignored monitors due to false positive alarms, and teams resorting to manual inspection of charts to catch problems. Traditionally, variations in the data generation processes and patterns have required strong modeling expertise to create models that accurately flag anomalies. In this paper, we describe an anomaly detection system that overcomes this common challenge by keeping track of its own performance and making changes as necessary to each model without requiring manual intervention. We demonstrate that this novel approach outperforms available alternatives on benchmark datasets in many scenarios.

artificial intelligence, data mining, machine learning, (17 more...)

2011.05047

Country:

North America > United States (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Weeraddana, Dilusha, MallawaArachchi, Sudaraka, Warnakula, Tharindu, Li, Zhidong, Wang, Yang

Long-Term Pipeline Failure Prediction Using Nonparametric Survival Analysis

Australian water infrastructure is more than a hundred years old, thus has begun to show its age through water main failures. Our work concerns approximately half a million pipelines across major Australian cities that deliver water to houses and businesses, serving over five million customers. Failures on these buried assets cause damage to properties and water supply disruptions. We applied Machine Learning techniques to find a cost-effective solution to the pipe failure problem in these Australian cities, where on average 1500 of water main failures occur each year. To achieve this objective, we construct a detailed picture and understanding of the behaviour of the water pipe network by developing a Machine Learning model to assess and predict the failure likelihood of water main breaking using historical failure records, descriptors of pipes and other environmental factors. Our results indicate that our system incorporating a nonparametric survival analysis technique called "Random Survival Forest" outperforms several popular algorithms and expert heuristics in long-term prediction. In addition, we construct a statistical inference technique to quantify the uncertainty associated with the long-term predictions.

pipe, pipe failure, prediction, (12 more...)

2011.08671

Country:

Oceania > Australia > Queensland (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Utah (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.93)
Water & Waste Management > Water Management > Water Supplies & Services (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Ritchie, Alexander, Balzano, Laura, Scott, Clayton

Supervised PCA: A Multiobjective Approach

arXiv.org Machine LearningNov-10-2020

Methods for supervised principal component analysis (SPCA) aim to incorporate label information into principal component analysis (PCA), so that the extracted features are more useful for a prediction task of interest. Prior work on SPCA has focused primarily on optimizing prediction error, and has neglected the value of maximizing variance explained by the extracted features. We propose a new method for SPCA that addresses both of these objectives jointly, and demonstrate empirically that our approach dominates existing approaches, i.e., outperforms them with respect to both prediction error and variation explained. Our approach accommodates arbitrary supervised learning losses and, through a statistical reformulation, provides a novel low-rank extension of generalized linear models.

dimension, principal component, variation, (15 more...)

2011.05309

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
(2 more...)

arXiv.org Machine LearningNov-10-2020

Automatic Detection of Influential Actors in Disinformation Networks

Smith, Steven T., Kao, Edward K., Mackin, Erika D., Shah, Danelle C., Simek, Olga, Rubin, Donald B.

The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IO). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and a novel network causal inference approach to quantify the impact of individual actors in spreading IO narratives. We demonstrate its capability on real-world hostile IO campaigns with Twitter datasets collected during the 2017 French presidential elections, and known IO accounts disclosed by Twitter over a broad range of IO campaigns (May 2007-February 2020), over 50 thousand accounts, 17 countries, and different account types including both trolls and bots. Our system detects IO accounts with 96% precision, 79% recall, and 96% area-under-the-PR-curve, maps out salient network communities, and discovers high-impact accounts that escape the lens of traditional impact statistics based on activity counts and network centrality. Results are corroborated with independent sources of known IO accounts from U.S. Congressional reports, investigative journalism, and IO datasets provided by Twitter.

io account, narrative, potential outcome, (14 more...)

2005.10879

Country:

Europe > France (1.00)
Asia > Russia (0.68)
Europe > Russia (0.14)
(26 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Media > News (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government > France Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

He, Yang-Hui, Hirst, Edward, Peterken, Toby

Machine-Learning Dessins d'Enfants: Explorations via Modular and Seiberg-Witten Curves

arXiv.org Machine LearningNov-10-2020

Having learnt of the remarkable theorem of Bely ˇ ı [1] which relates the existence of algebraic models of Riemann surfaces to that of analytic properties of rational functions thereon, Grothendieck launched an entire programme [2] by pictorially representing 1 this structure as bipartite graphs (the dessin) drawn on the Riemann surface. He hypothesised dessins d'enfants in their current form as a conceptual representation of the absolute Galois group over the rationals, one the most mysterious and least understood objects in number theory. Subsequently, he developed a generalisation of Bely ˇ ı's theorem which extends the surfaces considered in the mapping to more general Riemann surfaces. Properties of the mapping are identified with combinatorial invariants of the dessin d'enfant graphs [2] (q.v.

dessin, extension, node, (17 more...)

doi: 10.1088/1751-8121/abbc4f

2004.05218

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > Canada > Quebec > Montreal (0.04)
(4 more...)

Genre:

Research Report (0.81)
Instructional Material > Course Syllabus & Notes (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)

Differentially Private Synthetic Data: Applied Evaluations and Enhancements

Rosenblatt, Lucas, Liu, Xiaoyan, Pouyanfar, Samira, de Leon, Eduardo, Desai, Anuj, Allen, Joshua

Machine learning practitioners frequently seek to leverage the most informative available data, without violating the data owner's privacy, when building predictive models. Differentially private data synthesis protects personal details from exposure, and allows for the training of differentially private machine learning models on privately generated datasets. But how can we effectively assess the efficacy of differentially private synthetic data? In this paper, we survey four differentially private generative adversarial networks for data synthesis. We evaluate each of them at scale on five standard tabular datasets, and in two applied industry scenarios. Our results suggest some synthesizers are more applicable for different privacy budgets, and we further demonstrate complicating domain-based tradeoffs in selecting an approach. We offer experimental learning on applied machine learning scenarios with private internal data to researchers and practioners alike. In addition, we propose QUAIL, an ensemble-based modeling approach to generating synthetic data. We examine QUAIL's tradeoffs, and note circumstances in which it outperforms baseline differentially private supervised learning models under the same budget constraint. Maintaining an individual's privacy is a major concern when collecting sensitive information from groups or organizations. A formalization of privacy, known as differential privacy, has become the gold standard with which to protect information from malicious agents (Dwork et al., TAMC 2008).

dataset, synthesizer, synthetic data, (13 more...)

2011.05537

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming

Zhou, Zhenpeng, Beirami, Ahmad, Crook, Paul, Shah, Pararth, Subba, Rajen, Geramifard, Alborz

Motivated by the needs of resource constrained dialog policy learning, we introduce dialog policy via differentiable inductive logic (DILOG). We explore the tasks of one-shot learning and zero-shot domain transfer with DILOG on SimDial and MultiWoZ. Using a single representative dialog from the restaurant domain, we train DILOG on the SimDial dataset and obtain 99 % in-domain test accuracy. We also show that the trained DILOG zero-shot transfers to all other domains with 99 % accuracy, proving the suitability of DILOG to slot-filling dialogs. We further extend our study to the MultiWoZ dataset achieving 90 % inform and success metrics. We also observe that these metrics are not capturing some of the shortcomings of DILOG in terms of false positives, prompting us to measure an auxiliary Action F1 score. We show that DILOG is 100x more data efficient than state-of-the-art neural approaches on MultiWoZ while achieving similar performance metrics. We conclude with a discussion on the strengths and weaknesses of DILOG.

arxiv preprint arxiv, dilog, food pref, (13 more...)

2011.05457

Genre: Research Report (0.50)

Industry: Consumer Products & Services > Restaurants (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Fazli, Mehrdad, Kowsari, Kamran, Gharavi, Erfaneh, Barnes, Laura, Doryab, Afsaneh

HHAR-net: Hierarchical Human Activity Recognition using Neural Networks

Activity recognition using built-in sensors in smart and wearable devices provides great opportunities to understand and detect human behavior in the wild and gives a more holistic view of individuals' health and well being. Numerous computational methods have been applied to sensor streams to recognize different daily activities. However, most methods are unable to capture different layers of activities concealed in human behavior. Also, the performance of the models starts to decrease with increasing the number of activities. This research aims at building a hierarchical classification with Neural Networks to recognize human activities based on different levels of abstraction. We evaluate our model on the Extrasensory dataset; a dataset collected in the wild and containing data from smartphones and smartwatches. We use a two-level hierarchy with a total of six mutually exclusive labels namely, "lying down", "sitting", "standing in place", "walking", "running", and "bicycling" divided into "stationary" and "non-stationary". The results show that our model can recognize low-level activities (stationary/non-stationary) with 95.8% accuracy and overall accuracy of 92.8% over six labels. This is 3% above our best performing baseline.

activity recognition, misclassification, recognition, (13 more...)

2010.16052

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Consumer Health (0.89)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Energy-based Out-of-distribution Detection

Liu, Weitang, Wang, Xiaoyun, Owens, John D., Li, Yixuan

Determining whether inputs are out-of-distribution (OOD) is an essential building block for safely deploying machine learning models in the open world. However, previous methods relying on the softmax confidence score suffer from overconfident posterior distributions for OOD data. We propose a unified framework for OOD detection that uses an energy score. We show that energy scores better distinguish in- and out-of-distribution samples than the traditional approach using the softmax scores. Unlike softmax confidence scores, energy scores are theoretically aligned with the probability density of the inputs and are less susceptible to the overconfidence issue. Within this framework, energy can be flexibly used as a scoring function for any pre-trained neural classifier as well as a trainable cost function to shape the energy surface explicitly for OOD detection. On a CIFAR-10 pre-trained WideResNet, using the energy score reduces the average FPR (at TPR 95%) by 18.03% compared to the softmax confidence score. With energy-based training, our method outperforms the state-of-the-art on common benchmarks.

detection, energy score, fine-tuning, (11 more...)

2010.03759

Country:

North America > United States > California > Yolo County > Davis (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > Canada > Ontario > Toronto (0.14)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)