AITopics | Overview

Collaborating Authors

Overview

Embracing the age of artificial intelligence in the latest ISOfocus

#artificialintelligenceNov-14-2019, 14:36:52 GMT

Artificial intelligence (AI) is a game-changing technology that is affecting all our lives and shaping our future. In the latest ISOfocus issue, we debunk the AI myths, explore the opportunities and explain why globally relevant standards are key. Are killer robots about to take over the world? Mention artificial intelligence to the average person today and this is one of the many scary scenarios that spring to mind. Perhaps this is no surprise when you consider how AI is the technology that enables computers to think and act like human beings.

artificial intelligence, embracing, latest isofocus

#artificialintelligence

Country: North America > United States (0.07)

Genre: Overview (0.39)

Technology: Information Technology > Artificial Intelligence > Robots (0.75)

Add feedback

The frontier of simulation-based inference

Cranmer, Kyle, Brehmer, Johann, Louppe, Gilles

arXiv.org Machine LearningNov-14-2019

Many domains of science have developed complex simulations to describe phenomena of interest. While these simulations provide high-fidelity models, they are poorly suited for inference and lead to challenging inverse problems. We review the rapidly developing field of simulation-based inference and identify the forces giving new momentum to the field. Finally, we describe how the frontier is expanding so that a broad audience can appreciate the profound change these developments may have on science.

inference, likelihood, simulator, (14 more...)

arXiv.org Machine Learning

1911.01429

Country:

North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (0.66)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Neural networks for option pricing and hedging: a literature review

Ruf, Johannes, Wang, Weiguan

arXiv.org Machine LearningNov-13-2019

This work provides a review of this literature. The motivation for this summary arose from our companion paper Ruf and W ang [2019]. There we continue th e discussions of this note; in particular, of potentially problematic data leakage when training ANNs to historic financial data. This paper is organised in the following way. Section 2 featu res Table 1, a summary of the literature that concerns the use of ANNs for nonparametric pricing (and hedging) of options. Section 3 provides a list of recommended papers from Table 1. Section 4 provides a n overview of related work where ANNs are applied in the context of option pricing and hedging, but not necessarily as nonparametric estimation tools. Section 5 briefly discusses various regularisation techniq ues used in the reviewed literature.

deep learning, neural network, option pricing, (18 more...)

arXiv.org Machine Learning

1911.0562

Country:

Asia (0.14)
Africa > South Africa (0.14)
North America > United States (0.14)
Europe > Italy (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Energy > Oil & Gas (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Recent Advances in Algorithmic High-Dimensional Robust Statistics

Diakonikolas, Ilias, Kane, Daniel M.

arXiv.org Machine LearningNov-13-2019

Learning in the presence of outliers is a fundamental problem in statistics. Until recently, all known efficient unsupervised learning algorithms were very sensitive to outliers in high dimensions. In particular, even for the task of robust mean estimation under natural distributional assumptions, no efficient algorithm was known. Recent work in theoretical computer science gave the first efficient robust estimators for a number of fundamental statistical tasks, including mean and covariance estimation. Since then, there has been a flurry of research activity on algorithmic high-dimensional robust estimation in a range of settings. In this survey article, we introduce the core ideas and algorithmic techniques in the emerging area of algorithmic high-dimensional robust statistics with a focus on robust mean estimation. We also provide an overview of the approaches that have led to computationally efficient robust estimators for a range of broader statistical tasks and discuss new directions and opportunities for future work.

algorithm, estimation, mean estimation, (15 more...)

arXiv.org Machine Learning

1911.05911

Country:

North America > United States > New York (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(3 more...)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Convergence to minima for the continuous version of Backtracking Gradient Descent

Truong, Tuyen Trung

arXiv.org Machine LearningNov-13-2019

The main result of this paper is: {\bf Theorem.} Let $f:\mathbb{R}^k\rightarrow \mathbb{R}$ be a $C^{1}$ function, so that $\nabla f$ is locally Lipschitz continuous. Assume moreover that $f$ is $C^2$ near its generalised saddle points. Fix real numbers $\delta_0>0$ and $0<\alpha <1$. Then there is a smooth function $h:\mathbb{R}^k\rightarrow (0,\delta_0]$ so that the map $H:\mathbb{R}^k\rightarrow \mathbb{R}^k$ defined by $H(x)=x-h(x)\nabla f(x)$ has the following property: (i) For all $x\in \mathbb{R}^k$, we have $f(H(x)))-f(x)\leq -\alpha h(x)||\nabla f(x)||^2$. (ii) For every $x_0\in \mathbb{R}^k$, the sequence $x_{n+1}=H(x_n)$ either satisfies $\lim_{n\rightarrow\infty}||x_{n+1}-x_n||=0$ or $ \lim_{n\rightarrow\infty}||x_n||=\infty$. Each cluster point of $\{x_n\}$ is a critical point of $f$. If moreover $f$ has at most countably many critical points, then $\{x_n\}$ either converges to a critical point of $f$ or $\lim_{n\rightarrow\infty}||x_n||=\infty$. (iii) There is a set $\mathcal{E}_1\subset \mathbb{R}^k$ of Lebesgue measure $0$ so that for all $x_0\in \mathbb{R}^k\backslash \mathcal{E}_1$, the sequence $x_{n+1}=H(x_n)$, {\bf if converges}, cannot converge to a {\bf generalised} saddle point. (iv) There is a set $\mathcal{E}_2\subset \mathbb{R}^k$ of Lebesgue measure $0$ so that for all $x_0\in \mathbb{R}^k\backslash \mathcal{E}_2$, any cluster point of the sequence $x_{n+1}=H(x_n)$ is not a saddle point, and more generally cannot be an isolated generalised saddle point. Some other results are proven.

critical point, generalised saddle point, saddle point, (15 more...)

arXiv.org Machine Learning

1911.04221

Country:

Europe > Norway > Eastern Norway > Oslo (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(4 more...)

Genre:

Research Report (0.50)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.41)

Add feedback

MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams

Bhatia, Siddharth, Hooi, Bryan, Yoon, Minji, Shin, Kijung, Faloutsos, Christos

arXiv.org Artificial IntelligenceNov-13-2019

Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? Existing approaches aim to detect individually surprising edges. In this work, we propose MIDAS, which focuses on detecting microcluster anomalies, or suddenly arriving groups of suspiciously similar edges, such as lockstep behavior, including denial of service attacks in network traffic data. MIDAS has the following properties: (a) it detects microcluster anomalies while providing theoretical guarantees about its false positive probability; (b) it is online, thus processing each edge in constant time and constant memory, and also processes the data 108-505 times faster than state-of-the-art approaches; (c) it provides 46%-52% higher accuracy (in terms of AUC) than state-of-the-art approaches.

anomaly, cms data structure, detection, (13 more...)

arXiv.org Artificial Intelligence

1911.04464

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
South America > Brazil (0.04)
Europe > Ukraine (0.04)
(4 more...)

Genre:

Research Report (1.00)
Overview (0.86)

Industry:

Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.91)

Add feedback

Iteratively Training Look-Up Tables for Network Quantization

Cardinaux, Fabien, Uhlich, Stefan, Yoshiyama, Kazuki, Garcia, Javier Alonso, Mauch, Lukas, Tiedemann, Stephen, Kemp, Thomas, Nakamura, Akira

arXiv.org Machine LearningNov-12-2019

Abstract--Operating deep neural networks (DNNs) on devices with limited resources requires the reduction of their memo ry as well as computational footprint. Popular reduction method s are network quantization or pruning, which either reduce the wo rd length of the network parameters or remove weights from the network if they are not needed. In this article we discuss a ge neral framework for network reduction which we call Look-Up T able Quantization (LUT -Q). For each layer, we learn a value dictionary and an assignment matrix to represent the network weights. W e propose a special solver which combines gradient descent an d a one-step k-means update to learn both the value dictionari es and assignment matrices iteratively. This method is very fle xible: by constraining the value dictionary, many different reduc tion problems such as nonuniform network quantization, traini ng of multiplierless networks, network pruning or simultaneo us quantization and pruning can be implemented without changi ng the solver . This flexibility of the LUT -Q method allows us to use the same method to train networks for different hardware capabilities. Deep neural networks (DNN)s are currently used in many machine learning and signal processing applications with g reat success as their performance often beats the previous state - of-the-art approaches by a large margin, e.g., see [2] for an overview of deep learning. DNN approaches have become standard practice in computer vision, automatic speech rec og-nition and partially in natural language processing. They a re also extensively investigated to support other domains lik e medicine, robotics and finance forecasting. Recently, there has been a lot of interest in the research community in reducing the memory/computational footprint of neural networks.

multiplication, neural network, quantization, (11 more...)

arXiv.org Machine Learning

1911.04951

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow

Awan, Ammar Ahmad, Jain, Arpan, Anthony, Quentin, Subramoni, Hari, Panda, Dhabaleswar K.

arXiv.org Artificial IntelligenceNov-12-2019

The enormous amount of data and computation required to train DNNs have led to the rise of various parallelization strategies. Broadly, there are two strategies: 1) Data-Parallelism -- replicating the DNN on multiple processes and training on different training samples, and 2) Model-Parallelism -- dividing elements of the DNN itself into partitions across different processes. While data-parallelism has been extensively studied and developed, model-parallelism has received less attention as it is non-trivial to split the model across processes. In this paper, we propose HyPar-Flow: a framework for scalable and user-transparent parallel training of very large DNNs (up to 5,000 layers). We exploit TensorFlow's Eager Execution features and Keras APIs for model definition and distribution. HyPar-Flow exposes a simple API to offer data, model, and hybrid (model + data) parallel training for models defined using the Keras API. Under the hood, we introduce MPI communication primitives like send and recv on layer boundaries for data exchange between model-partitions and allreduce for gradient exchange across model-replicas. Our proposed designs in HyPar-Flow offer up to 3.1x speedup over sequential training for ResNet-110 and up to 1.6x speedup over Horovod-based data-parallel training for ResNet-1001; a model that has 1,001 layers and 30 million parameters. We provide an in-depth performance characterization of the HyPar-Flow framework on multiple HPC systems with diverse CPU architectures including Intel Xeon(s) and AMD EPYC. HyPar-Flow provides 110x speed up on 128 nodes of the Stampede2 cluster at TACC for hybrid-parallel training of ResNet-1001.

gradient, hypar-flow, partition, (17 more...)

arXiv.org Artificial Intelligence

1911.05146

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Texas (0.04)

Genre:

Research Report (0.65)
Overview (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

OpenAI forms exclusive computing partnership with Microsoft to build new Azure AI supercomputing technologies

#artificialintelligenceNov-11-2019, 03:16:55 GMT

Through this partnership, the companies will accelerate breakthroughs in AI and power OpenAI's efforts to create artificial general intelligence (AGI). The resulting enhancements to the Azure platform will also help developers build the next generation of AI applications. The companies will focus on building a computational platform in Azure of unprecedented scale, which will train and run increasingly advanced AI models, include hardware technologies that build on Microsoft's supercomputing technology, and adhere to the two companies' shared principles on ethics and trust. This will create the foundation for advancements in AI to be implemented in a safe, secure and trustworthy way and is a critical reason the companies chose to partner together. Over the past decade, innovative applications of deep neural networks coupled with increasing computational power have led to continuous AI breakthroughs in areas such as vision, speech, language processing, translation, robotic control and even gaming.

artificial intelligence, machine learning, natural language, (11 more...)

#artificialintelligence

Country:

North America > United States > Washington > King County > Redmond (0.06)
North America > United States > California > San Francisco County > San Francisco (0.06)

Genre:

Press Release (0.79)
Overview > Innovation (0.37)

Industry: Information Technology (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.75)

Add feedback

Location Attention for Extrapolation to Longer Sequences

Dubois, Yann, Dagan, Gautier, Hupkes, Dieuwke, Bruni, Elia

arXiv.org Machine LearningNov-10-2019

Neural networks are surprisingly good at interpolating and perform remarkably well when the training set examples resemble those in the test set. However, they are often unable to extrapolate patterns beyond the seen data, even when the abstractions required for such patterns are simple. In this paper, we first review the notion of extrapolation, why it is important and how one could hope to tackle it. We then focus on a specific type of extrapolation which is especially useful for natural language processing: generalization to sequences that are longer than the training ones. We hypothesize that models with a separate content- and location-based attention are more likely to extrapolate than those with common attention mechanisms. We empirically support our claim for recurrent seq2seq models with our proposed attention on variants of the Lookup Table task. This sheds light on some striking failures of neural models for sequences and on possible methods to approaching such issues.

arxiv preprint arxiv, attention mechanism, extrapolation, (12 more...)

arXiv.org Machine Learning

1911.03872

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report (0.50)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback