AITopics | SPE

Collaborating Authors

SPE

Memory and attention in deep learning

arXiv.org Artificial IntelligenceJul-3-2021

Intelligence necessitates memory. Without memory, humans fail to perform various nontrivial tasks such as reading novels, playing games or solving maths. As the ultimate goal of machine learning is to derive intelligent systems that learn and act automatically just like human, memory construction for machine is inevitable. Artificial neural networks model neurons and synapses in the brain by interconnecting computational units via weights, which is a typical class of machine learning algorithms that resembles memory structure. Their descendants with more complicated modeling techniques (a.k.a deep learning) have been successfully applied to many practical problems and demonstrated the importance of memory in the learning process of machinery systems. Recent progresses on modeling memory in deep learning have revolved around external memory constructions, which are highly inspired by computational Turing models and biological neuronal systems. Attention mechanisms are derived to support acquisition and retention operations on the external memory. Despite the lack of theoretical foundations, these approaches have shown promises to help machinery systems reach a higher level of intelligence. The aim of this thesis is to advance the understanding on memory and attention in deep learning. Its contributions include: (i) presenting a collection of taxonomies for memory, (ii) constructing new memory-augmented neural networks (MANNs) that support multiple control and memory units, (iii) introducing variability via memory in sequential generative models, (iv) searching for optimal writing operations to maximise the memorisation capacity in slot-based memory networks, and (v) simulating the Universal Turing Machine via Neural Stored-program Memory-a new kind of external memory for neural networks.

knowledge discovery and data mining, procedure prediction and drug prescription, test-set classification accuracy, (15 more...)

arXiv.org Artificial Intelligence

2107.0139

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.92)
Workflow (0.92)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Education (0.92)
Health & Medicine > Consumer Health (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Applications of the Free Energy Principle to Machine Learning and Neuroscience

Millidge, Beren

arXiv.org Artificial IntelligenceJun-30-2021

In this thesis, we explore and apply methods inspired by the free energy principle to two important areas in machine learning and neuroscience. The free energy principle is a general mathematical theory of the necessary information-theoretic behaviours of systems which maintain a separation from their environment. A core postulate of the theory is that complex systems can be seen as performing variational Bayesian inference and minimizing an information-theoretic quantity called the variational free energy. The free energy principle originated in, and has been extremely influential in theoretical neuroscience, having spawned a number of neurophysiologically realistic process theories, and maintaining close links with Bayesian Brain viewpoints. The thesis is split into three main parts where we apply methods and insights from the free energy principle to understand questions first in perception, then action, and finally learning. Specifically, in the first section, we focus on the theory of predictive coding, a neurobiologically plausible process theory derived from the free energy principle under certain assumptions, which argues that the primary function of the brain is to minimize prediction errors. We focus on scaling up predictive coding architectures and simulate large-scale predictive coding networks for perception on machine learning benchmarks; we investigate predictive coding's relationship to other classical filtering algorithms, and we demonstrate that many biologically implausible aspects of current models of predictive coding can be relaxed without unduly harming the performance of predictive coding models which allows for a potentially more literal translation of predictive coding theory into cortical microcircuits. In the second part of the thesis, we focus on the application of methods deriving from the free energy principle to action. We study the extension of methods of'active inference', a neurobiologically grounded account of action through variational message passing, to utilize deep artificial neural networks, allowing these methods to'scale up' to be competitive with state of the art deep reinforcement learning methods.

biologically implausible one-to-one connectivity, crucial approximate bayesian inference lemma, non-symmetric forward and backward connectivity, (17 more...)

arXiv.org Artificial Intelligence

2107.0014

Country:

Asia > Middle East > Jordan (0.04)
Africa > Mali (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(6 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (0.92)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
(4 more...)

Add feedback

The Threat of Offensive AI to Organizations

Mirsky, Yisroel, Demontis, Ambra, Kotak, Jaidip, Shankar, Ram, Gelei, Deng, Yang, Liu, Zhang, Xiangyu, Lee, Wenke, Elovici, Yuval, Biggio, Battista

arXiv.org Artificial IntelligenceJun-29-2021

AI has provided us with the ability to automate tasks, extract information from vast amounts of data, and synthesize media that is nearly indistinguishable from the real thing. However, positive tools can also be used for negative purposes. In particular, cyber adversaries can use AI (such as machine learning) to enhance their attacks and expand their campaigns. Although offensive AI has been discussed in the past, there is a need to analyze and understand the threat in the context of organizations. For example, how does an AI-capable adversary impact the cyber kill chain? Does AI benefit the attacker more than the defender? What are the most significant AI threats facing organizations today and what will be their impact on the future? In this survey, we explore the threat of offensive AI on organizations. First, we present the background and discuss how AI changes the adversary's methods, strategies, goals, and overall attack model. Then, through a literature review, we identify 33 offensive AI capabilities which adversaries can use to enhance their attacks. Finally, through a user study spanning industry and academia, we rank the AI threats and provide insights on the adversaries.

adversary, international conference, publication date, (14 more...)

arXiv.org Artificial Intelligence

2106.15764

Country:

Europe > Italy > Sardinia > Cagliari (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)
(20 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Law Enforcement & Public Safety (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
(6 more...)

Add feedback

Advancing Methodology for Social Science Research Using Alternate Reality Games: Proof-of-Concept Through Measuring Individual Differences and Adaptability and their impact on Team Performance

El-Nasr, Magy Seif, Harteveld, Casper, Fombelle, Paul, Nguyen, Truong-Huy, Rizzo, Paola, Schouten, Dylan, Madkour, Abdelrahman, Jemmali, Chaima, Kleinman, Erica, Javvaji, Nithesh, Teng, Zhaoqing, Inc, Extra Ludic

arXiv.org Artificial IntelligenceJun-25-2021

While work in fields of CSCW (Computer Supported Collaborative Work), Psychology and Social Sciences have progressed our understanding of team processes and their effect performance and effectiveness, current methods rely on observations or self-report, with little work directed towards studying team processes with quantifiable measures based on behavioral data. In this report we discuss work tackling this open problem with a focus on understanding individual differences and its effect on team adaptation, and further explore the effect of these factors on team performance as both an outcome and a process. We specifically discuss our contribution in terms of methods that augment survey data and behavioral data that allow us to gain more insight on team performance as well as develop a method to evaluate adaptation and performance across and within a group. To make this problem more tractable we chose to focus on specific types of environments, Alternate Reality Games (ARGs), and for several reasons. First, these types of games involve setups that are similar to a real-world setup, e.g., communication through slack or email. Second, they are more controllable than real environments allowing us to embed stimuli if needed. Lastly, they allow us to collect data needed to understand decisions and communications made through the entire duration of the experience, which makes team processes more transparent than otherwise possible. In this report we discuss the work we did so far and demonstrate the efficacy of the approach.

defense advanced research project agency, distribution statement, distribution unlimited, (13 more...)

arXiv.org Artificial Intelligence

2106.1374

Country: North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.93)

Industry:

Education (1.00)
Leisure & Entertainment > Games > Computer Games (0.67)
Government > Regional Government > North America Government > United States Government (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science > Data Quality (0.67)

Add feedback

Online Handbook of Argumentation for AI: Volume 2

OHAAI Collaboration, null, Brannstrom, Andreas, Castagna, Federico, Duchatelle, Theo, Foulis, Matt, Kampik, Timotheus, Kuhlmann, Isabelle, Malmqvist, Lars, Morveli-Espinoza, Mariela, Mumford, Jack, Pandzic, Stipe, Schaefer, Robin, Thorburn, Luke, Xydis, Andreas, Yuste-Ginel, Antonio, Zheng, Heng

arXiv.org Artificial IntelligenceJun-23-2021

This volume contains revised versions of the papers selected for the second volume of the Online Handbook of Argumentation for AI (OHAAI). Previously, formal theories of argument and argument interaction have been proposed and studied, and this has led to the more recent study of computational models of argument. Argumentation, as a field within artificial intelligence (AI), is highly relevant for researchers interested in symbolic representations of knowledge and defeasible reasoning. The purpose of this handbook is to provide an open access and curated anthology for the argumentation research community. OHAAI is designed to serve as a research hub to keep track of the latest and upcoming PhD-driven research on the theory and application of argumentation in all areas related to AI.

argument, argumentation, online handbook, (14 more...)

arXiv.org Artificial Intelligence

2106.10832

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(26 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Law (1.00)
Government (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

Heterogeneous Treatment Effects in Regression Discontinuity Designs

Reguly, Ágoston

arXiv.org Machine LearningJun-22-2021

The paper proposes a supervised machine learning algorithm to uncover treatment effect heterogeneity in classical regression discontinuity (RD) designs. Extending Athey and Imbens (2016), I develop a criterion for building an honest ``regression discontinuity tree'', where each leaf of the tree contains the RD estimate of a treatment (assigned by a common cutoff rule) conditional on the values of some pre-treatment covariates. It is a priori unknown which covariates are relevant for capturing treatment effect heterogeneity, and it is the task of the algorithm to discover them, without invalidating inference. I study the performance of the method through Monte Carlo simulations and apply it to the data set compiled by Pop-Eleches and Urquiola (2013) to uncover various sources of heterogeneity in the impact of attending a better secondary school in Romania.

algorithm, criterion, treatment effect, (15 more...)

arXiv.org Machine Learning

2106.1164

Country:

Europe > Romania (0.24)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Education > Educational Setting > K-12 Education > Secondary School (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Dive into Deep Learning

Zhang, Aston, Lipton, Zachary C., Li, Mu, Smola, Alexander J.

arXiv.org Artificial IntelligenceJun-21-2021

Just a few years ago, there were no legions of deep learning scientists developing intelligent products and services at major companies and startups. When the youngest among us (the authors) entered the field, machine learning did not command headlines in daily newspapers. Our parents had no idea what machine learning was, let alone why we might prefer it to a career in medicine or law. Machine learning was a forward-looking academic discipline with a narrow set of real-world applications. And those applications, e.g., speech recognition and computer vision, required so much domain knowledge that they were often regarded as separate areas entirely for which machine learning was one small component. Neural networks then, the antecedents of the deep learning models that we focus on in this book, were regarded as outmoded tools. In just the past five years, deep learning has taken the world by surprise, driving rapid progress in fields as diverse as computer vision, natural language processing, automatic speech recognition, reinforcement learning, and statistical modeling. With these advances in hand, we can now build cars that drive themselves with more autonomy than ever before (and less autonomy than some companies might have you believe), smart reply systems that automatically draft the most mundane emails, helping people dig out from oppressively large inboxes, and software agents that dominate the worldʼs best humans at board games like Go, a feat once thought to be decades away. Already, these tools exert ever-wider impacts on industry and society, changing the way movies are made, diseases are diagnosed, and playing a growing role in basic sciences--from astrophysics to biology.

chess, sequence length make self-attention prohibitively, vascular disease, (33 more...)

arXiv.org Artificial Intelligence

2106.11342

Country:

North America > United States > Texas (0.27)
North America > United States > California > Santa Clara County > Palo Alto (0.13)
Asia > Japan (0.13)
(6 more...)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > Promising Solution (1.00)
(4 more...)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology > Services (1.00)
(13 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

Importance measures derived from random forests: characterisation and extension

Sutera, Antonio

arXiv.org Machine LearningJun-17-2021

Nowadays new technologies, and especially artificial intelligence, are more and more established in our society. Big data analysis and machine learning, two sub-fields of artificial intelligence, are at the core of many recent breakthroughs in many application fields (e.g., medicine, communication, finance, ...), including some that are strongly related to our day-to-day life (e.g., social networks, computers, smartphones, ...). In machine learning, significant improvements are usually achieved at the price of an increasing computational complexity and thanks to bigger datasets. Currently, cutting-edge models built by the most advanced machine learning algorithms typically became simultaneously very efficient and profitable but also extremely complex. Their complexity is to such an extent that these models are commonly seen as black-boxes providing a prediction or a decision which can not be interpreted or justified. Nevertheless, whether these models are used autonomously or as a simple decision-making support tool, they are already being used in machine learning applications where health and human life are at stake. Therefore, it appears to be an obvious necessity not to blindly believe everything coming out of those models without a detailed understanding of their predictions or decisions. Accordingly, this thesis aims at improving the interpretability of models built by a specific family of machine learning algorithms, the so-called tree-based methods. Several mechanisms have been proposed to interpret these models and we aim along this thesis to improve their understanding, study their properties, and define their limitations.

computational statistic & data analysis, predictive feature, random forest variable importance measure, (16 more...)

arXiv.org Machine Learning

2106.09473

Country:

North America > United States > California > San Francisco County > San Francisco (0.13)
North America > United States > California > Alameda County > Berkeley (0.13)
Europe > Spain > Canary Islands (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(3 more...)

Add feedback

Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence

Cao, Jacky, Lam, Kit-Yung, Lee, Lik-Hang, Liu, Xiaoli, Hui, Pan, Su, Xiang

arXiv.org Artificial IntelligenceJun-16-2021

Mobile Augmented Reality (MAR) integrates computer-generated virtual objects with physical environments for mobile devices. MAR systems enable users to interact with MAR devices, such as smartphones and head-worn wearables, and performs seamless transitions from the physical world to a mixed world with digital entities. These MAR systems support user experiences by using MAR devices to provide universal accessibility to digital contents. Over the past 20 years, a number of MAR systems have been developed, however, the studies and design of MAR frameworks have not yet been systematically reviewed from the perspective of user-centric design. This article presents the first effort of surveying existing MAR frameworks (count: 37) and further discusses the latest studies on MAR through a top-down approach: 1) MAR applications; 2) MAR visualisation techniques adaptive to user mobility and contexts; 3) systematic evaluation of MAR frameworks including supported platforms and corresponding features such as tracking, feature extraction plus sensing capabilities; and 4) underlying machine learning approaches supporting intelligent operations within MAR systems. Finally, we summarise the development of emerging research fields, current state-of-the-art, and discuss the important open challenges and possible theoretical and technical directions. This survey aims to benefit both researchers and MAR system developers alike.

application, augmented reality, information, (14 more...)

arXiv.org Artificial Intelligence

2106.0871

Country:

Asia > China > Hong Kong (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
Europe > Finland > Northern Ostrobothnia > Oulu (0.04)
(8 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.48)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine > Surgery (1.00)
Information Technology > Services (0.93)
(5 more...)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Semantic Representation and Inference for NLP

Wang, Dongsheng

arXiv.org Artificial IntelligenceJun-15-2021

Semantic representation and inference is essential for Natural Language Processing (NLP). The state of the art for semantic representation and inference is deep learning, and particularly Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), and transformer Self-Attention models. This thesis investigates the use of deep learning for novel semantic representation and inference, and makes contributions in the following three areas: creating training data, improving semantic representations and extending inference learning. In terms of creating training data, we contribute the largest publicly available dataset of real-life factual claims for the purpose of automatic claim verification (MultiFC), and we present a novel inference model composed of multi-scale CNNs with different kernel sizes that learn from external sources to infer fact checking labels. In terms of improving semantic representations, we contribute a novel model that captures non-compositional semantic indicators. By definition, the meaning of a non-compositional phrase cannot be inferred from the individual meanings of its composing words (e.g., hot dog). Motivated by this, we operationalize the compositionality of a phrase contextually by enriching the phrase representation with external word embeddings and knowledge graphs. Finally, in terms of inference learning, we propose a series of novel deep learning architectures that improve inference by using syntactic dependencies, by ensembling role guided attention heads, incorporating gating layers, and concatenating multiple heads in novel and effective ways. This thesis consists of seven publications (five published and two under review).

bidirectional transformer, convolutional neural network, inter-block representation, (16 more...)

arXiv.org Artificial Intelligence

2106.08117

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Mexico (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(32 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Media > News (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback