AITopics

1910.00189

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningSep-30-2019

Automated design of error-resilient and hardware-efficient deep neural networks

Schorn, Christoph, Elsken, Thomas, Vogel, Sebastian, Runge, Armin, Guntoro, Andre, Ascheid, Gerd

Applying deep neural networks (DNNs) in mobile and safety-critical systems, such as autonomous vehicles, demands a reliable and efficient execution on hardware. Optimized dedicated hardware accelerators are being developed to achieve this. However, the design of efficient and reliable hardware has become increasingly difficult, due to the increased complexity of modern integrated circuit technology and its sensitivity against hardware faults, such as random bit-flips. It is thus desirable to exploit optimization potential for error resilience and efficiency also at the algorithmic side, e.g. by optimizing the architecture of the DNN. Since there are numerous design choices for the architecture of DNNs, with partially opposing effects on the preferred characteristics (such as small error rates at low latency), multi-objective optimization strategies are necessary. In this paper, we develop an evolutionary optimization technique for the automated design of hardware-optimized DNN architectures. For this purpose, we derive a set of easily computable objective functions, which enable the fast evaluation of DNN architectures with respect to their hardware efficiency and error resilience solely based on the network topology. We observe a strong correlation between predicted error resilience and actual measurements obtained from fault injection simulations. Keywords Neural Network Hardware · Error Resilience · Hardware Faults · Neural Architecture Search · Multi-Objective Optimization · AutoML 1 Introduction The application of deep neural networks (DNNs) in safety-critical perception systems, for example autonomous vehicles (A Vs), poses some challenges on the design of the underlying hardware platforms. On the one hand, efficient and fast accelerators are needed, since DNNs for computer vision exhibit massive computational requirements [55]. On the other hand, resilience against random hardware faults has to be ensured. In many driving scenarios, entering a fail-safe state is not sufficient, but fail-operational behavior and fault tolerance are required [48]. However, fault tolerance techniques at the hardware level often entail large redundancy overheads in silicon area, latency, and power consumption. These overheads stand in contrast to the low-power and low-latency requirements of embedded real-time DNN accelerators. Reliability concerns in nanoscale integrated circuits, for instance soft errors in memory and logic, represent an additional challenge for the realization of fault tolerance mechanisms at the hardware level [2, 33, 36, 68, 83]. Moreover, techniques such as near-threshold computing [26] and approximate computing [65] are desirable to meet power constraints, but can further increase error rates.

architecture, international conference, neural network, (16 more...)

1909.13844

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry: Semiconductors & Electronics (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningSep-30-2019

Imagine That! Leveraging Emergent Affordances for Tool Synthesis in Reaching Tasks

Wu, Yizhe, Kasewa, Sudhanshu, Groth, Oliver, Salter, Sasha, Sun, Li, Jones, Oiwi Parker, Posner, Ingmar

A BSTRACT In this paper we investigate an artificial agent's ability to perform task-focused tool synthesis via imagination. Our motivation is to explore the richness of information captured by the latent space of an object-centric generative model - and how to exploit it. In particular, our approach employs activation maximisation of a task-based performance predictor to optimise the latent variable of a structured latent-space model in order to generate tool geometries appropriate for the task at hand. We evaluate our model using a novel dataset of synthetic reaching tasks inspired by the cognitive sciences and behavioural ecology. In doing so we examine the model's ability to imagine tools for increasingly complex scenario types, beyond those seen during training. Our experiments demonstrate that the synthesis process modifies emergent, task-relevant object affordances in a targeted and deliberate way: the agents often specifically modify aspects of the tools which relate to meaningful (yet implicitly learned) concepts such as a tool's length, width and configuration. Our results therefore suggest that task relevant object affordances are implicitly encoded as directions in a structured latent space shaped by experience. 1 I NTRODUCTION Deep generative models are gaining in popularity for unsupervised representation learning. In particular, recent models like MONet (Burgess et al., 2019) have been proposed to decompose scenes into object-centric latent representations (cf. The notion of such an object-centric latent representation, trained from examples in an unsupervised way, holds a tantalising prospect: as generative models naturally capture factors of variation, could they also be used to expose these factors such that they can be modified in a task-driven way? We posit that a task-driven traversal of a structured latent space leads to affordances emerging naturally as directions in this space. This is in stark contrast to more common approaches to affordance learning where it is commonly achieved via direct supervision or implicitly via imitation (e.g. Tikhanoff et al., 2013; Myers et al., 2015; Liu et al., 2018; Grabner et al., 2011; Do et al., 2018).

international conference, latent space, scenario type, (15 more...)

1909.13561

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Larsson, Daniel T., Maity, Dipankar, Tsiotras, Panagiotis

Q-Search Trees: An Information-Theoretic Approach Towards Hierarchical Abstractions for Agents with Computational Limitations

arXiv.org Artificial IntelligenceSep-30-2019

In this paper, we develop a framework to obtain graph abstractions for decision-making by an agent where the abstractions emerge as a function of the agent's limited computational resources. We discuss the connection of the proposed approach with information-theoretic signal compression, and formulate a novel optimization problem to obtain tree-based abstractions as a function of the agent's computational resources. The structural properties of the new problem are discussed in detail, and two algorithmic approaches are proposed to obtain solutions to this optimization problem. We discuss the quality of, and prove relationships between, solutions obtained by the two proposed algorithms. The framework is demonstrated to generate a hierarchy of abstractions for a non-trivial environment.

algorithm, q-tree search algorithm, representation, (16 more...)

1910.00063

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(5 more...)

Genre:

Research Report (0.50)
Overview (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Eiffert, Stuart, Sukkarieh, Salah

Predicting Responses to a Robot's Future Motion using Generative Recurrent Neural Networks

arXiv.org Artificial IntelligenceSep-30-2019

Robotic navigation through crowds or herds requires the ability to both predict the future motion of nearby individuals and understand how these predictions might change in response to a robot's future action. State of the art trajectory prediction models using Recurrent Neural Networks (RNNs) do not currently account for a planned future action of a robot, and so cannot predict how an individual will move in response to a robot's planned path. We propose an approach that adapts RNNs to use a robot's next planned action as an input alongside the current position of nearby individuals. This allows the model to learn the response of individuals with regards to a robot's motion from real world observations. By linking a robot's actions to the response of those around it in training, we show that we are able to not only improve prediction accuracy in close range interactions, but also to predict the likely response of surrounding individuals to simulated actions. This allows the use of the model to simulate state transitions, without requiring any assumptions on agent interaction. We apply this model to varied datasets, including crowds of pedestrians interacting with vehicles and bicycles, and livestock interacting with a robotic vehicle.

agent, interaction, robot, (16 more...)

1909.13486

Country: Oceania > New Zealand (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Pang, Richard Yuanzhe, Gimpel, Kevin

Unsupervised Evaluation Metrics and Learning Criteria for Non-Parallel Textual Transfer

arXiv.org Artificial IntelligenceSep-30-2019

We consider the problem of automatically generating textual paraphrases with modified attributes or properties, focusing on the setting without parallel data (Hu et al., 2017; Shen et al., 2017). This setting poses challenges for evaluation. We show that the metric of post-transfer classification accuracy is insufficient on its own, and propose additional metrics based on semantic preservation and fluency as well as a way to combine them into a single overall score. We contribute new loss functions and training strategies to address the different metrics. Semantic preservation is addressed by adding a cyclic consistency loss and a loss based on paraphrase pairs, while fluency is improved by integrating losses based on style-specific language models. We experiment with a Yelp sentiment dataset and a new literature dataset that we propose, using multiple models that extend prior work (Shen et al., 2017). We demonstrate that our metrics correlate well with human judgments, at both the sentence-level and system-level. Automatic and manual evaluation also show large improvements over the baseline method of Shen et al. (2017). We hope that our proposed metrics can speed up system development for new textual transfer tasks while also encouraging the community to address our three complementary aspects of transfer quality.

computational linguistic, metric, proceedings, (12 more...)

1810.11878

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

arXiv.org Machine LearningSep-29-2019

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings

Levin, Keith, Roosta, Fred, Tang, Minh, Mahoney, Michael W., Priebe, Carey E.

Graph embeddings, a class of dimensionality reduction techniques designed for relational data, have proven useful in exploring and modeling network structure. Most dimensionality reduction methods allow out-of-sample extensions, by which an embedding can be applied to observations not present in the training set. Applied to graphs, the out-of-sample extension problem concerns how to compute the embedding of a vertex that is added to the graph after an embedding has already been computed. In this paper, we consider the out-of-sample extension problem for two graph embedding procedures: the adjacency spectral embedding and the Laplacian spectral embedding. In both cases, we prove that when the underlying graph is generated according to a latent space model called the random dot product graph, which includes the popular stochastic block model as a special case, an out-of-sample extension based on a least-squares objective obeys a central limit theorem about the true latent position of the out-of-sample vertex. In addition, we prove a concentration inequality for the out-of-sample extension of the adjacency spectral embedding based on a maximum-likelihood objective. Our results also yield a convenient framework in which to analyze trade-offs between estimation accuracy and computational expense, which we explore briefly.

extension, latent position, vertex, (17 more...)

1910.00423

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Oceania > Australia > Queensland (0.04)
North America > United States > North Carolina (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

arXiv.org Machine LearningSep-29-2019

Gradient Descent: The Ultimate Optimizer

Chandra, Kartik, Meijer, Erik, Andow, Samantha, Arroyo-Fang, Emilio, Dea, Irene, George, Johann, Grueter, Melissa, Hosmer, Basil, Stumpos, Steffi, Tempest, Alanna, Yang, Shannon

Working with any gradient-based machine learning algorithm involves the tedious task of tuning the optimizer's hyperparameters, such as the learning rate. There exist many techniques for automated hyperparameter optimization, but they typically introduce even more hyperparameters to control the hyperparameter optimization process. We propose to instead learn the hyperparameters themselves by gradient descent, and furthermore to learn the hyper-hyperparameters by gradient descent as well, and so on ad infinitum. As these towers of gradient-based optimizers grow, they become significantly less sensitive to the choice of top-level hyperparameters, hence decreasing the burden on the user to search for optimal values.

hyperparameter, optimizer, sgd, (13 more...)

1909.13371

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Europe > Spain > Canary Islands (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Psomakelis, Evangelos, Tserpes, Konstantinos, Zissisc, Dimitris, Anagnostopoulos, Dimosthenis, Varvarigou, Theodora

Context agnostic trajectory prediction based on $\lambda$-architecture

arXiv.org Machine LearningSep-29-2019

Predicting the next position of movable objects has been a problem for at least the last three decades, referred to as trajectory prediction. In our days, the vast amounts of data being continuously produced add the big data dimension to the trajectory prediction problem, which we are trying to tackle by creating a {\lambda}-Architecture based analytics platform. This platform performs both batch and stream analytics tasks and then combines them to perform analytical tasks that cannot be performed by analyzing any of these layers by itself. The biggest benefit of this platform is its context agnostic trait, which allows us to use it for any use case, as long as a time-stamped geolocation stream is provided. The experimental results presented prove that each part of the {\lambda}-Architecture performs well at certain targets, making a combination of these parts a necessity in order to improve the overall accuracy and performance of the platform.

algorithm, platform, prediction, (16 more...)

doi: 10.1016/j.future.2019.09.046

1909.13241

Country:

Oceania > Samoa (0.04)
Atlantic Ocean > Mediterranean Sea (0.04)
Pacific Ocean (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Marine (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

arXiv.org Artificial IntelligenceSep-29-2019

Generating Diverse Story Continuations with Controllable Semantics

Tu, Lifu, Ding, Xiaoan, Yu, Dong, Gimpel, Kevin

We propose a simple and effective modeling framework for controlled generation of multiple, diverse outputs. We focus on the setting of generating the next sentence of a story given its context. As controllable dimensions, we consider several sentence attributes, including sentiment, length, predicates, frames, and automatically-induced clusters. Our empirical results demonstrate: (1) our framework is accurate in terms of generating outputs that match the target control values; (2) our model yields increased maximum metric scores compared to standard n -best list generation via beam search; (3) controlling generation with semantic frames leads to a stronger combination of diversity and quality than other control variables as measured by automatic metrics. We also conduct a human evaluation to assess the utility of providing multiple suggestions for creative writing, demonstrating promising results for the potential of controllable, diverse generation in a collaborative writing system.

computational linguistic, continuation, proceedings, (14 more...)

1909.13434

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Italy > Tuscany > Florence (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(12 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)