AITopics

Deep learning natural language processing models often use vector word embeddings, such as word2vec or GloVe, to represent words. A discrete sequence of words can be much more easily integrated with downstream neural layers if it is represented as a sequence of continuous vectors. Also, semantic relationships between words, learned from a text corpus, can be encoded in the relative configurations of the embedding vectors. However, storing and accessing embedding vectors for all words in a dictionary requires large amount of space, and may stain systems with limited GPU memory. Here, we used approaches inspired by quantum computing to propose two related methods, {\em word2ket} and {\em word2ketXS}, for storing word embedding matrix during training and inference in a highly efficient way. Our approach achieves a hundred-fold or more reduction in the space required to store the embeddings with almost no relative drop in accuracy in practical natural language processing tasks.

artificial intelligence, machine learning, natural language, (14 more...)

1911.04975

Country: North America > United States > Virginia (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Beale, Nicholas, Battey, Heather, Davison, Anthony C., MacKay, Robert S.

An Unethical Optimization Principle

If an artificial intelligence aims to maximise risk-adjusted return, then under mild conditions it is disproportionately likely to pick an unethical strategy unless the objective function allows sufficiently for this risk. Even if the proportion ${\eta}$ of available unethical strategies is small, the probability ${p_U}$ of picking an unethical strategy can become large; indeed unless returns are fat-tailed ${p_U}$ tends to unity as the strategy space becomes large. We define an Unethical Odds Ratio Upsilon (${\Upsilon}$) that allows us to calculate ${p_U}$ from ${\eta}$, and we derive a simple formula for the limit of ${\Upsilon}$ as the strategy space becomes large. We give an algorithm for estimating ${\Upsilon}$ and ${p_U}$ in finite cases and discuss how to deal with infinite strategy spaces. We show how this principle can be used to help detect unethical strategies and to estimate ${\eta}$. Finally we sketch some policy implications of this work.

artificial intelligence, strategy space, unethical strategy, (16 more...)

1911.05116

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (1.00)

Industry:

Banking & Finance (0.68)
Government (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

arXiv.org Artificial IntelligenceNov-12-2019

A Pre-training Based Personalized Dialogue Generation Model with Persona-sparse Data

Zheng, Yinhe, Zhang, Rongsheng, Mao, Xiaoxi, Huang, Minlie

Endowing dialogue systems with personas is essential to deliver more human-like conversations. However, this problem is still far from well explored due to the difficulties of both embodying personalities in natural languages and the persona sparsity issue observed in most dialogue corpora. This paper proposes a pre-training based personalized dialogue model that can generate coherent responses using persona-sparse dialogue data. In this method, a pre-trained language model is used to initialize an encoder and decoder, and personal attribute embeddings are devised to model richer dialogue contexts by encoding speakers' personas together with dialogue histories. Further, to incorporate the target persona in the decoding process and to balance its contribution, an attention routing structure is devised in the decoder to merge features extracted from the target persona and dialogue contexts using dynamically predicted weights. Our model can utilize persona-sparse dialogues in a unified manner during the training process, and can also control the amount of persona-related features to exhibit during the inference process. Both automatic and manual evaluation demonstrates that the proposed model outperforms state-of-the-art methods for generating more coherent and persona consistent responses with persona-sparse data.

large language model, machine learning, persona, (21 more...)

arXiv.org Artificial Intelligence

1911.047

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

arXiv.org Artificial IntelligenceNov-12-2019

Aplib: Tactical Programming of Intelligent Agents

Prasetya, I. S. W. B.

This paper presents aplib, a Java library for programming intelligent agents, featuring BDI and multi agency, but adding on top of it a novel layer of tactical programming inspired by the domain of theorem proving. Aplib is also implemented in such a way to provide the fluency of a Domain Specific Language (DSL). Compared to dedicated BDI agent programming languages such as JASON, 2APL, or GOAL,aplib's embedded DSL approach does mean that \aplib\ programmers will still be limited by Java syntax, but on other hand they get all the advantages that Java programmers get: rich language features (object orientation, static type checking, $\lambda$-expression, libraries, etc), a whole array of development tools, integration with other technologies, large community, etc.

agent, artificial intelligence, programming language, (17 more...)

arXiv.org Artificial Intelligence

1911.0471

Country: Europe > Netherlands (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Journal of Artificial Intelligence ResearchNov-12-2019

On the Time and Space Complexity of Genetic Programming for Evolving Boolean Conjunctions

Lissovoi, Andrei (University of Sheffield) | Oliveto, Pietro S. (University of Sheffield)

Genetic programming (GP) is a general purpose bio-inspired meta-heuristic for the evolution of computer programs. In contrast to the several successful applications, there is little understanding of the working principles behind GP. In this paper we present a performance analysis that sheds light on the behaviour of simple GP systems for evolving conjunctions of n variables (ANDn). The analysis of a random local search GP system with minimal terminal and function sets reveals the relationship between the number of iterations and the progress the GP makes toward finding the target function. Afterwards we consider a more realistic GP system equipped with a global mutation operator and prove that it can efficiently solve ANDn by producing programs of linear size that fit a training set to optimality and with high probability generalise well. Additionally, we consider more general problems which extend the terminal set with undesired variables or negated variables. In the presence of undesired variables, we prove that, if non-strict selection is used, then the algorithm fits the complete training set efficiently while the strict selection algorithm may fail with high probability unless the substitution operator is switched off. If negations are allowed, we show that while the algorithms fail to fit the complete training set, the constructed solutions generalise well. Finally, from a problem hardness perspective, we reveal the existence of small training sets that allow the evolution of the exact conjunctions even with access to negations or undesired variables.

algorithm, iteration, probability, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11821

AI Access Foundation

11821

Journal of Artificial Intelligence Research

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

Lin, Yen-Chen, Hong, Zhang-Wei, Liao, Yuan-Hong, Shih, Meng-Li, Liu, Ming-Yu, Sun, Min

We introduce two tactics to attack agents trained by deep reinforcement learning algorithms using adversarial examples, namely the strategically-timed attack and the enchanting attack. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps prevent detection of the attack by the agent. We propose a novel method to determine when an adversarial example should be crafted and applied. In the enchanting attack, the adversary aims at luring the agent to a designated target state. This is achieved by combining a generative model and a planning algorithm: while the generative model predicts the future states, the planning algorithm generates a preferred sequence of actions for luring the agent. A sequence of adversarial examples is then crafted to lure the agent to take the preferred sequence of actions. We apply the two tactics to the agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C. In 5 Atari games, our strategically timed attack reduces as much reward as the uniform attack (i.e., attacking at every time step) does by attacking the agent 4 times less often. Our enchanting attack lures the agent toward designated target states with a more than 70% success rate. Videos are available at http://yenchenlin.me/adversarial_attack_RL/

adversarial example, agent, deep rl agent, (15 more...)

1703.06748

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
Asia > Taiwan (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.95)
Leisure & Entertainment > Games > Computer Games (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

van Harten, Louis D., Wolterink, Jelmer M., Verhoeff, Joost J. C., Išgum, Ivana

Exploiting Clinically Available Delineations for CNN-based Segmentation in Radiotherapy Treatment Planning

Convolutional neural networks (CNNs) have been widely and successfully used for medical image segmentation. However, CNNs are typically considered to require large numbers of dedicated expert-segmented training volumes, which may be limiting in practice. This work investigates whether clinically obtained segmentations which are readily available in picture archiving and communication systems (PACS) could provide a possible source of data to train a CNN for segmentation of organs-at-risk (OARs) in radiotherapy treatment planning. In such data, delineations of structures deemed irrelevant to the target clinical use may be lacking. To overcome this issue, we use multi-label instead of multi-class segmentation. We empirically assess how many clinical delineations would be sufficient to train a CNN for the segmentation of OARs and find that increasing the training set size beyond a limited number of images leads to sharply diminishing returns. Moreover, we find that by using multi-label segmentation, missing structures in the reference standard do not have a negative effect on overall segmentation accuracy. These results indicate that segmentations obtained in a clinical workflow can be used to train an accurate OAR segmentation model.

delineation, segmentation, subset, (12 more...)

1911.04967

Country: Europe > Netherlands > North Holland > Amsterdam (0.05)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.90)
Health & Medicine > Therapeutic Area > Oncology (0.62)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

van Harten, Louis D., Wolterink, Jelmer M., Verhoeff, Joost J. C., Išgum, Ivana

Automatic Online Quality Control of Synthetic CTs

Accurate MR-to-CT synthesis is a requirement for MR-only workflows in radiotherapy (RT) treatment planning. In recent years, deep learning-based approaches have shown impressive results in this field. However, to prevent downstream errors in RT treatment planning, it is important that deep learning models are only applied to data for which they are trained and that generated synthetic CT (sCT) images do not contain severe errors. For this, a mechanism for online quality control should be in place. In this work, we use an ensemble of sCT generators and assess their disagreement as a measure of uncertainty of the results. We show that this uncertainty measure can be used for two kinds of online quality control. First, to detect input images that are outside the expected distribution of MR images. Second, to identify sCT images that were generated from suitable MR images but potentially contain errors. Such automatic online quality control for sCT generation is likely to become an integral part of MR-only RT workflows.

contrast agent, generator, workflow, (13 more...)

1911.04986

Country: Europe > Netherlands > North Holland > Amsterdam (0.05)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hong, Seung Hwan, Lim, Jaechang, Ryu, Seongok, Kim, Woo Youn

Molecular Generative Model Based On Adversarially Regularized Autoencoder

Deep generative models are attracting great attention as a new promising approach for molecular design. All models reported so far are based on either variational autoencoder (VAE) or generative adversarial network (GAN). Here we propose a new type model based on an adversarially regularized autoencoder (ARAE). It basically uses latent variables like VAE, but the distribution of the latent variables is obtained by adversarial training like in GAN. The latter is intended to avoid both inappropriate approximation of posterior distribution in VAE and difficulty in handling discrete variables in GAN. Our benchmark study showed that ARAE indeed outperformed conventional models in terms of validity, uniqueness, and novelty per generated molecule. We also demonstrated successful conditional generation of drug-like molecules with ARAE for both cases of single and multiple properties control. As a potential real-world application, we could generate EGFR inhibitors sharing the scaffolds of known active molecules while satisfying drug-like conditions simultaneously.

adversarial training, latent vector, molecule, (14 more...)

1912.05617

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Zhu, Hufei, Wei, Chenghao

Efficient Ridge Solutions for the Incremental Broad Learning System on Added Inputs by Updating the Inverse or the Inverse Cholesky Factor of the Hermitian matrix in the Ridge Inverse

This brief proposes two BLS algorithms to improve the existing BLS for new added inputs in [7]. The proposed BLS algorithms avoid computing the ridge inverse, by computing the ridge solution (i.e., the output weights) from the inverse or the inverse Cholesky factor of the Hermitian matrix in the ridge inverse. The proposed BLS algorithm 1 updates the inverse of the Hermitian matrix by the matrix inversion lemma [12]. To update the upper-triangular inverse Cholesky factor of the Hermitian matrix, the proposed BLS algorithm 2 multiplies the inverse Cholesky factor with an upper-triangular intermediate matrix, which is computed by a Cholesky factorization or an inverse Cholesky factorization. Assume that the newly added input matrix corresponding to the added inputs is p * k, where p and k are the number of added training samples and the total node number, respectively. When p > k, the inverse of a sum of matrices [11] is utilized to compute the intermediate variables by a smaller matrix inverse in the proposed algorithm 1, or by a smaller inverse Cholesky factorization in the proposed algorithm 2. Usually the Hermitian matrix in the ridge inverse is smaller than the ridge inverse. Thus the proposed algorithms 1 and 2 require less flops (floating-point operations) than the existing BLS algorithm, which is verified by the theoretical flops calculation. In numerical experiments, the speedups for the case of p > k in each additional training time of the proposed BLS algorithms 1 and 2 over the existing algorithm are 1.95 - 5.43 and 2.29 - 6.34, respectively, and the speedups for the case of p < k are 8.83 - 10.21 and 2.28 - 2.58, respectively.

algorithm, algorithm 1, inverse, (15 more...)

1911.07292

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)