AITopics

Summarization of opinions is the process of automatically creating text summaries that reflect subjective information expressed in input documents, such as product reviews. While most previous research in opinion summarization has focused on the extractive setting, i.e. selecting fragments of the input documents to produce a summary, we let the model generate novel sentences and hence produce fluent text. Supervised abstractive summarization methods typically rely on large quantities of document-summary pairs which are expensive to acquire. In contrast, we consider the unsupervised setting, in other words, we do not use any summaries in training. We define a generative model for a multi-product review collection. Intuitively, we want to design such a model that, when generating a new review given a set of other reviews of the product, we can control the `amount of novelty' going into the new review or, equivalently, vary the degree of deviation from the input reviews. At test time, when generating summaries, we force the novelty to be minimal, and produce a text reflecting consensus opinions. We capture this intuition by defining a hierarchical variational autoencoder model. Both individual reviews and products they correspond to are associated with stochastic latent codes, and the review generator ('decoder') has direct access to the text of input reviews through the pointer-generator mechanism. In experiments on Amazon and Yelp data, we show that in this model by setting at test time the review's latent code to its mean, we produce fluent and coherent summaries.

information, proceedings, summarization, (15 more...)

1911.02247

Country:

North America > United States (0.28)
North America > Canada > Alberta (0.14)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Consumer Products & Services > Restaurants (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Braverman, Mark, Hazan, Elad, Simchowitz, Max, Woodworth, Blake

The gradient complexity of linear regression

We investigate the computational complexity of several basic linear algebra primitives, including largest eigenvector computation and linear regression, in the computational model that allows access to the data via a matrix-vector product oracle. We show that for polynomial accuracy, $\Theta(d)$ calls to the oracle are necessary and sufficient even for a randomized algorithm. Our lower bound is based on a reduction to estimating the least eigenvalue of a random Wishart matrix. This simple distribution enables a concise proof, leveraging a few key properties of the random Wishart ensemble.

algorithm, matrix, query, (14 more...)

1911.02212

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.60)

Machine Learning using the Variational Predictive Information Bottleneck with a Validation Set

Mukherjee, Sayandev

Zellner (1988) modeled statistical inference in terms of information processing and postulated the Information Conservation Principle (ICP) between the input and output of the information processing block, showing that this yielded Bayesian inference as the optimum information processing rule. Recently, Alemi (2019) reviewed Zellner's work in the context of machine learning and showed that the ICP could be seen as a special case of a more general optimum information processing criterion, namely the Predictive Information Bottleneck Objective. However, Alemi modeled the model training step in machine learning as using training and test data sets only, and did not account for the use of a validation data set during training. The present note is an attempt to extend Alemi's information processing formulation of machine learning, and the predictive information bottleneck objective for model training, to the widely-used scenario where training utilizes not only a training but also a validation data set.

information, mutual information, validation, (7 more...)

1911.0221

Country:

North America > United States > California > Santa Clara County > Sunnyvale (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Odetola, Tolulope A., Oderhohwo, Ogheneuriri, Hasan, Syed Rafay

A Scalable Multilabel Classification to Deploy Deep Learning Architectures For Edge Devices

Convolution Neural Networks (CNN) have performed well in many applications such as object detection, pattern recognition, video surveillance and so on. CNN carryout feature extraction on labelled data to perform classification. Multi-label classification assigns more than one label to a particular data sample in a data set. In multi-label classification, properties of a data point that are considered to be mutually exclusive are classified. However, existing multi-label classification requires some form of data pre-processing that involves image training data cropping or image tiling. The computation and memory requirement of these multi-label CNN models makes their deployment on edge devices challenging. In this paper, we propose a methodology that solves this problem by extending the capability of existing multi-label classification and provide models with lower latency that requires smaller memory size when deployed on edge devices. We make use of a single CNN model designed with multiple loss layers and multiple accuracy layers. This methodology is tested on state-of-the-art deep learning algorithms such as AlexNet, GoogleNet and SqueezeNet using the Stanford Cars Dataset and deployed on Raspberry Pi3. From the results the proposed methodology achieves comparable accuracy with 1.8x less MACC operation, 0.97x reduction in latency and 0.5x, 0.84x and 0.97x reduction in size for the generated AlexNet, GoogleNet and SqueezeNet CNN models respectively when compared to conventional ways of achieving multi-label classification like hard-coding multi-label instances into single labels. The methodology also yields CNN models that achieve 50\% less MACC operations, 50% reduction in latency and size of generated versions of AlexNet, GoogleNet and SqueezeNet respectively when compared to conventional ways using 2 different single-labelled models to achieve multi-label classification.

classification, label category, multi-label classification, (15 more...)

1911.02098

Country:

North America > United States > Tennessee > Putnam County > Cookeville (0.05)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
Africa > Middle East > Morocco (0.04)
Africa > Middle East > Egypt (0.04)

Genre: Research Report (0.64)

Industry:

Energy (0.47)
Commercial Services & Supplies (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Pavlov, Sergey, Artemov, Alexey, Sharaev, Maksim, Bernstein, Alexander, Burnaev, Evgeny

Weakly Supervised Fine Tuning Approach for Brain Tumor Segmentation Problem

Segmentation of tumors in brain MRI images is a challenging task, where most recent methods demand large volumes of data with pixel-level annotations, which are generally costly to obtain. In contrast, image-level annotations, where only the presence of lesion is marked, are generally cheap, generated in far larger volumes compared to pixel-level labels, and contain less labeling noise. In the context of brain tumor segmentation, both pixel-level and image-level annotations are commonly available; thus, a natural question arises whether a segmentation procedure could take advantage of both. In the present work we: 1) propose a learning-based framework that allows simultaneous usage of both pixel- and image-level annotations in MRI images to learn a segmentation model for brain tumor; 2) study the influence of comparative amounts of pixel- and image-level annotations on the quality of brain tumor segmentation; 3) compare our approach to the traditional fully-supervised approach and show that the performance of our method in terms of segmentation quality may be competitive.

annotation, segmentation, supervised learning step, (12 more...)

1911.01738

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
Asia > Russia (0.05)

Genre:

Research Report (0.67)
Workflow (0.47)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Global Adaptive Generative Adjustment

Wang, Bin, Wang, Xiaofei, Guo, Jianhua

Many traditional signal recovery approaches can behave well basing on the penalized likelihood. However, they have to meet with the difficulty in the selection of hyperparameters or tuning parameters in the penalties. In this article, we propose a global adaptive generative adjustment (GAGA) algorithm for signal recovery, in which multiple hyperpameters are automatically learned and alternatively updated with the signal. W e further prove that the output of our algorithm directly guarantees the consistency of model selection and the asymptotic normality of signal estimate. Moreover, we also propose a variant GAGA algorithm for improving the computational efficiency in the high-dimensional data analysis. Finally, in the simulated experiment, we consider the consistency of the outputs of our algorithms, and compare our algorithms to other penalized likelihood methods: the Adaptive LASSO, the SCAD and the MCP . The simulation results support the efficiency of our algorithms for signal recovery, and demonstrate that our algorithms outperform the other algorithms.

algorithm, global adaptive generative adjustment, selection, (11 more...)

1911.00658

Country: Asia > China > Jilin Province > Changchun (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Structure of Deep Neural Networks with a Priori Information in Wireless Tasks

Guo, Jia, Yang, Chenyang

--Deep neural networks (DNNs) have been employed for designing wireless networks in many aspects, such as transceiver optimization, resource allocation, and information prediction. Existing works either use fully-connected DNN or the DNNs with specific structures that are designed in other domains. In this paper, we show that a priori information widely existed in wireless tasks is permutation invariant. For these tasks, we propose a DNN with special structure, where the weight matrices between layers of the DNN only consist of two smaller sub-matrices. By such way of parameter sharing, the number of model parameters reduces, giving rise to low sample and computational complexity for training a DNN. We take predictive resource allocation as an example to show how the designed DNN can be applied for learning the optimal policy with unsupervised learning. Simulations results validate our analysis and show dramatic gain of the proposed structure in terms of reducing training complexity. I NTRODUCTION Deep neural networks (DNNs) have been introduced to design wireless networks recently in various aspects, ranging from signal detection and channel estimation [1], multi-cell coordinated beamforming [2], inter-cell interference management [3], resource allocation [4]-[7], traffic load prediction [8], and uplink/downlink channel calibration [9], etc.

complexity, dnn, permutation invariant, (11 more...)

1910.13728

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-6-2019

Probabilistic Similarity Networks

Heckerman, David

Normative expert systems have not become commonplace because they have been difficult to build and use. Over the past decade, however, researchers have developed the influence diagram, a graphical representation of a decision maker's beliefs, alternatives, and preferences that serves as the knowledge base of a normative expert system. Most people who have seen the representation find it intuitive and easy to use. Consequently, the influence diagram has overcome significantly the barriers to constructing normative expert systems. Nevertheless, building influence diagrams is not practical for extremely large and complex domains. In this book, I address the difficulties associated with the construction of the probabilistic portion of an influence diagram, called a knowledge map, belief network, or Bayesian network. I introduce two representations that facilitate the generation of large knowledge maps. In particular, I introduce the similarity network, a tool for building the network structure of a knowledge map, and the partition, a tool for assessing the probabilities associated with a knowledge map. I then use these representations to build Pathfinder, a large normative expert system for the diagnosis of lymph-node diseases (the domain contains over 60 diseases and over 100 disease findings). In an early version of the system, I encoded the knowledge of the expert using an erroneous assumption that all disease findings were independent, given each disease. When the expert and I attempted to build a more accurate knowledge map for the domain that would capture the dependencies among the disease findings, we failed. Using a similarity network, however, we built the knowledge-map structure for the entire domain in approximately 40 hours. Furthermore, the partition representation reduced the number of probability assessments required by the expert from 75,000 to 14,000.

medical computer science group, ompr ehensive similarity network, ordinary global kno wledge map, (12 more...)

arXiv.org Artificial Intelligence

1911.06263

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > Minnesota (0.04)
(12 more...)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceNov-6-2019

Conversation Generation with Concept Flow

Zhang, Houyu, Liu, Zhenghao, Xiong, Chenyan, Liu, Zhiyuan

Human conversations naturally evolve around related entities and connected concepts, while may also shift from topic to topic. This paper presents ConceptFlow, which leverages commonsense knowledge graphs to explicitly model such conversation flows for better conversation response generation. ConceptFlow grounds the conversation inputs to the latent concept space and represents the potential conversation flow as a concept flow along the commonsense relations. The concept is guided by a graph attention mechanism that models the possibility of the conversation evolving towards different concepts. The conversation response is then decoded using the encodings of both utterance texts and concept flows, integrating the learned conversation structure in the concept space. Our experiments on Reddit conversations demonstrate the advantage of ConceptFlow over previous commonsense aware dialog models and fine-tuned GPT -2 models, while using much fewer parameters but with explicit modeling of conversation structures. The rapid advancements of language modeling and natural language generation (NLG) techniques have enabled fully data-driven conversation models, which take user inputs (utterances) and directly generate natural language responses (Shang et al., 2015; Vinyals & Le, 2015; Li et al., 2016). On the other hand, the current generation models may still degenerate dull and repetitive contents (Holtz-man et al., 2019; Welleck et al., 2019), which, in conversation assistants, lead to irrelevant, off-topic, and non-useful responses that would damage user experiences (Tang et al., 2019; Zhang et al., 2018; Gao et al., 2019).

concept flow, conceptflow, representation, (14 more...)

arXiv.org Artificial Intelligence

1911.02707

Country: Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Fang, Zhou, Paliyawan, Pujana, Thawonmas, Ruck, Harada, Tomohiro

Towards An Angry-Birds-like Game System for Promoting Mental Well-being of Players Using Art-Therapy-embedded PCG

arXiv.org Artificial IntelligenceNov-6-2019

T owards an Angry-Birds-Like Game System for Promoting Mental Well-Being of Players Using Art-Therapy-Embedded Procedural Content Generation Zhou Fang 1, Pujana Paliyawan 2, Ruck Thawonmas 1 and Tomohiro Harada 1 1 College of Information Science and Engineering 2 Research Organization of Science and Technology Ritsumeikan University, Japan ruck@is.ritsumei.ac.jp Abstract -- This paper presents an integration of a game system and the art therapy concept for promoting the mental wellbeing of video game players. In the proposed game system, the player plays an Angry-Birds-like game in which levels in the game are generated based on images they draw. Upon finishing a game level, the player also receives positive feedback (praising words) toward their drawing and the generated level from an Art Therapy AI. The proposed system is composed of three major parts: (1) a drawing recognizer that identifies what object is drawn by the player (Sketcher), (2) a level generator that converts the drawing image into a pixel image, then a set of blocks representing a game level (PCG AI), and (3) the Art Therapy AI that encourages the player and improves their emotion. This paper describes an overview of the system and explains how its major components function.

art therapy, game level, therapy, (10 more...)

arXiv.org Artificial Intelligence

1911.02695

Country: Asia > Japan (0.26)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)