AITopics

This report to our stage 1 submission to the NeurIPS 2019 disentanglement challenge presents a simple image preprocessing method for training VAEs leading to improved disentanglement compared to directly using the images. In particular, we propose to use regionally aggregated feature maps extracted from CNNs pretrained on ImageNet. Our method achieved the 2nd place in stage 1 of the challenge (AIcrowd, 2019).

feature map, feature vector, improved disentanglement, (13 more...)

2002.10003

Country: Europe > Germany (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Boustati, Ayman, Akyildiz, Ömer Deniz, Damoulas, Theodoros, Johansen, Adam

Generalized Bayesian Filtering via Sequential Monte Carlo

We introduce a framework for inference in general state-space hidden Markov models (HMMs) under likelihood misspecification. In particular, we leverage the loss-theoretic perspective of generalized Bayesian inference (GBI) to define generalized filtering recursions in HMMs, that can tackle the problem of inference under model misspecification. In doing so, we arrive at principled procedures for robust inference against observation contamination through the $\beta$-divergence. Operationalizing the proposed framework is made possible via sequential Monte Carlo methods (SMC). The standard particle methods, and their associated convergence results, are readily generalized to the new setting. We demonstrate our approach to object tracking and Gaussian process regression problems, and observe improved performance over standard filtering algorithms.

coverage 0, inference, likelihood, (15 more...)

2002.09998

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Rapidly Personalizing Mobile Health Treatment Policies with Limited Data

Tomkins, Sabina, Liao, Peng, Klasnja, Predrag, Yeung, Serena, Murphy, Susan

Mobile health (mHealth) interventions deliver treatments to users to support healthy behaviors. These interventions offer an opportunity for social impact in a diverse range of domains from substance abuse (Rabbi et al., 2017), to disease management (Hamine et al., 2015) to physical inactivity (Consolvo et al., 2008). For example, to help users increase their physical activity, an mHealth application might send a walking suggestions at times and in locations when a user is likely to be able to pursue the suggestions. The promise of mHealth hinges on the ability to provide interventions at times when users need the support and are receptive to it (Nahum-Shani et al., 2017). Consequently, in developing reinforcement learning (RL) algorithms for mHealth our goal is to be able to learn an optimal policy of when and how to intervene for a given user and context.

algorithm, decision time, intelligentpooling, (16 more...)

2002.09971

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Michigan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.89)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Mitigating Class Boundary Label Uncertainty to Reduce Both Model Bias and Variance

Almeida, Matthew, Ding, Wei, Crouter, Scott, Chen, Ping

The study of model bias and variance with respect to decision boundaries is critically important in supervised classification. There is generally a tradeoff between the two, as fine-tuning of the decision boundary of a classification model to accommodate more boundary training samples (i.e., higher model complexity) may improve training accuracy (i.e., lower bias) but hurt generalization against unseen data (i.e., higher variance). By focusing on just classification boundary fine-tuning and model complexity, it is difficult to reduce both bias and variance. To overcome this dilemma, we take a different perspective and investigate a new approach to handle inaccuracy and uncertainty in the training data labels, which are inevitable in many applications where labels are conceptual and labeling is performed by human annotators. The process of classification can be undermined by uncertainty in the labels of the training data; extending a boundary to accommodate an inaccurately labeled point will increase both bias and variance. Our novel method can reduce both bias and variance by estimating the pointwise label uncertainty of the training set and accordingly adjusting the training sample weights such that those samples with high uncertainty are weighted down and those with low uncertainty are weighted up. In this way, uncertain samples have a smaller contribution to the objective function of the model's learning algorithm and exert less pull on the decision boundary. In a real-world physical activity recognition case study, the data presents many labeling challenges, and we show that this new approach improves model performance and reduces model variance.

decision boundary, neighborhood, variance, (13 more...)

2002.09963

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.15)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Tennessee > Knox County > Knoxville (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Consumer Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Aketi, Sai Aparna, Roy, Sourjya, Raghunathan, Anand, Roy, Kaushik

Gradual Channel Pruning while Training using Feature Relevance Scores for Convolutional Neural Networks

The enormous inference cost of deep neural networks can be scaled down by network compression. Pruning is one of the predominant approaches used for deep network compression. However, existing pruning techniques have one or more of the following limitations: 1) Additional energy cost on top of the compute heavy training stage due to pruning and fine-tuning stages, 2) Layer-wise pruning based on the statistics of a particular, ignoring the effect of error propagation in the network, 3) Lack of an efficient estimate for determining the important channels globally, 4) Unstructured pruning requires specialized hardware for effective use. To address all the above issues, we present a simple-yet-effective gradual channel pruning while training methodology using a novel data driven metric referred as Feature relevance score. The proposed technique gets rid of the additional retraining cycles by pruning least important channels in a structured fashion at fixed intervals during the actual training phase. Feature relevance scores help in efficiently evaluating the contribution of each channel towards the discriminative power of the network. We demonstrate the effectiveness of the proposed methodology on architectures such as VGG and ResNet using datasets such as CIFAR-10, CIFAR-100 and ImageNet, and successfully achieve significant model compression while trading off less than $1\%$ accuracy. Notably on CIFAR-10 dataset trained on ResNet-110, our approach achieves $2.4\times$ compression and a $56\%$ reduction in FLOPs with an accuracy drop of $0.01\%$ compared to the unpruned network.

feature relevance score, pruning, relevance score, (13 more...)

2002.09958

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Banerjee, Arindam, Chen, Tiancong, Zhou, Yingxue

De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and Non-smooth Predictors

In spite of several notable efforts, explaining the generalization of deterministic deep nets, e.g., ReLU-nets, has remained challenging. Existing approaches usually need to bound the Lipschitz constant of such deep nets but such bounds have been shown to increase substantially with the number of training samples yielding vacuous generalization bounds [Nagarajan and Kolter, 2019a]. In this paper, we present new de-randomized PAC-Bayes margin bounds for deterministic non-convex and non-smooth predictors, e.g., ReLU-nets. The bounds depend on a trade-off between the $L_2$-norm of the weights and the effective curvature (`flatness') of the predictor, avoids any dependency on the Lipschitz constant, and yield meaningful (decreasing) bounds with increase in training set size. Our analysis first develops a de-randomization argument for non-convex but smooth predictors, e.g., linear deep networks (LDNs). We then consider non-smooth predictors which for any given input realize as a smooth predictor, e.g., ReLU-nets become some LDN for a given input, but the realized smooth predictor can be different for different inputs. For such non-smooth predictors, we introduce a new PAC-Bayes analysis that maintains distributions over the structure as well as parameters of smooth predictors, e.g., LDNs corresponding to ReLU-nets, which after de-randomization yields a bound for the deterministic non-smooth predictor. We present empirical results to illustrate the efficacy of our bounds over changing training set size and randomness in labels.

assumption 1, generalization, predictor, (15 more...)

2002.09956

Country: North America > United States > Minnesota (0.04)

Genre: Research Report (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Óskarsdóttir, María, Bravo, Cristián, Sarraute, Carlos, Vanthienen, Jan, Baesens, Bart

The Value of Big Data for Credit Scoring: Enhancing Financial Inclusion using Mobile Phone Data and Social Network Analytics

Credit scoring is without a doubt one of the oldest applications of analytics. In recent years, a multitude of sophisticated classification techniques have been developed to improve the statistical performance of credit scoring models. Instead of focusing on the techniques themselves, this paper leverages alternative data sources to enhance both statistical and economic model performance. The study demonstrates how including call networks, in the context of positive credit information, as a new Big Data source has added value in terms of profit by applying a profit measure and profit-based feature selection. A unique combination of datasets, including call-detail records, credit and debit account information of customers is used to create scorecards for credit card applicants. Call-detail records are used to build call networks and advanced social network analytics techniques are applied to propagate influence from prior defaulters throughout the network to produce influence scores. The results show that combining call-detail records with traditional data in credit scoring models significantly increases their performance when measured in AUC. In terms of profit, the best model is the one built with only calling behavior features. In addition, the calling behavior features are the most predictive in other models, both in terms of statistical and economic performance. The results have an impact in terms of ethical use of call-detail records, regulatory implications, financial inclusion, as well as data sharing and privacy.

customer, delinquent customer, information, (16 more...)

doi: 10.1016/j.asoc.2018.10.004

2002.09931

Country:

Asia > China (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Basel-City > Basel (0.05)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Banking & Finance > Credit (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Mobile (1.00)
(3 more...)

Wiggers, Auke J., Hoogeboom, Emiel

Predictive Sampling with Forecasting Autoregressive Models

Autoregressive models (ARMs) currently hold state-of-the-art performance in likelihood-based modeling of image and audio data. Generally, neural network based ARMs are designed to allow fast inference, but sampling from these models is impractically slow. In this paper, we introduce the predictive sampling algorithm: a procedure that exploits the fast inference property of ARMs in order to speed up sampling, while keeping the model intact. We propose two variations of predictive sampling, namely sampling with ARM fixed-point iteration and learned forecasting modules. Their effectiveness is demonstrated in two settings: i) explicit likelihood modeling on binary MNIST, SVHN and CIFAR10, and ii) discrete latent modeling in an autoencoder trained on SVHN, CIFAR10 and Imagenet32. Empirically, we show considerable improvements over baselines in number of ARM inference calls and sampling speed.

forecasting mistake, forecasting module, iteration, (13 more...)

2002.09928

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning

Oldenhof, Martijn, Arany, Adam, Moreau, Yves, Simm, Jaak

In drug discovery, knowledge of the graph structure of chemical compounds is essential. Many thousands of scientific articles in chemistry and pharmaceutical sciences have investigated chemical compounds, but in cases the details of the structure of these chemical compounds is published only as an images. A tool to analyze these images automatically and convert them into a chemical graph structure would be useful for many applications, such drug discovery. A few such tools are available and they are mostly derived from optical character recognition. However, our evaluation of the performance of those tools reveals that they make often mistakes in detecting the correct bond multiplicity and stereochemical information. In addition, errors sometimes even lead to missing atoms in the resulting graph. In our work, we address these issues by developing a compound recognition method based on machine learning. More specifically, we develop a deep neural network model for optical compound recognition. The deep learning solution presented here consists of a segmentation model, followed by three classification models that predict atom locations, bonds and charges. Furthermore, this model not only predicts the graph structure of the molecule but also produces all information necessary to relate each component of the resulting graph to the source image. This solution is scalable and could rapidly process thousands of images. Finally, we compare empirically the proposed method to a well-established tool and observe significant error reductions.

chemical structure, classification network, segmentation network, (13 more...)

2002.09914

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)

Genre: Research Report (0.83)

Industry:

Materials > Chemicals (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

End-To-End Graph-based Deep Semi-Supervised Learning

Wang, Zihao, Tu, Enmei, Meng, Zhou

The quality of a graph is determined jointly by three key factors of the graph: nodes, edges and similarity measure (or edge weights), and is very crucial to the success of graph-based semi-supervised learning (SSL) approaches. Recently, dynamic graph, which means part/all its factors are dynamically updated during the training process, has demonstrated to be promising for graph-based semi-supervised learning. However, existing approaches only update part of the three factors and keep the rest manually specified during the learning stage. In this paper, we propose a novel graph-based semi-supervised learning approach to optimize all three factors simultaneously in an end-to-end learning fashion. To this end, we concatenate two neural networks (feature network and similarity network) together to learn the categorical label and semantic similarity, respectively, and train the networks to minimize a unified SSL objective function. We also introduce an extended graph Laplacian regularization term to increase training efficiency. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our approach.

graph, learning, similarity, (13 more...)

2002.09891

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report (0.50)

Industry: Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)