AITopics

2007.06776

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Achddou, Raphaël, di Martino, J. Matias, Sapiro, Guillermo

Nested Learning For Multi-Granular Tasks

arXiv.org Machine LearningJul-13-2020

Standard deep neural networks (DNNs) are commonly trained in an end-to-end fashion for specific tasks such as object recognition, face identification, or character recognition, among many examples. This specificity often leads to overconfident models that generalize poorly to samples that are not from the original training distribution. Moreover, such standard DNNs do not allow to leverage information from heterogeneously annotated training data, where for example, labels may be provided with different levels of granularity. Furthermore, DNNs do not produce results with simultaneous different levels of confidence for different levels of detail, they are most commonly an all or nothing approach. To address these challenges, we introduce the concept of nested learning: how to obtain a hierarchical representation of the input such that a coarse label can be extracted first, and sequentially refine this representation, if the sample permits, to obtain successively refined predictions, all of them with the corresponding confidence. We explicitly enforce this behavior by creating a sequence of nested information bottlenecks. Looking at the problem of nested learning from an information theory perspective, we design a network topology with two important properties. First, a sequence of low dimensional (nested) feature embeddings are enforced. Then we show how the explicit combination of nested outputs can improve both the robustness and the accuracy of finer predictions. Experimental results on Cifar-10, Cifar-100, MNIST, Fashion-MNIST, Dbpedia, and Plantvillage demonstrate that nested learning outperforms the same network trained in the standard end-to-end fashion.

artificial intelligence, machine learning, prediction, (17 more...)

2007.06402

Country:

North America > Canada > Ontario > Toronto (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kim, Byung-Hak, Sridharan, Seshadri, Atwal, Andy, Ganapathi, Varun

Deep Claim: Payer Response Prediction from Claims Data with Deep Learning

arXiv.org Machine LearningJul-13-2020

Each year, almost 10% of claims are denied by payers (i.e., health insurance plans). With the cost to recover these denials and underpayments, predicting payer response (likelihood of payment) from claims data with a high degree of accuracy and precision is anticipated to improve healthcare staffs' performance productivity and drive better patient financial experience and satisfaction in the revenue cycle (Barkholz, 2017). However, constructing advanced predictive analytics models has been considered challenging in the last twenty years. That said, we propose a (low-level) context-dependent compact representation of patients' historical claim records by effectively learning complicated dependencies in the (high-level) claim inputs. Built on this new latent representation, we demonstrate that a deep learning-based framework, Deep Claim, can accurately predict various responses from multiple payers using 2,905,026 de-identified claims data from two US health systems. Deep Claim's improvements over carefully chosen baselines in predicting claim denials are most pronounced as 22.21% relative recall gain (at 95% precision) on Health System A, which implies Deep Claim can find 22.21% more denials than the best baseline system.

artificial intelligence, data mining, machine learning, (14 more...)

2007.06229

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(11 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.34)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Banking & Finance > Insurance (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJul-10-2020

Population-Based Black-Box Optimization for Biological Sequence Design

Angermueller, Christof, Belanger, David, Gane, Andreea, Mariet, Zelda, Dohan, David, Murphy, Kevin, Colwell, Lucy, Sculley, D

The use of black-box optimization for the design of new biological sequences is an emerging research area with potentially revolutionary impact. The cost and latency of wet-lab experiments requires methods that find good sequences in few experimental rounds of large batches of sequences--a setting that off-the-shelf black-box optimization methods are ill-equipped to handle. We find that the performance of existing methods varies drastically across optimization tasks, posing a significant obstacle to real-world applications. To improve robustness, we propose Population-Based Black-Box Optimization (P3BO), which generates batches of sequences by sampling from an ensemble of methods. The number of sequences sampled from any method is proportional to the quality of sequences it previously proposed, allowing P3BO to combine the strengths of individual methods while hedging against their innate brittleness. Adapting the hyper-parameters of each of the methods online using evolutionary optimization further improves performance. Through extensive experiments on in-silico optimization tasks, we show that P3BO outperforms any single method in its population, proposing higher quality sequences as well as more diverse batches. As such, P3BO and Adaptive-P3BO are a crucial step towards deploying ML to real-world sequence design.

artificial intelligence, evolutionary algorithm, machine learning, (15 more...)

doi: 10.1111/j.1365-246X.2006.03227.x

2006.03227

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Transportation > Air (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningJul-7-2020

README: REpresentation learning by fairness-Aware Disentangling MEthod

Park, Sungho, Kim, Dohyung, Hwang, Sunhee, Byun, Hyeran

Fair representation learning aims to encode invariant representation with respect to the protected attribute, such as gender or age. In this paper, we design Fairness-aware Disentangling Variational AutoEncoder (FD-VAE) for fair representation learning. This network disentangles latent space into three subspaces with a decorrelation loss that encourages each subspace to contain independent information: 1) target attribute information, 2) protected attribute information, 3) mutual attribute information. After the representation learning, this disentangled representation is leveraged for fairer downstream classification by excluding the subspace with the protected attribute information. We demonstrate the effectiveness of our model through extensive experiments on CelebA and UTK Face datasets. Our method outperforms the previous state-of-the-art method by large margins in terms of equal opportunity and equalized odds.

artificial intelligence, information, machine learning, (14 more...)

2007.03775

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(3 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

arXiv.org Artificial IntelligenceJul-6-2020

Exploring Heterogeneous Information Networks via Pre-Training

Fang, Yang, Zhao, Xiang, Xiao, Weidong

To explore heterogeneous information networks (HINs), network representation learning (NRL) is proposed, which represents a network in a low-dimension space. Recently, graph neural networks (GNNs) have drawn a lot of attention which are very expressive for mining a HIN, while they suffer from low efficiency issue. In this paper, we propose a pre-training and fine-tuning framework PF-HIN to capture the features of a HIN. Unlike traditional GNNs that have to train the whole model for each downstream task, PF-HIN only needs to fine-tune the model using the pre-trained parameters and minimal extra task-specific parameters, thus improving the model efficiency and effectiveness. Specifically, in pre-training phase, we first use a ranking-based BFS strategy to form the input node sequence. Then inspired by BERT, we adopt deep bi-directional transformer encoders to train the model, which is a variant of GNN aggregator that is more powerful than traditional deep neural networks like CNN and LSTM. The model is pre-trained based on two tasks, i.e., masked node modeling (MNM) and adjacent node prediction (ANP). Additionally, we leverage factorized embedding parameterization and cross-layer parameter sharing to reduce the parameters. In fine-tuning stage, we choose four benchmark downstream tasks, i.e., link prediction, similarity search, node classification and node clustering. We use node sequence pairs as input for link prediction and similarity search, and a single node sequence as input for node classification and clustering. The experimental results of the above tasks on four real-world datasets verify the advancement of PF-HIN, as it outperforms state-of-the-art alternatives consistently and significantly.

artificial intelligence, machine learning, node, (17 more...)

2007.03184

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.14)
(25 more...)

Genre: Research Report > Experimental Study (0.67)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-3-2020

Transfer Learning for EEG-Based Brain-Computer Interfaces: A Review of Progress Made Since 2016

Wu, Dongrui, Xu, Yifan, Lu, Bao-Liang

A brain-computer interface (BCI) enables a user to communicate with a computer directly using brain signals. The most common non-invasive BCI modality, electroencephalogram (EEG), is sensitive to noise/artifact and suffers between-subject/within-subject non-stationarity. Therefore, it is difficult to build a generic pattern recognition model in an EEG-based BCI system that is optimal for different subjects, during different sessions, for different devices and tasks. Usually, a calibration session is needed to collect some training data for a new subject, which is time-consuming and user unfriendly. Transfer learning (TL), which utilizes data or knowledge from similar or relevant subjects/sessions/devices/tasks to facilitate learning for a new subject/session/device/task, is frequently used to reduce the amount of calibration effort. This paper reviews journal publications on TL approaches in EEG-based BCIs in the last few years, i.e., since 2016. Six paradigms and applications -- motor imagery, event-related potentials, steady-state visual evoked potentials, affective BCIs, regression problems, and adversarial attacks -- are considered. For each paradigm/application, we group the TL approaches into cross-subject/session, cross-device, and cross-task settings and review them separately. Observations and conclusions are made at the end of the paper, which may point to future research directions.

artificial intelligence, classification, machine learning, (17 more...)

2004.06286

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > China > Shanghai > Shanghai (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(19 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.71)
(2 more...)

arXiv.org Machine LearningJun-30-2020

A Tutorial on VAEs: From Bayes' Rule to Lossless Compression

Yu, Ronald

The Variational Auto-Encoder (VAE) belongs to a class of models, which we will refer to as deep maximum likelihood models, that uses a deep neural network to learn a maximum likelihood model for some input data. They are perhaps the most simple and efficient deep maximum likelihood model available, and have thus gained popularity in representation learning and generative image modeling. Unfortunately, in my opinion, in some circles the term "VAE" has become somewhat synonymous with "an auto-encoder with stochastic regularization that generates useful or beautiful samples", which has led to various misconceptions about VAEs. In this tutorial, we will return to the probabilistic and information theoretic roots of VAEs, clarify common misconceptions about VAEs, and look at a toy example on 2D data that will illustrate the capabilities and limitations of VAEs. In Section 2, we will give an overview of what is a maximum likelihood model and what a VAE looks like.

artificial intelligence, machine learning, vae, (15 more...)

2006.10273

Country:

Europe > France (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Overview (1.00)

Industry: Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningJun-29-2020

Deep Learning as a Competitive Feature-Free Approach for Automated Algorithm Selection on the Traveling Salesperson Problem

Seiler, Moritz, Pohl, Janina, Bossek, Jakob, Kerschke, Pascal, Trautmann, Heike

The Traveling Salesperson Problem (TSP) is a classical N P-hard optimization problem of utmost relevance, e.g., in transportation logistics, bioinformatics or circuit board fabrication. The goal is to route a salesperson through a set of cities such that each city is visited exactly once and the tour is of minimal length. In the past decades tremendous progress has been made in the development of high-performing heuristic TSP solvers. The local search-based Lin-Kernigham Heuristic (LKH) [14] and the genetic algorithm Edge-Assembly-Crossover (EAX) [35], along with their respective restart versions introduced in Kotthoff et al. [25], undeniably pose the state-of-the-art in inexact TSP solving. Automated Algorithm Selection (AS), originally proposed by Rice [39] back in 1976, is a powerful framework to predict the best-performing solver(s) from a portfolio of candidate solvers by means of machine learning. It has been successfully applied to a wide spectrum of challenging optimization problems in both the combinatorial [24, 29, 30, 40, 48] and continuous domain [21, 4] with partly astonishing performance gains - see the recent survey by Kerschke et al. [19] for a comprehensive overview. In particular, the TSP was subject to several successful ASstudies [25, 20, 33, 34, 37] which exploited the complementary performance profiles of simple heuristics on the one hand and the state-of-the-art solvers LKH and EAX on classical TSP benchmark sets on the other hand.

artificial intelligence, machine learning, selection, (15 more...)

2006.15968

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Germany > North Rhine-Westphalia > Münster Region > Münster (0.05)
Europe > Italy (0.04)
(10 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

O'Shaughnessy, Matthew, Canal, Gregory, Connor, Marissa, Davenport, Mark, Rozell, Christopher

Generative causal explanations of black-box classifiers

arXiv.org Artificial IntelligenceJun-24-2020

We develop a method for generating causal post-hoc explanations of black-box classifiers based on a learned low-dimensional representation of the data. The explanation is causal in the sense that changing learned latent factors produces a change in the classifier output statistics. To construct these explanations, we design a learning framework that leverages a generative model and information-theoretic measures of causal influence. Our objective function encourages both the generative model to faithfully represent the data distribution and the latent factors to have a large causal influence on the classifier output. Our method learns both global and local explanations, is compatible with any classifier that admits class probabilities and a gradient, and does not require labeled attributes or knowledge of causal structure. Using carefully controlled test cases, we provide intuition that illuminates the function of our causal objective. We then demonstrate the practical utility of our method on image recognition tasks.

explanation, machine learning, natural language, (17 more...)

2006.13913

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(14 more...)

Genre: Research Report (0.63)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)