AITopics

We propose a novel graph-driven generative model, that unifies multiple heterogeneous learning tasks into the same framework. The proposed model is based on the fact that heterogeneous learning tasks, which correspond to different generative processes, often rely on data with a shared graph structure. Accordingly, our model combines a graph convolu-tional network (GCN) with multiple variational autoencoders, thus embedding the nodes of the graph ( i.e., samples for the tasks) in a uniform manner while specializing their organization and usage to different tasks. With a focus on healthcare applications (tasks), including clinical topic modeling, procedure recommendation and admission-type prediction, we demonstrate that our method successfully leverages information across different tasks, boosting performance in all tasks and outperforming existing state-of-the-art approaches.

admission, icd code, procedure recommendation, (14 more...)

1911.08709

Country: Asia > Middle East > Jordan (0.05)

Genre: Research Report (0.84)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs

Li, Cheng, Dakkak, Abdul, Xiong, Jinjun, Hwu, Wen-mei

The past few years have seen a surge of applying Deep Learning (DL) models for a wide array of tasks such as image classification, object detection, machine translation, etc. While DL models provide an opportunity to solve otherwise intractable tasks, their adoption relies on them being optimized to meet latency and resource requirements. Benchmarking is a key step in this process but has been hampered in part due to the lack of representative and up-to-date benchmarking suites. This is exacerbated by the fast-evolving pace of DL models. This paper proposes DLBricks, a composable benchmark generation design that reduces the effort of developing, maintaining, and running DL benchmarks on CPUs. DLBricks decomposes DL models into a set of unique runnable networks and constructs the original model's performance using the performance of the generated benchmarks. DLBricks leverages two key observations: DL layers are the performance building blocks of DL models and layers are extensively repeated within and across DL models. Since benchmarks are generated automatically and the benchmarking time is minimized, DLBricks can keep up-to-date with the latest proposed models, relieving the pressure of selecting representative DL models. Moreover, DLBricks allows users to represent proprietary models within benchmark suites. We evaluate DLBricks using $50$ MXNet models spanning $5$ DL tasks on $4$ representative CPU systems. We show that DLBricks provides an accurate performance estimate for the DL models and reduces the benchmarking time across systems (e.g. within $95\%$ accuracy and up to $4.4\times$ benchmarking time speedup on Amazon EC2 c5.xlarge).

benchmark, dl model, dlbrick, (10 more...)

1911.07967

Country: North America > United States > Illinois > Champaign County > Urbana (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Jaiswal, Ayush, Moyer, Daniel, Steeg, Greg Ver, AbdAlmageed, Wael, Natarajan, Premkumar

Invariant Representations through Adversarial Forgetting

We propose a novel approach to achieving invariance for deep neural networks in the form of inducing amnesia to unwanted factors of data through a new adversarial forgetting mechanism. We show that the forgetting mechanism serves as an information-bottleneck, which is manipulated by the adversarial training to learn invariance to unwanted factors. Empirical results show that the proposed framework achieves state-of-the-art performance at learning invariance in both nuisance and bias settings on a diverse collection of datasets and tasks.

discriminator, information, invariance, (13 more...)

1911.0406

Country:

North America > United States > California (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Opitz, Juri, Burst, Sebastian

Macro F1 and Macro F1

The 'macro F1' metric is frequently used to evaluate binary, multi-class and multi-label classification problems. Yet, we find that there exist two different formulas to calculate this quantity. In this note, we show that only under rare circumstances, the two computations can be considered equivalent. More specifically, one formula well 'rewards' classifiers which produce a skewed error type distribution. In fact, the difference in outcome of the two computations can be as high as 0.5. Finally, we show that the two computations may not only diverge in their scalar result but also lead to different classifier rankings.

classifier, fractional, xss, (13 more...)

1911.03347

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.95)

Apprenticeship Learning via Frank-Wolfe

Zahavy, Tom, Cohen, Alon, Kaplan, Haim, Mansour, Yishay

T om Zahavy, Alon Cohen, Haim Kaplan and Yishay Mansour Google Research, Tel Aviv Abstract We consider the applications of the Frank-Wolfe (FW) algorithm for Apprenticeship Learning (AL). In this setting, we are given a Markov Decision Process (MDP) without an explicit reward function. Instead, we observe an expert that acts according to some policy, and the goal is to find a policy whose feature expectations are closest to those of the expert policy. We formulate this problem as finding the projection of the feature expectations of the expert on the feature expectations polytope - the convex hull of the feature expectations of all the deterministic policies in the MDP . We show that this formulation is equivalent to the AL objective and that solving this problem using the FW algorithm is equivalent well-known Projection method of Abbeel and Ng (2004). This insight allows us to analyze AL with tools from convex optimization literature and derive tighter convergence bounds on AL. Specifically, we show that a variation of the FW method that is based on taking "away steps" achieves a linear rate of convergence when applied to AL and that a stochastic version of the FW algorithm can be used to avoid precise estimation of feature expectations. We also experimentally show that this version outperforms the FW baseline. To the best of our knowledge, this is the first work that shows linear convergence rates for AL. 1 Introduction We consider sequential decision making in the Markov decision process (MDP) formalism. Given an MDP, the optimal policy and its value function are characterized by the Bellman equations and can be computed via value or policy iteration. This makes the MDP model useful in problems where we can specify the MDP model (states, actions, reward, transitions) appropriately. However, in many real-world problems, it is often hard to define a reward function, such that the optimal policy with respect to this reward produces the desired behavior.

algorithm, convergence, feature expectation, (17 more...)

1911.01679

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.24)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

RankML: a Meta Learning-Based Approach for Pre-Ranking Machine Learning Pipelines

Laadan, Doron, Vainshtein, Roman, Curiel, Yarden, Katz, Gilad, Rokach, Lior

The explosion of digital data has created multiple opportunities for organizations and individuals to leverage machine learning (ML) to transform the way they operate. However, the shortage of experts in the field of machine learning -- data scientists -- is often a setback to the use of ML. In an attempt to alleviate this shortage, multiple approaches for the automation of machine learning have been proposed in recent years. While these approaches are effective, they often require a great deal of time and computing resources. In this study, we propose RankML, a meta-learning based approach for predicting the performance of whole machine learning pipelines. Given a previously-unseen dataset, a performance metric, and a set of candidate pipelines, RankML immediately produces a ranked list of all pipelines based on their predicted performance. Extensive evaluation on 244 datasets, both in regression and classification tasks, shows that our approach either outperforms or is comparable to state-of-the-art, computationally heavy approaches while requiring a fraction of the time and computational cost.

dataset, pipeline, rankml, (15 more...)

1911.00108

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Consistent Robust Adversarial Prediction for General Multiclass Classification

Fathony, Rizal, Asif, Kaiser, Liu, Anqi, Bashiri, Mohammad Ali, Xing, Wei, Behpour, Sima, Zhang, Xinhua, Ziebart, Brian D.

Some example of the task are the zero-one loss classification where the predictor suffers a loss of one when making incorrect prediction and zero otherwise as well as the ordinal classification (also known as ordinal regression) where the predictor suffers a loss that increases as the prediction moves away from the true label. Empirical risk minimization (ERM) (Vapnik, 1992) is a standard approach for solving general multiclass classification problems by finding the classifier that minimizes a loss metric over the training data. However, since directly minimizing this loss over training data within the ERM framework is generally NPhard (Steinwart and Christmann, 2008), convex surrogate losses that can be efficiently optimized are employed to approximate the loss. Constructing surrogate losses for binary classification has been well studied, resulting in surrogate losses that enjoy desirable theoretical properties and good performance in practice. Among the popular examples are the logarithmic loss, which is minimized by the logistic regression classifier (McCullagh and Nelder, 1989), and the hinge loss, which is minimized by the support vector machine (SVM) (Boser et al., 1992; Cortes and Vapnik, 1995).

classification, loss metric, surrogate loss, (13 more...)

1812.07526

Country:

North America > United States > Wisconsin (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceNov-20-2019

Towards FAIR protocols and workflows: The OpenPREDICT case study

Celebi, Remzi, Moreira, Joao Rebelo, Hassan, Ahmed A., Ayyar, Sandeep, Ridder, Lars, Kuhn, Tobias, Dumontier, Michel

It is essential for the advancement of science that scientists and researchers share, reuse and reproduce workflows and protocols used by others. The FAIR principles are a set of guidelines that aim to maximize the value and usefulness of research data, and emphasize a number of important points regarding the means by which digital objects are found and reused by others. The question of how to apply these principles not just to the static input and output data but also to the dynamic workflows and protocols that consume and produce them is still under debate and poses a number of challenges. In this paper we describe our inclusive and overarching approach to apply the FAIR principles to workflows and protocols and demonstrate its benefits. We apply and evaluate our approach on a case study that consists of making the PREDICT workflow, a highly cited drug repurposing workflow, open and FAIR. This includes FAIRification of the involved datasets, as well as applying semantic technologies to represent and store data about the detailed versions of the general protocol, of the concrete workflow instructions, and of their execution traces. A semantic model was proposed to better address these specific requirements and were evaluated by answering competency questions. This semantic model consists of classes and relations from a number of existing ontologies, including Workflow4ever, PROV, EDAM, and BPMN. This allowed us then to formulate and answer new kinds of competency questions. Our evaluation shows the high degree to which our FAIRified OpenPREDICT workflow now adheres to the FAIR principles and the practicality and usefulness of being able to answer our new competency questions.

instruction, opredict, workflow, (15 more...)

arXiv.org Artificial Intelligence

1911.09531

Country:

Europe > Netherlands > Limburg > Maastricht (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (0.93)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

arXiv.org Artificial IntelligenceNov-20-2019

Integrating Automated Play in Level Co-Creation

Hoyt, Andrew, Guzdial, Matthew, Kumar, Yalini, Smith, Gillian, Riedl, Mark O.

In level co-creation an AI and human work together to create a video game level. One open challenge in level co-creation is how to empower human users to ensure particular qualities of the final level, such as challenge. There has been significant prior research into automated pathing and automated playtesting for video game levels, but not in how to incorporate these into tools. In this demonstration we present an improvement of the Morai Maker mixed-initiative level editor for Super Mario Bros. that includes automated pathing and challenge approximation features.

agent, computational intelligence, morai maker, (14 more...)

arXiv.org Artificial Intelligence

1911.09219

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Gao, Yifeng, Lin, Jessica

Discovering Subdimensional Motifs of Different Lengths in Large-Scale Multivariate Time Series

arXiv.org Artificial IntelligenceNov-20-2019

--Detecting repeating patterns of different lengths in time series, also called variable-length motifs, has received a great amount of attention by researchers and practitioners. Despite the significant progress that has been made in recent single dimensional variable-length motif discovery work, detecting variable-length subdimensional motifs --patterns that are simultaneously occurring only in a subset of dimensions in multivariate time series--remains a difficult task. The main challenge is scalability. On the one hand, the brute-force enumeration solution, which searches for motifs of all possible lengths, is very time consuming even in single dimensional time series. On the other hand, previous work show that index-based fixed-length approximate motif discovery algorithms such as random projection are not suitable for detecting variable-length motifs due to memory requirement. In this paper, we introduce an approximate variable-length subdimensional motif discovery algorithm called Collaborative HIerarchy based Motif Enumeration (CHIME) to efficiently detect variable-length subdimensional motifs given a minimum motif length in large-scale multivariate time series. We show that the memory cost of the approach is significantly smaller than that of random projection. Moreover, the speed of the proposed algorithm is significantly faster than that of the state-of-the-art algorithms. We demonstrate that CHIME can efficiently detect meaningful variable-length subdimensional motifs in large real world multivariate time series datasets. I NTRODUCTION Detecting repeating patterns of various lengths, also called variable-length motifs, in time series has received a great amount of attention [1] [2] [3] [4]. Since motifs of different lengths can naturally coexist in a time series, detecting variable-length motifs often is a necessary step for many real-world applications such as classification [2], anomaly detection [5] and data visualization [6]. Contrary to the significant progress that has been made in recent single dimensional variable-length motif discovery work [2] [3] [7], only little progress is made in detecting variable-length subdimensional motifs [8] [9] -- patterns that are simultaneously occurring only in a subset of all dimensions in multivariate time series. Existing approaches [10] [8] [9] in subdimensional motif discovery still only detect motifs of a specified length, possibly suggested by domain experts. While in some applications, these approaches may fit well if domain knowledge is available and a good motif length can be specified by the user, we aim at solving the problem in a more general case -- when the correct motif length is not known, or motifs of various lengths coexist in the data. One is labeled in red line and occurs in dimension {D 1,D 2} with length 200.

motif, subdimensional motif, subsequence, (15 more...)

arXiv.org Artificial Intelligence

1911.09218

Country:

North America > United States > California (0.14)
North America > United States > Virginia (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.54)