Results


A Comprehensive Survey on Deep Graph Representation Learning Methods

Journal of Artificial Intelligence Research

There has been a lot of activity in graph representation learning in recent years. Graph representation learning aims to produce graph representation vectors to represent the structure and characteristics of huge graphs precisely. This is crucial since the effectiveness of the graph representation vectors will influence how well they perform in subsequent tasks like anomaly detection, connection prediction, and node classification. Recently, there has been an increase in the use of other deep-learning breakthroughs for data-based graph problems. Graph-based learning environments have a taxonomy of approaches, and this study reviews all their learning settings. The learning problem is theoretically and empirically explored. This study briefly introduces and summarizes the Graph Neural Architecture Search (G-NAS), outlines several Graph Neural Networks' drawbacks, and suggests some strategies to mitigate these challenges. Lastly, the study discusses several potential future study avenues yet to be explored.


Amortized Variational Inference: A Systematic Review

Journal of Artificial Intelligence Research

The core principle of Variational Inference (VI) is to convert the statistical inference problem of computing complex posterior probability densities into a tractable optimization problem. This property enables VI to be faster than several sampling-based techniques. However, the traditional VI algorithm is not scalable to large data sets and is unable to readily infer out-of-bounds data points without re-running the optimization process. Recent developments in the field, like stochastic-, black box-, and amortized-VI, have helped address these issues. Generative modeling tasks nowadays widely make use of amortized VI for its efficiency and scalability, as it utilizes a parameterized function to learn the approximate posterior density parameters. In this paper, we review the mathematical foundations of various VI techniques to form the basis for understanding amortized VI. Additionally, we provide an overview of the recent trends that address several issues of amortized VI, such as the amortization gap, generalization issues, inconsistent representation learning, and posterior collapse. Finally, we analyze alternate divergence measures that improve VI optimization.


Prediction of Social Dynamic Agents and Long-Tailed Learning Challenges: A Survey

Journal of Artificial Intelligence Research

Autonomous robots that can perform common tasks like driving, surveillance, and chores have the biggest potential for impact due to frequency of usage, and the biggest potential for risk due to direct interaction with humans. These tasks take place in openended environments where humans socially interact and pursue their goals in complex and diverse ways. To operate in such environments, such systems must predict this behaviour, especially when the behavior is unexpected and potentially dangerous. Therefore, we summarize trends in various types of tasks, modeling methods, datasets, and social interaction modules aimed at predicting the future location of dynamic, socially interactive agents. Furthermore, we describe long-tailed learning techniques from classification and regression problems that can be applied to prediction problems. To our knowledge this is the first work that reviews social interaction modeling within prediction, and long-tailed learning techniques within regression and prediction.


How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

Journal of Artificial Intelligence Research

Machine Learning (ML) models are ubiquitous in real-world applications and are a constant focus of research. Modern ML models have become more complex, deeper, and harder to reason about. At the same time, the community has started to realize the importance of protecting the privacy of the training data that goes into these models. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between. The adoption of DP is hindered by limited practical guidance of what DP protection entails, what privacy guarantees to aim for, and the difficulty of achieving good privacy-utility-computation trade-offs for ML models. Tricks for tuning and maximizing performance are scattered among papers or stored in the heads of practitioners, particularly with respect to the challenging task of hyperparameter tuning. Furthermore, the literature seems to present conflicting evidence on how and whether to apply architectural adjustments and which components are “safe” to use with DP. In this survey paper, we attempt to create a self-contained guide that gives an in-depth overview of the field of DP ML. We aim to assemble information about achieving the best possible DP ML model with rigorous privacy guarantees. Our target audience is both researchers and practitioners. Researchers interested in DP for ML will benefit from a clear overview of current advances and areas for improvement. We also include theory-focused sections that highlight important topics such as privacy accounting and convergence. For a practitioner, this survey provides a background in DP theory and a clear step-by-step guide for choosing an appropriate privacy definition and approach, implementing DP training, potentially updating the model architecture, and tuning hyperparameters. For both researchers and practitioners, consistently and fully reporting privacy guarantees is critical, so we propose a set of specific best practices for stating guarantees. With sufficient computation and a sufficiently large training set or supplemental nonprivate data, both good accuracy (that is, almost as good as a non-private model) and good privacy can often be achievable. And even when computation and dataset size are limited, there are advantages to training with even a weak (but still finite) formal DP guarantee. Hence, we hope this work will facilitate more widespread deployments of DP ML models.


Resources - Second Edition -- An Introduction to Statistical Learning

#artificialintelligence

The original Chapter 10 lab made use of keras, an R package for deep learning that relies on Python. Getting keras to work on your computer can be a bit of a challenge. Installation instructions are available here. RStudio has recently released a new R package for deep learning, called torch, that does not require a Python installation. Daniel Falbel and Sigrid Keydana, two of the torch developers, translated our keras version of the Chapter 10 lab to torch.


IJMS

#artificialintelligence

Over the past few decades, the advances in computational resources and computer science, combined with next-generation sequencing and other emerging omics techniques, ushered in a new era of biology, allowing for sophisticated analysis of complex biological data. Bioinformatics is evolving as an integrative field between computer science and biology, that allows the representation, storage, management, analysis and investigation of numerous data types with diverse algorithms and computational tools. The bioinformatics approaches include sequence analysis, comparative genomics, molecular evolution studies and phylogenetics, protein and RNA structure prediction, gene expression and regulation analysis, and biological network analysis, as well as the genetics of human diseases, in particular, cancer, and medical image analysis [1,2,3]. Machine learning (ML) is a field in computer science that studies the use of computers to simulate human learning by exploring patterns in the data and applying self-improvement to continually enhance the performance of learning tasks. ML algorithms can be roughly divided into supervised learning algorithms, which learn to map input example into their respective output, and unsupervised learning algorithms, which identify hidden patterns in unlabeled data. The advances made in machine-learning over the past decade transformed the landscape of data analysis [4,5,6].


Skip-Thought Vectors Ryan Kiros 1, Richard S. Zemel

Neural Information Processing Systems

We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoderdecoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next introduce a simple vocabulary expansion method to encode words that were not seen as part of training, allowing us to expand our vocabulary to a million words. After training our model, we extract and evaluate our vectors with linear models on 8 tasks: semantic relatedness, paraphrase detection, image-sentence ranking, question-type classification and 4 benchmark sentiment and subjectivity datasets. The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice.


Rectified Factor Networks

Neural Information Processing Systems

We propose rectified factor networks (RFNs) to efficiently construct very sparse, non-linear, high-dimensional representations of the input. RFN models identify rare and small events in the input, have a low interference between code units, have a small reconstruction error, and explain the data covariance structure. RFN learning is a generalized alternating minimization algorithm derived from the posterior regularization method which enforces non-negative and normalized posterior means.


Deeply Learning the Messages in Message Passing Inference

Neural Information Processing Systems

Deep structured output learning shows great promise in tasks like semantic image segmentation. We proffer a new, efficient deep structured model learning scheme, in which we show how deep Convolutional Neural Networks (CNNs) can be used to directly estimate the messages in message passing inference for structured prediction with Conditional Random Fields (CRFs). With such CNN message estimators, we obviate the need to learn or evaluate potential functions for message calculation. This confers significant efficiency for learning, since otherwise when performing structured learning for a CRF with CNN potentials it is necessary to undertake expensive inference for every stochastic gradient iteration. The network output dimension of message estimators is the same as the number of classes, rather than exponentially growing in the order of the potentials. Hence it is more scalable for cases that involve a large number of classes. We apply our method to semantic image segmentation and achieve impressive performance, which demonstrates the effectiveness and usefulness of our CNN message learning method.


Deep Poisson Factor Modeling

Neural Information Processing Systems

We propose a new deep architecture for topic modeling, based on Poisson Factor Analysis (PFA) modules. The model is composed of a Poisson distribution to model observed vectors of counts, as well as a deep hierarchy of hidden binary units. Rather than using logistic functions to characterize the probability that a latent binary unit is on, we employ a Bernoulli-Poisson link, which allows PFA modules to be used repeatedly in the deep architecture. We also describe an approach to build discriminative topic models, by adapting PFA modules. We derive efficient inference via MCMC and stochastic variational methods, that scale with the number of non-zeros in the data and binary units, yielding significant efficiency, relative to models based on logistic links. Experiments on several corpora demonstrate the advantages of our model when compared to related deep models.