AITopics | Chang, Oscar

Collaborating Authors

Chang, Oscar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Assessing SATNet's Ability to Solve the Symbol Grounding Problem

Chang, Oscar, Flokas, Lampros, Lipson, Hod, Spranger, Michael

arXiv.org Artificial IntelligenceDec-12-2023

SATNet is an award-winning MAXSAT solver that can be used to infer logical rules and integrated as a differentiable layer in a deep neural network. It had been shown to solve Sudoku puzzles visually from examples of puzzle digit images, and was heralded as an impressive achievement towards the longstanding AI goal of combining pattern recognition with logical reasoning. In this paper, we clarify SATNet's capabilities by showing that in the absence of intermediate labels that identify individual Sudoku digit images with their logical representations, SATNet completely fails at visual Sudoku (0% test accuracy). More generally, the failure can be pinpointed to its inability to learn to assign symbols to perceptual phenomena, also known as the symbol grounding problem, which has long been thought to be a prerequisite for intelligent agents to perform real-world logical reasoning. We propose an MNIST based test as an easy instance of the symbol grounding problem that can serve as a sanity check for differentiable symbolic solvers in general. Naive applications of SATNet on this test lead to performance worse than that of models without logical reasoning capabilities. We report on the causes of SATNet's failure and how to prevent them.

artificial intelligence, machine learning, satnet, (14 more...)

arXiv.org Artificial Intelligence

2312.11522

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Sudoku (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Accelerating Meta-Learning by Sharing Gradients

Chang, Oscar, Lipson, Hod

arXiv.org Artificial IntelligenceDec-12-2023

The success of gradient-based meta-learning is primarily attributed to its ability to leverage related tasks to learn task-invariant information. However, the absence of interactions between different tasks in the inner loop leads to task-specific over-fitting in the initial phase of meta-training. While this is eventually corrected by the presence of these interactions in the outer loop, it comes at a significant cost of slower meta-learning. To address this limitation, we explicitly encode task relatedness via an inner loop regularization mechanism inspired by multi-task learning. Our algorithm shares gradient information from previously encountered tasks as well as concurrent tasks in the same task batch, and scales their contribution with meta-learned parameters. We show using two popular few-shot classification datasets that gradient sharing enables meta-learning under bigger inner loop learning rates and can accelerate the meta-training process by up to 134%.

artificial intelligence, denote, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2312.08398

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Ensemble Model Patching: A Parameter-Efficient Variational Bayesian Neural Network

Chang, Oscar, Yao, Yuling, Williams-King, David, Lipson, Hod

arXiv.org Machine LearningMay-22-2019

Two main obstacles preventing the widespread adoption of variational Bayesian neural networks are the high parameter overhead that makes them infeasible on large networks, and the difficulty of implementation, which can be thought of as "programming overhead." MC dropout [Gal and Ghahramani, 2016] is popular because it sidesteps these obstacles. Nevertheless, dropout is often harmful to model performance when used in networks with batch normalization layers [Li et al., 2018], which are an indispensable part of modern neural networks. We construct a general variational family for ensemble-based Bayesian neural networks that encompasses dropout as a special case. We further present two specific members of this family that work well with batch normalization layers, while retaining the benefits of low parameter and programming overhead, comparable to non-Bayesian training. Our proposed methods improve predictive accuracy and achieve almost perfect calibration on a ResNet-18 trained with ImageNet.

arxiv preprint arxiv, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

1905.09453

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Seven Myths in Machine Learning Research

Chang, Oscar, Lipson, Hod

arXiv.org Machine LearningFeb-18-2019

While neural networks are commonly believed to be black boxes, there have been many, many attempts made to interpret them. Saliency maps, or other similar methods that assign importance scores to features or training examples, are the most popular form of interpretation. It is tempting to be able to conclude that the reason why a given image is classified a certain way is due to particular parts of the image that are salient to the neural network's decision in making the classification. There are several ways to compute this saliency map, often making use of a neural network's activations on a given image and the gradients that flow through the network. In Ghorbani et al. [2017], the authors show that they can introduce an imperceptible perturbation to a given image to distort its saliency map. A monarch butterfly is thus classified as a monarch butterfly, not on account of the patterns on its wings, but because of some unimportant green leaves in the background.

dataset, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

1902.06789

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences

Das, Payel, Wadhawan, Kahini, Chang, Oscar, Sercu, Tom, Santos, Cicero Dos, Riemer, Matthew, Chenthamarakshan, Vijil, Padhi, Inkit, Mojsilovic, Aleksandra

arXiv.org Machine LearningNov-13-2018

Given the emerging global threat of antimicrobial resistance, new methods for next-generation antimicrobial design are urgently needed. We report a peptide generation framework PepCVAE, based on a semi-supervised variational autoencoder (VAE) model, for designing novel antimicrobial peptide (AMP) sequences. Our model learns a rich latent space of the biological peptide context by taking advantage of abundant, unlabeled peptide sequences. The model further learns a disentangled antimicrobial attribute space by using the feedback from a jointly trained AMP classifier that uses limited labeled instances. The disentangled representation allows for controllable generation of AMPs. Extensive analysis of the PepCVAE-generated sequences reveals superior performance of our model in comparison to a plain VAE, as PepCVAE generates novel AMP sequences with higher long-range diversity, while being closer to the training distribution of biological peptides. These features are highly desired in next-generation antimicrobial design.

deep learning, neural network, sequence, (22 more...)

arXiv.org Machine Learning

1810.07743

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Agent Embeddings: A Latent Representation for Pole-Balancing Networks

Chang, Oscar, Kwiatkowski, Robert, Chen, Siyuan, Lipson, Hod

arXiv.org Artificial IntelligenceNov-11-2018

We show that it is possible to reduce a high-dimensional object like a neural network agent into a low-dimensional vector representation with semantic meaning that we call agent embeddings, akin to word or face embeddings. This can be done by collecting examples of existing networks, vectorizing their weights, and then learning a generative model over the weight space in a supervised fashion. We investigate a pole-balancing task, Cart-Pole, as a case study and show that multiple new pole-balancing networks can be generated from their agent embeddings without direct access to training data from the Cart-Pole simulator. In general, the learned embedding space is helpful for mapping out the space of solutions for a given task. We observe in the case of Cart-Pole the surprising finding that good agents make different decisions despite learning similar representations, whereas bad agents make similar (bad) decisions while learning dissimilar representations. Linearly interpolating between the latent embeddings for a good agent and a bad agent yields an agent embedding that generates a network with intermediate performance, where the performance can be tuned according to the coefficient of interpolation. Linear extrapolation in the latent space also results in performance boosts, up to a point.

agent, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

1811.04516

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Gradient Normalization & Depth Based Decay For Deep Learning

Kwiatkowski, Robert, Chang, Oscar

arXiv.org Machine LearningFeb-28-2018

In this paper we introduce a novel method of gradient normalization and decay with respect to depth. Our method leverages the simple concept of normalizing all gradients in a deep neural network, and then decaying said gradients with respect to their depth in the network. Our proposed normalization and decay techniques can be used in conjunction with most current state of the art optimizers and are a very simple addition to any network. This method, although simple, showed improvements in convergence time on state of the art networks such as DenseNet and ResNet on image classification tasks, as well as on an LSTM for natural language processing tasks.

decay, deep learning, gradient normalization & depth

arXiv.org Machine Learning

1712.03607

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback