AITopics | Huh, Dongsung

Collaborating Authors

Huh, Dongsung

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Discovering Abstract Symbolic Relations by Learning Unitary Group Representations

Huh, Dongsung

arXiv.org Artificial IntelligenceMay-22-2024

We investigate a principled approach for symbolic operation completion (SOC), a minimal task for studying symbolic reasoning. While conceptually similar to matrix completion, SOC poses a unique challenge in modeling abstract relationships between discrete symbols. We demonstrate that SOC can be efficiently solved by a minimal model - a bilinear map - with a novel factorized architecture. Inspired by group representation theory, this architecture leverages matrix embeddings of symbols, modeling each symbol as an operator that dynamically influences others. Our model achieves perfect test accuracy on SOC with comparable or superior sample efficiency to Transformer baselines across most datasets, while boasting significantly faster learning speeds (100-1000$\times$). Crucially, the model exhibits an implicit bias towards learning general group structures, precisely discovering the unitary representations of underlying groups. This remarkable property not only confers interpretability but also significant implications for automatic symmetry discovery in geometric deep learning. Overall, our work establishes group theory as a powerful guiding principle for discovering abstract algebraic structures in deep learning, and showcases matrix representations as a compelling alternative to traditional vector embeddings for modeling symbolic relationships.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.17002

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

ISAAC Newton: Input-based Approximate Curvature for Newton's Method

Petersen, Felix, Sutter, Tobias, Borgelt, Christian, Huh, Dongsung, Kuehne, Hilde, Sun, Yuekai, Deussen, Oliver

arXiv.org Artificial IntelligenceApr-30-2023

We present ISAAC (Input-baSed ApproximAte Curvature), a novel method that conditions the gradient using selected second-order information and has an asymptotically vanishing computational overhead, assuming a batch size smaller than the number of neurons. We show that it is possible to compute a good conditioner based on only the input to a respective layer without a substantial computational overhead. The proposed method allows effective training even in small-batch stochastic regimes, which makes it competitive to first-order as well as second-order methods. While second-order optimization methods are traditionally much less explored than first-order methods in large-scale machine learning (ML) applications due to their memory requirements and prohibitive computational cost per iteration, they have recently become more popular in ML mainly due to their fast convergence properties when compared to first-order methods [1]. The expensive computation of an inverse Hessian (also known as pre-conditioning matrix) in the Newton step has also been tackled via estimating the curvature from the change in gradients.

approximation, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.00604

Country: North America > United States (0.67)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.30)

Add feedback

The Missing Invariance Principle Found -- the Reciprocal Twin of Invariant Risk Minimization

Huh, Dongsung, Baidya, Avinash

arXiv.org Artificial IntelligenceJan-16-2023

Machine learning models often generalize poorly to out-of-distribution (OOD) data as a result of relying on features that are spuriously correlated with the label during training. Recently, the technique of Invariant Risk Minimization (IRM) was proposed to learn predictors that only use invariant features by conserving the feature-conditioned label expectation $\mathbb{E}_e[y|f(x)]$ across environments. However, more recent studies have demonstrated that IRM-v1, a practical version of IRM, can fail in various settings. Here, we identify a fundamental flaw of IRM formulation that causes the failure. We then introduce a complementary notion of invariance, MRI, based on conserving the label-conditioned feature expectation $\mathbb{E}_e[f(x)|y]$, which is free of this flaw. Further, we introduce a simplified, practical version of the MRI formulation called MRI-v1. We prove that for general linear problems, MRI-v1 guarantees invariant predictors given sufficient number of environments. We also empirically demonstrate that MRI-v1 strongly out-performs IRM-v1 and consistently achieves near-optimal OOD generalization in image-based nonlinear problems.

artificial intelligence, machine learning, missing invariance principle found, (2 more...)

arXiv.org Artificial Intelligence

2205.14546

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Gradient Descent for Spiking Neural Networks

Huh, Dongsung, Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-2018

Most large-scale network models use neurons with static nonlinearities that produce analog output, despite the fact that information processing in the brain is predominantly carried out by dynamic neurons that produce discrete pulses called spikes. Research in spike-based computation has been impeded by the lack of efficient supervised learning algorithm for spiking neural networks. Here, we present a gradient descent method for optimizing spiking network models by introducing a differentiable formulation of spiking dynamics and deriving the exact gradient calculation. For demonstration, we trained recurrent spiking networks on two dynamic tasks: one that requires optimizing fast (~ millisecond) spike-based interactions for efficient encoding of information, and a delayed-memory task over extended duration (~ second). The results show that the gradient descent approach indeed optimizes networks dynamics on the time scale of individual spikes as well as on behavioral time scales. In conclusion, our method yields a general purpose supervised learning algorithm for spiking neural networks, which can facilitate further investigations on spike-based computations.

artificial intelligence, machine learning, neuron, (14 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Gradient Descent for Spiking Neural Networks

Huh, Dongsung, Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-2018

artificial intelligence, machine learning, neural network, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Gradient Descent for Spiking Neural Networks

Huh, Dongsung, Sejnowski, Terrence J.

arXiv.org Machine LearningJun-19-2017

Much of studies on neural computation are based on network models of static neurons that produce analog output, despite the fact that information processing in the brain is predominantly carried out by dynamic neurons that produce discrete pulses called spikes. Research in spike-based computation has been impeded by the lack of efficient supervised learning algorithm for spiking networks. Here, we present a gradient descent method for optimizing spiking network models by introducing a differentiable formulation of spiking networks and deriving the exact gradient calculation. For demonstration, we trained recurrent spiking networks on two dynamic tasks: one that requires optimizing fast (~millisecond) spike-based interactions for efficient encoding of information, and a delayed memory XOR task over extended duration (~second). The results show that our method indeed optimizes the spiking network dynamics on the time scale of individual spikes as well as behavioral time scales. In conclusion, our result offers a general purpose supervised learning algorithm for spiking neural networks, thus advancing further investigations on spike-based computation.

neural network, neurology, neuron, (18 more...)

arXiv.org Machine Learning

1706.04698

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback