AITopics | Zador, Anthony M.

Zador, Anthony M.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Token-Level Uncertainty-Aware Objective for Language Model Post-Training

Liu, Tingkai, Benjamin, Ari S., Zador, Anthony M.

arXiv.org Artificial IntelligenceMar-14-2025

In the current work, we connect token-level uncertainty in causal language modeling to two types of training objectives: 1) masked maximum likelihood (MLE), 2) self-distillation. We show that masked MLE is effective in reducing epistemic uncertainty, and serve as an effective token-level automatic curriculum learning technique. However, masked MLE is prone to overfitting and requires self-distillation regularization to improve or maintain performance on out-of-distribution tasks. We demonstrate significant performance gain via the proposed training objective - combined masked MLE and self-distillation - across multiple architectures (Gemma, LLaMA, Phi) and datasets (Alpaca, ShareGPT, GSM8K), mitigating overfitting while maintaining adaptability during post-training. Our findings suggest that uncertainty-aware training provides an effective mechanism for enhancing language model training.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.16511

Country:

North America > United States (0.14)
Asia > Middle East (0.14)

Genre: Research Report > New Finding (0.54)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
(3 more...)

Add feedback

Neural Circuit Architectural Priors for Embodied Control

Bhattasali, Nikhil X., Zador, Anthony M., Engel, Tatiana A.

arXiv.org Artificial IntelligenceNov-27-2022

Artificial neural networks for motor control usually adopt generic architectures like fully connected MLPs. While general, these tabula rasa architectures rely on large amounts of experience to learn, are not easily transferable to new bodies, and have internal dynamics that are difficult to interpret. In nature, animals are born with highly structured connectivity in their nervous systems shaped by evolution; this innate circuitry acts synergistically with learning mechanisms to provide inductive biases that enable most animals to function well soon after birth and learn efficiently. Convolutional networks inspired by visual circuitry have encoded useful biases for vision. However, it is unknown the extent to which ANN architectures inspired by neural circuitry can yield useful biases for other AI domains. In this work, we ask what advantages biologically inspired ANN architecture can provide in the domain of motor control. Specifically, we translate C. elegans locomotion circuits into an ANN model controlling a simulated Swimmer agent. On a locomotion task, our architecture achieves good initial performance and asymptotic performance comparable with MLPs, while dramatically improving data efficiency and requiring orders of magnitude fewer parameters. Our architecture is interpretable and transfers to new body designs. An ablation analysis shows that constrained excitation/inhibition is crucial for learning, while weight initialization contributes to good initial performance. Our work demonstrates several advantages of biologically inspired ANN architecture and encourages future work in more complex embodied control.

artificial intelligence, machine learning, survey article, (19 more...)

arXiv.org Artificial Intelligence

2201.05242

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Spectro-Temporal Receptive Fields of Subthreshold Responses in Auditory Cortex

Machens, Christian K., Wehr, Michael, Zador, Anthony M.

Neural Information Processing SystemsDec-31-2003

How do cortical neurons represent the acoustic environment? This question is often addressed by probing with simple stimuli such as clicks or tone pips. Such stimuli have the advantage of yielding easily interpreted answers, but have the disadvantage that they may fail to uncover complex or higher-order neuronal response properties. Here we adopt an alternative approach, probing neuronal responses with complex acoustic stimuli, including animal vocalizations and music. We have used in vivo whole cell methods in the rat auditory cortex to record subthreshold membrane potential fluctuations elicited by these stimuli.

artificial intelligence, neuron, upstream oil & gas, (18 more...)

Neural Information Processing Systems

Industry: Energy > Oil & Gas > Upstream (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Binary Coding in Auditory Cortex

Deweese, Michael R., Zador, Anthony M.

Neural Information Processing SystemsDec-31-2003

Cortical neurons have been reported to use both rate and temporal codes. Here we describe a novel mode in which each neuron generates exactly 0 or 1 action potentials, but not more, in response to a stimulus. We used cell-attached recording, which ensured single-unit isolation, to record responses in rat auditory cortex to brief tone pips. Surprisingly, the majority of neurons exhibited binary behavior with few multi-spike responses; several dramatic examples consisted of exactly one spike on 100% of trials, with no trial-to-trial variability in spike count. Many neurons were tuned to stimulus frequency. Since individual trials yielded at most one spike for most neurons, the information about stimulus frequency was encoded in the population, and would not have been accessible to later stages of processing that only had access to the activity of a single unit. These binary units allow a more efficient population code than is possible with conventional rate coding units, and are consistent with a model of cortical processing in which synchronous packets of spikes propagate stably from one neuronal population to the next.

artificial intelligence, health & medicine, neuron, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Spectro-Temporal Receptive Fields of Subthreshold Responses in Auditory Cortex

Machens, Christian K., Wehr, Michael, Zador, Anthony M.

Neural Information Processing SystemsDec-31-2003

How do cortical neurons represent the acoustic environment? This question isoften addressed by probing with simple stimuli such as clicks or tone pips. Such stimuli have the advantage of yielding easily interpreted answers, but have the disadvantage that they may fail to uncover complex or higher-order neuronal response properties. Here we adopt an alternative approach, probing neuronal responses with complex acoustic stimuli, including animal vocalizations and music. We have used in vivo whole cell methods in the rat auditory cortex to record subthreshold membrane potential fluctuations elicited by these stimuli.

artificial intelligence, neuron, upstream oil & gas, (18 more...)

Neural Information Processing Systems

Industry: Energy > Oil & Gas > Upstream (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Processing of Time Series by Neural Circuits with Biologically Realistic Synaptic Dynamics

Natschläger, Thomas, Maass, Wolfgang, Sontag, Eduardo D., Zador, Anthony M.

Neural Information Processing SystemsDec-31-2001

Experimental data show that biological synapses behave quite differently from the symbolic synapses in common artificial neural network models.

artificial intelligence, dynamic network, neural network, (18 more...)

Neural Information Processing Systems

Country:

Europe > Austria (0.14)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Processing of Time Series by Neural Circuits with Biologically Realistic Synaptic Dynamics

Natschläger, Thomas, Maass, Wolfgang, Sontag, Eduardo D., Zador, Anthony M.

Neural Information Processing SystemsDec-31-2001

Experimental data show that biological synapses behave quite differently from the symbolic synapses in common artificial neural network models. Biological synapses are dynamic, i.e., their "weight" changes on a short time scale by several hundred percent in dependence of the past input to the synapse. In this article we explore the consequences that these synaptic dynamics entail for the computational power of feedforward neural networks. We show that gradient descent suffices to approximate a given (quadratic) filter by a rather small neural system with dynamic synapses. We also compare our network model to artificial neural networks designed for time series processing. Our numerical results are complemented by theoretical analysis which show that even with just a single hidden layer such networks can approximate a surprisingly large large class of nonlinear filters: all filters that can be characterized by Volterra series. This result is robust with regard to various changes in the model for synaptic dynamics.

artificial intelligence, dynamic network, neural network, (16 more...)

Neural Information Processing Systems

Country:

Europe > Austria (0.14)
North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Dynamic Stochastic Synapses as Computational Units

Maass, Wolfgang, Zador, Anthony M.

Neural Information Processing SystemsDec-31-1998

In most neural network models, synapses are treated as static weights that change only on the slow time scales of learning. In fact, however, synapses are highly dynamic, and show use-dependent plasticity over a wide range of time scales. Moreover, synaptic transmission is an inherently stochastic process: a spike arriving at a presynaptic terminal triggers release of a vesicle of neurotransmitter from a release site with a probability that can be much less than one. Changes in release probability represent one of the main mechanisms by which synaptic efficacy is modulated in neural circuits. We propose and investigate a simple model for dynamic stochastic synapses that can easily be integrated into common models for neural computation. We show through computer simulations and rigorous theoretical analysis that this model for a dynamic stochastic synapse increases computational power in a nontrivial way. Our results may have implications for the processing of time-varying signals by both biological and artificial neural networks. A synapse 8 carries out computations on spike trains, more precisely on trains of spikes from the presynaptic neuron. Each spike from the presynaptic neuron mayor may not trigger the release of a neurotransmitter-filled vesicle at the synapse.

neural network, neurology, synapse, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Dynamic Stochastic Synapses as Computational Units

Maass, Wolfgang, Zador, Anthony M.

Neural Information Processing SystemsDec-31-1998

neural network, neurology, synapse, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Dynamic Stochastic Synapses as Computational Units

Maass, Wolfgang, Zador, Anthony M.

Neural Information Processing SystemsDec-31-1998

In most neural network models, synapses are treated as static weights that change only on the slow time scales of learning. In fact, however, synapses are highly dynamic, and show use-dependent plasticity over a wide range of time scales. Moreover, synaptic transmission is an inherently stochastic process: a spike arriving at a presynaptic terminal triggers release of a vesicle of neurotransmitter from a release site with a probability that can be much less than one. Changes in release probability represent one of the main mechanisms by which synaptic efficacy is modulated in neural circuits. We propose and investigate a simple model for dynamic stochastic synapses that can easily be integrated into common models for neural computation. We show through computer simulations and rigorous theoretical analysis that this model for a dynamic stochastic synapse increases computational power in a nontrivial way. Our results may have implications for the processing oftime-varying signals by both biological and artificial neural networks. A synapse 8 carries out computations on spike trains, more precisely on trains of spikes from the presynaptic neuron. Each spike from the presynaptic neuron mayor may not trigger the release of a neurotransmitter-filled vesicle at the synapse.

neural network, neurology, synapse, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback