AITopics | single step

Collaborating Authors

single step

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adversarial Robustness is at Odds with Lazy Training

Neural Information Processing SystemsApr-25-2026, 05:49:10 GMT

Recent works show that adversarial examples exist for random neural networks [Daniely and Shacham, 2020] and that these examples can be found using a single step of gradient ascent [Bubeck et al., 2021]. In this work, we extend this line of work to "lazy training" of neural networks - a dominant model in deep learning theory in which neural networks are provably efficiently learnable. We show that over-parametrized neural networks that are guaranteed to generalize well and enjoy strong computational guarantees remain vulnerable to attacks generated using a single step of gradient ascent.

artificial intelligence, machine learning, neural network, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.69)

Industry: Information Technology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Recurrent Relational Networks

Rasmus Palm, Ulrich Paquet, Ole Winther

Neural Information Processing SystemsFeb-14-2026, 08:43:50 GMT

As an illustrative example, consider solving a Sudoku.

artificial intelligence, machine learning, reasoning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

56c51a39a7c77d8084838cc920585bd0-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 05:52:06 GMT

cgd, diagonal block, matrix inverse, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.52)

Add feedback

Adversarial Robustness is at Odds with Lazy Training

Neural Information Processing SystemsDec-23-2025, 22:58:43 GMT

Recent works show that adversarial examples exist for random neural networks [Daniely and Schacham, 2020] and that these examples can be found using a single step of gradient ascent [Bubeck et al., 2021]. In this work, we extend this line of work to ``lazy training'' of neural networks -- a dominant model in deep learning theory in which neural networks are provably efficiently learnable. We show that over-parametrized neural networks that are guaranteed to generalize well and enjoy strong computational guarantees remain vulnerable to attacks generated using a single step of gradient ascent.

adversarial robustness, name change, neural network, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step

Neural Information Processing SystemsMay-26-2025, 22:18:56 GMT

Language model systems often enable LLMs to generate code for arithmetic operations to achieve accurate calculations. However, this approach compromises speed and security, and fine-tuning risks the language model losing prior capabilities. We propose a framework that enables exact arithmetic in *a single autoregressive step*, providing faster, more secure, and more interpretable LLM systems with arithmetic capabilities. We use the hidden states of a LLM to control a symbolic architecture that performs arithmetic. Furthermore, OccamLlama outperforms GPT 4o with and without a code interpreter on average across a range of mathematical problem solving benchmarks, demonstrating that OccamLLMs can excel in arithmetic tasks, even surpassing much larger models.

large language model, machine learning, natural language, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Graph neural networks extrapolate out-of-distribution for shortest paths

Nerem, Robert R., Chen, Samantha, Dasgupta, Sanjoy, Wang, Yusu

arXiv.org Artificial IntelligenceMar-30-2025

Neural networks (NNs), despite their success and wide adoption, still struggle to extrapolate out-of-distribution (OOD), i.e., to inputs that are not well-represented by their training dataset. Addressing the OOD generalization gap is crucial when models are deployed in environments significantly different from the training set, such as applying Graph Neural Networks (GNNs) trained on small graphs to large, real-world graphs. One promising approach for achieving robust OOD generalization is the framework of neural algorithmic alignment, which incorporates ideas from classical algorithms by designing neural architectures that resemble specific algorithmic paradigms (e.g. dynamic programming). The hope is that trained models of this form would have superior OOD capabilities, in much the same way that classical algorithms work for all instances. We rigorously analyze the role of algorithmic alignment in achieving OOD generalization, focusing on graph neural networks (GNNs) applied to the canonical shortest path problem. We prove that GNNs, trained to minimize a sparsity-regularized loss over a small set of shortest path instances, exactly implement the Bellman-Ford (BF) algorithm for shortest paths. In fact, if a GNN minimizes this loss within an error of $\epsilon$, it implements the BF algorithm with an error of $O(\epsilon)$. Consequently, despite limited training data, these GNNs are guaranteed to extrapolate to arbitrary shortest-path problems, including instances of any size. Our empirical results support our theory by showing that NNs trained by gradient descent are able to minimize this loss and extrapolate in practice.

artificial intelligence, machine learning, minagg gnn, (17 more...)

arXiv.org Artificial Intelligence

2503.19173

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adversarial Robustness is at Odds with Lazy Training

Neural Information Processing SystemsOct-10-2024, 11:21:35 GMT

Recent works show that adversarial examples exist for random neural networks [Daniely and Schacham, 2020] and that these examples can be found using a single step of gradient ascent [Bubeck et al., 2021]. In this work, we extend this line of work to lazy training'' of neural networks -- a dominant model in deep learning theory in which neural networks are provably efficiently learnable. We show that over-parametrized neural networks that are guaranteed to generalize well and enjoy strong computational guarantees remain vulnerable to attacks generated using a single step of gradient ascent.

adversarial robustness, lazy training, neural network, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Rome was Not Built in a Single Step: Hierarchical Prompting for LLM-based Chip Design

Nakkab, Andre, Zhang, Sai Qian, Karri, Ramesh, Garg, Siddharth

arXiv.org Artificial IntelligenceSep-9-2024

Large Language Models (LLMs) are effective in computer hardware synthesis via hardware description language (HDL) generation. However, LLM-assisted approaches for HDL generation struggle when handling complex tasks. We introduce a suite of hierarchical prompting techniques which facilitate efficient stepwise design methods, and develop a generalizable automation pipeline for the process. To evaluate these techniques, we present a benchmark set of hardware designs which have solutions with or without architectural hierarchy. Using these benchmarks, we compare various open-source and proprietary LLMs, including our own fine-tuned Code Llama-Verilog model. Our hierarchical methods automatically produce successful designs for complex hardware modules that standard flat prompting methods cannot achieve, allowing smaller open-source LLMs to compete with large proprietary models. Hierarchical prompting reduces HDL generation time and yields savings on LLM costs. Our experiments detail which LLMs are capable of which applications, and how to apply hierarchical methods in various modes. We explore case studies of generating complex cores using automatic scripted hierarchical prompts, including the first-ever LLM-designed processor with no human feedback. Tools for the Recurrent Optimization via Machine Editing (ROME) method can be found at https://github.com/ajn313/ROME-LLM

benchmark, llm, submodule, (11 more...)

arXiv.org Artificial Intelligence

2407.18276

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe (0.04)

Genre: Research Report (0.40)

Industry:

Semiconductors & Electronics (0.41)
Information Technology > Hardware (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Training trajectories, mini-batch losses and the curious role of the learning rate

Sandler, Mark, Zhmoginov, Andrey, Vladymyrov, Max, Miller, Nolan

arXiv.org Artificial IntelligenceFeb-1-2023

Stochastic gradient descent plays a fundamental role in nearly all applications of deep learning. However its ability to converge to a global minimum remains shrouded in mystery. In this paper we propose to study the behavior of the loss function on fixed mini-batches along SGD trajectories. We show that the loss function on a fixed batch appears to be remarkably convex-like. In particular for ResNet the loss for any fixed mini-batch can be accurately modeled by a quadratic function and a very low loss value can be reached in just one step of gradient descent with sufficiently large learning rate. We propose a simple model that allows to analyze the relationship between the gradients of stochastic mini-batches and the full batch. Our analysis allows us to discover the equivalency between iterate aggregates and specific learning rate schedules. In particular, for Exponential Moving Average (EMA) and Stochastic Weight Averaging we show that our proposed model matches the observed training trajectories on ImageNet. Our theoretical model predicts that an even simpler averaging technique, averaging just two points a many steps apart, significantly improves accuracy compared to the baseline. We validated our findings on ImageNet and other datasets using ResNet architecture.

artificial intelligence, machine learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2301.02312

Country:

North America > Canada > Ontario > Toronto (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)

Add feedback

New 3-D printing technique can make autonomous robots in a single step

Los Angeles TimesJul-8-2022, 21:33:49 GMT

Building a robot is hard. Building one that can sense its environment and learn how to get around on its own is even harder. But UCLA engineers took on an even bigger challenge. Not only did they create autonomous robots, they 3-D printed them in a single step. Each robot is about the size of a fingertip.

autonomous robot, robot, single step, (15 more...)

Los Angeles Times

AI-Alerts: 2022 > 2022-07 > AAAI AI-Alert for Jul 12, 2022 (1.00)

Country:

North America > United States > Maryland (0.05)
North America > United States > Colorado > Boulder County > Boulder (0.05)

Industry: Machinery > Industrial Machinery (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback