AITopics | step back

Collaborating Authors

step back

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Lookahead Optimizer: k steps forward, 1 step back

Neural Information Processing SystemsDec-25-2025, 17:28:20 GMT

The vast majority of successful deep neural networks are trained using variants of stochastic gradient descent (SGD) algorithms. Recent attempts to improve SGD can be broadly categorized into two approaches: (1) adaptive learning rate schemes, such as AdaGrad and Adam and (2) accelerated schemes, such as heavy-ball and Nesterov momentum. In this paper, we propose a new optimization algorithm, Lookahead, that is orthogonal to these previous approaches and iteratively updates two sets of weights. Intuitively, the algorithm chooses a search direction by looking ahead at the sequence of ``fast weights generated by another optimizer. We show that Lookahead improves the learning stability and lowers the variance of its inner optimizer with negligible computation and memory cost. We empirically demonstrate Lookahead can significantly improve the performance of SGD and Adam, even with their default hyperparameter settings on ImageNet, CIFAR-10/100, neural machine translation, and Penn Treebank.

lookahead optimizer, name change, step forward, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

CoFineLLM: Conformal Finetuning of LLMs for Language-Instructed Robot Planning

Wang, Jun, Vorobeychik, Yevgeniy, Kantaros, Yiannis

arXiv.org Artificial IntelligenceNov-11-2025

Large Language Models (LLMs) have recently emerged as planners for language-instructed agents, generating sequences of actions to accomplish natural language tasks. However, their reliability remains a challenge, especially in long-horizon tasks, since they often produce overconfident yet wrong outputs. Conformal Prediction (CP) has been leveraged to address this issue by wrapping LLM outputs into prediction sets that contain the correct action with a user-defined confidence. When the prediction set is a singleton, the planner executes that action; otherwise, it requests help from a user. This has led to LLM-based planners that can ensure plan correctness with a user-defined probability. However, as LLMs are trained in an uncertainty-agnostic manner, without awareness of prediction sets, they tend to produce unnecessarily large sets, particularly at higher confidence levels, resulting in frequent human interventions limiting autonomous deployment. To address this, we introduce CoFineLLM (Conformal Finetuning for LLMs), the first CP-aware fine-tuning framework for LLM-based planners that explicitly reduces prediction-set size and, in turn, the need for user interventions. We evaluate our approach on multiple language-instructed robot planning problems and show consistent improvements over uncertainty-aware and uncertainty-agnostic finetuning baselines in terms of prediction-set size, and help rates. Finally, we demonstrate robustness of our method to out-of-distribution scenarios in hardware experiments.

large language model, natural language, prediction, (16 more...)

arXiv.org Artificial Intelligence

2511.06575

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Yang, Xiao-Wen, Zhu, Xuan-Yi, Wei, Wen-Da, Zhang, Ding-Chu, Shao, Jie-Jing, Zhou, Zhi, Guo, Lan-Zhe, Li, Yu-Feng

arXiv.org Artificial IntelligenceFeb-6-2025

The integration of slow-thinking mechanisms into large language models (LLMs) offers a promising way toward achieving Level 2 AGI Reasoners, as exemplified by systems like OpenAI's o1. However, several significant challenges remain, including inefficient overthinking and an overreliance on auxiliary reward models. We point out that these limitations stem from LLMs' inability to internalize the search process, a key component of effective reasoning. A critical step toward addressing this issue is enabling LLMs to autonomously determine when and where to backtrack, a fundamental operation in traditional search algorithms. To this end, we propose a self-backtracking mechanism that equips LLMs with the ability to backtrack during both training and inference. This mechanism not only enhances reasoning ability but also efficiency by transforming slow-thinking processes into fast-thinking through self-improvement. Empirical evaluations demonstrate that our proposal significantly enhances the reasoning capabilities of LLMs, achieving a performance gain of over 40 percent compared to the optimal-path supervised fine-tuning method. We believe this study introduces a novel and promising pathway for developing more advanced and robust Reasoners.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.04404

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Lookahead Optimizer: k steps forward, 1 step back

Neural Information Processing SystemsJan-25-2025, 16:59:59 GMT

Update: I have read the author's response and have kept my score. Please note that in DeVries and Taylor'17, 'ResNet-18' is not truly the ResNet-18 model (it consists of 4 stages and has more than an order of magnitude more parameters than the original ResNet-18 due to wider channels). This should be made clear in the paper in order not to cause more confusion in the community. Originality: Medium/High The proposed algorithm is considerably different than recently proposed methods for deep learning, which gravitate towards adaptive gradient methods. It has some similarities to variance reduction algorithms with inner and outer loops, however Lookahead has a very simple outer loop structure and and is easy to implement.

lookahead optimizer, step back, step forward, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)

Add feedback

Lookahead Optimizer: k steps forward, 1 step back

Neural Information Processing SystemsOct-10-2024, 12:37:20 GMT

The vast majority of successful deep neural networks are trained using variants of stochastic gradient descent (SGD) algorithms. Recent attempts to improve SGD can be broadly categorized into two approaches: (1) adaptive learning rate schemes, such as AdaGrad and Adam and (2) accelerated schemes, such as heavy-ball and Nesterov momentum. In this paper, we propose a new optimization algorithm, Lookahead, that is orthogonal to these previous approaches and iteratively updates two sets of weights. Intuitively, the algorithm chooses a search direction by looking ahead at the sequence of fast weights" generated by another optimizer. We show that Lookahead improves the learning stability and lowers the variance of its inner optimizer with negligible computation and memory cost.

algorithm, lookahead optimizer, step forward, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.45)

Add feedback

The future of travel? For hyperloop, it's one step forward, two steps back

Al JazeeraSep-18-2024, 02:59:44 GMT

Taipei, Taiwan – Imagine boarding a train that glides above the ground at supersonic speeds. Speeding through an airless tube using powerful electro-magnets, passengers could travel from San Francisco to Los Angeles, London to Paris, or Basra to Baghdad in less than an hour. The train would be potentially greener than existing modes of transportation, too, using electricity that could be drawn from renewable energy sources. While it may sound like the stuff of science fiction, scientists and engineers in multiple countries are working on making the concept of the so-called hyperloop a reality. Hyperloop proponents, who include tech billionaire Elon Musk, have announced a series of recent breakthroughs in progressing the technology, whose development has been plagued by commercial setbacks and doubts about its feasibility.

couldrick, hyperloop, vehicle, (16 more...)

Al Jazeera

Country:

North America > United States > California > San Francisco County > San Francisco (0.26)
North America > United States > California > Los Angeles County > Los Angeles (0.26)
Asia > Taiwan > Taiwan Province > Taipei (0.25)
(7 more...)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Rail (1.00)

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models

Zheng, Huaixiu Steven, Mishra, Swaroop, Chen, Xinyun, Cheng, Heng-Tze, Chi, Ed H., Le, Quoc V, Zhou, Denny

arXiv.org Artificial IntelligenceOct-9-2023

We present Step-Back Prompting, a simple prompting technique that enables LLMs to do abstractions to derive high-level concepts and first principles from instances containing specific details. Using the concepts and principles to guide the reasoning steps, LLMs significantly improve their abilities in following a correct reasoning path towards the solution. We conduct experiments of Step-Back Prompting with PaLM-2L models and observe substantial performance gains on a wide range of challenging reasoning-intensive tasks including STEM, Knowledge QA, and Multi-Hop Reasoning. For instance, Step-Back Prompting improves PaLM-2L performance on MMLU Physics and Chemistry by 7% and 11%, TimeQA by 27%, and MuSiQue by 7%.

abstraction, evoking reasoning, rompting, (13 more...)

arXiv.org Artificial Intelligence

2310.06117

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.05)
Europe > United Kingdom > England > Hertfordshire (0.04)
North America > United States > Texas (0.04)
(7 more...)

Genre:

Workflow (0.47)
Research Report (0.40)

Industry:

Leisure & Entertainment > Sports > Football (0.67)
Education > Curriculum > Subject-Specific Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Taking a Step Back with KCal: Multi-Class Kernel-Based Calibration for Deep Neural Networks

Lin, Zhen, Trivedi, Shubhendu, Sun, Jimeng

arXiv.org Machine LearningFeb-15-2022

Deep neural network (DNN) classifiers are often overconfident, producing miscalibrated class probabilities. Most existing calibration methods either lack theoretical guarantees for producing calibrated outputs or reduce the classification accuracy in the process. This paper proposes a new Kernel-based calibration method called KCal. Unlike other calibration procedures, KCal does not operate directly on the logits or softmax outputs of the DNN. Instead, it uses the penultimate-layer latent embedding to train a metric space in a supervised manner. In effect, KCal amounts to a supervised dimensionality reduction of the neural network embedding, and generates a prediction using kernel density estimation on a holdout calibration set. We first analyze KCal theoretically, showing that it enjoys a provable asymptotic calibration guarantee. Then, through extensive experiments, we confirm that KCal consistently outperforms existing calibration methods in terms of both the classification accuracy and the (confidence and class-wise) calibration error.

deep neural network, multi-class kernel-based calibration, step back, (1 more...)

arXiv.org Machine Learning

2202.07679

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Lack of diversity in AI development causes serious real-life harm for people of color

#artificialintelligenceFeb-14-2022, 03:10:20 GMT

Every time you ask Alexa to turn on your lights or play a song, you're using AI. But AI is also put to work in more serious ways, like facial recognition software by law enforcement. Some critics say there's a troubling lack of diversity among those who create the programs, and that is causing serious harm for people of color. We're joined now by Angle Bush. ANGLE BUSH: Thank you for having me.

black woman, diversity, snell, (11 more...)

#artificialintelligence

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.36)
Law (0.35)
Government (0.31)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

The secret to AI success? Focusing on data preparation

#artificialintelligenceAug-27-2021, 17:59:25 GMT

Datasets are essential to AI models. They provide the truth by which we train AI models and measure a model's success. Engineers often look to the AI model as the key to delivering highly accurate results, but in reality it is often the data that determines an AI model success. Data flows through every step of the AI workflow, from model training to deployment, and the way it is prepared can be the main driver of accuracy when designing robust AI models. Engineers can use these five tips to improve their data preparation process and drive success when developing a complete AI system.

artificial intelligence, data mining, machine learning, (17 more...)

#artificialintelligence

Country: Oceania > Australia (0.05)

Genre: Workflow (0.57)

Technology:

Information Technology > Data Science > Data Mining (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback