AITopics | wol

Collaborating Authors

wol

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Egalitarian Gradient Descent: A Simple Approach to Accelerated Grokking

Pasand, Ali Saheb, Dohmatob, Elvis

arXiv.org Artificial IntelligenceOct-7-2025

Grokking is the phenomenon whereby, unlike the training performance, which peaks early in the training process, the test/generalization performance of a model stagnates over arbitrarily many epochs and then suddenly jumps to usually close to perfect levels. In practice, it is desirable to reduce the length of such plateaus, that is to make the learning process "grok" faster. In this work, we provide new insights into grokking. First, we show both empirically and theoretically that grokking can be induced by asymmetric speeds of (stochastic) gradient descent, along different principal (i.e singular directions) of the gradients. We then propose a simple modification that normalizes the gradients so that dynamics along all the principal directions evolves at exactly the same speed. Then, we establish that this modified method, which we call egalitarian gradient descent (EGD) and can be seen as a carefully modified form of natural gradient descent, groks much faster. In fact, in some cases the stagnation is completely removed. Finally, we empirically show that on classical arithmetic problems such as modular addition and sparse parity problem which this stagnation has been widely observed and intensively studied, that our proposed method eliminates the plateaus.

artificial intelligence, arxivpreprintarxiv, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.0493

Genre: Research Report (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.97)

Add feedback

On LLM Wizards: Identifying Large Language Models' Behaviors for Wizard of Oz Experiments

Fang, Jingchao, Arechiga, Nikos, Namaoshi, Keiichi, Bravo, Nayeli, Hogan, Candice, Shamma, David A.

arXiv.org Artificial IntelligenceJul-10-2024

The Wizard of Oz (WoZ) method is a widely adopted research approach where a human Wizard "role-plays" a not readily available technology and interacts with participants to elicit user behaviors and probe the design space. With the growing ability for modern large language models (LLMs) to role-play, one can apply LLMs as Wizards in WoZ experiments with better scalability and lower cost than the traditional approach. However, methodological guidance on responsibly applying LLMs in WoZ experiments and a systematic evaluation of LLMs' role-playing ability are lacking. Through two LLM-powered WoZ studies, we take the first step towards identifying an experiment lifecycle for researchers to safely integrate Figure 1: An overview of our proposed experiment lifecycle LLMs into WoZ experiments and interpret data generated compared to traditional Wizard of Oz experiments. We ask from settings that involve Wizards role-played by LLMs. We also GPT-4 empowered agents to play the role of "Wizards" in contribute a heuristic-based evaluation framework that allows the conversation-based Wizard of Oz experiments. The agents estimation of LLMs' role-playing ability in WoZ experiments and talk to either Simulacrums powered by GPT-4 (in Study 1) or reveals LLMs' behavior patterns at scale.

experiment, wizard, wol, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3652988.3673967

2407.08067

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > California > Santa Clara County > Los Altos (0.04)
(22 more...)

Genre:

Research Report > Experimental Study (0.93)
Personal > Interview (0.93)

Industry:

Transportation > Ground > Road (1.00)
Health & Medicine (1.00)
Energy > Renewable (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback