AITopics

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The DJI Mini 4K drone will probably never be this cheap again

If you have been in the market for a DJI drone or camera lately, you probably know that prices have been extremely volatile. Looming tariffs have had some models selling well above market price since the beginning of the year. Right now, however, Amazon has the DJI Mini 4K for just 250, which is 50 less than its retail price and likely the lowest it will ever go in the future. This 249-gram drone weighs just under the limit that you won't have to register it with the FAA before you take it to the skies. Despite its tiny size, it offers everything you'd expect out of a much more expensive model.

artificial intelligence, drone, writer and editor, (4 more...)

Popular Science

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning

Neural Information Processing SystemsMay-29-2025, 21:19:13 GMT

Previous studies on continual knowledge learning (CKL) in large language models (LLMs) have predominantly focused on approaches such as regularization, architectural modifications, and rehearsal techniques to mitigate catastrophic forgetting. However, these methods naively inherit the inefficiencies of standard training procedures, indiscriminately applying uniform weight across all tokens, which can lead to unnecessary parameter updates and increased forgetting. To address these shortcomings, we propose a novel CKL approach termed Train-Attention-Augmented Language Model (TAALM), which enhances learning efficiency by dynamically predicting and applying weights to tokens based on their usefulness. This method employs a meta-learning framework that optimizes token importance predictions, facilitating targeted knowledge updates and minimizing forgetting.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Verification and search algorithms for causal DAGs

Neural Information Processing SystemsMay-29-2025, 21:19:05 GMT

We study two problems related to recovering causal graphs from interventional data: (i) verification, where the task is to check if a purported causal graph is correct, and (ii) search, where the task is to recover the correct causal graph. For both, we wish to minimize the number of interventions performed. For the first problem, we give a characterization of a minimal sized set of atomic interventions that is necessary and sufficient to check the correctness of a claimed causal graph. Our characterization uses the notion of covered edges, which enables us to obtain simple proofs and also easily reason about earlier known results. We also generalize our results to the settings of bounded size interventions and node-dependent interventional costs.

artificial intelligence, intervention, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.65)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Verification and search algorithms for causal DAGs

Neural Information Processing SystemsMay-29-2025, 21:19:01 GMT

artificial intelligence, intervention, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Algorithmic progress in language models Anson Ho1 David Owen

Neural Information Processing SystemsMay-29-2025, 21:18:51 GMT

We investigate the rate at which algorithms for pre-training language models have improved since the advent of deep learning. Using a dataset of over 200 language model evaluations on Wikitext and Penn Treebank spanning 2012-2023, we find that the compute required to reach a set performance threshold has halved approximately every 8 months, with a 90% confidence interval of around 2 to 22 months, substantially faster than hardware gains per Moore's Law. We estimate augmented scaling laws, which enable us to quantify algorithmic progress and determine the relative contributions of scaling models versus innovations in training algorithms. Despite the rapid pace of algorithmic progress and the development of new architectures such as the transformer, our analysis reveals that the increase in compute made an even larger contribution to overall performance improvements over this time period. Though limited by noisy benchmark data, our analysis quantifies the rapid progress in language modeling, shedding light on the relative contributions from compute and algorithms.

algorithmic progress, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Decentralized Parallel Algorithm for Training Generative Adversarial Nets

Neural Information Processing SystemsMay-29-2025, 21:18:25 GMT

Generative Adversarial Networks (GANs) are a powerful class of generative models in the deep learning community.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Iowa > Johnson County > Iowa City (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

A Decentralized Parallel Algorithm for Training Generative Adversarial Nets

Neural Information Processing SystemsMay-29-2025, 21:18:17 GMT

Generative Adversarial Networks (GANs) are a powerful class of generative models in the deep learning community. Current practice on large-scale GAN training utilizes large models and distributed large-batch training strategies, and is implemented on deep learning frameworks (e.g., TensorFlow, PyTorch, etc.) designed in a centralized manner. In the centralized network topology, every worker needs to either directly communicate with the central node or indirectly communicate with all other workers in every iteration. However, when the network bandwidth is low or network latency is high, the performance would be significantly degraded. Despite recent progress on decentralized algorithms for training deep neural networks, it remains unclear whether it is possible to train GANs in a decentralized manner. The main difficulty lies at handling the nonconvex-nonconcave min-max optimization and the decentralized communication simultaneously. In this paper, we address this difficulty by designing the first gradient-based decentralized parallel algorithm which allows workers to have multiple rounds of communications in one iteration and to update the discriminator and generator simultaneously, and this design makes it amenable for the convergence analysis of the proposed decentralized algorithm. Theoretically, our proposed decentralized algorithm is able to solve a class of non-convex non-concave min-max problems with provable non-asymptotic convergence to first-order stationary point. Experimental results on GANs demonstrate the effectiveness of the proposed algorithm.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Iowa > Johnson County > Iowa City (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Checklist

Neural Information Processing SystemsMay-29-2025, 21:18:02 GMT

For all authors... (a) Do the main claims made in the abstract and introduction accurately reflect the paper's contributions and scope? If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See here: shorturl.at/fnyOQ. If you used crowdsourcing or conducted research with human subjects... (a) Did you include the full text of instructions given to participants and screenshots, if applicable? [N/A] (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? [N/A] (c) Did you include the estimated hourly wage paid to participants and the total amount spent on participant compensation? First, we describe the parameters and variables we need to formulate the problem. Then, we state the problem and explain shortly what every constraint represents.

artificial intelligence, graph, node, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.54)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

Significant advancements in large language models

Neural Information Processing SystemsMay-29-2025, 21:17:44 GMT

Large language models have demonstrated remarkable capabilities but their performance is heavily reliant on effective prompt engineering. Automatic prompt optimization (APO) methods are designed to automate this and can be broadly categorized into those targeting instructions (instruction optimization, IO) vs. those targeting exemplars (exemplar optimization, EO). Despite their shared objective, these have evolved rather independently, with IO receiving more research attention recently. This paper seeks to bridge this gap by comprehensively comparing the performance of representative IO and EO techniques both isolation and combination on a diverse set of challenging tasks. Our findings reveal that intelligently reusing model-generated input-output pairs obtained from evaluating prompts on the validation set as exemplars, consistently improves performance on top of IO methods but is currently under-investigated. We also find that despite the recent focus on IO, how we select exemplars can outweigh how we optimize instructions, with EO strategies as simple as random search outperforming state-of-the-art IO methods with seed instructions without any optimization. Moreover, we observe a synergy between EO and IO, with optimal combinations surpassing the individual contributions. We conclude that studying exemplar optimization both as a standalone method and its optimal combination with instruction optimization remain a crucial aspect of APO and deserve greater consideration in future research, even in the era of highly capable instruction-following models.

exemplar, large language model, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry:

Media > Film (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
Leisure & Entertainment > Sports > Soccer (0.93)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Appendix

Neural Information Processing SystemsMay-29-2025, 21:17:36 GMT

A Method Details A.1 The attention network The attention network is implemented as a feedforward neural network with one hidden layer: Input layer: 12 units Hidden layer: N units coupled with a dropout layer p = 0.5 Output layer: N units, softmax activation function N is the capacity of policy memory. From these three policies, we tried to extract all possible information. The information should be cheap to extract and dependent on the current data, so we prefer features extracted from the outputs of these policies (value, entropy, distance, return, etc.). Intuitively, the most important features should be the empirical returns, values associated with each policy and the distances, which gives a good hint of which virtual policy will yield high performance (e.g., a virtual policy that is closer to the policy that obtained high return and low value loss). A.2 The advantage function In this paper, we use GAE [18] as the advantage function for all models and experiments Note that Algo. 1 illustrates the procedure for 1 actor. A.3 The objective function Following [19], our objective function also includes value loss and entropy terms.

artificial intelligence, hyperparameter, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback