AITopics | output

Collaborating Authors

output

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Add, Multiply, and Execute Algorithmic Instructions Exactly with Neural Networks

Neural Information Processing SystemsJun-18-2026, 09:56:39 GMT

Neural networks are known for their ability to approximate smooth functions, yet they fail to generalize perfectly to unseen inputs when trained on discrete operations. Such operations lie at the heart of algorithmic tasks such as arithmetic, which is often used as a test bed for algorithmic execution in neural networks. In this work, we ask: can neural networks learn to execute binary-encoded algorithmic instructions exactly? We use the Neural Tangent Kernel (NTK) framework to study the training dynamics of two-layer fully connected networks in the infinite-width limit and show how a sufficiently large ensemble of such models can be trained to execute exactly, with high probability, four fundamental tasks: binary permutations, binary addition, binary multiplication, and Subtract and Branch if Negative (SBN) instructions. Since SBN is Turing-complete, our framework extends to computable functions. We show how this can be efficiently achieved using only logarithmically many training data. Our approach relies on two techniques: structuring the training data to isolate bit-level rules, and controlling correlations in the NTK regime to align model predictions with the target algorithmic executions.

artificial intelligence, instruction count, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.45)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Approximating Real-Time Recurrent Learning with Random Kronecker Factors

Asier Mujika, Florian Meier, Angelika Steger

Neural Information Processing SystemsFeb-14-2026, 21:04:38 GMT

Wealso confirm these theoretical results experimentally. Further,we showempirically thattheKF-RTRLalgorithm captures long-term dependencies and almost matches the performance of TBPTT on real world tasks by trainingRecurrent Highway Networks on a synthetic string memorization task and onthe Penn TreeBank task, respectively.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland (0.05)
North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

e8258e5140317ff36c7f8225a3bf9590-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 16:41:22 GMT

The original MuZero did not use sticky actions (Machado et al., 2017) (a 25% chance that the selected action is ignored and that instead the previous action is repeated) for Atari experiments. For all experiments in this work we used a network architecture based on the one introduced by MuZero(Schrittwieser etal.,2020), To implement the network, we used the modules provided by the Haiku neural network library (Henniganetal.,2020). We did not observe any benefit from using a Gaussian mixture, so instead inallourexperiments weusedasingle Gaussian withdiagonal covariance. All experiments used the Adam optimiser (Kingma & Ba, 2015) with decoupled weight decay (Loshchilov & Hutter, 2017) for training.

artificial intelligence, config, machine learning, (18 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SureMap

Neural Information Processing SystemsFeb-9-2026, 17:06:26 GMT

The task is then to use a setS Dn of n 1 i.i.d.

artificial intelligence, justification, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ShoppingMMLU: AMassiveMulti-TaskOnline ShoppingBenchmarkforLargeLanguageModels

Neural Information Processing SystemsFeb-9-2026, 07:17:19 GMT

However,existingmodelsand benchmarks are commonly tailored to specific tasks, falling short of capturing the full complexity of online shopping. Large Language Models (LLMs), with their multi-task and few-shot learning abilities, have the potential to profoundly transform online shopping byalleviating task-specific engineering effortsandby providing users with interactiveconversations.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe > Spain (0.04)
Europe > Germany (0.04)
Europe > France (0.04)
Asia > Japan (0.04)

Industry:

Retail > Online (0.58)
Information Technology (0.58)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

OntheNoiseRobustnessofIn-ContextLearning forTextGeneration

Neural Information Processing SystemsFeb-9-2026, 02:01:16 GMT

Large language models (LLMs) have shown impressive performance on downstream tasks by in-contextlearning (ICL), which heavily relies on the quality of demonstrations selected from a large set of annotated examples.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > Montana (0.04)
North America > United States > Alabama (0.04)
Asia > China > Jiangsu Province > Yancheng (0.04)
(2 more...)

Genre: Research Report (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

ba74855789913e5ed36f87288af79e5b-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsSep-28-2025, 20:57:43 GMT

agent, algorithm, distribution shift, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.46)
Asia (0.45)
North America > United States > New York (0.14)
Europe > France (0.14)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Energy > Power Industry (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

GreenMachine: Automatic Design of Zero-Cost Proxies for Energy-Efficient NAS

Cortês, Gabriel, Lourenço, Nuno, Machado, Penousal

arXiv.org Artificial IntelligenceNov-22-2024

Artificial Intelligence (AI) has driven innovations and created new opportunities across various sectors. However, leveraging domain-specific knowledge often requires automated tools to design and configure models effectively. In the case of Deep Neural Networks (DNNs), researchers and practitioners usually resort to Neural Architecture Search (NAS) approaches, which are resource- and time-intensive, requiring the training and evaluation of numerous candidate architectures. This raises sustainability concerns, particularly due to the high energy demands involved, creating a paradox: the pursuit of the most effective model can undermine sustainability goals. To mitigate this issue, zero-cost proxies have emerged as a promising alternative. These proxies estimate a model's performance without the need for full training, offering a more efficient approach. This paper addresses the challenges of model evaluation by automatically designing zero-cost proxies to assess DNNs efficiently. Our method begins with a randomly generated set of zero-cost proxies, which are evolved and tested using the NATS-Bench benchmark. We assess the proxies' effectiveness using both randomly sampled and stratified subsets of the search space, ensuring they can differentiate between low- and high-performing networks and enhance generalizability. Results show our method outperforms existing approaches on the stratified sampling strategy, achieving strong correlations with ground truth performance, including a Kendall correlation of 0.89 on CIFAR-10 and 0.77 on CIFAR-100 with NATS-Bench-SSS and a Kendall correlation of 0.78 on CIFAR-10 and 0.71 on CIFAR-100 with NATS-Bench-TSS.

evolutionary algorithm, machine learning, proxy, (20 more...)

arXiv.org Artificial Intelligence

2411.1529

Country:

Europe > Switzerland (0.04)
Europe > Austria (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

GeoGalactica: A Scientific Large Language Model in Geoscience

Lin, Zhouhan, Deng, Cheng, Zhou, Le, Zhang, Tianhang, Xu, Yi, Xu, Yutong, He, Zhongmou, Shi, Yuanyuan, Dai, Beiya, Song, Yunchong, Zeng, Boyi, Chen, Qiyuan, Shi, Tao, Huang, Tianyu, Xu, Yiwei, Wang, Shu, Fu, Luoyi, Zhang, Weinan, He, Junxian, Ma, Chao, Zhu, Yunqiang, Wang, Xinbing, Zhou, Chenghu

arXiv.org Artificial IntelligenceDec-31-2023

Large language models (LLMs) have achieved huge success for their general knowledge and ability to solve a wide spectrum of tasks in natural language processing (NLP). Due to their impressive abilities, LLMs have shed light on potential inter-discipline applications to foster scientific discoveries of a specific domain by using artificial intelligence (AI for science, AI4S). In the meantime, utilizing NLP techniques in geoscience research and practice is wide and convoluted, contributing from knowledge extraction and document classification to question answering and knowledge discovery. In this work, we take the initial step to leverage LLM for science, through a rather straightforward approach. We try to specialize an LLM into geoscience, by further pre-training the model with a vast amount of texts in geoscience, as well as supervised fine-tuning (SFT) the resulting model with our custom collected instruction tuning dataset. These efforts result in a model GeoGalactica consisting of 30 billion parameters. To our best knowledge, it is the largest language model for the geoscience domain. More specifically, GeoGalactica is from further pre-training of Galactica. We train GeoGalactica over a geoscience-related text corpus containing 65 billion tokens curated from extensive data sources in the big science project Deep-time Digital Earth (DDE), preserving as the largest geoscience-specific text corpus. Then we fine-tune the model with 1 million pairs of instruction-tuning data consisting of questions that demand professional geoscience knowledge to answer. In this technical report, we will illustrate in detail all aspects of GeoGalactica, including data collection, data cleaning, base model selection, pre-training, SFT, and evaluation. We open-source our data curation tools and the checkpoints of GeoGalactica during the first 3/4 of pre-training.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2401.00434

Country: