AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Should We Really Edit Language Models On the Evaluation of Edited Language Models

Neural Information Processing SystemsMay-29-2025, 02:48:53 GMT

Model editing has become an increasingly popular method for efficiently updating knowledge within language models. Current approaches primarily focus on reliability, generalization, and locality, with many excelling across these criteria. Some recent studies have highlighted the potential pitfalls of these editing methods, such as knowledge distortion and conflicts. However, the general capabilities of post-edited language models remain largely unexplored. In this paper, we conduct a comprehensive evaluation of various editing methods across different language models, and have the following findings.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)

Add feedback

A FineWeb Datasheet Dataset Details Purpose of the dataset

Neural Information Processing SystemsMay-29-2025, 02:44:38 GMT

We released FineWeb to make large language model training more accessible to the machine learning community at large. The dataset was curated by Hugging Face. The dataset was funded by Hugging Face. The dataset is released under the Open Data Commons Attribution License (ODC-By) v1.0 license. The use of this dataset is also subject to Common-Crawl's Terms of Use.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre: Instructional Material (0.46)

Industry:

Health & Medicine > Consumer Health (0.47)
Education > Educational Setting (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Add feedback

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Neural Information Processing SystemsMay-29-2025, 02:44:35 GMT

The performance of a large language model (LLM) depends heavily on the quality and size of its pretraining dataset. However, the pretraining datasets for state-ofthe-art open LLMs like Llama 3 and Mixtral are not publicly available and very little is known about how they were created. In this work, we introduce FineWeb, a 15-trillion token dataset derived from 96 Common Crawl snapshots that produces better-performing LLMs than other open pretraining datasets. To advance the understanding of how best to curate high-quality pretraining datasets, we carefully document and ablate all of the design choices used in FineWeb, including indepth investigations of deduplication and filtering strategies. In addition, we introduce FineWeb-Edu, a 1.3-trillion token collection of educational text filtered from FineWeb.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Instructional Material (0.92)
Research Report (0.67)

Industry:

Education > Educational Setting (0.67)
Health & Medicine > Consumer Health (0.46)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks

Neural Information Processing SystemsMay-29-2025, 02:44:12 GMT

It is well known that modern deep neural networks are powerful enough to memorize datasets even when the labels have been randomized.

artificial intelligence, machine learning, neuron, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

3f5ee243547dee91fbd053c1c4a845aa-Supplemental.pdf

Neural Information Processing SystemsMay-29-2025, 02:44:05 GMT

machine learning, natural language, student model, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.69)
North America > United States > Texas (0.14)
North America > United States > Louisiana (0.14)

Industry: Education (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

Neural Information Processing SystemsMay-29-2025, 02:43:58 GMT

distillation, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.68)
North America > United States > Louisiana (0.14)
North America > United States > Texas (0.14)

Industry: Education (0.73)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

The Inductive Bias of Quantum Kernels

Neural Information Processing SystemsMay-29-2025, 02:43:43 GMT

It has been hypothesized that quantum computers may lend themselves well to applications in machine learning. In the present work, we analyze function classes defined via quantum kernels. Quantum computers offer the possibility to efficiently compute inner products of exponentially large density operators that are classically hard to compute.

artificial intelligence, kernel, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

The Inductive Bias of Quantum Kernels

Neural Information Processing SystemsMay-29-2025, 02:43:39 GMT

artificial intelligence, kernel, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Accelerating Transformers with Spectrum-Preserving Token Merging 1,3,4

Neural Information Processing SystemsMay-29-2025, 02:43:20 GMT

Increasing the throughput of the Transformer architecture, a foundational component used in numerous state-of-the-art models for vision and language tasks (e.g., GPT, LLaVa), is an important problem in machine learning. One recent and effective strategy is to merge token representations within Transformer models, aiming to reduce computational and memory requirements while maintaining accuracy. Prior works have proposed algorithms based on Bipartite Soft Matching (BSM), which divides tokens into distinct sets and merges the top k similar tokens. However, these methods have significant drawbacks, such as sensitivity to tokensplitting strategies and damage to informative tokens in later layers.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry:

Leisure & Entertainment > Sports > Baseball (0.67)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

A Gradient Sampling Method With Complexity Guarantees for Lipschitz Functions in High and Low Dimensions

Neural Information Processing SystemsMay-29-2025, 02:43:12 GMT

Their method is a novel modification of Goldstein's classical subgradient method. Their work, however, makes use of a nonstandard subgradient oracle model and requires the function to be directionally differentiable. Our first contribution in this paper is to show that both of these assumptions can be dropped by simply adding a small random perturbation in each step of their algorithm. The resulting method works on any Lipschitz function whose value and gradient can be evaluated at points of differentiability.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report (0.46)

Technology: