AITopics

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Theoretical Analysis

Neural Information Processing SystemsMay-30-2025, 08:21:10 GMT

This section contains the theoretical analysis of the loss functions of offline experience replay (Proposition 2), augmented experience replay (Proposition 3), and online experience replay with reservoir sampling (Proposition 1). At each iteration t, t = 1,..T, a batch of data is sampled from the incoming task, B Note 3: Consider a balanced continual learning dataset (e.g., Split-CIFAR100, Split-Mini-ImageNet) where |D Note 4: Consider general continual learning datasets. Table 3 lists the image size, the number of classes, the number of tasks, and data size per task of the four CL benchmarks. C.1 Continual Learning Implementation The hyperparameter settings are summarized in Table 4. All models are optimized using vanilla SGD.

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

Repeated Augmented Rehearsal: A Simple but Strong Baseline for Online Continual Learning

Neural Information Processing SystemsMay-30-2025, 08:21:07 GMT

Online continual learning (OCL) aims to train neural networks incrementally from a non-stationary data stream with a single pass through data. Rehearsal-based methods attempt to approximate the observed input distributions over time with a small memory and revisit them later to avoid forgetting. Despite their strong empirical performance, rehearsal methods still suffer from a poor approximation of past data's loss landscape with memory samples. This paper revisits the rehearsal dynamics in online settings. We provide theoretical insights on the inherent memory overfitting risk from the viewpoint of biased and dynamic empirical risk minimization, and examine the merits and limits of repeated rehearsal. Inspired by our analysis, a simple and intuitive baseline, repeated augmented rehearsal (RAR), is designed to address the underfitting-overfitting dilemma of online rehearsal. Surprisingly, across four rather different OCL benchmarks, this simple baseline outperforms vanilla rehearsal by 9%-17% and also significantly improves the state-of-the-art rehearsal-based methods MIR, ASER, and SCR. We also demonstrate that RAR successfully achieves an accurate approximation of the loss landscape of past data and high-loss ridge aversion in its learning trajectory. Extensive ablation studies are conducted to study the interplay between repeated and augmented rehearsal, and reinforcement learning (RL) is applied to dynamically adjust the hyperparameters of RAR to balance the stability-plasticity trade-off online.

machine learning, rehearsal, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Oceania > New Zealand (0.16)
Europe > France (0.14)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Online (0.71)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations, Ying Tiffany He

Neural Information Processing SystemsMay-30-2025, 08:20:54 GMT

First-order optimization (FOO) algorithms are pivotal in numerous computational domains, such as reinforcement learning and deep learning. However, their application to complex tasks often entails significant optimization inefficiency due to their need of many sequential iterations for convergence. In response, we introduce first-order optimization expedited with approximately parallelized iterations (OptEx), the first general framework that enhances the optimization efficiency of FOO by leveraging parallel computing to directly mitigate its requirement of many sequential iterations for convergence. To achieve this, OptEx utilizes a kernelized gradient estimation that is based on the history of evaluated gradients to predict the gradients required by the next few sequential iterations in FOO, which helps to break the inherent iterative dependency and hence enables the approximate parallelization of iterations in FOO. We further establish theoretical guarantees for the estimation error of our kernelized gradient estimation and the iteration complexity of SGD-based OptEx, confirming that the estimation error diminishes to zero as the history of gradients accumulates and that our SGD-based OptEx enjoys an effective acceleration rate of Θ( N) over standard SGD given parallelism of N, in terms of the sequential iterations required for convergence. Finally, we provide extensive empirical studies, including synthetic functions, reinforcement learning tasks, and neural network training on various datasets, to underscore the substantial efficiency improvements achieved by OptEx in practice. Our implementation is available at https://github.com/youyve/OptEx.

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

A Appendix

Neural Information Processing SystemsMay-30-2025, 08:20:36 GMT

Summation over i I decouples i from j and allows reuse of computation over j.

artificial intelligence, equivalence class, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

6 Appendix

Neural Information Processing SystemsMay-30-2025, 08:20:24 GMT

We also need "strides" as input to indicate how many new blocks will be kept in each step. BM25 is a famous TF-IDF-like information retrieval method. Each block is scored based on the common words with query or textual label. However, the semantic relevance are neglected. For example, BM25 fails to find the relevance between label name "sports" with "baseball player". Glove is a group of pretrained word representation.

artificial intelligence, information retrieval, natural language, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.58)

Add feedback

CogLTX: Applying BERT to Long Texts Chang Zhou Tsinghua University

Neural Information Processing SystemsMay-30-2025, 08:20:17 GMT

BERT is incapable of processing long texts due to its quadratically increasing memory and time consumption. The most natural ways to address this problem, such as slicing the text by a sliding window or simplifying transformers, suffer from insufficient long-range attentions or need customized CUDA kernels. The maximum length limit in BERT reminds us the limited capacity (5 9 chunks) of the working memory of humans --- then how do human beings Cognize Long TeXts?

machine learning, natural language, text classification, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.47)

Add feedback

Thank you very much for your careful, insightful and valuable comments, we will explain your concerns point by point

Neural Information Processing SystemsMay-30-2025, 08:20:06 GMT

We also compare the space of Longformer, which is still much heavier than CogLTX.

artificial intelligence, machine learning, summarization, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

UDA

Neural Information Processing SystemsMay-30-2025, 08:19:40 GMT

Cleaning missing values: The human-generated questions may be unanswerable. Thus, we remove the Q&A items that lack available answers. Additionally, documents lacking any valid Q&A pairs are also removed.

artificial intelligence, dataset, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.49)

Industry: Law (0.96)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-world Document Analysis

Neural Information Processing SystemsMay-30-2025, 08:19:37 GMT

The use of Retrieval-Augmented Generation (RAG) has improved Large Language Models (LLMs) in collaborating with external data, yet significant challenges exist in real-world scenarios. In areas such as academic literature and finance question answering, data are often found in raw text and tables in HTML or PDF formats, which can be lengthy and highly unstructured. In this paper, we introduce a benchmark suite, namely Unstructured Document Analysis (UDA), that involves 2,965 real-world documents and 29,590 expert-annotated Q&A pairs. We revisit popular LLMand RAG-based solutions for document analysis and evaluate the design choices and answer qualities across multiple document domains and diverse query types. Our evaluation yields interesting findings and highlights the importance of data parsing and retrieval. We hope our benchmark can shed light and better serve real-world document analysis applications.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.28)
North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VaiPhy: a Variational Inference Based Algorithm for Phylogeny Appendix

Neural Information Processing SystemsMay-30-2025, 08:19:29 GMT

A.1 Update Equation Details The update equations of VaiPhy follow the standard mean-field VI updates. Furthermore, i is the set of nodes except node i, and C is a constant. Hence, during the training of VaiPhy, we used a maximum likelihood heuristic to update the branch lengths given a tree topology. After the training, we used the tree topologies sampled from SLANTIS and corresponding branch lengths sampled from the JC sampler to compute IWELBO. A.2 Neighbor-Joining Initialization We utilize the NJ algorithm to initialize VaiPhy with a reasonable state. The sequence data is fed into BIONJ, an NJ algorithm, to create an initial reference phylogenetic tree using the PhyML software, version 3.3.20200621

artificial intelligence, bayesian inference, machine learning, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback