AITopics | ema

Collaborating Authors

ema

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LassoFlexNet: Flexible Neural Architecture for Tabular Data

Lui, Kry Yik Chau, Chi, Cheng, Basu, Kishore, Cao, Yanshuai

arXiv.org Machine LearningMar-24-2026

Despite their dominance in vision and language, deep neural networks often underperform relative to tree-based models on tabular data. To bridge this gap, we incorporate five key inductive biases into deep learning: robustness to irrelevant features, axis alignment, localized irregularities, feature heterogeneity, and training stability. We propose \emph{LassoFlexNet}, an architecture that evaluates the linear and nonlinear marginal contribution of each input via Per-Feature Embeddings, and sparsely selects relevant variables using a Tied Group Lasso mechanism. Because these components introduce optimization challenges that destabilize standard proximal methods, we develop a \emph{Sequential Hierarchical Proximal Adaptive Gradient optimizer with exponential moving averages (EMA)} to ensure stable convergence. Across $52$ datasets from three benchmarks, LassoFlexNet matches or outperforms leading tree-based models, achieving up to a $10$\% relative gain, while maintaining Lasso-like interpretability. We substantiate these empirical results with ablation studies and theoretical proofs confirming the architecture's enhanced expressivity and structural breaking of undesired rotational invariance.

artificial intelligence, lassoflexnet, machine learning, (18 more...)

arXiv.org Machine Learning

2603.20631

Country: North America > United States > California (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

e56eea9a45b153de634b23780365f976-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 21:13:44 GMT

assumption, mean function, missingness probability, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Utah (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Consumer Health (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

ARobustFunctionalEMAlgorithmforIncomplete PanelCountData

Neural Information Processing SystemsFeb-10-2026, 21:13:36 GMT

Panel count data describes aggregated counts of recurrent events observed at discrete time points. To understand dynamics of health behaviors and predict future negative events, the field of quantitative behavioral research has evolved toincreasingly rely upon panel count data collected viamultiple self reports, for example, about frequencies ofsmoking using in-the-moment surveysonmobile devices. However, missing reports are common and present a major barrier to downstream statistical learning.

artificial intelligence, machine learning, mean function, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Utah (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.52)

Add feedback

d9812f756d0df06c7381945d2e2c7d4b-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 16:29:54 GMT

Neural Information Processing Systems

Country:

North America > United States > New York > Broome County > Binghamton (0.04)
North America > Canada (0.04)
Asia > China (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A review of NMF, PLSA, LBA, EMA, and LCA with a focus on the identifiability issue

Qi, Qianqian, van der Heijden, Peter G. M.

arXiv.org Machine LearningDec-30-2025

Across fields such as machine learning, social science, geography, considerable attention has been given to models that factorize a nonnegative matrix into the product of two or three matrices, subject to nonnegative or row-sum-to-1 constraints. Although these models are to a large extend similar or even equivalent, they are presented under different names, and their similarity is not well known. This paper highlights similarities among five popular models, latent budget analysis (LBA), latent class analysis (LCA), end-member analysis (EMA), probabilistic latent semantic analysis (PLSA), and nonnegative matrix factorization (NMF). We focus on an essential issue-identifiability-of these models and prove that the solution of LBA, EMA, LCA, PLSA is unique if and only if the solution of NMF is unique. We also provide a brief review for algorithms of these models. We illustrate the models with a time budget dataset from social science, and end the paper with a discussion of closely related models such as archetypal analysis.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2512.22282

Country:

North America > United States (0.68)
Asia (0.68)
Europe (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

SuRe: Surprise-Driven Prioritised Replay for Continual LLM Learning

Hazard, Hugo, Fountas, Zafeirios, Benfeghoul, Martin A., Oomerjee, Adnan, Wang, Jun, Bou-Ammar, Haitham

arXiv.org Artificial IntelligenceDec-1-2025

Continual learning, one's ability to adapt to a sequence of tasks without forgetting previously acquired knowledge, remains a major challenge in machine learning and a key gap between artificial and human intelligence. While regularisation and replay perform well in vision, they lag behind multi-task learning for large language models (LLMs), especially at scale with many tasks. We revisit replay and argue that two failure modes drive this gap: selection (what to rehearse) and integration (how to consolidate new knowledge). To address selection, we propose Surprise-prioritised Replay (SuRe), a simple, architecture-agnostic rule that ranks and stores the most surprising (high Negative Log-Likelihood) sequences. SuRe achieves state-of-the-art performance in the Large Number of Tasks (LNT) setting and delivers the best overall average across both Standard CL and LNT benchmarks. To address integration, we add a dual-learner design with fast and slow LoRA adapters merged via an exponential moving average (EMA), enabling rapid adaptation while stabilising long-term knowledge. Combining SuRe with the dual learner yields further gains, including improvements of up to +5 accuracy points on LNT over prior SOTA. Ablation studies confirm that our proposed method remains robust under reduced replay frequency and small buffer size, demonstrating both effectiveness and sample efficiency. Taken together, our results establish replay as a strong baseline for continual LLM fine-tuning and demonstrate that surprise-based selection and slow-weight consolidation are complementary components for mitigating catastrophic forgetting.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2511.22367

Genre: Research Report > New Finding (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Preference-Aware Memory Update for Long-Term LLM Agents

Sun, Haoran, Zhang, Zekun, Zeng, Shaoning

arXiv.org Artificial IntelligenceOct-14-2025

One of the key factors influencing the reasoning capabilities of LLM-based agents is their ability to leverage long-term memory. Integrating long-term memory mechanisms allows agents to make informed decisions grounded in historical interactions. While recent advances have significantly improved the storage and retrieval components, by encoding memory into dense vectors for similarity search or organizing memory as structured knowledge graphs most existing approaches fall short in memory updating. In particular, they lack mechanisms for dynamically refining preference memory representations in response to evolving user behaviors and contexts. To address this gap, we propose a Preference-Aware Memory Update Mechanism (PAMU) that enables dynamic and personalized memory refinement. By integrating sliding window averages (SW) with exponential moving averages (EMA), PAMU constructs a fused preference-aware representation that captures both short-term fluctuations and long-term user tendencies. We conduct experiments on five task scenarios of the LoCoMo dataset, and the results show that our mechanism can significantly improve the output quality of LLM in five baselines, validating its effectiveness in long-term conversations.

large language model, mechanism, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.0972

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

LLMs on a Budget? Say HOLA

Siddiqui, Zohaib Hasan, Gao, Jiechao, Shabbir, Ebad, Azeez, Mohammad Anas, Ali, Rafiq, Kashyap, Gautam Siddharth, Naseem, Usman

arXiv.org Artificial IntelligenceOct-10-2025

Running Large Language Models (LLMs) on edge devices is constrained by high compute and memory demands posing a barrier for real-time applications in sectors like healthcare, education, and embedded systems. Current solutions such as quantization, pruning, and retrieval-augmented generation (RAG) offer only partial optimizations and often compromise on speed or accuracy. We introduce HOLA, an end-to-end optimization framework for efficient LLM deployment. Internally, it leverages Hierarchical Speculative Decoding (HSD) for faster inference without quality loss. Externally, AdaComp-RAG adjusts retrieval complexity based on context needs. Together with LoBi, which blends structured pruning (LoRA) and quantization, HOLA delivers significant gains: 17.6% EMA on GSM8K, 10.5% MCA on ARC, and reduced latency and memory on edge devices like Jetson Nano--proving both scalable and production-ready.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.18952

Country: Asia > India > NCT (0.14)

Genre: Research Report (0.82)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

90fd4f88f588ae64038134f1eeaa023f-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 05:38:30 GMT

Thank you for all the helpful comments. Several related works were raised by the reviewers which we discuss here. We note that the authors have marked their ArXiv submission as containing errors. Each of their inner loops uses SGD to solve the distance-regularized objectives. First, we use the EMA of slow weights to adjust the training parameters during optimization.

artificial intelligence, lookahead, machine learning, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

Filters

Collaborating Authors

ema

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

LassoFlexNet: Flexible Neural Architecture for Tabular Data

e56eea9a45b153de634b23780365f976-Supplemental.pdf

ARobustFunctionalEMAlgorithmforIncomplete PanelCountData

d9812f756d0df06c7381945d2e2c7d4b-Supplemental.pdf

92c3b916311a5517d9290576e3ea37ad-Supplemental.pdf

A review of NMF, PLSA, LBA, EMA, and LCA with a focus on the identifiability issue

SuRe: Surprise-Driven Prioritised Replay for Continual LLM Learning

Preference-Aware Memory Update for Long-Term LLM Agents

LLMs on a Budget? Say HOLA

90fd4f88f588ae64038134f1eeaa023f-AuthorFeedback.pdf