AITopics | pem

Collaborating Authors

pem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Composing Parameter-Efficient Modules with Arithmetic Operations

Neural Information Processing SystemsFeb-9-2026, 07:45:05 GMT

Our approach requires no additional training and enables highly flexible module composition.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
(6 more...)

Genre: Research Report > New Finding (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Fine Tuning a Simulation-Driven Estimator

Lakshminarayanan, Braghadeesh, Guerrero, Margarita A., Rojas, Cristian R.

arXiv.org Machine LearningJan-28-2026

Many industries now deploy high-fidelity simulators (digital twins) to represent physical systems, yet their parameters must be calibrated to match the true system. This motivated the construction of simulation-driven parameter estimators, built by generating synthetic observations for sampled parameter values and learning a supervised mapping from observations to parameters. However, when the true parameters lie outside the sampled range, predictions suffer from an out-of-distribution (OOD) error. This paper introduces a fine-tuning approach for the Two-Stage estimator that mitigates OOD effects and improves accuracy. The effectiveness of the proposed method is verified through numerical simulations.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1109/LCSYS.2025.3647070

2504.0448

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Spectral Predictability as a Fast Reliability Indicator for Time Series Forecasting Model Selection

Wang, Oliver, Quan, Pengrui, Yang, Kang, Srivastava, Mani

arXiv.org Artificial IntelligenceNov-13-2025

Practitioners deploying time series forecasting models face a dilemma: exhaustively validating dozens of models is computationally prohibitive, yet choosing the wrong model risks poor performance. We show that spectral predictability~$Ω$ -- a simple signal processing metric -- systematically stratifies model family performance, enabling fast model selection. We conduct controlled experiments in four different domains, then further expand our analysis to 51 models and 28 datasets from the GIFT-Eval benchmark. We find that large time series foundation models (TSFMs) systematically outperform lightweight task-trained baselines when $Ω$ is high, while their advantage vanishes as $Ω$ drops. Computing $Ω$ takes seconds per dataset, enabling practitioners to quickly assess whether their data suits TSFM approaches or whether simpler, cheaper models suffice. We demonstrate that $Ω$ stratifies model performance predictably, offering a practical first-pass filter that reduces validation costs while highlighting the need for models that excel on genuinely difficult (low-$Ω$) problems rather than merely optimizing easy ones.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.08884

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

Generalizable Reasoning through Compositional Energy Minimization

Oarga, Alexandru, Du, Yilun

arXiv.org Artificial IntelligenceOct-24-2025

Generalization is a key challenge in machine learning, specifically in reasoning tasks, where models are expected to solve problems more complex than those encountered during training. Existing approaches typically train reasoning models in an end-to-end fashion, directly mapping input instances to solutions. While this allows models to learn useful heuristics from data, it often results in limited generalization beyond the training distribution. In this work, we propose a novel approach to reasoning generalization by learning energy landscapes over the solution spaces of smaller, more tractable subproblems. At test time, we construct a global energy landscape for a given problem by combining the energy functions of multiple subproblems. This compositional approach enables the incorporation of additional constraints during inference, allowing the construction of energy landscapes for problems of increasing difficulty. To improve the sample quality from this newly constructed energy landscape, we introduce Parallel Energy Minimization (PEM). We evaluate our approach on a wide set of reasoning problems. Our method outperforms existing state-of-the-art methods, demonstrating its ability to generalize to larger and more complex problems. Project website can be found at: https://alexoarga.github.io/compositional_reasoning/

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.20607

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > Promising Solution (0.54)

Industry: Energy (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.92)
(2 more...)

Add feedback

Composing Parameter-Efficient Modules with Arithmetic Operations

Neural Information Processing SystemsOct-8-2025, 08:17:54 GMT

Our approach requires no additional training and enables highly flexible module composition.

experiment, instruction, module, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
(6 more...)

Genre: Research Report > New Finding (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Reveal and Release: Iterative LLM Unlearning with Self-generated Data

Xie, Linxi, Teng, Xin, Ke, Shichang, Wen, Hongyi, Wang, Shengjie

arXiv.org Artificial IntelligenceSep-19-2025

Large language model (LLM) unlearning has demonstrated effectiveness in removing the influence of undesirable data (also known as forget data). Existing approaches typically assume full access to the forget dataset, overlooking two key challenges: (1) Forget data is often privacy-sensitive, rare, or legally regulated, making it expensive or impractical to obtain (2) The distribution of available forget data may not align with how that information is represented within the model. To address these limitations, we propose a ``Reveal-and-Release'' method to unlearn with self-generated data, where we prompt the model to reveal what it knows using optimized instructions. To fully utilize the self-generated forget data, we propose an iterative unlearning framework, where we make incremental adjustments to the model's weight space with parameter-efficient modules trained on the forget data. Experimental results demonstrate that our method balances the tradeoff between forget quality and utility preservation.

forget data, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.14624

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Domain Expansion: Parameter-Efficient Modules as Building Blocks for Composite Domains

Patel, Mann, Panda, Divyajyoti, Mehta, Hilay, Patel, Parth, Parikh, Dhruv

arXiv.org Artificial IntelligenceJan-24-2025

Parameter-Efficient Fine-Tuning (PEFT) is an efficient alternative to full scale fine-tuning, gaining popularity recently. With pre-trained model sizes growing exponentially, PEFT can be effectively utilized to fine-tune compact modules, Parameter-Efficient Modules (PEMs), trained to be domain experts over diverse domains. In this project, we explore composing such individually fine-tuned PEMs for distribution generalization over the composite domain. To compose PEMs, simple composing functions are used that operate purely on the weight space of the individually fine-tuned PEMs, without requiring any additional fine-tuning. The proposed method is applied to the task of representing the 16 Myers-Briggs Type Indicator (MBTI) composite personalities via 4 building block dichotomies, comprising of 8 individual traits which can be merged (composed) to yield a unique personality. We evaluate the individual trait PEMs and the composed personality PEMs via an online MBTI personality quiz questionnaire, validating the efficacy of PEFT to fine-tune PEMs and merging PEMs without further fine-tuning for domain composition.

large language model, machine learning, pem, (19 more...)

arXiv.org Artificial Intelligence

2501.14321

Country: Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre:

Questionnaire & Opinion Survey (0.90)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process Rewarding

Sun, Zhongxiang, Wang, Qipeng, Yu, Weijie, Zang, Xiaoxue, Zheng, Kai, Xu, Jun, Zhang, Xiao, Yang, Song, Li, Han

arXiv.org Artificial IntelligenceJan-14-2025

Retrieval-Augmented Generation (RAG) systems for Large Language Models (LLMs) hold promise in knowledge-intensive tasks but face limitations in complex multi-step reasoning. While recent methods have integrated RAG with chain-of-thought reasoning or test-time search using Process Reward Models (PRMs), these approaches encounter challenges such as a lack of explanations, bias in PRM training data, early-step bias in PRM scores, and insufficient post-training optimization of reasoning potential. To address these issues, we propose Retrieval-Augmented Reasoning through Trustworthy Process Rewarding (ReARTeR), a framework that enhances RAG systems' reasoning capabilities through post-training and test-time scaling. At test time, ReARTeR introduces Trustworthy Process Rewarding via a Process Reward Model for accurate scalar scoring and a Process Explanation Model (PEM) for generating natural language explanations, enabling step refinement. During post-training, it utilizes Monte Carlo Tree Search guided by Trustworthy Process Rewarding to collect high-quality step-level preference data, optimized through Iterative Preference Optimization. ReARTeR addresses three core challenges: (1) misalignment between PRM and PEM, tackled through off-policy preference learning; (2) bias in PRM training data, mitigated by balanced annotation methods and stronger annotations for challenging examples; and (3) early-step bias in PRM, resolved through a temporal-difference-based look-ahead search strategy. Experimental results on multi-step reasoning benchmarks demonstrate significant improvements, underscoring ReARTeR's potential to advance the reasoning capabilities of RAG systems.

rag system, rearter, reasoning step, (12 more...)

arXiv.org Artificial Intelligence

2501.07861

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

EMPERROR: A Flexible Generative Perception Error Model for Probing Self-Driving Planners

Hanselmann, Niklas, Doll, Simon, Cordts, Marius, Lensch, Hendrik P. A., Geiger, Andreas

arXiv.org Artificial IntelligenceNov-12-2024

To handle the complexities of real-world traffic, learning planners for self-driving from data is a promising direction. While recent approaches have shown great progress, they typically assume a setting in which the ground-truth world state is available as input. However, when deployed, planning needs to be robust to the long-tail of errors incurred by a noisy perception system, which is often neglected in evaluation. To address this, previous work has proposed drawing adversarial samples from a perception error model (PEM) mimicking the noise characteristics of a target object detector. However, these methods use simple PEMs that fail to accurately capture all failure modes of detection. In this paper, we present EMPERROR, a novel transformer-based generative PEM, apply it to stress-test an imitation learning (IL)-based planner and show that it imitates modern detectors more faithfully than previous work. Furthermore, it is able to produce realistic noisy inputs that increase the planner's collision rate by up to 85%, demonstrating its utility as a valuable tool for a more complete evaluation of self-driving planners.

artificial intelligence, detector, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.07719

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

PEM: Prototype-based Efficient MaskFormer for Image Segmentation

Cavagnero, Niccolò, Rosi, Gabriele, Cuttano, Claudia, Pistilli, Francesca, Ciccone, Marco, Averta, Giuseppe, Cermelli, Fabio

arXiv.org Artificial IntelligenceMay-6-2024

Recent transformer-based architectures have shown impressive results in the field of image segmentation. Thanks to their flexibility, they obtain outstanding performance in multiple segmentation tasks, such as semantic and panoptic, under a single unified framework. To achieve such impressive performance, these architectures employ intensive operations and require substantial computational resources, which are often not available, especially on edge devices. To fill this gap, we propose Prototype-based Efficient MaskFormer (PEM), an efficient transformer-based architecture that can operate in multiple segmentation tasks. PEM proposes a novel prototype-based cross-attention which leverages the redundancy of visual features to restrict the computation and improve the efficiency without harming the performance. In addition, PEM introduces an efficient multi-scale feature pyramid network, capable of extracting features that have high semantic content in an efficient way, thanks to the combination of deformable convolutions and context-based self-modulation. We benchmark the proposed PEM architecture on two tasks, semantic and panoptic segmentation, evaluated on two different datasets, Cityscapes and ADE20K. PEM demonstrates outstanding performance on every task and dataset, outperforming task-specific architectures while being comparable and even better than computationally-expensive baselines.

architecture, panoptic segmentation, segmentation, (13 more...)

arXiv.org Artificial Intelligence

2402.19422

Country: Europe (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback