Goto

Collaborating Authors

 spl




Empirical Assessment of the Perception of Software Product Line Engineering by an SME before Migrating its Code Base

Georges, Thomas, Huchard, Marianne, König, Mélanie, Nebut, Clémentine, Tibermacine, Chouki

arXiv.org Artificial Intelligence

Migrating a set of software variants into a software product line (SPL) is an expensive and potentially challenging endeavor. Indeed, SPL engineering can significantly impact a company's development process and often requires changes to established developer practices. The work presented in this paper stems from a collaboration with a Small and Medium-sized Enterprise (SME) that decided to migrate its existing code base into an SPL. In this study, we conducted an in-depth evaluation of the company's current development processes and practices, as well as the anticipated benefits and risks associated with the migration. Key stakeholders involved in software development participated in this evaluation to provide insight into their perceptions of the migration and their potential resistance to change. This paper describes the design of the interviews conducted with these stakeholders and presents an analysis of the results. Among the qualitative findings, we observed that all participants, regardless of their role in the development process, identified benefits of the migration relevant to their own activities. Furthermore, our results suggest that an effective risk mitigation strategy involves keeping stakeholders informed and engaged throughout the process, preserving as many good practices as possible, and actively involving them in the migration to ensure a smooth transition and minimize potential challenges.


Diverse Preference Learning for Capabilities and Alignment

Slocum, Stewart, Parker-Sartori, Asher, Hadfield-Menell, Dylan

arXiv.org Artificial Intelligence

The ability of LLMs to represent diverse perspectives is critical as they increasingly impact society. However, recent studies reveal that alignment algorithms such as RLHF and DPO significantly reduce the diversity of LLM outputs. Not only do aligned LLMs generate text with repetitive structure and word choice, they also approach problems in more uniform ways, and their responses reflect a narrower range of societal perspectives. We attribute this problem to the KL divergence regularizer employed in preference learning algorithms. This causes the model to systematically overweight majority opinions and sacrifice diversity in its outputs. To address this, we propose Soft Preference Learning, which decouples the entropy and cross-entropy terms in the KL penalty -- allowing for fine-grained control over LLM generation diversity. From a capabilities perspective, LLMs trained using Soft Preference Learning attain higher accuracy on difficult repeated sampling tasks and produce outputs with greater semantic and lexical diversity. From an alignment perspective, they are capable of representing a wider range of societal viewpoints and display improved logit calibration. Notably, Soft Preference Learning resembles, but is a Pareto improvement over, standard temperature scaling. As LLMs become integrated into how people consume information (Bick et al., 2024) and approach tasks (Deloitte, 2024), their ability to represent diverse perspectives is critical. For example, consider an LLM answering the following multiple-choice question: The best way to reduce income inequality is: (A) Increase minimum wage (B) Expand access to education and job training (C) Implement universal basic income (D) Lower taxes on the wealthy to stimulate job creation Imagine a survey showing people's preferences as: A (55%), B (20%), C (15%), and D (10%). How should an LLM respond to this question? Ideally, we may prefer it to reflect the range of views in the population. If an LLM assigns 99% probability to majority option A, it fails to represent the diversity of perspectives. With LLMs becoming important information sources, this may reinforce dominant narratives at the expense of minority views. However, recent studies show that alignment algorithms such as RLHF and DPO significantly reduce the diversity of LLM outputs. This leads to mode collapse towards majority preferences, as the example above shows (Kirk et al., 2024; Padmakumar & He, 2024; Rafailov et al., 2024; Christiano et al., 2023). In a generative setting, this results in repetitive responses, as illustrated in Figure 1. For example, the DPO model frequently uses the same doctor's name and 1 We highlight Doctor name, gender, and textual aberration features shown in the plots on the right. DPO responses are well-formed but lack diversity (e.g.


Enhancing software product lines with machine learning components

Cobaleda, Luz-Viviana, Carvajal, Julián, Vallejo, Paola, López, Andrés, Mazo, Raúl

arXiv.org Artificial Intelligence

Modern software systems increasingly integrate machine learning (ML) due to its advancements and ability to enhance data-driven decision-making. However, this integration introduces significant challenges for software engineering, especially in software product lines (SPLs), where managing variability and reuse becomes more complex with the inclusion of ML components. Although existing approaches have addressed variability management in SPLs and the integration of ML components in isolated systems, few have explored the intersection of both domains. Specifically, there is limited support for modeling and managing variability in SPLs that incorporate ML components. To bridge this gap, this article proposes a structured framework designed to extend Software Product Line engineering, facilitating the integration of ML components. It facilitates the design of SPLs with ML capabilities by enabling systematic modeling of variability and reuse. The proposal has been partially implemented with the VariaMos tool.


Iterative Residual Cross-Attention Mechanism: An Integrated Approach for Audio-Visual Navigation Tasks

Zhang, Hailong, Yu, Yinfeng, Wang, Liejun, Sun, Fuchun, Zheng, Wendong

arXiv.org Artificial Intelligence

Audio-visual navigation represents a significant area of research in which intelligent agents utilize egocentric visual and auditory perceptions to identify audio targets. Conventional navigation methodologies typically adopt a staged modular design, which involves first executing feature fusion, then utilizing Gated Recurrent Unit (GRU) modules for sequence modeling, and finally making decisions through reinforcement learning. While this modular approach has demonstrated effectiveness, it may also lead to redundant information processing and inconsistencies in information transmission between the various modules during the feature fusion and GRU sequence modeling phases. This paper presents IRCAM-AVN (Iterative Residual Cross-Attention Mechanism for Audiovisual Navigation), an end-to-end framework that integrates multimodal information fusion and sequence modeling within a unified IRCAM module, thereby replacing the traditional separate components for fusion and GRU. This innovative mechanism employs a multi-level residual design that concatenates initial multimodal sequences with processed information sequences. This methodological shift progressively optimizes the feature extraction process while reducing model bias and enhancing the model's stability and generalization capabilities. Empirical results indicate that intelligent agents employing the iterative residual cross-attention mechanism exhibit superior navigation performance.


MUVLA: Learning to Explore Object Navigation via Map Understanding

Han, Peilong, Jia, Fan, Zhang, Min, Qiu, Yutao, Tang, Hongyao, Zheng, Yan, Wang, Tiancai, Hao, Jianye

arXiv.org Artificial Intelligence

In this paper, we present MUVLA, a Map Understanding Vision-Language-Action model tailored for object navigation. It leverages semantic map abstractions to unify and structure historical information, encoding spatial context in a compact and consistent form. MUVLA takes the current and history observations, as well as the semantic map, as inputs and predicts the action sequence based on the description of goal object. Furthermore, it amplifies supervision through reward-guided return modeling based on dense short-horizon progress signals, enabling the model to develop a detailed understanding of action value for reward maximization. MUVLA employs a three-stage training pipeline: learning map-level spatial understanding, imitating behaviors from mixed-quality demonstrations, and reward amplification. This strategy allows MUVLA to unify diverse demonstrations into a robust spatial representation and generate more rational exploration strategies. Experiments on HM3D and Gibson benchmarks demonstrate that MUVLA achieves great generalization and learns effective exploration behaviors even from low-quality or partially successful trajectories.


Dynamic Multi-Target Fusion for Efficient Audio-Visual Navigation

Yu, Yinfeng, Zhang, Hailong, Zhu, Meiling

arXiv.org Artificial Intelligence

Audiovisual embodied navigation enables robots to locate audio sources by dynamically integrating visual observations from onboard sensors with the auditory signals emitted by the target. The core challenge lies in effectively leveraging multimodal cues to guide navigation. While prior works have explored basic fusion of visual and audio data, they often overlook deeper perceptual context. To address this, we propose the Dynamic Multi-Target Fusion for Efficient Audio-Visual Navigation (DMTF-A VN). Our approach uses a multi-target architecture coupled with a refined Transformer mechanism to filter and selectively fuse cross-modal information. Extensive experiments on the Replica and Matterport3D datasets demonstrate that DMTF-A VN achieves state-of-the-art performance, outperforming existing methods in success rate (SR), path efficiency (SPL), and scene adaptation (SNA). Furthermore, the model exhibits strong scalability and generalizability, paving the way for advanced multimodal fusion strategies in robotic navigation.


FiLM-Nav: Efficient and Generalizable Navigation via VLM Fine-tuning

Yokoyama, Naoki, Ha, Sehoon

arXiv.org Artificial Intelligence

Enabling robotic assistants to navigate complex environments and locate objects described in free-form language is a critical capability for real-world deployment. While foundation models, particularly Vision-Language Models (VLMs), offer powerful semantic understanding, effectively adapting their web-scale knowledge for embodied decision-making remains a key challenge. We present FiLM-Nav (Fine-tuned Language Model for Navigation), an approach that directly fine-tunes pre-trained VLM as the navigation policy. In contrast to methods that use foundation models primarily in a zero-shot manner or for map annotation, FiLM-Nav learns to select the next best exploration frontier by conditioning directly on raw visual trajectory history and the navigation goal. Leveraging targeted simulated embodied experience allows the VLM to ground its powerful pre-trained representations in the specific dynamics and visual patterns relevant to goal-driven navigation. Critically, fine-tuning on a diverse data mixture combining ObjectNav, OVON, ImageNav, and an auxiliary spatial reasoning task proves essential for achieving robustness and broad generalization. FiLM-Nav sets a new state-of-the-art in both SPL and success rate on HM3D ObjectNav among open-vocabulary methods, and sets a state-of-the-art SPL on the challenging HM3D-OVON benchmark, demonstrating strong generalization to unseen object categories. Our work validates that directly fine-tuning VLMs on diverse simulated embodied data is a highly effective pathway towards generalizable and efficient semantic navigation capabilities.