Energy
Efficient task and path planning for maintenance automation using a robot system
Friedrich, Christian, Csiszar, Akos, Lechler, Armin, Verl, Alexander
The research and development of intelligent automation solutions is a ground-breaking point for the factory of the future. A promising and challenging mission is the use of autonomous robot systems to automate tasks in the field of maintenance. For this purpose, the robot system must be able to plan autonomously the different manipulation tasks and the corresponding paths. Basic requirements are the development of algorithms with a low computational complexity and the possibility to deal with environmental uncertainties. In this work, an approach is presented, which is especially suited to solve the problem of maintenance automation. For this purpose, offline data from CAD is combined with online data from an RGBD vision system via a probabilistic filter, to compensate uncertainties from offline data. For planning the different tasks, a method is explained, which use a symbolic description, founded on a novel sampling-based method to compute the disassembly space. For path planning we use global state-of-the art algorithms with a method that allows the adaption of the exploration stepsize in order to reduce the planning time. Every method is experimentally validated and discussed.
Maintenance automation: methods for robotics manipulation planning and execution
Friedrich, Christian, Gulde, Ralf, Lechler, Armin, Verl, Alexander
Automating complex tasks using robotic systems requires skills for planning, control and execution. This paper proposes a complete robotic system for maintenance automation, which can automate disassembly and assembly operations under environmental uncertainties (e.g. deviations between prior plan information). The cognition of the robotic system is based on a planning approach (using CAD and RGBD data) and includes a method to interpret a symbolic plan and transform it to a set of executable robot instructions. The complete system is experimentally evaluated using real-world applications. This work shows the first step to transfer these theoretical results into a practical robotic solution.
PKG-DPO: Optimizing Domain-Specific AI systems with Physics Knowledge Graphs and Direct Preference Optimization
Kulkarni, Nitin Nagesh, Wilcox, Bryson, Sawa, Max, Thom, Jason
Advancing AI systems in scientific domains like physics, materials science, and engineering calls for reasoning over complex, multi-physics phenomena while respecting governing principles. Although Large Language Models (LLMs) and existing preference optimization techniques perform well on standard benchmarks, they often struggle to differentiate between physically valid and invalid reasoning. This shortcoming becomes critical in high-stakes applications like metal joining, where seemingly plausible yet physically incorrect recommendations can lead to defects, material waste, equipment damage, and serious safety risks. To address this challenge, we introduce PKG-DPO, a novel framework that integrates Physics Knowledge Graphs (PKGs) with Direct Preference Optimization (DPO) to enforce physical validity in AI-generated outputs. PKG-DPO comprises three key components A) hierarchical physics knowledge graph that encodes cross-domain relationships, conservation laws, and thermodynamic principles. B) A physics reasoning engine that leverages structured knowledge to improve discrimination between physically consistent and inconsistent responses. C) A physics-grounded evaluation suite designed to assess compliance with domain-specific constraints. PKG-DPO achieves 17% fewer constraint violations and an 11% higher Physics Score compared to KG-DPO (knowledge graph-based DPO). Additionally, PKG-DPO demonstrates a 12\% higher relevant parameter accuracy and a 7% higher quality alignment in reasoning accuracy. While our primary focus is on metal joining, the framework is broadly applicable to other multi-scale, physics-driven domains, offering a principled approach to embedding scientific constraints into preference learning.
Spacer: Towards Engineered Scientific Inspiration
Lee, Minhyeong, Hwang, Suyoung, Moon, Seunghyun, Nah, Geonho, Koh, Donghyun, Cho, Youngjun, Park, Johyun, Yoo, Hojin, Park, Jiho, Choi, Haneul, Moon, Sungbin, Hwang, Taehoon, Kim, Seungwon, Kim, Jaeyeong, Kim, Seongjun, Jung, Juneau
Recent advances in LLMs have made automated scientific research the next frontline in the path to artificial superintelligence. However, these systems are bound either to tasks of narrow scope or the limited creative capabilities of LLMs. We propose Spacer, a scientific discovery system that develops creative and factually grounded concepts without external intervention. Spacer attempts to achieve this via 'deliberate decontextualization,' an approach that disassembles information into atomic units - keywords - and draws creativity from unexplored connections between them. Spacer consists of (i) Nuri, an inspiration engine that builds keyword sets, and (ii) the Manifesting Pipeline that refines these sets into elaborate scientific statements. Nuri extracts novel, high-potential keyword sets from a keyword graph built with 180,000 academic publications in biological fields. The Manifesting Pipeline finds links between keywords, analyzes their logical structure, validates their plausibility, and ultimately drafts original scientific concepts. According to our experiments, the evaluation metric of Nuri accurately classifies high-impact publications with an AUROC score of 0.737. Our Manifesting Pipeline also successfully reconstructs core concepts from the latest top-journal articles solely from their keyword sets. An LLM-based scoring system estimates that this reconstruction was sound for over 85% of the cases. Finally, our embedding space analysis shows that outputs from Spacer are significantly more similar to leading publications compared with those from SOTA LLMs.
Generative Artificial Intelligence and Agents in Research and Teaching
Jauhiainen, Jussi S., Toppari, Aurora
This study provides a comprehensive analysis of the development, functioning, and application of generative artificial intelligence (GenAI) and large language models (LLMs), with an emphasis on their implications for research and education. It traces the conceptual evolution from artificial intelligence (AI) through machine learning (ML) and deep learning (DL) to transformer architectures, which constitute the foundation of contemporary generative systems. Technical aspects, including prompting strategies, word embeddings, and probabilistic sampling methods (temperature, top-k, and top-p), are examined alongside the emergence of autonomous agents. These elements are considered in relation to both the opportunities they create and the limitations and risks they entail. The work critically evaluates the integration of GenAI across the research process, from ideation and literature review to research design, data collection, analysis, interpretation, and dissemination. While particular attention is given to geographical research, the discussion extends to wider academic contexts. A parallel strand addresses the pedagogical applications of GenAI, encompassing course and lesson design, teaching delivery, assessment, and feedback, with geography education serving as a case example. Central to the analysis are the ethical, social, and environmental challenges posed by GenAI. Issues of bias, intellectual property, governance, and accountability are assessed, alongside the ecological footprint of LLMs and emerging technological strategies for mitigation. The concluding section considers near- and long-term futures of GenAI, including scenarios of sustained adoption, regulation, and potential decline. By situating GenAI within both scholarly practice and educational contexts, the study contributes to critical debates on its transformative potential and societal responsibilities.
Comparative Analysis of UAV Path Planning Algorithms for Efficient Navigation in Urban 3D Environments
Cheriet, Hichem, Badra, Khellat Kihel, Samira, Chouraqui
The most crucial challenges for UAVs are planning paths and avoiding obstacles in their way. In recent years, a wide variety of path-planning algorithms have been developed. These algorithms have successfully solved path-planning problems; however, they suffer from multiple challenges and limitations. To test the effectiveness and efficiency of three widely used algorithms, namely A*, RRT*, and Particle Swarm Optimization (PSO), this paper conducts extensive experiments in 3D urban city environments cluttered with obstacles. Three experiments were designed with two scenarios each to test the aforementioned algorithms. These experiments consider different city map sizes, different altitudes, and varying obstacle densities and sizes in the environment. According to the experimental results, the A* algorithm outperforms the others in both computation efficiency and path quality. PSO is especially suitable for tight turns and dense environments, and RRT* offers a balance and works well across all experiments due to its randomized approach to finding solutions.
Ensembles of Neural Surrogates for Parametric Sensitivity in Ocean Modeling
Sun, Yixuan, Egele, Romain, Narayanan, Sri Hari Krishna, Van Roekel, Luke, Gonzales, Carmelo, Brus, Steven, Nadiga, Balu, Madireddy, Sandeep, Balaprakash, Prasanna
Accurate simulations of the oceans are crucial in understanding the Earth system. Despite their efficiency, simulations at lower resolutions must rely on various uncertain parameterizations to account for unresolved processes. However, model sensitivity to parameterizations is difficult to quantify, making it challenging to tune these parameterizations to reproduce observations. Deep learning surrogates have shown promise for efficient computation of the parametric sensitivities in the form of partial derivatives, but their reliability is difficult to evaluate without ground truth derivatives. In this work, we leverage large-scale hyperparameter search and ensemble learning to improve both forward predictions, autoregressive rollout, and backward adjoint sensitivity estimation. Particularly, the ensemble method provides epistemic uncertainty of function value predictions and their derivatives, providing improved reliability of the neural surrogates in decision making.
Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of Embeddings
Xu, Liyan, Su, Zhenlin, Yu, Mo, Li, Jiangnan, Meng, Fandong, Zhou, Jie
This work stems from an observed limitation of text encoders: embeddings may not be able to recognize fine-grained entities or events within encoded semantics, resulting in failed retrieval even in simple cases. To examine such behaviors, we first introduce a new evaluation dataset, CapRetrieval, in which passages are image captions and queries are phrases targeting entity or event concepts in diverse forms. Zero-shot evaluation suggests that encoders often struggle with these fine-grained matching, regardless of training sources or model size. Aiming for enhancement, we proceed to finetune encoders with our proposed data generation strategies, enabling a small 0.1B encoder to outperform the state-of-the-art 7B model. Within this process, we further uncover the granularity dilemma, a challenge for embeddings to capture fine-grained salience while aligning with overall semantics. Our dataset, code and models in this work are publicly released at https://github.com/lxucs/CapRetrieval.
Uni-AIMS: AI-Powered Microscopy Image Analysis
Hong, Yanhui, Wang, Nan, Xia, Zhiyi, Tao, Haoyi, Fang, Xi, Li, Yiming, Wang, Jiankun, Jin, Peng, Cai, Xiaochen, Li, Shengyu, Chen, Ziqi, Zhang, Zezhong, Ke, Guolin, Zhang, Linfeng
This paper presents a systematic solution for the intelligent recognition and automatic analysis of microscopy images. We developed a data engine that generates high-quality annotated datasets through a combination of the collection of diverse microscopy images from experiments, synthetic data generation and a human-in-the-loop annotation process. To address the unique challenges of microscopy images, we propose a segmentation model capable of robustly detecting both small and large objects. The model effectively identifies and separates thousands of closely situated targets, even in cluttered visual environments. Furthermore, our solution supports the precise automatic recognition of image scale bars, an essential feature in quantitative microscopic analysis. Building upon these components, we have constructed a comprehensive intelligent analysis platform and validated its effectiveness and practicality in real-world applications. This study not only advances automatic recognition in microscopy imaging but also ensures scalability and generalizability across multiple application domains, offering a powerful tool for automated microscopic analysis in interdisciplinary research. A online application is made available for researchers to access and evaluate the proposed automated analysis service.
A Novel Framework for Uncertainty Quantification via Proper Scores for Classification and Beyond
In this PhD thesis, we propose a novel framework for uncertainty quantification in machine learning, which is based on proper scores. Uncertainty quantification is an important cornerstone for trustworthy and reliable machine learning applications in practice. Usually, approaches to uncertainty quantification are problem-specific, and solutions and insights cannot be readily transferred from one task to another. Proper scores are loss functions minimized by predicting the target distribution. Due to their very general definition, proper scores apply to regression, classification, or even generative modeling tasks. We contribute several theoretical results, that connect epistemic uncertainty, aleatoric uncertainty, and model calibration with proper scores, resulting in a general and widely applicable framework. We achieve this by introducing a general bias-variance decomposition for strictly proper scores via functional Bregman divergences. Specifically, we use the kernel score, a kernel-based proper score, for evaluating sample-based generative models in various domains, like image, audio, and natural language generation. This includes a novel approach for uncertainty estimation of large language models, which outperforms state-of-the-art baselines. Further, we generalize the calibration-sharpness decomposition beyond classification, which motivates the definition of proper calibration errors. We then introduce a novel estimator for proper calibration errors in classification, and a novel risk-based approach to compare different estimators for squared calibration errors. Last, we offer a decomposition of the kernel spherical score, another kernel-based proper score, allowing a more fine-grained and interpretable evaluation of generative image models.