Goto

Collaborating Authors

 Vestland


Adaptive Nonlinear Data Assimilation through P-Spline Triangular Measure Transport

arXiv.org Machine Learning

Non-Gaussian statistics are a challenge for data assimilation. Linear methods oversimplify the problem, yet fully nonlinear methods are often too expensive to use in practice. The best solution usually lies between these extremes. Triangular measure transport offers a flexible framework for nonlinear data assimilation. Its success, however, depends on how the map is parametrized. Too much flexibility leads to overfitting; too little misses important structure. To address this balance, we develop an adaptation algorithm that selects a parsimonious parametrization automatically. Our method uses P-spline basis functions and an information criterion as a continuous measure of model complexity. This formulation enables gradient descent and allows efficient, fine-scale adaptation in high-dimensional settings. The resulting algorithm requires no hyperparameter tuning. It adjusts the transport map to the appropriate level of complexity based on the system statistics and ensemble size. We demonstrate its performance in nonlinear, non-Gaussian problems, including a high-dimensional distributed groundwater model.





Designing value-aligned autonomous vehicles: from moral dilemmas to conflict-sensitive design

AIHub

Imagine an autonomous car driving along a quiet suburban road when suddenly a dog runs onto the road. The system must brake hard and decide, within a fraction of a second, whether to swerve into oncoming traffic--where the other autonomous car might make space--to steer right and hit the roadside barrier, or to continue straight and injure the dog. The first two options risk only material damage; the last harms a living creature. Each choice is justifiable and involves trade-offs between safety, property and ethical concerns. However, today's autonomous systems are not designed to explicitly take such value-laden conflicts into account.



Privacy Preservation and Identity Tracing Prevention in AI-Driven Eye Tracking for Interactive Learning Environments

arXiv.org Artificial Intelligence

Eye-tracking technology can aid in understanding neurodevelopmental disorders and tracing a person's identity. However, this technology poses a significant risk to privacy, as it captures sensitive information about individuals and increases the likelihood that data can be traced back to them. This paper proposes a human-centered framework designed to prevent identity backtracking while preserving the pedagogical benefits of AI-powered eye tracking in interactive learning environments. We explore how real-time data anonymization, ethical design principles, and regulatory compliance (such as GDPR) can be integrated to build trust and transparency. We first demonstrate the potential for backtracking student IDs and diagnoses in various scenarios using serious game-based eye-tracking data. We then provide a two-stage privacy-preserving framework that prevents participants from being tracked while still enabling diagnostic classification. The first phase covers four scenarios: I) Predicting disorder diagnoses based on different game levels. II) Predicting student IDs based on different game levels. III) Predicting student IDs based on randomized data. IV) Utilizing K-Means for out-of-sample data. In the second phase, we present a two-stage framework that preserves privacy. We also employ Federated Learning (FL) across multiple clients, incorporating a secure identity management system with dummy IDs and administrator-only access controls. In the first phase, the proposed framework achieved 99.3% accuracy for scenario 1, 63% accuracy for scenario 2, and 99.7% accuracy for scenario 3, successfully identifying and assigning a new student ID in scenario 4. In phase 2, we effectively prevented backtracking and established a secure identity management system with dummy IDs and administrator-only access controls, achieving an overall accuracy of 99.40%.


Calibrated and uncertain? Evaluating uncertainty estimates in binary classification models

arXiv.org Machine Learning

Rigorous statistical methods, including parameter estimation with accompanying uncertainties, underpin the validity of scientific discovery, especially in the natural sciences. With increasingly complex data models such as deep learning techniques, uncertainty quantification has become exceedingly difficult and a plethora of techniques have been proposed. In this case study, we use the unifying framework of approximate Bayesian inference combined with empirical tests on carefully created synthetic classification datasets to investigate qualitative properties of six different probabilistic machine learning algorithms for class probability and uncertainty estimation: (i) a neural network ensemble, (ii) neural network ensemble with conflictual loss, (iii) evidential deep learning, (iv) a single neural network with Monte Carlo Dropout, (v) Gaussian process classification and (vi) a Dirichlet process mixture model. We check if the algorithms produce uncertainty estimates which reflect commonly desired properties, such as being well calibrated and exhibiting an increase in uncertainty for out-of-distribution data points. Our results indicate that all algorithms are well calibrated, but none of the deep learning based algorithms provide uncertainties that consistently reflect lack of experimental evidence for out-of-distribution data points. We hope our study may serve as a clarifying example for researchers developing new methods of uncertainty estimation for scientific data-driven modeling.



The CASTLE 2024 Dataset: Advancing the Art of Multimodal Understanding

arXiv.org Artificial Intelligence

Multi-perspective datasets that combine firstperson and third-person views are rare and typically include only a Egocentric video has seen increased interest in recent years, as limited number of activities and do not last long enough to capture it is used in a range of areas. However, most existing datasets the full range of interactions and social dynamics characteristic of are limited to a single perspective. In this paper, we present the everyday life. CASTLE 2024 dataset, a multimodal collection containing ego-and In this paper, we introduce the CASTLE 2024 dataset, a multimodal exo-centric (i.e., first-and third-person perspective) video and audio multi-perspective collection of ego-centric (first-person) from 15 time-aligned sources, as well as other sensor streams and and exo-centric (third-person) high-resolution video recordings, auxiliary data. The dataset was recorded by volunteer participants augmented with additional sensor streams, designed to capture the over four days in a fixed location and includes the point of view complexity of daily human experiences. The dataset captures the of 10 participants, with an additional 5 fixed cameras providing an experience and daily interaction of ten volunteer participants over exocentric perspective. The entire dataset contains over 600 hours the course of four days. It shows a broad range of domestic and of UHD video recorded at 50 frames per second. In contrast to other social activities, including cooking, eating, cleaning, meeting and datasets, CASTLE 2024 does not contain any partial censoring, such leisure activities, capturing authentic interactions among participants.