Goto

Collaborating Authors

 Bayesian Learning


Compositional Understanding in Signaling Games

arXiv.org Artificial Intelligence

Even when the signalers send compositional messages, the receivers do not interpret them compositionally. When information from one message component is lost or forgotten, the information from other components is also erased. In this paper I construct signaling game models in which genuine compositional understanding evolves. I present two new models: a minimalist receiver who only learns from the atomic messages of a signal, and a generalist receiver who learns from all of the available information. These models are in many ways simpler than previous alternatives, and allow the receivers to learn from the atomic components of messages.


CLEVER: Stream-based Active Learning for Robust Semantic Perception from Human Instructions

arXiv.org Artificial Intelligence

We propose CLEVER, an active learning system for robust semantic perception with Deep Neural Networks (DNNs). For data arriving in streams, our system seeks human support when encountering failures and adapts DNNs online based on human instructions. In this way, CLEVER can eventually accomplish the given semantic perception tasks. Our main contribution is the design of a system that meets several desiderata of realizing the aforementioned capabilities. The key enabler herein is our Bayesian formulation that encodes domain knowledge through priors. Empirically, we not only motivate CLEVER's design but further demonstrate its capabilities with a user validation study as well as experiments on humanoid and deformable objects. To our knowledge, we are the first to realize stream-based active learning on a real robot, providing evidence that the robustness of the DNN-based semantic perception can be improved in practice. The project website can be accessed at https://sites.google.com/view/thecleversystem.


An Adaptive Random Fourier Features approach Applied to Learning Stochastic Differential Equations

arXiv.org Artificial Intelligence

The efficient identification of dynamical systems from data is a fundamental challenge in many scientific and engineering domains. Classical parameter estimation techniques for stochastic differential equations (SDEs) - including maximum likelihood estimation, the method of moments, and Bayesian inference [15], [21], have widespread applications in physics [19], [23], finance [1], [8] and biology [20]. Despite their utility, these methods impose strong model assumptions, demand substantial analytical effort, and often become computationally intractable for complex or high-dimensional systems. Recent advances in machine learning have offer new options for data-driven modelling of dynamical systems [17]. Deep learning frameworks, such as residual networks, neural ordinary differential equations [3], and neural partial differential equations (PDEs) [14, 18], demonstrate significant promise in approximating complex dynamical systems.


Old Rules in a New Game: Mapping Uncertainty Quantification to Quantum Machine Learning

arXiv.org Artificial Intelligence

One of the key obstacles in traditional deep learning is the reduction in model transparency caused by increasingly intricate model functions, which can lead to problems such as overfitting and excessive confidence in predictions. With the advent of quantum machine learning offering possible advances in computational power and latent space complexity, we notice the same opaque behavior. Despite significant research in classical contexts, there has been little advancement in addressing the black-box nature of quantum machine learning. Consequently, we approach this gap by building upon existing work in classical uncertainty quantification and initial explorations in quantum Bayesian modeling to theoretically develop and empirically evaluate techniques to map classical uncertainty quantification methods to the quantum machine learning domain. Our findings emphasize the necessity of leveraging classical insights into uncertainty quantification to include uncertainty awareness in the process of designing new quantum machine learning models.


Distributed Machine Learning Approach for Low-Latency Localization in Cell-Free Massive MIMO Systems

arXiv.org Artificial Intelligence

--Low-latency localization is critical in cellular networks to support real-time applications requiring precise positioning. In this paper, we propose a distributed machine learning (ML) framework for fingerprint-based localization tailored to cell-free massive multiple-input multiple-output (MIMO) systems, an emerging architecture for 6G networks. The proposed framework enables each access point (AP) to independently train a Gaussian process regression model using local angle-of-arrival and received signal strength fingerprints. These models provide probabilistic position estimates for the user equipment (UE), which are then fused by the UE with minimal computational overhead to derive a final location estimate. This decentralized approach eliminates the need for fronthaul communication between the APs and the central processing unit (CPU), thereby reducing latency. Additionally, distributing computational tasks across the APs alleviates the processing burden on the CPU compared to traditional centralized localization schemes. Simulation results demonstrate that the proposed distributed framework achieves localization accuracy comparable to centralized methods, despite lacking the benefits of centralized data aggregation. Moreover, it effectively reduces uncertainty of the location estimates, as evidenced by the 95% covariance ellipse. The results highlight the potential of distributed ML for enabling low-latency, high-accuracy localization in future 6G networks. The next-generation 6G mobile communication is expected to revolutionize wireless communication systems, with integrated sensing and communication (ISAC) playing a key role in enabling advanced connectivity.


Learning Software Bug Reports: A Systematic Literature Review

arXiv.org Artificial Intelligence

The recent advancement of artificial intelligence, especially machine learning (ML), has significantly impacted software engineering research, including bug report analysis. ML aims to automate the understanding, extraction, and correlation of information from bug reports. Despite its growing importance, there has been no comprehensive review in this area. In this paper, we present a systematic literature review covering 1,825 papers, selecting 204 for detailed analysis. We derive seven key findings: 1) Extensive use of CNN, LSTM, and $k$NN for bug report analysis, with advanced models like BERT underutilized due to their complexity. 2) Word2Vec and TF-IDF are popular for feature representation, with a rise in deep learning approaches. 3) Stop word removal is the most common preprocessing, with structural methods rising after 2020. 4) Eclipse and Mozilla are the most frequently evaluated software projects. 5) Bug categorization is the most common task, followed by bug localization and severity prediction. 6) There is increasing attention on specific bugs like non-functional and performance bugs. 7) Common evaluation metrics are F1-score, Recall, Precision, and Accuracy, with $k$-fold cross-validation preferred for model evaluation. 8) Many studies lack robust statistical tests. We also identify six promising future research directions to provide useful insights for practitioners.


SemiOccam: A Robust Semi-Supervised Image Recognition Network Using Sparse Labels

arXiv.org Artificial Intelligence

We present SemiOccam, an image recognition network that leverages semi-supervised learning in a highly efficient manner. Existing works often rely on complex training techniques and architectures, requiring hundreds of GPU hours for training, while their generalization ability with extremely limited labeled data remains to be improved. To address these limitations, we construct a hierarchical mixture density classification mechanism by optimizing mutual information between feature representations and target classes, compressing redundant information while retaining crucial discriminative components. Experimental results demonstrate that our method achieves state-of-the-art performance on three commonly used datasets, with accuracy exceeding 95% on two of them using only 4 labeled samples per class, and its simple architecture keeps training time at the minute level. Notably, this paper reveals a long-overlooked data leakage issue in the STL-10 dataset for semi-supervised learning and removes duplicates to ensure reliable experimental results. We release the deduplicated CleanSTL-10 dataset to facilitate fair and reproducible research. Code available at https://github.com/Shu1L0n9/SemiOccam.


Toward Temporal Causal Representation Learning with Tensor Decomposition

arXiv.org Machine Learning

Temporal causal representation learning is a powerful tool for uncovering complex patterns in observational studies, which are often represented as low-dimensional time series. However, in many real-world applications, data are high-dimensional with varying input lengths and naturally take the form of irregular tensors. To analyze such data, irregular tensor decomposition is critical for extracting meaningful clusters that capture essential information. In this paper, we focus on modeling causal representation learning based on the transformed information. First, we present a novel causal formulation for a set of latent clusters. We then propose CaRTeD, a joint learning framework that integrates temporal causal representation learning with irregular tensor decomposition. Notably, our framework provides a blueprint for downstream tasks using the learned tensor factors, such as modeling latent structures and extracting causal information, and offers a more flexible regularization design to enhance tensor decomposition. Theoretically, we show that our algorithm converges to a stationary point. More importantly, our results fill the gap in theoretical guarantees for the convergence of state-of-the-art irregular tensor decomposition. Experimental results on synthetic and real-world electronic health record (EHR) datasets (MIMIC-III), with extensive benchmarks from both phenotyping and network recovery perspectives, demonstrate that our proposed method outperforms state-of-the-art techniques and enhances the explainability of causal representations.


Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design

arXiv.org Machine Learning

We develop a semi-amortized, policy-based, approach to Bayesian experimental design (BED) called Stepwise Deep Adaptive Design (Step-DAD). Like existing, fully amortized, policy-based BED approaches, Step-DAD trains a design policy upfront before the experiment. However, rather than keeping this policy fixed, Step-DAD periodically updates it as data is gathered, refining it to the particular experimental instance. This test-time adaptation improves both the flexibility and the robustness of the design strategy compared with existing approaches. Empirically, Step-DAD consistently demonstrates superior decision-making and robustness compared with current state-of-the-art BED methods.


Context-Aware Behavior Learning with Heuristic Motion Memory for Underwater Manipulation

arXiv.org Artificial Intelligence

Autonomous motion planning is critical for efficient and safe underwater manipulation in dynamic marine environments. Current motion planning methods often fail to effectively utilize prior motion experiences and adapt to real-time uncertainties inherent in underwater settings. In this paper, we introduce an Adaptive Heuristic Motion Planner framework that integrates a Heuristic Motion Space (HMS) with Bayesian Networks to enhance motion planning for autonomous underwater manipulation. Our approach employs the Probabilistic Roadmap (PRM) algorithm within HMS to optimize paths by minimizing a composite cost function that accounts for distance, uncertainty, energy consumption, and execution time. By leveraging HMS, our framework significantly reduces the search space, thereby boosting computational performance and enabling real-time planning capabilities. Bayesian Networks are utilized to dynamically update uncertainty estimates based on real-time sensor data and environmental conditions, thereby refining the joint probability of path success. Through extensive simulations and real-world test scenarios, we showcase the advantages of our method in terms of enhanced performance and robustness. This probabilistic approach significantly advances the capability of autonomous underwater robots, ensuring optimized motion planning in the face of dynamic marine challenges.