Goto

Collaborating Authors

 Queensland


Exploiting Descriptive Completeness Prior for Cross Modal Hashing with Incomplete Labels

Neural Information Processing Systems

In this paper, we tackle the challenge of generating high-quality hash codes for cross-modal retrieval in the presence of incomplete labels, which creates uncertainty in distinguishing between positive and negative pairs. Vision-language models such as CLIP offer a potential solution by providing generic knowledge for missing label recovery, yet their zero-shot performance remains insufficient. To address this, we propose a novel Prompt Contrastive Recovery approach, PCRIL, which progressively identifies promising positive classes from unknown label sets and recursively searches for other relevant labels. Identifying unknowns is nontrivial due to the fixed and long-tailed patterns of positive label sets in training data, which hampers the discovery of new label combinations. Therefore, we consider each subset of positive labels and construct three types of negative prompts through deletion, addition, and replacement for prompt learning. The augmented supervision guides the model to measure the completeness of label sets, thus facilitating the subsequent greedy tree search for label completion. We also address extreme cases of significant unknown labels and lack of negative pairwise supervision by deriving two augmentation strategies: seeking unknown-complementary samples for mixup and random flipping for negative labels. Extensive experiments reveal the vulnerability of current methods and demonstrate the effectiveness of PCRIL, achieving an average 12% mAP improvement to the current SOTA across all datasets. Our code is available at github.com/E-Galois/PCRIL.


Pseudo-Relevance Feedback Can Improve Zero-Shot LLM-Based Dense Retrieval

arXiv.org Artificial Intelligence

Recent advances in language modelling have been motivated the Pseudo-relevance feedback (PRF) refines queries by leveraging initially replacement of encoder-only backbones like BERT with larger retrieved documents to improve retrieval effectiveness. In this decoder-only backbones (generative LLMs) to form dense representations paper, we investigate how large language models (LLMs) can facilitate [2, 13, 23], allowing to leverage richer contextual information PRF for zero-shot LLM-based dense retrieval, extending the and enhancing dense retrieval generalization. Of particular recently proposed PromptReps method. Specifically, our approach interest for this paper is PromptReps [23], an LLM-based approach uses LLMs to extract salient passage features--such as keywords for dense retrieval. PromptReps is unique in that it does not require and summaries--from top-ranked documents, which are then integrated contrastive learning, producing effective representations for dense into PromptReps to produce enhanced query representations.


Efficient Post-Hoc Uncertainty Calibration via Variance-Based Smoothing

arXiv.org Artificial Intelligence

Since state-of-the-art uncertainty estimation methods are often computationally demanding, we investigate whether incorporating prior information can improve uncertainty estimates in conventional deep neural networks. Our focus is on machine learning tasks where meaningful predictions can be made from sub-parts of the input. For example, in speaker classification, the speech waveform can be divided into sequential patches, each containing information about the same speaker. We observe that the variance between sub-predictions serves as a reliable proxy for uncertainty in such settings. Our proposed variance-based scaling framework produces competitive uncertainty estimates in classification while being less computationally demanding and allowing for integration as a post-hoc calibration tool. This approach also leads to a simple extension of deep ensembles, improving the expressiveness of their predicted distributions.


Low-cost Real-world Implementation of the Swing-up Pendulum for Deep Reinforcement Learning Experiments

arXiv.org Artificial Intelligence

Deep reinforcement learning (DRL) has had success in virtual and simulated domains, but due to key differences between simulated and real-world environments, DRL-trained policies have had limited success in real-world applications. To assist researchers to bridge the \textit{sim-to-real gap}, in this paper, we describe a low-cost physical inverted pendulum apparatus and software environment for exploring sim-to-real DRL methods. In particular, the design of our apparatus enables detailed examination of the delays that arise in physical systems when sensing, communicating, learning, inferring and actuating. Moreover, we wish to improve access to educational systems, so our apparatus uses readily available materials and parts to reduce cost and logistical barriers. Our design shows how commercial, off-the-shelf electronics and electromechanical and sensor systems, combined with common metal extrusions, dowel and 3D printed couplings provide a pathway for affordable physical DRL apparatus. The physical apparatus is complemented with a simulated environment implemented using a high-fidelity physics engine and OpenAI Gym interface.


Training Directional Locomotion for Quadrupedal Low-Cost Robotic Systems via Deep Reinforcement Learning

arXiv.org Artificial Intelligence

In this work we present Deep Reinforcement Learning (DRL) training of directional locomotion for low-cost quadrupedal robots in the real world. In particular, we exploit randomization of heading that the robot must follow to foster exploration of action-state transitions most useful for learning both forward locomotion as well as course adjustments. Changing the heading in episode resets to current yaw plus a random value drawn from a normal distribution yields policies able to follow complex trajectories involving frequent turns in both directions as well as long straight-line stretches. By repeatedly changing the heading, this method keeps the robot moving within the training platform and thus reduces human involvement and need for manual resets during the training. Real world experiments on a custom-built, low-cost quadruped demonstrate the efficacy of our method with the robot successfully navigating all validation tests. When trained with other approaches, the robot only succeeds in forward locomotion test and fails when turning is required.


The potential role of AI agents in transforming nuclear medicine research and cancer management in India

arXiv.org Artificial Intelligence

India faces a significant cancer burden, with an incidence - to - mortality ratio indicating that nearly three out of five individuals diagnosed with cancer succumb to the disease. While the limitations of physical healthcare infrastructure are widely acknowledged as a primary challenge, concerted efforts by government and healthcare agencies are underway to mitigate these constraints. However, given the country's vast geography and high population density, it is imperative to explore alternative soft infrastructure solutions to complement existing frameworks . Artificial Intelligence agents are increasingly transforming problem - solving approaches across various domains, with their application in medicine proving particularly transformative. In this perspective, we examine the potential role of AI agents in advancing nuclear medicine fo r cancer research, diagnosis, and management in India. We begin with a brief overview of AI agents and their capabilities, followed by a proposed agent - based ecosystem that can address prevailing sustainability challenges in India's nuclear medicine. Keywords: AI Agents; cancer; nuclear medicine ecosystem; sustainability challenges 1. Introduction India's with population of 1.4 billion faces a significant cancer burden, with ~1.5 million new cases and ~850,000 deaths annually [1] [2] . With an i ncidence - to - m ortality p ercentage of approximately 64.8%, nearly three out of five individuals diagnosed with cancer are expected to succumb to the disease [2] . Projections indicate that mortality rates will rise significantly, increasing from 64.7% to 109.6% between 2022 and 2050, largely due to demographic shifts as the reproductive - age population transitions into middle and old age. This growing cancer burden will place even more pressure on the already overburdened healthcare system, making it essential to address the gaps in both infrastructure and indigenous research and innovations to ensure timely and effective patient treatment [3] . This trend underscores the urgent need for a resilient, patient - centred framework that integrates medical advancements, early detection through diagnostics, timely therapeutic interventions, and equitable access to care. Nuclear medicine uses a small amount of targeted radioactive material to diagnose and treat cancer [4] .


Image-Based Relocalization and Alignment for Long-Term Monitoring of Dynamic Underwater Environments

arXiv.org Artificial Intelligence

Effective monitoring of underwater ecosystems is crucial for tracking environmental changes, guiding conservation efforts, and ensuring long-term ecosystem health. However, automating underwater ecosystem management with robotic platforms remains challenging due to the complexities of underwater imagery, which pose significant difficulties for traditional visual localization methods. We propose an integrated pipeline that combines Visual Place Recognition (VPR), feature matching, and image segmentation on video-derived images. This method enables robust identification of revisited areas, estimation of rigid transformations, and downstream analysis of ecosystem changes. Furthermore, we introduce the SQUIDLE+ VPR Benchmark-the first large-scale underwater VPR benchmark designed to leverage an extensive collection of unstructured data from multiple robotic platforms, spanning time intervals from days to years. The dataset encompasses diverse trajectories, arbitrary overlap and diverse seafloor types captured under varying environmental conditions, including differences in depth, lighting, and turbidity. Our code is available at: https://github.com/bev-gorry/underloc


REGRACE: A Robust and Efficient Graph-based Re-localization Algorithm using Consistency Evaluation

arXiv.org Artificial Intelligence

Loop closures are essential for correcting odometry drift and creating consistent maps, especially in the context of large-scale navigation. Current methods using dense point clouds for accurate place recognition do not scale well due to computationally expensive scan-to-scan comparisons. Alternative object-centric approaches are more efficient but often struggle with sensitivity to viewpoint variation. In this work, we introduce REGRACE, a novel approach that addresses these challenges of scalability and perspective difference in re-localization by using LiDAR-based submaps. We introduce rotation-invariant features for each labeled object and enhance them with neighborhood context through a graph neural network. To identify potential revisits, we employ a scalable bag-of-words approach, pooling one learned global feature per submap. Additionally, we define a revisit with geometrical consistency cues rather than embedding distance, allowing us to recognize far-away loop closures. Our evaluations demonstrate that REGRACE achieves similar results compared to state-of-the-art place recognition and registration baselines while being twice as fast.


PostHoc FREE Calibrating on Kolmogorov Arnold Networks

arXiv.org Artificial Intelligence

Kolmogorov Arnold Networks (KANs) are neural architectures inspired by the Kolmogorov Arnold representation theorem that leverage B Spline parameterizations for flexible, locally adaptive function approximation. Although KANs can capture complex nonlinearities beyond those modeled by standard MultiLayer Perceptrons (MLPs), they frequently exhibit miscalibrated confidence estimates manifesting as overconfidence in dense data regions and underconfidence in sparse areas. In this work, we systematically examine the impact of four critical hyperparameters including Layer Width, Grid Order, Shortcut Function, and Grid Range on the calibration of KANs. Furthermore, we introduce a novel TemperatureScaled Loss (TSL) that integrates a temperature parameter directly into the training objective, dynamically adjusting the predictive distribution during learning. Both theoretical analysis and extensive empirical evaluations on standard benchmarks demonstrate that TSL significantly reduces calibration errors, thereby improving the reliability of probabilistic predictions. Overall, our study provides actionable insights into the design of spline based neural networks and establishes TSL as a robust loss solution for enhancing calibration.


Learning Surrogate Equations for the Analysis of an Agent-Based Cancer Model

arXiv.org Artificial Intelligence

In this paper, we adapt a two species agent-based cancer model that describes the interaction between cancer cells and healthy cells on a uniform grid to include the interaction with a third species -- namely immune cells. We run six different scenarios to explore the competition between cancer and immune cells and the initial concentration of the immune cells on cancer dynamics. We then use coupled equation learning to construct a population-based reaction model for each scenario. We show how they can be unified into a single surrogate population-based reaction model, whose underlying three coupled ordinary differential equations are much easier to analyse than the original agent-based model. As an example, by finding the single steady state of the cancer concentration, we are able to find a linear relationship between this concentration and the initial concentration of the immune cells. This then enables us to estimate suitable values for the competition and initial concentration to reduce the cancer substantially without performing additional complex and expensive simulations from an agent-based stochastic model. The work shows the importance of performing equation learning from agent-based stochastic data for gaining key insights about the behaviour of complex cellular dynamics.