Goto

Collaborating Authors

 Oceania


BenSParX: A Robust Explainable Machine Learning Framework for Parkinson's Disease Detection from Bengali Conversational Speech

arXiv.org Artificial Intelligence

Parkinson's disease (PD) poses a growing global health challenge, with Bangladesh experiencing a notable rise in PD-related mortality. Early detection of PD remains particularly challenging in resource-constrained settings, where voice-based analysis has emerged as a promising non-invasive and cost-effective alternative. However, existing studies predominantly focus on English or other major languages; notably, no voice dataset for PD exists for Bengali - posing a significant barrier to culturally inclusive and accessible healthcare solutions. Moreover, most prior studies employed only a narrow set of acoustic features, with limited or no hyperparameter tuning and feature selection strategies, and little attention to model explainability. This restricts the development of a robust and generalizable machine learning model. To address this gap, we present BenSparX, the first Bengali conversational speech dataset for PD detection, along with a robust and explainable machine learning framework tailored for early diagnosis. The proposed framework incorporates diverse acoustic feature categories, systematic feature selection methods, and state-of-the-art machine learning algorithms with extensive hyperparameter optimization. Furthermore, to enhance interpretability and trust in model predictions, the framework incorporates SHAP (SHapley Additive exPlanations) analysis to quantify the contribution of individual acoustic features toward PD detection. Our framework achieves state-of-the-art performance, yielding an accuracy of 95.77%, F1 score of 95.57%, and AUC-ROC of 0.982. We further externally validated our approach by applying the framework to existing PD datasets in other languages, where it consistently outperforms state-of-the-art approaches. To facilitate further research and reproducibility, the dataset has been made publicly available at https://github.com/Riad071/BenSParX.


TrainBo: An Interactive Robot-assisted Scenario Training System for Older Adults with Dementia

arXiv.org Artificial Intelligence

Dementia is an overall decline in memory and cognitive skills severe enough to reduce an elders ability to perform everyday activities. There is an increasing need for accessible technologies for cognitive training to slow down the cognitive decline. With the ability to provide instant feedback and assistance, social robotic systems have been proven effective in enhancing learning abilities across various age groups. This study focuses on the design of an interactive robot-assisted scenario training system TrainBo with self-determination theory, derives design requirements through formative and formal studies and the system usability is also be evaluated. A pilot test is conducted on seven older adults with dementia in an elderly care center in Hong Kong for four weeks. Our finding shows that older adults with dementia have an improvement in behavioural engagement, emotional engagement, and intrinsic motivation after using Trainbo. These findings can provide valuable insights into the development of more captivating interactive robots for extensive training purposes.


Private Statistical Estimation via Truncation

arXiv.org Machine Learning

We introduce a novel framework for differentially private (DP) statistical estimation via data truncation, addressing a key challenge in DP estimation when the data support is unbounded. Traditional approaches rely on problem-specific sensitivity analysis, limiting their applicability. By leveraging techniques from truncated statistics, we develop computationally efficient DP estimators for exponential family distributions, including Gaussian mean and covariance estimation, achieving near-optimal sample complexity. Previous works on exponential families only consider bounded or one-dimensional families. Our approach mitigates sensitivity through truncation while carefully correcting for the introduced bias using maximum likelihood estimation and DP stochastic gradient descent. Along the way, we establish improved uniform convergence guarantees for the log-likelihood function of exponential families, which may be of independent interest. Our results provide a general blueprint for DP algorithm design via truncated statistics.


Importance Sampling for Nonlinear Models

arXiv.org Machine Learning

While norm-based and leverage-score-based methods have been extensively studied for identifying "important" data points in linear models, analogous tools for nonlinear models remain significantly underdeveloped. By introducing the concept of the adjoint operator of a nonlinear map, we address this gap and generalize norm-based and leverage-score-based importance sampling to nonlinear settings. We demonstrate that sampling based on these generalized notions of norm and leverage scores provides approximation guarantees for the underlying nonlinear mapping, similar to linear subspace embeddings. As direct applications, these nonlinear scores not only reduce the computational complexity of training nonlinear models by enabling efficient sampling over large datasets but also offer a novel mechanism for model explainability and outlier detection. Our contributions are supported by both theoretical analyses and experimental results across a variety of supervised learning scenarios.


Stronger Together: Unleashing the Social Impact of Hate Speech Research

arXiv.org Artificial Intelligence

The advent of the internet has been both a blessing and a curse for once marginalised communities. When used well, the internet can be used to connect and establish communities crossing different intersections; however, it can also be used as a tool to alienate people and communities as well as perpetuate hate, misinformation, and disinformation especially on social media platforms. We propose steering hate speech research and researchers away from pre-existing computational solutions and consider social methods to inform social solutions to address this social problem. In a similar way linguistics research can inform language planning policy, linguists should apply what we know about language and society to mitigate some of the emergent risks and dangers of anti-social behaviour in digital spaces. We argue linguists and NLP researchers can play a principle role in unleashing the social impact potential of linguistics research working alongside communities, advocates, activists, and policymakers to enable equitable digital inclusion and to close the digital divide.


Unlearning for Federated Online Learning to Rank: A Reproducibility Study

arXiv.org Artificial Intelligence

This paper reports on findings from a comparative study on the effectiveness and efficiency of federated unlearning strategies within Federated Online Learning to Rank (FOLTR), with specific attention to systematically analysing the unlearning capabilities of methods in a verifiable manner. Federated approaches to ranking of search results have recently garnered attention to address users privacy concerns. In FOLTR, privacy is safeguarded by collaboratively training ranking models across decentralized data sources, preserving individual user data while optimizing search results based on implicit feedback, such as clicks. Recent legislation introduced across numerous countries is establishing the so called "the right to be forgotten", according to which services based on machine learning models like those in FOLTR should provide capabilities that allow users to remove their own data from those used to train models. This has sparked the development of unlearning methods, along with evaluation practices to measure whether unlearning of a user data successfully occurred. Current evaluation practices are however often controversial, necessitating the use of multiple metrics for a more comprehensive assessment -- but previous proposals of unlearning methods only used single evaluation metrics. This paper addresses this limitation: our study rigorously assesses the effectiveness of unlearning strategies in managing both under-unlearning and over-unlearning scenarios using adapted, and newly proposed evaluation metrics. Thanks to our detailed analysis, we uncover the strengths and limitations of five unlearning strategies, offering valuable insights into optimizing federated unlearning to balance data privacy and system performance within FOLTR. We publicly release our code and complete results at https://github.com/Iris1026/Unlearning-for-FOLTR.git.


UniHM: Universal Human Motion Generation with Object Interactions in Indoor Scenes

arXiv.org Artificial Intelligence

Figure 1: Text-to-Motion sequences (left) and Text-to-HOI sequences (right) generated by our approach. Abstract --Human motion synthesis in complex scenes presents a fundamental challenge, extending beyond conventional T ext-to-Motion tasks by requiring the integration of diverse modalities such as static environments, movable objects, natural language prompts, and spatial waypoints. Existing language-conditioned motion models often struggle with scene-aware motion generation due to limitations in motion tokenization, which leads to information loss and fails to capture the continuous, context-dependent nature of 3D human movement. T o address these issues, we propose UniHM, a unified motion language model that leverages diffusion-based generation for synthesizing scene-aware human motion. UniHM is the first framework to support both T ext-to-Motion and T ext-to-Human-Object Interaction (HOI) in complex 3D scenes. Our approach introduces three key contributions: (1) a mixed-motion representation that fuses continuous 6DoF motion with discrete local motion tokens to improve motion realism; (2) a novel Look-Up-Free Quantization V AE (LFQ-V AE) that surpasses traditional VQ-V AEs in both reconstruction accuracy and generative performance; and (3) an enriched version of the Lingo dataset augmented with HumanML3D annotations, providing stronger supervision for scene-specific motion learning. Experimental results demonstrate that UniHM achieves comparative performance on the OMOMO benchmark for text-to-HOI synthesis and yields competitive results on HumanML3D for general text-conditioned motion generation. Human motion synthesis in complex scenes represents a challenging extension of the Text-to-Motion paradigm, with potential applications in virtual reality, robotics, and interactive environments where accurately synthesized human motion is critical to user experience. While language models have demonstrated considerable success in generating realistic human motion sequences based on text prompts, they struggle to achieve similar efficacy in scene-specific motion generation. Scene-based human motion synthesis requires not only an understanding of human motion but also an intricate integration of diverse modalities, such as static scene elements, moveable objects, text prompts, and motion waypoints. These modalities add layers of complexity that go beyond standard Text-to-Motion tasks, demanding a cohesive synthesis of environmental context and dynamic interaction.


A Systematic Analysis of Base Model Choice for Reward Modeling

arXiv.org Artificial Intelligence

Reinforcement learning from human feedback (RLHF) and, at its core, reward modeling have become a crucial part of training powerful large language models (LLMs). One commonly overlooked factor in training high-quality reward models (RMs) is the effect of the base model, which is becoming more challenging to choose given the rapidly growing pool of LLMs. In this work, we present a systematic analysis of the effect of base model selection on reward modeling performance. Our results show that the performance can be improved by up to 14% compared to the most common (i.e., default) choice. Moreover, we showcase the strong statistical relation between some existing benchmarks and downstream performances. We also demonstrate that the results from a small set of benchmarks could be combined to boost the model selection ($+$18% on average in the top 5-10). Lastly, we illustrate the impact of different post-training steps on the final performance and explore using estimated data distributions to reduce performance prediction error.


User-centric Music Recommendations

arXiv.org Artificial Intelligence

This work presents a user-centric recommendation framework, designed as a pipeline with four distinct, connected, and customizable phases. These phases are intended to improve explainability and boost user engagement. We have collected the historical Last.fm track playback records of a single user over approximately 15 years. The collected dataset includes more than 90,000 playbacks and approximately 14,000 unique tracks. From track playback records, we have created a dataset of user temporal contexts (each row is a specific moment when the user listened to certain music descriptors). As music descriptors, we have used community-contributed Last.fm tags and Spotify audio features. They represent the music that, throughout years, the user has been listening to. Next, given the most relevant Last.fm tags of a moment (e.g. the hour of the day), we predict the Spotify audio features that best fit the user preferences in that particular moment. Finally, we use the predicted audio features to find tracks similar to these features. The final aim is to recommend (and discover) tracks that the user may feel like listening to at a particular moment. For our initial study case, we have chosen to predict only a single audio feature target: danceability. The framework, however, allows to include more target variables. The ability to learn the musical habits from a single user can be quite powerful, and this framework could be extended to other users.


Towards Better Evaluation for Generated Patent Claims

arXiv.org Artificial Intelligence

Patent claims define the scope of protection and establish the legal boundaries of an invention. Drafting these claims is a complex and time-consuming process that usually requires the expertise of skilled patent attorneys, which can form a large access barrier for many small enterprises. To solve these challenges, researchers have investigated the use of large language models (LLMs) for automating patent claim generation. However, existing studies highlight inconsistencies between automated evaluation metrics and human expert assessments. To bridge this gap, we introduce Patent-CE, the first comprehensive benchmark for evaluating patent claims. Patent-CE includes comparative claim evaluations annotated by patent experts, focusing on five key criteria: feature completeness, conceptual clarity, terminology consistency, logical linkage, and overall quality. Additionally, we propose PatClaimEval, a novel multi-dimensional evaluation method specifically designed for patent claims. Our experiments demonstrate that PatClaimEval achieves the highest correlation with human expert evaluations across all assessment criteria among all tested metrics. This research provides the groundwork for more accurate evaluations of automated patent claim generation systems.