Goto

Collaborating Authors

 Overview


Leveraging Auxiliary Task Relevance for Enhanced Bearing Fault Diagnosis through Curriculum Meta-learning

arXiv.org Artificial Intelligence

The accurate diagnosis of machine breakdowns is crucial for maintaining operational safety in smart manufacturing. Despite the promise shown by deep learning in automating fault identification, the scarcity of labeled training data, particularly for equipment failure instances, poses a significant challenge. This limitation hampers the development of robust classification models. Existing methods like model-agnostic meta-learning (MAML) do not adequately address variable working conditions, affecting knowledge transfer. To address these challenges, a Related Task Aware Curriculum Meta-learning (RT-ACM) enhanced fault diagnosis framework is proposed in this paper, inspired by human cognitive learning processes. RT-ACM improves training by considering the relevance of auxiliary sensor working conditions, adhering to the principle of ``paying more attention to more relevant knowledge", and focusing on ``easier first, harder later" curriculum sampling. This approach aids the meta-learner in achieving a superior convergence state. Extensive experiments on two real-world datasets demonstrate the superiority of RT-ACM framework.


Beyond the Comfort Zone: Emerging Solutions to Overcome Challenges in Integrating LLMs into Software Products

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly embedded into software products across diverse industries, enhancing user experiences, but at the same time introducing numerous challenges for developers. Unique characteristics of LLMs force developers, who are accustomed to traditional software development and evaluation, out of their comfort zones as the LLM components shatter standard assumptions about software systems. This study explores the emerging solutions that software developers are adopting to navigate the encountered challenges. Leveraging a mixed-method research, including 26 interviews and a survey with 332 responses, the study identifies 19 emerging solutions regarding quality assurance that practitioners across several product teams at Microsoft are exploring. The findings provide valuable insights that can guide the development and evaluation of LLM-based products more broadly in the face of these challenges.


D-Wave's Nonlinear-Program Hybrid Solver: Description and Performance Analysis

arXiv.org Artificial Intelligence

The development of advanced quantum-classical algorithms is among the most prominent strategies in quantum computing. Numerous hybrid solvers have been introduced recently. Many of these methods are created ad hoc to address specific use cases. However, several well-established schemes are frequently utilized to address optimization problems. In this context, D-Wave launched the Hybrid Solver Service in 2020, offering a portfolio of methods designed to accelerate time-to-solution for users aiming to optimize performance and operational processes. Recently, a new technique has been added to this portfolio: the Nonlinear-Program Hybrid Solver. This paper describes this solver and evaluates its performance through a benchmark of 45 instances across three combinatorial optimization problems: the Traveling Salesman Problem, the Knapsack Problem, and the Maximum Cut Problem. To facilitate the use of this relatively unexplored solver, we provide details of the implementation used to solve these three optimization problems.


Beyond algorithm hyperparameters: on preprocessing hyperparameters and associated pitfalls in machine learning applications

arXiv.org Machine Learning

Adequately generating and evaluating prediction models based on supervised machine learning (ML) is often challenging, especially for less experienced users in applied research areas. Special attention is required in settings where the model generation process involves hyperparameter tuning, i.e. data-driven optimization of different types of hyperparameters to improve the predictive performance of the resulting model. Discussions about tuning typically focus on the hyperparameters of the ML algorithm (e.g., the minimum number of observations in each terminal node for a tree-based algorithm). In this context, it is often neglected that hyperparameters also exist for the preprocessing steps that are applied to the data before it is provided to the algorithm (e.g., how to handle missing feature values in the data). As a consequence, users experimenting with different preprocessing options to improve model performance may be unaware that this constitutes a form of hyperparameter tuning - albeit informal and unsystematic - and thus may fail to report or account for this optimization. To illuminate this issue, this paper reviews and empirically illustrates different procedures for generating and evaluating prediction models, explicitly addressing the different ways algorithm and preprocessing hyperparameters are typically handled by applied ML users. By highlighting potential pitfalls, especially those that may lead to exaggerated performance claims, this review aims to further improve the quality of predictive modeling in ML applications.


Deep Learning, Machine Learning, Advancing Big Data Analytics and Management

arXiv.org Artificial Intelligence

Advancements in artificial intelligence, machine learning, and deep learning have catalyzed the transformation of big data analytics and management into pivotal domains for research and application. This work explores the theoretical foundations, methodological advancements, and practical implementations of these technologies, emphasizing their role in uncovering actionable insights from massive, high-dimensional datasets. The study presents a systematic overview of data preprocessing techniques, including data cleaning, normalization, integration, and dimensionality reduction, to prepare raw data for analysis. Core analytics methodologies such as classification, clustering, regression, and anomaly detection are examined, with a focus on algorithmic innovation and scalability. Furthermore, the text delves into state-of-the-art frameworks for data mining and predictive modeling, highlighting the role of neural networks, support vector machines, and ensemble methods in tackling complex analytical challenges. Special emphasis is placed on the convergence of big data with distributed computing paradigms, including cloud and edge computing, to address challenges in storage, computation, and real-time analytics. The integration of ethical considerations, including data privacy and compliance with global standards, ensures a holistic perspective on data management. Practical applications across healthcare, finance, marketing, and policy-making illustrate the real-world impact of these technologies. Through comprehensive case studies and Python-based implementations, this work equips researchers, practitioners, and data enthusiasts with the tools to navigate the complexities of modern data analytics. It bridges the gap between theory and practice, fostering the development of innovative solutions for managing and leveraging data in the era of artificial intelligence.


Selective Reviews of Bandit Problems in AI via a Statistical View

arXiv.org Machine Learning

Introduction Reinforcement Learning (RL) is one of the most prominent and widely discussed methods in artificial intelligence, primarily focusing on how an agent learns to make decisions by interacting with an environment to maximize cumulative rewards [1]. RL has seen extensive applications in various domains, including autonomous driving [2], recommendation systems [3], unmanned aerial vehicles (UAVs) [4], financial trading [5], causal inference [6], and precision medicine [7,8]; see [9,10] for a review. The classic and simplified problem in RL is the stochastic bandit problems. Stochastic bandit problems exemplify the exploration-exploitation tradeoff dilemma, where an agent must choose between exploring new options to gather more information and exploiting known options to maximize rewards. The current review literature on stochastic bandit algorithms highlights applications in areas such as recommendation systems[11-13], experimental design[14], and precision medicine[8], causal inference[15]. Efficient bandit algorithms are designed from a statistical perspective. However, these aspects remain underexplored in existing reviews. This paper aims to address this gap by focusing on the probabilistic and statistical foundations of stochastic algorithms, with particular emphasis on concentration inequalities, minimax rate of regret upper bounds, small-sample statistical inferences, linear models, Bayesian optimization, statistical learning theory, design of experiments, the Neyman-Rubin causal model, functional data analysis, robust statistics, information theory, and so on.


Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions

arXiv.org Artificial Intelligence

Recent advancements in deep learning have significantly revolutionized the field of clinical diagnosis and treatment, offering novel approaches to improve diagnostic precision and treatment efficacy across diverse clinical domains, thus driving the pursuit of precision medicine. The growing availability of multi-organ and multimodal datasets has accelerated the development of large-scale Medical Multimodal Foundation Models (MMFMs). These models, known for their strong generalization capabilities and rich representational power, are increasingly being adapted to address a wide range of clinical tasks, from early diagnosis to personalized treatment strategies. This review offers a comprehensive analysis of recent developments in MMFMs, focusing on three key aspects: datasets, model architectures, and clinical applications. We also explore the challenges and opportunities in optimizing multimodal representations and discuss how these advancements are shaping the future of healthcare by enabling improved patient outcomes and more efficient clinical workflows.


An ADHD Diagnostic Interface Based on EEG Spectrograms and Deep Learning Techniques

arXiv.org Artificial Intelligence

This paper introduces an innovative approach to Attention-deficit/hyperactivity disorder (ADHD) diagnosis by employing deep learning (DL) techniques on electroencephalography (EEG) signals. This method addresses the limitations of current behavior-based diagnostic methods, which often lead to misdiagnosis and gender bias. By utilizing a publicly available EEG dataset and converting the signals into spectrograms, a Resnet-18 convolutional neural network (CNN) architecture was used to extract features for ADHD classification. The model achieved a high precision, recall, and an overall F1 score of 0.9. Feature extraction highlighted significant brain regions (frontopolar, parietal, and occipital lobes) associated with ADHD. These insights guided the creation of a three-part digital diagnostic system, facilitating cost-effective and accessible ADHD screening, especially in school environments. This system enables earlier and more accurate identification of students at risk for ADHD, providing timely support to enhance their developmental outcomes. This study showcases the potential of integrating EEG analysis with DL to enhance ADHD diagnostics, presenting a viable alternative to traditional methods.


U-Net in Medical Image Segmentation: A Review of Its Applications Across Modalities

arXiv.org Artificial Intelligence

Medical imaging is essential in healthcare to provide key insights into patient anatomy and pathology, aiding in diagnosis and treatment. Non-invasive techniques such as X-ray, Magnetic Resonance Imaging (MRI), Computed Tomography (CT), and Ultrasound (US), capture detailed images of organs, tissues, and abnormalities. Effective analysis of these images requires precise segmentation to delineate regions of interest (ROI), such as organs or lesions. Traditional segmentation methods, relying on manual feature-extraction, are labor-intensive and vary across experts. Recent advancements in Artificial Intelligence (AI) and Deep Learning (DL), particularly convolutional models such as U-Net and its variants (U-Net++ and U-Net 3+), have transformed medical image segmentation (MIS) by automating the process and enhancing accuracy. These models enable efficient, precise pixel-wise classification across various imaging modalities, overcoming the limitations of manual segmentation. This review explores various medical imaging techniques, examines the U-Net architectures and their adaptations, and discusses their application across different modalities. It also identifies common challenges in MIS and proposes potential solutions.


A Study on Quantum Neural Networks in Healthcare 5.0

arXiv.org Artificial Intelligence

The working environment in healthcare analytics is transforming with the emergence of healthcare 5.0 and the advancements in quantum neural networks. In addition to analyzing a comprehensive set of case studies, we also review relevant literature from the fields of quantum computing applications and smart healthcare analytics, focusing on the implications of quantum deep neural networks. This study aims to shed light on the existing research gaps regarding the implications of quantum neural networks in healthcare analytics. We argue that the healthcare industry is currently transitioning from automation towards genuine collaboration with quantum networks, which presents new avenues for research and exploration. Specifically, this study focuses on evaluating the performance of Healthcare 5.0, which involves the integration of diverse quantum machine learning and quantum neural network systems. This study also explores a range of potential challenges and future directions for Healthcare 5.0, particularly focusing on the integration of quantum neural networks.