Goto

Collaborating Authors

 Overview


Survey on Leveraging Uncertainty Estimation Towards Trustworthy Deep Neural Networks: The Case of Reject Option and Post-training Processing

arXiv.org Artificial Intelligence

Although neural networks (especially deep neural networks) have achieved \textit{better-than-human} performance in many fields, their real-world deployment is still questionable due to the lack of awareness about the limitation in their knowledge. To incorporate such awareness in the machine learning model, prediction with reject option (also known as selective classification or classification with abstention) has been proposed in literature. In this paper, we present a systematic review of the prediction with the reject option in the context of various neural networks. To the best of our knowledge, this is the first study focusing on this aspect of neural networks. Moreover, we discuss different novel loss functions related to the reject option and post-training processing (if any) of network output for generating suitable measurements for knowledge awareness of the model. Finally, we address the application of the rejection option in reducing the prediction time for the real-time problems and present a comprehensive summary of the techniques related to the reject option in the context of extensive variety of neural networks. Our code is available on GitHub: \url{https://github.com/MehediHasanTutul/Reject_option}


A Review on Explainable Artificial Intelligence for Healthcare: Why, How, and When?

arXiv.org Artificial Intelligence

Artificial intelligence (AI) models are increasingly finding applications in the field of medicine. Concerns have been raised about the explainability of the decisions that are made by these AI models. In this article, we give a systematic analysis of explainable artificial intelligence (XAI), with a primary focus on models that are currently being used in the field of healthcare. The literature search is conducted following the preferred reporting items for systematic reviews and meta-analyses (PRISMA) standards for relevant work published from 1 January 2012 to 02 February 2022. The review analyzes the prevailing trends in XAI and lays out the major directions in which research is headed. We investigate the why, how, and when of the uses of these XAI models and their implications. We present a comprehensive examination of XAI methodologies as well as an explanation of how a trustworthy AI can be derived from describing AI models for healthcare fields. The discussion of this work will contribute to the formalization of the XAI field.


A Comprehensive Review of Data-Driven Co-Speech Gesture Generation

arXiv.org Artificial Intelligence

Gestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation and is considered an enabling technology in film, games, virtual social spaces, and for interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models, that benefit from the growing availability of data. This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule-based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text, and non-linguistic input. We also chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method. Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human-like motion; grounding the gesture in the co-occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.


AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges

arXiv.org Artificial Intelligence

Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big data generated by IT Operations processes, particularly in cloud infrastructures, to provide actionable insights with the primary goal of maximizing availability. There are a wide variety of problems to address, and multiple use-cases, where AI capabilities can be leveraged to enhance operational efficiency. Here we provide a review of the AIOps vision, trends challenges and opportunities, specifically focusing on the underlying AI techniques. We discuss in depth the key types of data emitted by IT Operations activities, the scale and challenges in analyzing them, and where they can be helpful. We categorize the key AIOps tasks as - incident detection, failure prediction, root cause analysis and automated actions. We discuss the problem formulation for each task, and then present a taxonomy of techniques to solve these problems. We also identify relatively under explored topics, especially those that could significantly benefit from advances in AI literature. We also provide insights into the trends in this field, and what are the key investment opportunities.


Brief Review -- ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

#artificialintelligence

ECA-Net clearly outperforms SENet, and also outperforms fixed kernel version of ECA-Net. ECA-Net is superior to SENet and CBAM while it is very competitive to AA-Net with lower model complexity. Note that AA-Net is trained with Inception data augmentation and different setting of learning rates. ECA-Net performs favorably against state-of-the-art CNNs while benefiting much lower model complexity. Different frameworks are used, ECA-Net can well generalize to object detection task.


What Are the Data-Centric AI Concepts behind GPT Models?

#artificialintelligence

Artificial Intelligence (AI) has made incredible strides in transforming the way we live, work, and interact with technology. Recently, that one area that has seen significant progress is the development of Large Language Models (LLMs), such as GPT-3, ChatGPT, and GPT-4. These models are capable of performing tasks such as language translation, text summarization, and question-answering with impressive accuracy. While it's difficult to ignore the increasing model size of LLMs, it's also important to recognize that their success is due largely to the large amount and high-quality data used to train them. In this article, we will present an overview of the recent advancements in LLMs from a data-centric AI perspective, drawing upon insights from our recent survey papers [1,2] with corresponding technical resources on GitHub.


A Comprehensive Survey on Knowledge Distillation of Diffusion Models

arXiv.org Artificial Intelligence

Diffusion Models (DMs), also referred to as score-based diffusion models, utilize neural networks to specify score functions. Unlike most other probabilistic models, DMs directly model the score functions, which makes them more flexible to parametrize and potentially highly expressive for probabilistic modeling. DMs can learn fine-grained knowledge, i.e., marginal score functions, of the underlying distribution. Therefore, a crucial research direction is to explore how to distill the knowledge of DMs and fully utilize their potential. Our objective is to provide a comprehensible overview of the modern approaches for distilling DMs, starting with an introduction to DMs and a discussion of the challenges involved in distilling them into neural vector fields. We also provide an overview of the existing works on distilling DMs into both stochastic and deterministic implicit generators. Finally, we review the accelerated diffusion sampling algorithms as a training-free method for distillation. Our tutorial is intended for individuals with a basic understanding of generative models who wish to apply DM's distillation or embark on a research project in this field.


Class-Imbalanced Learning on Graphs: A Survey

arXiv.org Artificial Intelligence

In recent years, graph representation learning techniques have proven effective in discovering meaningful vector representations of nodes, edges, or entire graphs, resulting in successful applications across a wide range of downstream tasks [29, 52, 68]. However, graph data often presents a significant challenge in the form of class imbalance, where one class's instances significantly outnumber those of other classes. This imbalance can lead to suboptimal performance when applying machine learning techniques to graph data. Class-imbalanced learning on graphs (CILG) is an emerging research area addressing class imbalance in graph data, where traditional methods for non-graph data might be unsuitable or ineffective for several reasons. Firstly, graph data's unique, irregular, non-Euclidean structure complicates traditional class-imbalance techniques designed for Euclidean data [78]. Secondly, graph data often holds rich relational information, necessitating specialized techniques for preservation and leverage during the learning process [51]. Lastly, node dependencies and interactions in a graph make class re-balancing complex, as naรฏve oversampling or undersampling may disrupt the graph's structure and thus lead to poor performance [35].


Vertical Federated Learning: A Structured Literature Review

arXiv.org Artificial Intelligence

Federated Learning (FL) has emerged as a promising distributed learning paradigm with an added advantage of data privacy. With the growing interest in having collaboration among data owners, FL has gained significant attention of organizations. The idea of FL is to enable collaborating participants train machine learning (ML) models on decentralized data without breaching privacy. In simpler words, federated learning is the approach of ``bringing the model to the data, instead of bringing the data to the mode''. Federated learning, when applied to data which is partitioned vertically across participants, is able to build a complete ML model by combining local models trained only using the data with distinct features at the local sites. This architecture of FL is referred to as vertical federated learning (VFL), which differs from the conventional FL on horizontally partitioned data. As VFL is different from conventional FL, it comes with its own issues and challenges. In this paper, we present a structured literature review discussing the state-of-the-art approaches in VFL. Additionally, the literature review highlights the existing solutions to challenges in VFL and provides potential research directions in this domain.


Evolution of Large Language Models: Revealing the Maestro of Linguistic Symphony

#artificialintelligence

Large Language Models (LLMs) have emerged as a cornerstone of artificial intelligence research and development, revolutionizing how machines understand and process natural language. These models, based on advanced deep learning architectures, have become increasingly sophisticated, capable of generating human-like text, answering questions, summarizing content, and performing a plethora of other tasks. The remarkable growth in the capabilities of LLMs can be attributed to advancements in computational power, the availability of large-scale datasets, and the continuous refinement of algorithmic techniques. A key element in the success of LLMs is their use of transformer-based architectures, which employ self-attention mechanisms to capture contextual information across long text sequences. Transformers have demonstrated a remarkable ability to scale, enabling the development of larger models with billions of parameters.