Overview
Selecting a classification performance measure: matching the measure to the problem
Hand, David J., Christen, Peter, Ziyad, Sumayya
The problem of identifying to which of a given set of classes objects belong is ubiquitous, occurring in many research domains and application areas, including medical diagnosis, financial decision making, online commerce, and national security. But such assignments are rarely completely perfect, and classification errors occur. This means it is necessary to compare classification methods and algorithms to decide which is ``best'' for any particular problem. However, just as there are many different classification methods, so there are many different ways of measuring their performance. It is thus vital to choose a measure of performance which matches the aims of the research or application. This paper is a contribution to the growing literature on the relative merits of different performance measures. Its particular focus is the critical importance of matching the properties of the measure to the aims for which the classification is being made.
Denoising: A Powerful Building-Block for Imaging, Inverse Problems, and Machine Learning
Milanfar, Peyman, Delbracio, Mauricio
Denoising, the process of reducing random fluctuations in a signal to emphasize essential patterns, has been a fundamental problem of interest since the dawn of modern scientific inquiry. Recent denoising techniques, particularly in imaging, have achieved remarkable success, nearing theoretical limits by some measures. Yet, despite tens of thousands of research papers, the wide-ranging applications of denoising beyond noise removal have not been fully recognized. This is partly due to the vast and diverse literature, making a clear overview challenging. This paper aims to address this gap. We present a clarifying perspective on denoisers, their structure, and desired properties. We emphasize the increasing importance of denoising and showcase its evolution into an essential building block for complex tasks in imaging, inverse problems, and machine learning. Despite its long history, the community continues to uncover unexpected and groundbreaking uses for denoising, further solidifying its place as a cornerstone of scientific and engineering practice.
Surveying the MLLM Landscape: A Meta-Review of Current Surveys
Li, Ming, Chen, Keyu, Bi, Ziqian, Liu, Ming, Peng, Benji, Niu, Qian, Liu, Junyu, Wang, Jinlang, Zhang, Sen, Pan, Xuanhe, Xu, Jiawei, Feng, Pohsun
The rise of Multimodal Large Language Models (MLLMs) has become a transformative force in the field of artificial intelligence, enabling machines to process and generate content across multiple modalities, such as text, images, audio, and video. These models represent a significant advancement over traditional unimodal systems, opening new frontiers in diverse applications ranging from autonomous agents to medical diagnostics. By integrating multiple modalities, MLLMs achieve a more holistic understanding of information, closely mimicking human perception. As the capabilities of MLLMs expand, the need for comprehensive and accurate performance evaluation has become increasingly critical. This survey aims to provide a systematic review of benchmark tests and evaluation methods for MLLMs, covering key topics such as foundational concepts, applications, evaluation methodologies, ethical concerns, security, efficiency, and domain-specific applications. Through the classification and analysis of existing literature, we summarize the main contributions and methodologies of various surveys, conduct a detailed comparative analysis, and examine their impact within the academic community. Additionally, we identify emerging trends and underexplored areas in MLLM research, proposing potential directions for future studies. This survey is intended to offer researchers and practitioners a comprehensive understanding of the current state of MLLM evaluation, thereby facilitating further progress in this rapidly evolving field.
Trends, Advancements and Challenges in Intelligent Optimization in Satellite Communication
Krajsic, Philippe, Suess, Viola, Cao, Zehong, Kowalczyk, Ryszard, Franczyk, Bogdan
Abstract--Efficient satellite communications play an enormously important role in all of our daily lives. This includes the transmission of data for communication purposes, the operation of IoT applications or the provision of data for ground stations. More and more, AI-based methods are finding their way into these areas. This paper gives an overview of current research in the field of intelligent optimization of satellite communication. For this purpose, a text-mining based literature review was conducted and the identified papers were thematically clustered and analyzed. The identified clusters cover the main topics of routing, resource allocation and, load balancing. Through such a clustering of the literature in overarching topics, a structured analysis of the research papers was enabled, allowing the identification of latest technologies and approaches as well as research needs for intelligent optimization of satellite communication.
Analysis of Convolutional Neural Network-based Image Classifications: A Multi-Featured Application for Rice Leaf Disease Prediction and Recommendations for Farmers
Paneru, Biplov, Paneru, Bishwash, Shah, Krishna Bikram
This study presents a novel method for improving rice disease classification using 8 different convolutional neural network (CNN) algorithms, which will further the field of precision agriculture. Tkinter-based application that offers farmers a feature-rich interface. With the help of this cutting-edge application, farmers will be able to make timely and well-informed decisions by enabling real-time disease prediction and providing personalized recommendations. Together with the user-friendly Tkinter interface, the smooth integration of cutting-edge CNN transfer learning algorithms-based technology that include ResNet-50, InceptionV3, VGG16, and MobileNetv2 with the UCI dataset represents a major advancement toward modernizing agricultural practices and guaranteeing sustainable crop management. Remarkable outcomes include 75% accuracy for ResNet-50, 90% accuracy for DenseNet121, 84% accuracy for VGG16, 95.83% accuracy for MobileNetV2, 91.61% accuracy for DenseNet169, and 86% accuracy for InceptionV3. These results give a concise summary of the models' capabilities, assisting researchers in choosing appropriate strategies for precise and successful rice crop disease identification. A severe overfitting has been seen on VGG19 with 70% accuracy and Nasnet with 80.02% accuracy. On Renset101, only an accuracy of 54% could be achieved, along with only 33% on efficientNetB0. A MobileNetV2-trained model was successfully deployed on a TKinter GUI application to make predictions using image or real-time video capture.
IBM Quantum Computers: Evolution, Performance, and Future Directions
Quantum computers represent a transformative frontier in computational technology, promising exponential speedups beyond classical computing limits. IBM Quantum has led significant advancements in both hardware and software, providing access to quantum hardware via IBM Cloud since 2016, achieving a milestone with the world's first accessible quantum computer. This article explores IBM's quantum computing journey, focusing on the development of practical quantum computers. We summarize the evolution and advancements of IBM Quantum's processors across generations, including their recent breakthrough surpassing the 1,000-qubit barrier. The paper reviews detailed performance metrics across various hardware, tracing their evolution over time and highlighting IBM Quantum's transition from the noisy intermediate-scale quantum (NISQ) computing era towards fault-tolerant quantum computing capabilities.
Practical Aspects on Solving Differential Equations Using Deep Learning: A Primer
Deep learning (DL) [44] has advanced significantly in recent years, providing algorithms that can solve complex problems varying from image classification [12] to playing games such as Go [59] at a human level or even better. More recent advances include the development of Large Language Models (LLMs), which are substantial deep learning models (i.e., they usually have billions of parameters) such as ChatGPT [63, 21], and Llama [66] that have been trained on enormous data sets. Deep learning has naturally affected many scientific fields and has been applied to many complex problems. One such problem is the numerical solution of differential equations, where (deep) neural networks approximate the solution of partial or ordinary differential equations. The idea is not new since there are works from the 90s like Lee et al. [45] and Wang et al. [67] have relied on Hopfield neural networks [33] to solve differential equations using linear systems after discretizing them. Meade and Fernandez have proposed solvers for nonlinear differential equations based on combining neural networks and splines [48]. Lagaris et al. introduced in [42] an efficient method that relies on trial solutions and neural networks to solve boundary and initial values problems. They form a loss function using the differential operator of the problem, and a neural network approximates the solution by minimizing that loss function.
Art and Science of Quantizing Large-Scale Models: A Comprehensive Overview
Wang, Yanshu, Yang, Tong, Liang, Xiyan, Wang, Guoan, Lu, Hanning, Zhe, Xu, Li, Yaoming, Weitao, Li
This paper provides a comprehensive overview of the principles, challenges, and methodologies associated with quantizing large-scale neural network models. As neural networks have evolved towards larger and more complex architectures to address increasingly sophisticated tasks, the computational and energy costs have escalated significantly. We explore the necessity and impact of model size growth, highlighting the performance benefits as well as the computational challenges and environmental considerations. The core focus is on model quantization as a fundamental approach to mitigate these challenges by reducing model size and improving efficiency without substantially compromising accuracy. We delve into various quantization techniques, including both post-training quantization (PTQ) and quantization-aware training (QAT), and analyze several state-of-the-art algorithms such as LLM-QAT, PEQA(L4Q), ZeroQuant, SmoothQuant, and others. Through comparative analysis, we examine how these methods address issues like outliers, importance weighting, and activation quantization, ultimately contributing to more sustainable and accessible deployment of large-scale models.
Towards Ethical Personal AI Applications: Practical Considerations for AI Assistants with Long-Term Memory
One application area of long-term memory (LTM) capabilities with increasing traction is personal AI companions and assistants. With the ability to retain and contextualize past interactions and adapt to user preferences, personal AI companions and assistants promise a profound shift in how we interact with AI and are on track to become indispensable in personal and professional settings. However, this advancement introduces new challenges and vulnerabilities that require careful consideration regarding the deployment and widespread use of these systems. The goal of this paper is to explore the broader implications of building and deploying personal AI applications with LTM capabilities using a holistic evaluation approach. This will be done in three ways: 1) reviewing the technological underpinnings of LTM in Large Language Models, 2) surveying current personal AI companions and assistants, and 3) analyzing critical considerations and implications of deploying and using these applications.
Enhancing the Reliability of LiDAR Point Cloud Sampling: A Colorization and Super-Resolution Approach Based on LiDAR-Generated Images
Ha, Sier, Du, Honghao, Yu, Xianjia, Song, Jian, Westerlund, Tomi
In recent years, Light Detection and Ranging (LiDAR) technology, a critical sensor in robotics and autonomous systems, has seen significant advancements. These improvements include enhanced resolution of point clouds and the capability to provide 360{\deg} low-resolution images. These images encode various data such as depth, reflectivity, and near-infrared light within the pixels. However, an excessive density of points and conventional point cloud sampling can be counterproductive, particularly in applications such as LiDAR odometry, where misleading points and degraded geometry information may induce drift errors. Currently, extensive research efforts are being directed towards leveraging LiDAR-generated images to improve situational awareness. This paper presents a comprehensive review of current deep learning (DL) techniques, including colorization and super-resolution, which are traditionally utilized in conventional computer vision tasks. These techniques are applied to LiDAR-generated images and are analyzed qualitatively. Based on this analysis, we have developed a novel approach that selectively integrates the most suited colorization and super-resolution methods with LiDAR imagery to sample reliable points from the LiDAR point cloud. This approach aims to not only improve the accuracy of point cloud registration but also avoid mismatching caused by lacking geometry information, thereby augmenting the utility and precision of LiDAR systems in practical applications. In our evaluation, the proposed approach demonstrates superior performance compared to our previous work, achieving lower translation and rotation errors with a reduced number of points.