Overview
A Comprehensive Overview of Feature Selection Methods in Machine Learning
A critical phase of machine learning is feature selection. To enhance the model's effectiveness and generalizability, it includes choosing and creating a subset of features from a dataset that are most related to the target variable. There are many different techniques that can be used for feature selection, and the appropriate technique will depend on the specific problem and the type of data you are working with. It's important to note that feature selection is a trade-off between model complexity and predictive power. Adding more features to a model can potentially improve its performance, but it can also make the model more difficult to interpret and increase the risk of overfitting (performing well on the training data but poorly on unseen data). A hybrid method involves combining two or more of the above methods to find the optimal subset of features.
A Comprehensive Review on Autonomous Navigation
Nahavandi, Saeid, Alizadehsani, Roohallah, Nahavandi, Darius, Mohamed, Shady, Mohajer, Navid, Rokonuzzaman, Mohammad, Hossain, Ibrahim
The field of autonomous mobile robots has undergone dramatic advancements over the past decades. Despite achieving important milestones, several challenges are yet to be addressed. Aggregating the achievements of the robotic community as survey papers is vital to keep the track of current state-of-the-art and the challenges that must be tackled in the future. This paper tries to provide a comprehensive review of autonomous mobile robots covering topics such as sensor types, mobile robot platforms, simulation tools, path planning and following, sensor fusion methods, obstacle avoidance, and SLAM. The urge to present a survey paper is twofold. First, autonomous navigation field evolves fast so writing survey papers regularly is crucial to keep the research community well-aware of the current status of this field. Second, deep learning methods have revolutionized many fields including autonomous navigation. Therefore, it is necessary to give an appropriate treatment of the role of deep learning in autonomous navigation as well which is covered in this paper. Future works and research gaps will also be discussed.
Deep Active Learning for Computer Vision: Past and Future
Takezoe, Rinyoichi, Liu, Xu, Mao, Shunan, Chen, Marco Tianyu, Feng, Zhanpeng, Zhang, Shiliang, Wang, Xiaoyu
As an important data selection schema, active learning emerges as the essential component when iterating an Artificial Intelligence (AI) model. It becomes even more critical given the dominance of deep neural network based models, which are composed of a large number of parameters and data hungry, in application. Despite its indispensable role for developing AI models, research on active learning is not as intensive as other research directions. In this paper, we present a review of active learning through deep active learning approaches from the following perspectives: 1) technical advancements in active learning, 2) applications of active learning in computer vision, 3) industrial systems leveraging or with potential to leverage active learning for data iteration, 4) current limitations and future research directions. We expect this paper to clarify the significance of active learning in a modern AI model manufacturing process and to bring additional research attention to active learning. By addressing data automation challenges and coping with automated machine learning systems, active learning will facilitate democratization of AI technologies by boosting model production at scale.
Time series forecasting using fuzzy cognitive maps: a survey - Artificial Intelligence Review
Among various soft computing approaches for time series forecasting, fuzzy cognitive maps (FCMs) have shown remarkable results as a tool to model and analyze the dynamics of complex systems. FCMs have similarities to recurrent neural networks and can be classified as a neuro-fuzzy method. In other words, FCMs are a mixture of fuzzy logic, neural network, and expert system aspects, which act as a powerful tool for simulating and studying the dynamic behavior of complex systems. The most interesting features are knowledge interpretability, dynamic characteristics and learning capability. The goal of this survey paper is mainly to present an overview on the most relevant and recent FCM-based time series forecasting models proposed in the literature.
How to Find Actionable Static Analysis Warnings: A Case Study with FindBugs
Yedida, Rahul, Kang, Hong Jin, Tu, Huy, Yang, Xueqi, Lo, David, Menzies, Tim
Automatically generated static code warnings suffer from a large number of false alarms. Hence, developers only take action on a small percent of those warnings. To better predict which static code warnings should not be ignored, we suggest that analysts need to look deeper into their algorithms to find choices that better improve the particulars of their specific problem. Specifically, we show here that effective predictors of such warnings can be created by methods that locally adjust the decision boundary (between actionable warnings and others). These methods yield a new high water-mark for recognizing actionable static code warnings. For eight open-source Java projects (cassandra, jmeter, commons, lucene-solr, maven, ant, tomcat, derby) we achieve perfect test results on 4/8 datasets and, overall, a median AUC (area under the true negatives, true positives curve) of 92%.
Image Classification with Small Datasets: Overview and Benchmark
Brigato, L., Barz, B., Iocchi, L., Denzler, J.
Image classification with small datasets has been an active research area in the recent past. However, as research in this scope is still in its infancy, two key ingredients are missing for ensuring reliable and truthful progress: a systematic and extensive overview of the state of the art, and a common benchmark to allow for objective comparisons between published methods. This article addresses both issues. First, we systematically organize and connect past studies to consolidate a community that is currently fragmented and scattered. Second, we propose a common benchmark that allows for an objective comparison of approaches. It consists of five datasets spanning various domains (e.g., natural images, medical imagery, satellite data) and data types (RGB, grayscale, multispectral). We use this benchmark to re-evaluate the standard cross-entropy baseline and ten existing methods published between 2017 and 2021 at renowned venues. Surprisingly, we find that thorough hyper-parameter tuning on held-out validation data results in a highly competitive baseline and highlights a stunted growth of performance over the years. Indeed, only a single specialized method dating back to 2019 clearly wins our benchmark and outperforms the baseline classifier.
A Review of Scene Representations for Robot Manipulators
For a robot to act intelligently, it needs to sense the world around it. Increasingly, robots build an internal representation of the world from sensor readings. This representation can then be used to inform downstream tasks, such as manipulation, collision avoidance, or human interaction. In practice, scene representations vary widely depending on the type of robot, the sensing modality, and the task that the robot is designed to do. This review provides an overview of the scene representations used for robot manipulators (robot arms). We focus primarily on representations which are built from real world sensing and are used to inform some downstream robotics task. Building an intermediate scene representation is not necessary for a robotics system. It is completely possible for a robotics system to act directly on sensor data (e.g.
The State of the Art in Enhancing Trust in Machine Learning Models with the Use of Visualizations
Chatzimparmpas, A., Martins, R., Jusufi, I., Kucher, K., Rossi, Fabrice, Kerren, A.
Machine learning (ML) models are nowadays used in complex applications in various domains, such as medicine, bioinformatics, and other sciences. Due to their black box nature, however, it may sometimes be hard to understand and trust the results they provide. This has increased the demand for reliable visualization tools related to enhancing trust in ML models, which has become a prominent topic of research in the visualization community over the past decades. To provide an overview and present the frontiers of current research on the topic, we present a State-of-the-Art Report (STAR) on enhancing trust in ML models with the use of interactive visualization. We define and describe the background of the topic, introduce a categorization for visualization techniques that aim to accomplish this goal, and discuss insights and opportunities for future research directions. Among our contributions is a categorization of trust against different facets of interactive ML, expanded and improved from previous research. Our results are investigated from different analytical perspectives: (a) providing a statistical overview, (b) summarizing key findings, (c) performing topic analyses, and (d) exploring the data sets used in the individual papers, all with the support of an interactive web-based survey browser. We intend this survey to be beneficial for visualization researchers whose interests involve making ML models more trustworthy, as well as researchers and practitioners from other disciplines in their search for effective visualization techniques suitable for solving their tasks with confidence and conveying meaning to their data.
Towards Sustainable Artificial Intelligence: An Overview of Environmental Protection Uses and Issues
Pachot, Arnault, Patissier, Céline
Artificial Intelligence (AI) is used to create more sustainable production methods and model climate change, making it a valuable tool in the fight against environmental degradation. This paper describes the paradox of an energy-consuming technology serving the ecological challenges of tomorrow. The study provides an overview of the sectors that use AI-based solutions for environmental protection. It draws on numerous examples from AI for Green players to present use cases and concrete examples. In the second part of the study, the negative impacts of AI on the environment and the emerging technological solutions to support Green AI are examined. It is also shown that the research on less energy-consuming AI is motivated more by cost and energy autonomy constraints than by environmental considerations. This leads to a rebound effect that favors an increase in the complexity of models. Finally, the need to integrate environmental indicators into algorithms is discussed. The environmental dimension is part of the broader ethical problem of AI, and addressing it is crucial for ensuring the sustainability of AI in the long term.
Federated Learning -- Methods, Applications and beyond
Heusinger, Moritz, Raab, Christoph, Rossi, Fabrice, Schleif, Frank-Michael
In recent years the applications of machine learning models have increased rapidly, due to the large amount of available data and technological progress.While some domains like web analysis can benefit from this with only minor restrictions, other fields like in medicine with patient data are strongerregulated. In particular \emph{data privacy} plays an important role as recently highlighted by the trustworthy AI initiative of the EU or general privacy regulations in legislation. Another major challenge is, that the required training \emph{data is} often \emph{distributed} in terms of features or samples and unavailable for classicalbatch learning approaches. In 2016 Google came up with a framework, called \emph{Federated Learning} to solve both of these problems. We provide a brief overview on existing Methods and Applications in the field of vertical and horizontal \emph{Federated Learning}, as well as \emph{Fderated Transfer Learning}.