Goto

Collaborating Authors

 Overview


AutoDES: AutoML Pipeline Generation of Classification with Dynamic Ensemble Strategy Selection

arXiv.org Artificial Intelligence

Automating machine learning has achieved remarkable technological developments in recent years, and building an automated machine learning pipeline is now an essential task. The model ensemble is the technique of combining multiple models to get a better and more robust model. However, existing automated machine learning tends to be simplistic in handling the model ensemble, where the ensemble strategy is fixed, such as stacked generalization. There have been many techniques on different ensemble methods, especially ensemble selection, and the fixed ensemble strategy limits the upper limit of the model's performance. In this article, we present a novel framework for automated machine learning. Our framework incorporates advances in dynamic ensemble selection, and to our best knowledge, our approach is the first in the field of AutoML to search and optimize ensemble strategies. In the comparison experiments, our method outperforms the state-of-the-art automated machine learning frameworks with the same CPU time in 42 classification datasets from the OpenML platform. Ablation experiments on our framework validate the effectiveness of our proposed method.


Evaluating State of the Art, Forecasting Ensembles- and Meta-learning Strategies for Model Fusion

arXiv.org Artificial Intelligence

Techniques of hybridisation and ensemble learning are popular model fusion techniques for improving the predictive power of forecasting methods. With limited research that instigates combining these two promising approaches, this paper focuses on the utility of the Exponential-Smoothing-Recurrent Neural Network (ES-RNN) in the pool of base models for different ensembles. We compare against some state of the art ensembling techniques and arithmetic model averaging as a benchmark. We experiment with the M4 forecasting data set of 100,000 time-series, and the results show that the Feature-based Forecast Model Averaging (FFORMA), on average, is the best technique for late data fusion with the ES-RNN. However, considering the M4's Daily subset of data, stacking was the only successful ensemble at dealing with the case where all base model performances are similar. Our experimental results indicate that we attain state of the art forecasting results compared to N-BEATS as a benchmark. We conclude that model averaging is a more robust ensemble than model selection and stacking strategies. Further, the results show that gradient boosting is superior for implementing ensemble learning strategies.


GeoAI for Large-Scale Image Analysis and Machine Vision: Recent Progress of Artificial Intelligence in Geography

#artificialintelligence

GeoAI, or geospatial artificial intelligence, has become a trending topic and the frontier for spatial analytics in Geography. Although much progress has been made in exploring the integration of AI and Geography, there is yet no clear definition of GeoAI, its scope of research, or a broad discussion of how it enables new ways of problem solving across social and environmental sciences. This paper provides a comprehensive overview of GeoAI research used in large-scale image analysis, and its methodological foundation, most recent progress in geospatial applications, and comparative advantages over traditional methods. We organize this review of GeoAI research according to different kinds of image or structured data, including satellite and drone images, street views, and geo-scientific data, as well as their applications in a variety of image analysis and machine vision tasks. While different applications tend to use diverse types of data and models, we summarized six major strengths of GeoAI research, including (1) enablement of large-scale analytics; (2) automation; (3) high accuracy; (4) sensitivity in detecting subtle changes; (5) tolerance of noise in data; and (6) rapid technological advancement. As GeoAI remains a rapidly evolving field, we also describe current knowledge gaps and discuss future research directions.


Indoor Localization for Personalized Ambient Assisted Living of Multiple Users in Multi-Floor Smart Environments

arXiv.org Artificial Intelligence

This paper presents a multifunctional interdisciplinary framework that makes four scientific contributions towards the development of personalized ambient assisted living, with a specific focus to address the different and dynamic needs of the diverse aging population in the future of smart living environments. First, it presents a probabilistic reasoning-based mathematical approach to model all possible forms of user interactions for any activity arising from the user diversity of multiple users in such environments. Second, it presents a system that uses this approach with a machine learning method to model individual user profiles and user-specific user interactions for detecting the dynamic indoor location of each specific user. Third, to address the need to develop highly accurate indoor localization systems for increased trust, reliance, and seamless user acceptance, the framework introduces a novel methodology where two boosting approaches Gradient Boosting and the AdaBoost algorithm are integrated and used on a decision tree-based learning model to perform indoor localization. Fourth, the framework introduces two novel functionalities to provide semantic context to indoor localization in terms of detecting each user's floor-specific location as well as tracking whether a specific user was located inside or outside a given spatial region in a multi-floor-based indoor setting. These novel functionalities of the proposed framework were tested on a dataset of localization-related Big Data collected from 18 different users who navigated in 3 buildings consisting of 5 floors and 254 indoor spatial regions. The results show that this approach of indoor localization for personalized AAL that models each specific user always achieves higher accuracy as compared to the traditional approach of modeling an average user.


Residual and Attentional Architectures for Vector-Symbols

arXiv.org Artificial Intelligence

Vector-symbolic architectures (VSAs) provide methods for computing which are highly flexible and carry unique advantages. Concepts in VSAs are represented by 'symbols,' long vectors of values which utilize properties of high-dimensional spaces to represent and manipulate information. In this new work, we combine efficiency of the operations provided within the framework of the Fourier Holographic Reduced Representation (FHRR) VSA with the power of deep networks to construct novel VSA based residual and attention-based neural network architectures. Using an attentional FHRR architecture, we demonstrate that the same network architecture can address problems from different domains (image classification and molecular toxicity prediction) by encoding different information into the network's inputs, similar to the Perceiver model. This demonstrates a novel application of VSAs and a potential path to implementing state-of-the-art neural models on neuromorphic hardware.


A Survey on Participant Selection for Federated Learning in Mobile Networks

#artificialintelligence

Federated Learning (FL) is an efficient distributed machine learning paradigm that employs private datasets in a privacy-preserving manner. The main challenges of FL is that end devices usually possess various computation and communication capabilities and their training data are not independent and identically distributed (non-IID). Due to limited communication bandwidth and unstable availability of such devices in a mobile network, only a fraction of end devices (also referred to as the participants or clients in a FL process) can be selected in each round. Hence, it is of paramount importance to utilize an efficient participant selection scheme to maximize the performance of FL including final model accuracy and training time. In this paper, we provide a review of participant selection techniques for FL.


Physics-Informed Deep Neural Operator Networks

arXiv.org Artificial Intelligence

Standard neural networks can approximate general nonlinear operators, represented either explicitly by a combination of mathematical operators, e.g., in an advection-diffusion-reaction partial differential equation, or simply as a black box, e.g., a system-of-systems. The first neural operator was the Deep Operator Network (DeepONet), proposed in 2019 based on rigorous approximation theory. Since then, a few other less general operators have been published, e.g., based on graph neural networks or Fourier transforms. For black box systems, training of neural operators is data-driven only but if the governing equations are known they can be incorporated into the loss function during training to develop physics-informed neural operators. Neural operators can be used as surrogates in design problems, uncertainty quantification, autonomous systems, and almost in any application requiring real-time inference. Moreover, independently pre-trained DeepONets can be used as components of a complex multi-physics system by coupling them together with relatively light training. Here, we present a review of DeepONet, the Fourier neural operator, and the graph neural operator, as well as appropriate extensions with feature expansions, and highlight their usefulness in diverse applications in computational mechanics, including porous media, fluid mechanics, and solid mechanics.


Pareto Frontier Approximation Network (PA-Net) to Solve Bi-objective TSP

arXiv.org Artificial Intelligence

The travelling salesperson problem (TSP) is a classic resource allocation problem used to find an optimal order of doing a set of tasks while minimizing (or maximizing) an associated objective function. It is widely used in robotics for applications such as planning and scheduling. In this work, we solve TSP for two objectives using reinforcement learning (RL). Often in multi-objective optimization problems, the associated objective functions can be conflicting in nature. In such cases, the optimality is defined in terms of Pareto optimality. A set of these Pareto optimal solutions in the objective space form a Pareto front (or frontier). Each solution has its trade-off. We present the Pareto frontier approximation network (PA-Net), a network that generates good approximations of the Pareto front for the bi-objective travelling salesperson problem (BTSP). Firstly, BTSP is converted into a constrained optimization problem. We then train our network to solve this constrained problem using the Lagrangian relaxation and policy gradient. With PA-Net we improve the performance over an existing deep RL-based method. The average improvement in the hypervolume metric, which is used to measure the optimality of the Pareto front, is 2.3%. At the same time, PA-Net has 4.5x faster inference time. Finally, we present the application of PA-Net to find optimal visiting order in a robotic navigation task/coverage planning. Our code is available on the project website.


An Overview of Distant Supervision for Relation Extraction with a Focus on Denoising and Pre-training Methods

arXiv.org Artificial Intelligence

Relation Extraction (RE) is a foundational task of natural language processing. RE seeks to transform raw, unstructured text into structured knowledge by identifying relational information between entity pairs found in text. RE has numerous uses, such as knowledge graph completion, text summarization, question-answering, and search querying. The history of RE methods can be roughly organized into four phases: pattern-based RE, statistical-based RE, neural-based RE, and large language model-based RE. This survey begins with an overview of a few exemplary works in the earlier phases of RE, highlighting limitations and shortcomings to contextualize progress. Next, we review popular benchmarks and critically examine metrics used to assess RE performance. We then discuss distant supervision, a paradigm that has shaped the development of modern RE methods. Lastly, we review recent RE works focusing on denoising and pre-training methods.


Technology and Consciousness

arXiv.org Artificial Intelligence

We report on a series of eight workshops held in the summer of 2017 on the topic "technology and consciousness." The workshops covered many subjects but the overall goal was to assess the possibility of machine consciousness, and its potential implications. In the body of the report, we summarize most of the basic themes that were discussed: the structure and function of the brain, theories of consciousness, explicit attempts to construct conscious machines, detection and measurement of consciousness, possible emergence of a conscious technology, methods for control of such a technology and ethical considerations that might be owed to it. An appendix outlines the topics of each workshop and provides abstracts of the talks delivered. Update: Although this report was published in 2018 and the workshops it is based on were held in 2017, recent events suggest that it is worth bringing forward. In particular, in the Spring of 2022, a Google engineer claimed that LaMDA, one of their "large language models" is sentient or even conscious. This provoked a flurry of commentary in both the scientific and popular press, some of it interesting and insightful, but almost all of it ignorant of the prior consideration given to these topics and the history of research into machine consciousness. Thus, we are making a lightly refreshed version of this report available in the hope that it will provide useful background to the current debate and will enable more informed commentary. Although this material is five years old, its technical points remain valid and up to date, but we have "refreshed" it by adding a few footnotes highlighting recent developments.