Expert Systems
Industrial Internet of Things Intelligence Empowering Smart Manufacturing: A Literature Review
Hu, Yujiao, Jia, Qingmin, Yao, Yuao, Lee, Yong, Lee, Mengjie, Wang, Chenyi, Zhou, Xiaomao, Xie, Renchao, Yu, F. Richard
The fiercely competitive business environment and increasingly personalized customization needs are driving the digital transformation and upgrading of the manufacturing industry. IIoT intelligence, which can provide innovative and efficient solutions for various aspects of the manufacturing value chain, illuminates the path of transformation for the manufacturing industry. It is time to provide a systematic vision of IIoT intelligence. However, existing surveys often focus on specific areas of IIoT intelligence, leading researchers and readers to have biases in their understanding of IIoT intelligence, that is, believing that research in one direction is the most important for the development of IIoT intelligence, while ignoring contributions from other directions. Therefore, this paper provides a comprehensive overview of IIoT intelligence. We first conduct an in-depth analysis of the inevitability of manufacturing transformation and study the successful experiences from the practices of Chinese enterprises. Then we give our definition of IIoT intelligence and demonstrate the value of IIoT intelligence for industries in fucntions, operations, deployments, and application. Afterwards, we propose a hierarchical development architecture for IIoT intelligence, which consists of five layers. The practical values of technical upgrades at each layer are illustrated by a close look on lighthouse factories. Following that, we identify seven kinds of technologies that accelerate the transformation of manufacturing, and clarify their contributions. Finally, we explore the open challenges and development trends from four aspects to inspire future researches.
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
Wu, Jialin, Hu, Xia, Wang, Yaqing, Pang, Bo, Soricut, Radu
Large multi-modal models (LMMs) exhibit remarkable performance across numerous tasks. However, generalist LMMs often suffer from performance degradation when tuned over a large collection of tasks. Recent research suggests that Mixture of Experts (MoE) architectures are useful for instruction tuning, but for LMMs of parameter size around O(50-100B), the prohibitive cost of replicating and storing the expert models severely limits the number of experts we can use. We propose Omni-SMoLA, an architecture that uses the Soft MoE approach to (softly) mix many multimodal low rank experts, and avoids introducing a significant number of new parameters compared to conventional MoE models. The core intuition here is that the large model provides a foundational backbone, while different lightweight experts residually learn specialized knowledge, either per-modality or multimodally. Extensive experiments demonstrate that the SMoLA approach helps improve the generalist performance across a broad range of generative vision-and-language tasks, achieving new SoTA generalist performance that often matches or outperforms single specialized LMM baselines, as well as new SoTA specialist performance.
A Causality-Aware Pattern Mining Scheme for Group Activity Recognition in a Pervasive Sensor Space
Kim, Hyunju, Son, Heesuk, Lee, Dongman
Human activity recognition (HAR) is a key challenge in pervasive computing and its solutions have been presented based on various disciplines. Specifically, for HAR in a smart space without privacy and accessibility issues, data streams generated by deployed pervasive sensors are leveraged. In this paper, we focus on a group activity by which a group of users perform a collaborative task without user identification and propose an efficient group activity recognition scheme which extracts causality patterns from pervasive sensor event sequences generated by a group of users to support as good recognition accuracy as the state-of-the-art graphical model. To filter out irrelevant noise events from a given data stream, a set of rules is leveraged to highlight causally related events. Then, a pattern-tree algorithm extracts frequent causal patterns by means of a growing tree structure. Based on the extracted patterns, a weighted sum-based pattern matching algorithm computes the likelihoods of stored group activities to the given test event sequence by means of matched event pattern counts for group activity recognition. We evaluate the proposed scheme using the data collected from our testbed and CASAS datasets where users perform their tasks on a daily basis and validate its effectiveness in a real environment. Experiment results show that the proposed scheme performs higher recognition accuracy and with a small amount of runtime overhead than the existing schemes.
Active learning for data streams: a survey
Cacciarelli, Davide, Kulahci, Murat
Online active learning is a paradigm in machine learning that aims to select the most informative data points to label from a data stream. The problem of minimizing the cost associated with collecting labeled observations has gained a lot of attention in recent years, particularly in real-world applications where data is only available in an unlabeled form. Annotating each observation can be time-consuming and costly, making it difficult to obtain large amounts of labeled data. To overcome this issue, many active learning strategies have been proposed in the last decades, aiming to select the most informative observations for labeling in order to improve the performance of machine learning models. These approaches can be broadly divided into two categories: static pool-based and stream-based active learning. Pool-based active learning involves selecting a subset of observations from a closed pool of unlabeled data, and it has been the focus of many surveys and literature reviews. However, the growing availability of data streams has led to an increase in the number of approaches that focus on online active learning, which involves continuously selecting and labeling observations as they arrive in a stream. This work aims to provide an overview of the most recently proposed approaches for selecting the most informative observations from data streams in real time. We review the various techniques that have been proposed and discuss their strengths and limitations, as well as the challenges and opportunities that exist in this area of research.
Understanding Your Agent: Leveraging Large Language Models for Behavior Explanation
Zhang, Xijia, Guo, Yue, Stepputtis, Simon, Sycara, Katia, Campbell, Joseph
Intelligent agents such as robots are increasingly deployed in real-world, safety-critical settings. It is vital that these agents are able to explain the reasoning behind their decisions to human counterparts; however, their behavior is often produced by uninterpretable models such as deep neural networks. We propose an approach to generate natural language explanations for an agent's behavior based only on observations of states and actions, thus making our method independent from the underlying model's representation. For such models, we first learn a behavior representation and subsequently use it to produce plausible explanations with minimal hallucination while affording user interaction with a pre-trained large language model. We evaluate our method in a multi-agent search-and-rescue environment and demonstrate the effectiveness of our explanations for agents executing various behaviors. Through user studies and empirical experiments, we show that our approach generates explanations as helpful as those produced by a human domain expert while enabling beneficial interactions such as clarification and counterfactual queries.
CommunityAI: Towards Community-based Federated Learning
Murturi, Ilir, Donta, Praveen Kumar, Dustdar, Schahram
Federated Learning (FL) has emerged as a promising paradigm to train machine learning models collaboratively while preserving data privacy. However, its widespread adoption faces several challenges, including scalability, heterogeneous data and devices, resource constraints, and security concerns. Despite its promise, FL has not been specifically adapted for community domains, primarily due to the wide-ranging differences in data types and context, devices and operational conditions, environmental factors, and stakeholders. In response to these challenges, we present a novel framework for Community-based Federated Learning called CommunityAI. CommunityAI enables participants to be organized into communities based on their shared interests, expertise, or data characteristics. Community participants collectively contribute to training and refining learning models while maintaining data and participant privacy within their respective groups. Within this paper, we discuss the conceptual architecture, system requirements, processes, and future challenges that must be solved. Finally, our goal within this paper is to present our vision regarding enabling a collaborative learning process within various communities.
A knowledge-driven AutoML architecture
Cofaru, Corneliu, Loeckx, Johan
Automated machine learning (AutoML) gathered a significant amount of attention in recent years as a way of automating some of the typical workflows in machine learning (ML) and data science more broadly. For a comprehensive and systematic view on the subject, there is an already growing number of survey works that cover the state-of-the-art Hutter et al. (2019); Yao et al. (2018); Elshawi et al. (2019); Zöller and Huber (2021); Truong et al. (2019); He et al. (2021); Hospedales et al. (2020); Vanschoren (2018Santu"); Karmaker Santu"Santu". Currently, it is becoming apparent that the size of the potential problem space, required solution sophistication, transparency and legal constraints Roscher et al. (2020); Drozdal et al. (2020); Rudin et al. (2021); Veale and Borgesius (2021); Smuha et al. (2021) render AutoML a problem extremely difficult to define and solve either holistically or agnostically.
Students' interest in knowledge acquisition in Artificial Intelligence
Petrescu, Manuela-Andreea, Pop, Emilia-Loredana, Mihoc, Tudor-Dan
Some students' expectations and points of view related to the Artificial Intelligence course are explored and analyzed in this study. We anonymous collected answers from 58 undergraduate students out of 200 enrolled in the Computer Science specialization. The answers were analysed and interpreted using thematic analysis to find out their interests and attractive and unattractive aspects related to the Artificial Intelligence study topic. We concluded that students are interested in Artificial Intelligence due to its trendiness, applicability, their passion and interest in the subject, the potential for future growth, and high salaries. However, the students' expectations were mainly related to achieving medium knowledge in the Artificial Intelligence field, and men seem to be more interested in acquiring high-level skills than women. The most common part that wasn't enjoyed by the students was the mathematical aspect used in Artificial Intelligence. Some of them (a small group) were also aware of the Artificial Intelligence potential which could be used in an unethical manner for negative purposes. Our study also provides a short comparison to the Databases course, in which students were not that passionate or interested in achieving medium knowledge, their interest was related to DB usage and basic information.
Utilizing Multiple Inputs Autoregressive Models for Bearing Remaining Useful Life Prediction
Wang, Junliang, Zhang, Qinghua, Zhu, Guanhua, Sun, Guoxi
Accurate prediction of the Remaining Useful Life (RUL) of rolling bearings is crucial in industrial production, yet existing models often struggle with limited generalization capabilities due to their inability to fully process all vibration signal patterns. We introduce a novel multi-input autoregressive model to address this challenge in RUL prediction for bearings. Our approach uniquely integrates vibration signals with previously predicted Health Indicator (HI) values, employing feature fusion to output current window HI values. Through autoregressive iterations, the model attains a global receptive field, effectively overcoming the limitations in generalization. Furthermore, we innovatively incorporate a segmentation method and multiple training iterations to mitigate error accumulation in autoregressive models. Empirical evaluation on the PMH2012 dataset demonstrates that our model, compared to other backbone networks using similar autoregressive approaches, achieves significantly lower Root Mean Square Error (RMSE) and Score. Notably, it outperforms traditional autoregressive models that use label values as inputs and non-autoregressive networks, showing superior generalization abilities with a marked lead in RMSE and Score metrics.
Active Foundational Models for Fault Diagnosis of Electrical Motors
Anbalagan, Sriram, GP, Sai Shashank, Agarwal, Deepesh, Natarajan, Balasubramaniam, Srinivasan, Babji
Fault detection and diagnosis of electrical motors are of utmost importance in ensuring the safe and reliable operation of several industrial systems. Detection and diagnosis of faults at the incipient stage allows corrective actions to be taken in order to reduce the severity of faults. The existing data-driven deep learning approaches for machine fault diagnosis rely extensively on huge amounts of labeled samples, where annotations are expensive and time-consuming. However, a major portion of unlabeled condition monitoring data is not exploited in the training process. To overcome this limitation, we propose a foundational model-based Active Learning framework that utilizes less amount of labeled samples, which are most informative and harnesses a large amount of available unlabeled data by effectively combining Active Learning and Contrastive Self-Supervised Learning techniques. It consists of a transformer network-based backbone model trained using an advanced nearest-neighbor contrastive self-supervised learning method. This approach empowers the backbone to learn improved representations of samples derived from raw, unlabeled vibration data. Subsequently, the backbone can undergo fine-tuning to address a range of downstream tasks, both within the same machines and across different machines. The effectiveness of the proposed methodology has been assessed through the fine-tuning of the backbone for multiple target tasks using three distinct machine-bearing fault datasets. The experimental evaluation demonstrates a superior performance as compared to existing state-of-the-art fault diagnosis methods with less amount of labeled data.