Overview
Machine Learning and Consumer Data
Chang, Hannah H., Mukherjee, Anirban
The digital revolution has led to the digitization of human behavior, creating unprecedented opportunities to understand observable actions on an unmatched scale. Emerging phenomena such as crowdfunding and crowdsourcing have further illuminated consumer behavior while also introducing new behavioral patterns. However, the sheer volume and complexity of this data present significant challenges for marketing researchers and practitioners. Traditional methods used to analyze consumer data fall short in handling the breadth, precision, and scale of emerging data sources. To address this, computational methods have been developed to manage the "big data" associated with consumer behavior, which typically includes structured data, textual data, audial data, and visual data. These methods, particularly machine learning, allow for effective parsing and processing of multi-faceted data. Given these recent developments, this review article seeks to familiarize researchers and practitioners with new data sources and analysis techniques for studying consumer behavior at scale. It serves as an introduction to the application of computational social science in understanding and leveraging publicly available consumer data.
Semi-supervised Object Detection: A Survey on Recent Research and Progress
Wang, Yanyang, Liu, Zhaoxiang, Lian, Shiguo
In recent years, deep learning technology has been maturely applied in the field of object detection, and most algorithms tend to be supervised learning. However, a large amount of labeled data requires high costs of human resources, which brings about low efficiency and limitations. Semi-supervised object detection (SSOD) has been paid more and more attentions due to its high research value and practicability. It is designed to learn information by using small amounts of labeled data and large amounts of unlabeled data. In this paper, we present a comprehensive and up-to-date survey on the SSOD approaches from five aspects. We first briefly introduce several ways of data augmentation. Then, we dive the mainstream semi-supervised strategies into pseudo labels, consistent regularization, graph based and transfer learning based methods, and introduce some methods in challenging settings. We further present widely-used loss functions, and then we outline the common benchmark datasets and compare the accuracy among different representative approaches. Finally, we conclude this paper and present some promising research directions for the future. Our survey aims to provide researchers and practitioners new to the field as well as more advanced readers with a solid understanding of the main approaches developed over the past few years.
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware
Zhang, Shichang, Sohrabizadeh, Atefeh, Wan, Cheng, Huang, Zijie, Hu, Ziniu, Wang, Yewen, Yingyan, null, Lin, null, Cong, Jason, Sun, Yizhou
Many real-world data can be represented as graphs with nodes denoting a collection of entities and edges denoting their pairwise relationships, such as individuals in social networks, financial transactions between firms and banks, atoms and bonds in molecules, and vehicles in transportation systems. Graph neural networks (GNNs) [45, 71, 125] have recently become the most widely used graph machine learning (ML) model for learning knowledge and making predictions on graph data. GNNs have achieved state-of-the-art performance in many graph ML applications. They are used, for example, in recommendations on social graphs [89, 136, 165], fraud account detection on financial graphs [31], drug discoveries from molecule graphs [64], traffic forecasting on transportation graphs [65], and so on. The superior performance of GNNs on graphs is mainly due to their ability to combine the entity information, represented as the node features, and the relationships, represented as the graph structure.
Large Sequence Models for Sequential Decision-Making: A Survey
Wen, Muning, Lin, Runji, Wang, Hanjing, Yang, Yaodong, Wen, Ying, Mai, Luo, Wang, Jun, Zhang, Haifeng, Zhang, Weinan
Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e.g., GPT-3 and Swin Transformer. Although originally designed for prediction problems, it is natural to inquire about their suitability for sequential decision-making and reinforcement learning problems, which are typically beset by long-standing issues involving sample efficiency, credit assignment, and partial observability. In recent years, sequence models, especially the Transformer, have attracted increasing interest in the RL communities, spawning numerous approaches with notable effectiveness and generalizability. This survey presents a comprehensive overview of recent works aimed at solving sequential decision-making tasks with sequence models such as the Transformer, by discussing the connection between sequential decision-making and sequence modeling, and categorizing them based on the way they utilize the Transformer. Moreover, this paper puts forth various potential avenues for future research intending to improve the effectiveness of large sequence models for sequential decision-making, encompassing theoretical foundations, network architectures, algorithms, and efficient training systems. As this article has been accepted by the Frontiers of Computer Science, here is an early version, and the most up-to-date version can be found at https://journal.hep.com.cn/fcs/EN/10.1007/s11704-023-2689-5
Split Learning in 6G Edge Networks
Lin, Zheng, Qu, Guanqiao, Chen, Xianhao, Huang, Kaibin
With the proliferation of distributed edge computing resources, the 6G mobile network will evolve into a network for connected intelligence. Along this line, the proposal to incorporate federated learning into the mobile edge has gained considerable interest in recent years. However, the deployment of federated learning faces substantial challenges as massive resource-limited IoT devices can hardly support on-device model training. This leads to the emergence of split learning (SL) which enables servers to handle the major training workload while still enhancing data privacy. In this article, we offer a brief overview of key advancements in SL and articulate its seamless integration with wireless edge networks. We begin by illustrating the tailored 6G architecture to support edge SL. Then, we examine the critical design issues for edge SL, including innovative resource-efficient learning frameworks and resource management strategies under a single edge server. Additionally, we expand the scope to multi-edge scenarios, exploring multi-edge collaboration and mobility management from a networking perspective. Finally, we discuss open problems for edge SL, including convergence analysis, asynchronous SL and U-shaped SL.
A Survey on Neural-symbolic Learning Systems
Yu, Dongran, Yang, Bo, Liu, Dayou, Wang, Hui, Pan, Shirui
In recent years, neural systems have demonstrated highly effective learning ability and superior perception intelligence. However, they have been found to lack effective reasoning and cognitive ability. On the other hand, symbolic systems exhibit exceptional cognitive intelligence but suffer from poor learning capabilities when compared to neural systems. Recognizing the advantages and disadvantages of both methodologies, an ideal solution emerges: combining neural systems and symbolic systems to create neural-symbolic learning systems that possess powerful perception and cognition. The purpose of this paper is to survey the advancements in neural-symbolic learning systems from four distinct perspectives: challenges, methods, applications, and future directions. By doing so, this research aims to propel this emerging field forward, offering researchers a comprehensive and holistic overview. This overview will not only highlight the current state-of-the-art but also identify promising avenues for future research.
A Survey on Multimodal Large Language Models
Yin, Shukang, Fu, Chaoyou, Zhao, Sirui, Li, Ke, Sun, Xing, Xu, Tong, Chen, Enhong
Multimodal Large Language Model (MLLM) recently has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal tasks. The surprising emergent capabilities of MLLM, such as writing stories based on images and OCR-free math reasoning, are rare in traditional methods, suggesting a potential path to artificial general intelligence. In this paper, we aim to trace and summarize the recent progress of MLLM. First of all, we present the formulation of MLLM and delineate its related concepts. Then, we discuss the key techniques and applications, including Multimodal Instruction Tuning (M-IT), Multimodal In-Context Learning (M-ICL), Multimodal Chain of Thought (M-CoT), and LLM-Aided Visual Reasoning (LAVR). Finally, we discuss existing challenges and point out promising research directions. In light of the fact that the era of MLLM has only just begun, we will keep updating this survey and hope it can inspire more research. An associated GitHub link collecting the latest papers is available at https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models.
Efficient Online Processing with Deep Neural Networks
The capabilities and adoption of deep neural networks (DNNs) grow at an exhilarating pace: Vision models accurately classify human actions in videos and identify cancerous tissue in medical scans as precisely than human experts; large language models answer wide-ranging questions, generate code, and write prose, becoming the topic of everyday dinner-table conversations. Even though their uses are exhilarating, the continually increasing model sizes and computational complexities have a dark side. The economic cost and negative environmental externalities of training and serving models is in evident disharmony with financial viability and climate action goals. Instead of pursuing yet another increase in predictive performance, this dissertation is dedicated to the improvement of neural network efficiency. Specifically, a core contribution addresses the efficiency aspects during online inference. Here, the concept of Continual Inference Networks (CINs) is proposed and explored across four publications. CINs extend prior state-of-the-art methods developed for offline processing of spatio-temporal data and reuse their pre-trained weights, improving their online processing efficiency by an order of magnitude. These advances are attained through a bottom-up computational reorganization and judicious architectural modifications. The benefit to online inference is demonstrated by reformulating several widely used network architectures into CINs, including 3D CNNs, ST-GCNs, and Transformer Encoders. An orthogonal contribution tackles the concurrent adaptation and computational acceleration of a large source model into multiple lightweight derived models. Drawing on fusible adapter networks and structured pruning, Structured Pruning Adapters achieve superior predictive accuracy under aggressive pruning using significantly fewer learned weights compared to fine-tuning with pruning.
BogieCopter: A Multi-Modal Aerial-Ground Vehicle for Long-Endurance Inspection Applications
Abstract-- The use of Micro Aerial Vehicles (MAVs) for inspection and surveillance missions has proved to be extremely useful, however, their usability is negatively impacted by the large power requirements and the limited operating time. This work describes the design and development of a novel hybrid aerial-ground vehicle, enabling multi-modal mobility and long operating time, suitable for long-endurance inspection and monitoring applications. The vehicle's performance is evaluated To the best of our knowledge, this is the most complete review of research available for multimodal During the last decade, Micro Aerial Vehicles (MAVs) aerial-ground vehicles. This work aims at addressing have received a great deal of attention, both academically and some of the limitations in prior designs, by exploiting the commercially, due to their ability to quickly reach areas of strengths of both active and passive solutions offered for the interest, overcome obstacles, and provide an elevated view of ground actuation mechanism, while taking into consideration the environment. The applications in which MAVs are used a generous payload required for most industrial applications.