Not enough data to create a plot.
Try a different view from the menu above.
Fang, Zhou
Semantic and Contextual Modeling for Malicious Comment Detection with BERT-BiLSTM
Fang, Zhou, Zhang, Hanlu, He, Jacky, Qi, Zhen, Zheng, Hongye
This study aims to develop an efficient and accurate model for detecting malicious comments, addressing the increasingly severe issue of false and harmful content on social media platforms. We propose a deep learning model that combines BERT and BiLSTM. The BERT model, through pre-training, captures deep semantic features of text, while the BiLSTM network excels at processing sequential data and can further model the contextual dependencies of text. Experimental results on the Jigsaw Unintended Bias in Toxicity Classification dataset demonstrate that the BERT+BiLSTM model achieves superior performance in malicious comment detection tasks, with a precision of 0.94, recall of 0.93, and accuracy of 0.94. This surpasses other models, including standalone BERT, TextCNN, TextRNN, and traditional machine learning algorithms using TF-IDF features. These results confirm the superiority of the BERT+BiLSTM model in handling imbalanced data and capturing deep semantic features of malicious comments, providing an effective technical means for social media content moderation and online environment purification.
Breaking the Context Bottleneck on Long Time Series Forecasting
Ma, Chao, Hou, Yikai, Li, Xiang, Sun, Yinggang, Yu, Haining, Fang, Zhou, Qu, Jiaxing
Long-term time-series forecasting is essential for planning and decision-making in economics, energy, and transportation, where long foresight is required. To obtain such long foresight, models must be both efficient and effective in processing long sequence. Recent advancements have enhanced the efficiency of these models; however, the challenge of effectively leveraging longer sequences persists. This is primarily due to the tendency of these models to overfit when presented with extended inputs, necessitating the use of shorter input lengths to maintain tolerable error margins. In this work, we investigate the multiscale modeling method and propose the Logsparse Decomposable Multiscaling (LDM) framework for the efficient and effective processing of long sequences. We demonstrate that by decoupling patterns at different scales in time series, we can enhance predictability by reducing non-stationarity, improve efficiency through a compact long input representation, and simplify the architecture by providing clear task assignments. Experimental results demonstrate that LDM not only outperforms all baselines in long-term forecasting benchmarks, but also reducing both training time and memory costs.
Incorporating uncertainty quantification into travel mode choice modeling: a Bayesian neural network (BNN) approach and an uncertainty-guided active survey framework
Zheng, Shuwen, Fang, Zhou, Zhao, Liang
Existing deep learning approaches for travel mode choice modeling fail to inform modelers about their prediction uncertainty. Even when facing scenarios that are out of the distribution of training data, which implies high prediction uncertainty, these approaches still provide deterministic answers, potentially leading to misguidance. To address this limitation, this study introduces the concept of uncertainty from the field of explainable artificial intelligence into travel mode choice modeling. We propose a Bayesian neural network-based travel mode prediction model (BTMP) that quantifies the uncertainty of travel mode predictions, enabling the model itself to "know" and "tell" what it doesn't know. With BTMP, we further propose an uncertainty-guided active survey framework, which dynamically formulates survey questions representing travel mode choice scenarios with high prediction uncertainty. Through iterative collection of responses to these dynamically tailored survey questions, BTMP is iteratively trained to achieve the desired accuracy faster with fewer questions, thereby reducing survey costs. Experimental validation using synthetic datasets confirms the effectiveness of BTMP in quantifying prediction uncertainty. Furthermore, experiments, utilizing both synthetic and real-world data, demonstrate that the BTMP model, trained with the uncertainty-guided active survey framework, requires 20% to 50% fewer survey responses to match the performance of the model trained on randomly collected survey data. Overall, the proposed BTMP model and active survey framework innovatively incorporate uncertainty quantification into travel mode choice modeling, providing model users with essential insights into prediction reliability while optimizing data collection for deep learning model training in a cost-efficient manner.
vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of Gradient Directions for Policy Improvement
Zhu, Yiwen, Liu, Jinyi, Wei, Wenya, Fu, Qianyi, Hu, Yujing, Fang, Zhou, An, Bo, Hao, Jianye, Lv, Tangjie, Fan, Changjie
Reinforcement Learning (RL) is a widely employed technique in decision-making problems, encompassing two fundamental operations -- policy evaluation and policy improvement. Enhancing learning efficiency remains a key challenge in RL, with many efforts focused on using ensemble critics to boost policy evaluation efficiency. However, when using multiple critics, the actor in the policy improvement process can obtain different gradients. Previous studies have combined these gradients without considering their disagreements. Therefore, optimizing the policy improvement process is crucial to enhance learning efficiency. This study focuses on investigating the impact of gradient disagreements caused by ensemble critics on policy improvement. We introduce the concept of uncertainty of gradient directions as a means to measure the disagreement among gradients utilized in the policy improvement process. Through measuring the disagreement among gradients, we find that transitions with lower uncertainty of gradient directions are more reliable in the policy improvement process. Building on this analysis, we propose a method called von Mises-Fisher Experience Resampling (vMFER), which optimizes the policy improvement process by resampling transitions and assigning higher confidence to transitions with lower uncertainty of gradient directions. Our experiments demonstrate that vMFER significantly outperforms the benchmark and is particularly well-suited for ensemble structures in RL.
Homotopy Based Reinforcement Learning with Maximum Entropy for Autonomous Air Combat
Zhu, Yiwen, Fang, Zhou, Zheng, Yuan, Wei, Wenya
The Intelligent decision of the unmanned combat aerial vehicle (UCAV) has long been a challenging problem. The conventional search method can hardly satisfy the real-time demand during high dynamics air combat scenarios. The reinforcement learning (RL) method can significantly shorten the decision time via using neural networks. However, the sparse reward problem limits its convergence speed and the artificial prior experience reward can easily deviate its optimal convergent direction of the original task, which raises great difficulties for the RL air combat application. In this paper, we propose a homotopy-based soft actor-critic method (HSAC) which focuses on addressing these problems via following the homotopy path between the original task with sparse reward and the auxiliary task with artificial prior experience reward. The convergence and the feasibility of this method are also proved in this paper. To confirm our method feasibly, we construct a detailed 3D air combat simulation environment for the RL-based methods training firstly, and we implement our method in both the attack horizontal flight UCAV task and the self-play confrontation task. Experimental results show that our method performs better than the methods only utilizing the sparse reward or the artificial prior experience reward. The agent trained by our method can reach more than 98.3% win rate in the attack horizontal flight UCAV task and average 67.4% win rate when confronted with the agents trained by the other two methods.
Fighting Game Commentator with Pitch and Loudness Adjustment Utilizing Highlight Cues
Xu, Junjie H., Fang, Zhou, Chen, Qihang, Ohno, Satoru, Paliyawan, Pujana
Watching video game live-streaming via platforms such as Twitch and YouTube has snowballed in popularity for a decade, In traditional sports, some research investigated commentaries and it has become a new kind of entertainment [1] with a considerable by human commentators from the perspective of their market value [2]. The game commentary can keep game phonetic variation [5]. Commercial games in recent decades, live-streaming audiences entertained and informed [3]. However such as NBA 2K series and FIFA series, commentaries to employ human commentator is costly, and therefore, the by human commentators were pre-recorded then replayed demand for non-human or AI commentators has been surfaced during the gameplay. Therefore, the demand for building and increasingly gained interest from researchers [4]. As game intelligent live commentary generating systems for video commentary is a kind of expressive speech, synthesizing game game live-streaming, expected to bring higher productivity commentary requires not only synthesizing realistic speech at lower costs than human commentators, has surfaced and using text input from game scenes but also adjusting the gained much interest by researchers [4], [8]-[10]. The recent phonetic variance that expresses the emotional information advancement on neural models accelerated the development based on the context [6].
Incorporating planning intelligence into deep learning: A planning support tool for street network design
Fang, Zhou, Jin, Ying, Yang, Tianren
With the emergence of deep learning techniques, procedural and example-based modeling have been increasingly applied to support automatic content generation and visualization for planning decisions (Hartmann et al., 2017). Procedural modeling relies on manually designated rule sets to produce proposals. Parish and Mรผller (2001) made one of the first attempts to generate three-dimensional city models for visualization using procedural approaches, where a Lindenmayer system was used to grow road networks and buildings conditioned on global goals and local constraints. Given an initial and a final road point, Galin et al. (2010) developed a cost minimization function to automate path creation, considering the slope of the terrain and natural obstacles. The function was then extended to generate hierarchical road networks between towns at a regional level (Galin et al., 2011). Similar procedural principles can also be applied to allocate land use, subdivide blocks and generate buildings (see, e.g., Chen et al., 2008; Lyu et al., 2015). In comparison, example-based approaches learn from real-world cases in a preprocessing step to extract features and adopt them as templates. Hartmann et al. (2017) developed an automatic road generation tool, StreetGAN, using a generative adversarial network (GAN) to synthesize street networks in a fix-sized region that can maintain the consistency of urban layouts learned from the training data set. Similarly, Kempinska and Murcio (2019) trained Variational Autoencoders (VAEs) using images of street networks derived from OpenStreetMap to capture urban configurations using lowdimensional vectors and generating new street networks by controlling the encoded vectors.
DeepStreet: A deep learning powered urban street network generation module
Fang, Zhou, Yang, Tianren, Jin, Ying
In countries experiencing unprecedented waves of urbanization, there is a need for rapid and high-quality urban street design. Our study presents a novel deep learning powered approach, DeepStreet (DS), for automatic street network generation that can be applied to the urban street design with local characteristics. DS is driven by a Convolutional Neural Network (CNN) that enables the interpolation of streets based on the areas of immediate vicinity. Specifically, the CNN is firstly trained to detect, recognize and capture the local features as well as the patterns of the existing street network sourced from the OpenStreetMap. With the trained CNN, DS is able to predict street networks' future expansion patterns within the predefined region conditioned on its surrounding street networks. To test the performance of DS, we apply it to an area in and around the Eixample area in the City of Barcelona, a well-known example in the fields of urban and transport planning with iconic grid-like street networks in the centre and irregular road alignments farther afield. The results show that DS can (1) detect and self-cluster different types of complex street patterns in Barcelona; (2) predict both gridiron and irregular street and road networks. DS proves to have a great potential as a novel tool for designers to efficiently design the urban street network that well maintains the consistency across the existing and newly generated urban street network. Furthermore, the generated networks can serve as a benchmark to guide the local plan-making especially in rapidly-developing cities. Keywords: Urban street network, machine learning, deep learning, Convolutional Neural Network (CNN), Generative Adversarial Network (GAN), image completion, image inpainting
GP-SLAM+: real-time 3D lidar SLAM based on improved regionalized Gaussian process map reconstruction
Ruan, Jianyuan, Li, Bo, Wang, Yinqiang, Fang, Zhou
This paper presents a 3D lidar SLAM system based on improved regionalized Gaussian process (GP) map reconstruction to provide both low-drift state estimation and mapping in real-time for robotics applications. We utilize spatial GP regression to model the environment. This tool enables us to recover surfaces including those in sparsely scanned areas and obtain uniform samples with uncertainty. Those properties facilitate robust data association and map updating in our scan-to-map registration scheme, especially when working with sparse range data. Compared with previous GP-SLAM, this work overcomes the prohibitive computational complexity of GP and redesigns the registration strategy to meet the accuracy requirements in 3D scenarios. For large-scale tasks, a two-thread framework is employed to suppress the drift further. Aerial and ground-based experiments demonstrate that our method allows robust odometry and precise mapping in real-time. It also outperforms the state-of-the-art lidar SLAM systems in our tests with light-weight sensors.