Not enough data to create a plot.
Try a different view from the menu above.
Ding, Yu
Set-Membership Filtering-Based Cooperative State Estimation for Multi-Agent Systems
Ding, Yu, Cong, Yirui, Wang, Xiangke
In this article, we focus on the cooperative state estimation problem of a multi-agent system. Each agent is equipped with absolute and relative measurements. The purpose of this research is to make each agent generate its own state estimation with only local measurement information and local communication with neighborhood agents using Set Membership Filter(SMF). To handle this problem, we analyzed centralized SMF framework as a benchmark of distributed SMF and propose a finite-horizon method called OIT-Inspired centralized constrained zonotopic algorithm. Moreover, we put forward a distributed Set Membership Filtering(SMFing) framework and develop a distributed constained zonotopic algorithm. Finally, simulation verified our theoretical results, that our proposed algorithms can effectively estimate the state of each agent.
Artificial Intelligence/Operations Research Workshop 2 Report Out
Dickerson, John, Dilkina, Bistra, Ding, Yu, Gupta, Swati, Van Hentenryck, Pascal, Koenig, Sven, Krishnan, Ramayya, Kulkarni, Radhika, Gill, Catherine, Griffin, Haley, Hunter, Maddy, Schwartz, Ann
Artificial intelligence (AI) has received significant attention in recent years, primarily due to breakthroughs in game playing, computer vision, and natural language processing that captured the imagination of the scientific community and the public at large. Many businesses, industries, and academic disciplines are now contemplating the application of AI to their own challenges. The federal government in the US and other countries have also invested significantly in advancing AI research and created funding initiatives and programs to promote greater collaboration across multiple communities. Some of the investment examples in the US include the establishment of the National AI Initiative Office, the launch of the National AI Research Resource Task Force, and more recently, the establishment of the National AI Advisory Committee. In 2021 INFORMS and ACM SIGAI joined together with the Computing Community Consortium (CCC) to organize a series of three workshops. The objective for this workshop series is to explore ways to exploit the synergies of the AI and Operations Research (OR) communities to transform decision making.
Effective Multimodal Reinforcement Learning with Modality Alignment and Importance Enhancement
Ma, Jinming, Wu, Feng, Chen, Yingfeng, Ji, Xianpeng, Ding, Yu
Many real-world applications require an agent to make robust and deliberate decisions with multimodal information (e.g., robots with multi-sensory inputs). However, it is very challenging to train the agent via reinforcement learning (RL) due to the heterogeneity and dynamic importance of different modalities. Specifically, we observe that these issues make conventional RL methods difficult to learn a useful state representation in the end-to-end training with multimodal information. To address this, we propose a novel multimodal RL approach that can do multimodal alignment and importance enhancement according to their similarity and importance in terms of RL tasks respectively. By doing so, we are able to learn an effective state representation and consequentially improve the RL training process. We test our approach on several multimodal RL domains, showing that it outperforms state-of-the-art methods in terms of learning speed and policy quality.
Towards Futuristic Autonomous Experimentation--A Surprise-Reacting Sequential Experiment Policy
Ahmed, Imtiaz, Bukkapatnam, Satish, Botcha, Bhaskar, Ding, Yu
An autonomous experimentation platform in manufacturing is supposedly capable of conducting a sequential search for finding suitable manufacturing conditions for advanced materials by itself or even for discovering new materials with minimal human intervention. The core of the intelligent control of such platforms is the policy directing sequential experiments, namely, to decide where to conduct the next experiment based on what has been done thus far. Such policy inevitably trades off exploitation versus exploration and the current practice is under the Bayesian optimization framework using the expected improvement criterion or its variants. We discuss whether it is beneficial to trade off exploitation versus exploration by measuring the element and degree of surprise associated with the immediate past observation. We devise a surprise-reacting policy using two existing surprise metrics, known as the Shannon surprise and Bayesian surprise. Our analysis shows that the surprise-reacting policy appears to be better suited for quickly characterizing the overall landscape of a response surface or a design place under resource constraints. We argue that such capability is much needed for futuristic autonomous experimentation platforms. We do not claim that we have a fully autonomous experimentation platform, but believe that our current effort sheds new lights or provides a different view angle as researchers are racing to elevate the autonomy of various primitive autonomous experimentation systems.
InterMulti:Multi-view Multimodal Interactions with Text-dominated Hierarchical High-order Fusion for Emotion Analysis
Qiu, Feng, Kong, Wanzeng, Ding, Yu
Humans are sophisticated at reading interlocutors' emotions from multimodal signals, such as speech contents, voice tones and facial expressions. However, machines might struggle to understand various emotions due to the difficulty of effectively decoding emotions from the complex interactions between multimodal signals. In this paper, we propose a multimodal emotion analysis framework, InterMulti, to capture complex multimodal interactions from different views and identify emotions from multimodal signals. Our proposed framework decomposes signals of different modalities into three kinds of multimodal interaction representations, including a modality-full interaction representation, a modality-shared interaction representation, and three modality-specific interaction representations. Additionally, to balance the contribution of different modalities and learn a more informative latent interaction representation, we developed a novel Text-dominated Hierarchical High-order Fusion(THHF) module. THHF module reasonably integrates the above three kinds of representations into a comprehensive multimodal interaction representation. Extensive experimental results on widely used datasets, (i.e.) MOSEI, MOSI and IEMOCAP, demonstrate that our method outperforms the state-of-the-art.
TCFimt: Temporal Counterfactual Forecasting from Individual Multiple Treatment Perspective
Xi, Pengfei, Wang, Guifeng, Hu, Zhipeng, Xiong, Yu, Gong, Mingming, Huang, Wei, Wu, Runze, Ding, Yu, Lv, Tangjie, Fan, Changjie, Feng, Xiangnan
Determining causal effects of temporal multi-intervention assists decision-making. Restricted by time-varying bias, selection bias, and interactions of multiple interventions, the disentanglement and estimation of multiple treatment effects from individual temporal data is still rare. To tackle these challenges, we propose a comprehensive framework of temporal counterfactual forecasting from an individual multiple treatment perspective (TCFimt). TCFimt constructs adversarial tasks in a seq2seq framework to alleviate selection and time-varying bias and designs a contrastive learning-based block to decouple a mixed treatment effect into separated main treatment effects and causal interactions which further improves estimation accuracy. Through implementing experiments on two real-world datasets from distinct fields, the proposed method shows satisfactory performance in predicting future outcomes with specific treatments and in choosing optimal treatment type and timing than state-of-the-art methods.
EffMulti: Efficiently Modeling Complex Multimodal Interactions for Emotion Analysis
Qiu, Feng, Xie, Chengyang, Ding, Yu, Kong, Wanzeng
Humans are skilled in reading the interlocutor's emotion from multimodal signals, including spoken words, simultaneous speech, and facial expressions. It is still a challenge to effectively decode emotions from the complex interactions of multimodal signals. In this paper, we design three kinds of multimodal latent representations to refine the emotion analysis process and capture complex multimodal interactions from different views, including a intact three-modal integrating representation, a modality-shared representation, and three modality-individual representations. Then, a modality-semantic hierarchical fusion is proposed to reasonably incorporate these representations into a comprehensive interaction representation. The experimental results demonstrate that our EffMulti outperforms the state-of-the-art methods. The compelling performance benefits from its well-designed framework with ease of implementation, lower computing complexity, and less trainable parameters.
Domain Generalization by Learning and Removing Domain-specific Features
Ding, Yu, Wang, Lei, Liang, Bin, Liang, Shuming, Wang, Yang, Chen, Fang
Deep Neural Networks (DNNs) suffer from domain shift when the test dataset follows a distribution different from the training dataset. Domain generalization aims to tackle this issue by learning a model that can generalize to unseen domains. In this paper, we propose a new approach that aims to explicitly remove domain-specific features for domain generalization. Following this approach, we propose a novel framework called Learning and Removing Domain-specific features for Generalization (LRDG) that learns a domain-invariant model by tactically removing domain-specific features from the input images. Specifically, we design a classifier to effectively learn the domain-specific features for each source domain, respectively. We then develop an encoder-decoder network to map each input image into a new image space where the learned domain-specific features are removed. With the images output by the encoder-decoder network, another classifier is designed to learn the domain-invariant features to conduct image classification. Extensive experiments demonstrate that our framework achieves superior performance compared with state-of-the-art methods.
FlowFace: Semantic Flow-guided Shape-aware Face Swapping
Zeng, Hao, Zhang, Wei, Fan, Changjie, Lv, Tangjie, Wang, Suzhen, Zhang, Zhimeng, Ma, Bowen, Li, Lincheng, Ding, Yu, Yu, Xin
In this work, we propose a semantic flow-guided two-stage framework for shape-aware face swapping, namely FlowFace. Unlike most previous methods that focus on transferring the source inner facial features but neglect facial contours, our FlowFace can transfer both of them to a target face, thus leading to more realistic face swapping. Concretely, our FlowFace consists of a face reshaping network and a face swapping network. The face reshaping network addresses the shape outline differences between the source and target faces. It first estimates a semantic flow (i.e., face shape differences) between the source and the target face, and then explicitly warps the target face shape with the estimated semantic flow. After reshaping, the face swapping network generates inner facial features that exhibit the identity of the source face. We employ a pre-trained face masked autoencoder (MAE) to extract facial features from both the source face and the target face. In contrast to previous methods that use identity embedding to preserve identity information, the features extracted by our encoder can better capture facial appearances and identity information. Then, we develop a cross-attention fusion module to adaptively fuse inner facial features from the source face with the target facial attributes, thus leading to better identity preservation. Extensive quantitative and qualitative experiments on in-the-wild faces demonstrate that our FlowFace outperforms the state-of-the-art significantly.
Graph Regularized Autoencoder and its Application in Unsupervised Anomaly Detection
Ahmed, Imtiaz, Galoppo, Travis, Hu, Xia, Ding, Yu
Dimensionality reduction is a crucial first step for many unsupervised learning tasks including anomaly detection. Autoencoder is a popular mechanism to accomplish the goal of dimensionality reduction. In order to make dimensionality reduction effective for high-dimensional data embedding nonlinear low-dimensional manifold, it is understood that some sort of geodesic distance metric should be used to discriminate the data samples. Inspired by the success of neighborhood aware shortest path based geodesic approximatiors such as ISOMAP, in this work, we propose to use a minimum spanning tree (MST), a graph-based algorithm, to approximate the local neighborhood structure and generate structure-preserving distances among data points. We use this MST-based distance metric to replace the Euclidean distance metric in the embedding function of autoencoders and develop a new graph regularized autoencoder, which outperforms, over 20 benchmark anomaly detection datasets, the plain autoencoder using no regularizer as well as the autoencoders using the Euclidean-based regularizer. We furthermore incorporate the MST regularizer into two generative adversarial networks and find that using the MST regularizer improves the performance of anomaly detection substantially for both generative adversarial networks.