Not enough data to create a plot.
Try a different view from the menu above.
Wang, Li
A score-based particle method for homogeneous Landau equation
Huang, Yan, Wang, Li
The Landau equation stands as one of the fundamental kinetic equations, modeling the evolution of charged particles undergoing Coulomb interaction [27]. It is particularly useful for plasmas where collision effects become non-negligible. Computing the Landau equation presents numerous challenges inherent in kinetic equations, including high dimensionality, multiple scales, and strong nonlinearity and non-locality. On the other hand, deep learning has progressively transformed the numerical computation of partial differential equations by leveraging neural networks' ability to approximate complex functions and the powerful optimization toolbox. However, straightforward application of deep learning to compute PDEs often encounters training difficulties and leads to a loss of physical fidelity. In this paper, we propose a score-based particle method that elegantly combines learning with structure-preserving particle methods. This method inherits the favorable conservative properties of deterministic particle methods while relying only on light training to dynamically obtain the score function over time. The learning component replaces the expensive density estimation in previous particle methods, drastically accelerating computation.
HMAMP: Hypervolume-Driven Multi-Objective Antimicrobial Peptides Design
Wang, Li, Li, Yiping, Fu, Xiangzheng, Ye, Xiucai, Shi, Junfeng, Yen, Gary G., Zeng, Xiangxiang
Antimicrobial peptides (AMPs) have exhibited unprecedented potential as biomaterials in combating multidrug-resistant bacteria. Despite the increasing adoption of artificial intelligence for novel AMP design, challenges pertaining to conflicting attributes such as activity, hemolysis, and toxicity have significantly impeded the progress of researchers. This paper introduces a paradigm shift by considering multiple attributes in AMP design. Presented herein is a novel approach termed Hypervolume-driven Multi-objective Antimicrobial Peptide Design (HMAMP), which prioritizes the simultaneous optimization of multiple attributes of AMPs. By synergizing reinforcement learning and a gradient descent algorithm rooted in the hypervolume maximization concept, HMAMP effectively expands exploration space and mitigates the issue of pattern collapse. This method generates a wide array of prospective AMP candidates that strike a balance among diverse attributes. Furthermore, we pinpoint knee points along the Pareto front of these candidate AMPs. Empirical results across five benchmark models substantiate that HMAMP-designed AMPs exhibit competitive performance and heightened diversity. A detailed analysis of the helical structures and molecular dynamics simulations for ten potential candidate AMPs validates the superiority of HMAMP in the realm of multi-objective AMP design. The ability of HMAMP to systematically craft AMPs considering multiple attributes marks a pioneering milestone, establishing a universal computational framework for the multi-objective design of AMPs.
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory
Liu, Zicheng, Wang, Li, Li, Siyuan, Wang, Zedong, Lin, Haitao, Li, Stan Z.
Transformer models have been successful in various sequence processing tasks, but the self-attention mechanism's computational cost limits its practicality for long sequences. Although there are existing attention variants that improve computational efficiency, they have a limited ability to abstract global information effectively based on their hand-crafted mixing strategies. On the other hand, state-space models (SSMs) are tailored for long sequences but cannot capture complicated local information. Therefore, the combination of them as a unified token mixer is a trend in recent long-sequence models. However, the linearized attention degrades performance significantly even when equipped with SSMs. To address the issue, we propose a new method called LongVQ. LongVQ uses the vector quantization (VQ) technique to compress the global abstraction as a length-fixed codebook, enabling the linear-time computation of the attention matrix. This technique effectively maintains dynamic global and local patterns, which helps to complement the lack of long-range dependency issues. Our experiments on the Long Range Arena benchmark, autoregressive language modeling, and image and speech classification demonstrate the effectiveness of LongVQ. Our model achieves significant improvements over other sequence models, including variants of Transformers, Convolutions, and recent State Space Models.
Event Grounded Criminal Court View Generation with Cooperative (Large) Language Models
Yue, Linan, Liu, Qi, Zhao, Lili, Wang, Li, Gao, Weibo, An, Yanqing
With the development of legal intelligence, Criminal Court View Generation has attracted much attention as a crucial task of legal intelligence, which aims to generate concise and coherent texts that summarize case facts and provide explanations for verdicts. Existing researches explore the key information in case facts to yield the court views. Most of them employ a coarse-grained approach that partitions the facts into broad segments (e.g., verdict-related sentences) to make predictions. However, this approach fails to capture the complex details present in the case facts, such as various criminal elements and legal events. To this end, in this paper, we propose an Event Grounded Generation (EGG) method for criminal court view generation with cooperative (Large) Language Models, which introduces the fine-grained event information into the generation. Specifically, we first design a LLMs-based extraction method that can extract events in case facts without massive annotated events. Then, we incorporate the extracted events into court view generation by merging case facts and events. Besides, considering the computational burden posed by the use of LLMs in the extraction phase of EGG, we propose a LLMs-free EGG method that can eliminate the requirement for event extraction using LLMs in the inference phase. Extensive experimental results on a real-world dataset clearly validate the effectiveness of our proposed method.
Towards Faithful Explanations: Boosting Rationalization with Shortcuts Discovery
Yue, Linan, Liu, Qi, Du, Yichao, Wang, Li, Gao, Weibo, An, Yanqing
The remarkable success in neural networks provokes the selective rationalization. It explains the prediction results by identifying a small subset of the inputs sufficient to support them. Since existing methods still suffer from adopting the shortcuts in data to compose rationales and limited large-scale annotated rationales by human, in this paper, we propose a Shortcuts-fused Selective Rationalization (SSR) method, which boosts the rationalization by discovering and exploiting potential shortcuts. Specifically, SSR first designs a shortcuts discovery approach to detect several potential shortcuts. Then, by introducing the identified shortcuts, we propose two strategies to mitigate the problem of utilizing shortcuts to compose rationales. Finally, we develop two data augmentations methods to close the gap in the number of annotated rationales. Extensive experimental results on real-world datasets clearly validate the effectiveness of our proposed method.
Causal Disentanglement for Regulating Social Influence Bias in Social Recommendation
Wang, Li, Xu, Min, Zhang, Quangui, Shi, Yunxiao, Wu, Qiang
Social recommendation systems face the problem of social influence bias, which can lead to an overemphasis on recommending items that friends have interacted with. Addressing this problem is crucial, and existing methods often rely on techniques such as weight adjustment or leveraging unbiased data to eliminate this bias. However, we argue that not all biases are detrimental, i.e., some items recommended by friends may align with the user's interests. Blindly eliminating such biases could undermine these positive effects, potentially diminishing recommendation accuracy. In this paper, we propose a Causal Disentanglement-based framework for Regulating Social influence Bias in social recommendation, named CDRSB, to improve recommendation performance. From the perspective of causal inference, we find that the user social network could be regarded as a confounder between the user and item embeddings (treatment) and ratings (outcome). Due to the presence of this social network confounder, two paths exist from user and item embeddings to ratings: a non-causal social influence path and a causal interest path. Building upon this insight, we propose a disentangled encoder that focuses on disentangling user and item embeddings into interest and social influence embeddings. Mutual information-based objectives are designed to enhance the distinctiveness of these disentangled embeddings, eliminating redundant information. Additionally, a regulatory decoder that employs a weight calculation module to dynamically learn the weights of social influence embeddings for effectively regulating social influence bias has been designed. Experimental results on four large-scale real-world datasets Ciao, Epinions, Dianping, and Douban book demonstrate the effectiveness of CDRSB compared to state-of-the-art baselines.
A Privacy-Preserving Framework with Multi-Modal Data for Cross-Domain Recommendation
Wang, Li, Sang, Lei, Zhang, Quangui, Wu, Qiang, Xu, Min
Cross-domain recommendation (CDR) aims to enhance recommendation accuracy in a target domain with sparse data by leveraging rich information in a source domain, thereby addressing the data-sparsity problem. Some existing CDR methods highlight the advantages of extracting domain-common and domain-specific features to learn comprehensive user and item representations. However, these methods can't effectively disentangle these components as they often rely on simple user-item historical interaction information (such as ratings, clicks, and browsing), neglecting the rich multi-modal features. Additionally, they don't protect user-sensitive data from potential leakage during knowledge transfer between domains. To address these challenges, we propose a Privacy-Preserving Framework with Multi-Modal Data for Cross-Domain Recommendation, called P2M2-CDR. Specifically, we first design a multi-modal disentangled encoder that utilizes multi-modal information to disentangle more informative domain-common and domain-specific embeddings. Furthermore, we introduce a privacy-preserving decoder to mitigate user privacy leakage during knowledge transfer. Local differential privacy (LDP) is utilized to obfuscate the disentangled embeddings before inter-domain exchange, thereby enhancing privacy protection. To ensure both consistency and differentiation among these obfuscated disentangled embeddings, we incorporate contrastive learning-based domain-inter and domain-intra losses. Extensive Experiments conducted on four real-world datasets demonstrate that P2M2-CDR outperforms other state-of-the-art single-domain and cross-domain baselines.
Emergency Caching: Coded Caching-based Reliable Map Transmission in Emergency Networks
Tian, Zeyu, Xu, Lianming, Li, Liang, Wang, Li, Fei, Aiguo
Many rescue missions demand effective perception and real-time decision making, which highly rely on effective data collection and processing. In this study, we propose a three-layer architecture of emergency caching networks focusing on data collection and reliable transmission, by leveraging efficient perception and edge caching technologies. Based on this architecture, we propose a disaster map collection framework that integrates coded caching technologies. Our framework strategically caches coded fragments of maps across unmanned aerial vehicles (UAVs), fostering collaborative uploading for augmented transmission reliability. Additionally, we establish a comprehensive probability model to assess the effective recovery area of disaster maps. Towards the goal of utility maximization, we propose a deep reinforcement learning (DRL) based algorithm that jointly makes decisions about cooperative UAVs selection, bandwidth allocation and coded caching parameter adjustment, accommodating the real-time map updates in a dynamic disaster situation. Our proposed scheme is more effective than the non-coding caching scheme, as validated by simulation.
Emergency Computing: An Adaptive Collaborative Inference Method Based on Hierarchical Reinforcement Learning
Fu, Weiqi, Xu, Lianming, Wu, Xin, Wang, Li, Fei, Aiguo
In achieving effective emergency response, the timely acquisition of environmental information, seamless command data transmission, and prompt decision-making are crucial. This necessitates the establishment of a resilient emergency communication dedicated network, capable of providing communication and sensing services even in the absence of basic infrastructure. In this paper, we propose an Emergency Network with Sensing, Communication, Computation, Caching, and Intelligence (E-SC3I). The framework incorporates mechanisms for emergency computing, caching, integrated communication and sensing, and intelligence empowerment. E-SC3I ensures rapid access to a large user base, reliable data transmission over unstable links, and dynamic network deployment in a changing environment. However, these advantages come at the cost of significant computation overhead. Therefore, we specifically concentrate on emergency computing and propose an adaptive collaborative inference method (ACIM) based on hierarchical reinforcement learning. Experimental results demonstrate our method's ability to achieve rapid inference of AI models with constrained computational and communication resources.
NoisyNN: Exploring the Influence of Information Entropy Change in Learning Systems
Yu, Xiaowei, Huang, Zhe, Xue, Yao, Zhang, Lu, Wang, Li, Liu, Tianming, Zhu, Dajiang
We explore the impact of entropy change in deep learning systems via noise injection at different levels, i.e., the latent space and input image. The series of models that employ our methodology are collectively known as Noisy Neural Networks (NoisyNN), with examples such as NoisyViT and NoisyCNN. Noise is conventionally viewed as a harmful perturbation in various deep learning architectures, such as convolutional neural networks (CNNs) and vision transformers (ViTs), as well as different learning tasks like image classification and transfer learning. However, this work shows noise can be an effective way to change the entropy of the learning system. We demonstrate that specific noise can boost the performance of various deep architectures under certain conditions. We theoretically prove the enhancement gained from positive noise by reducing the task complexity defined by information entropy and experimentally show the significant performance gain in large image datasets, such as the ImageNet. Herein, we use the information entropy to define the complexity of the task. We categorize the noise into two types, positive noise (PN) and harmful noise (HN), based on whether the noise can help reduce the complexity of the task. Extensive experiments of CNNs and ViTs have shown performance improvements by proactively injecting positive noise, where we achieved an unprecedented top 1 accuracy of over 95$\%$ on ImageNet. Both theoretical analysis and empirical evidence have confirmed that the presence of positive noise, can benefit the learning process, while the traditionally perceived harmful noise indeed impairs deep learning models. The different roles of noise offer new explanations for deep models on specific tasks and provide a new paradigm for improving model performance. Moreover, it reminds us that we can influence the performance of learning systems via information entropy change.