Chen, Boyu
Blend the Separated: Mixture of Synergistic Experts for Data-Scarcity Drug-Target Interaction Prediction
Zhai, Xinlong, Wang, Chunchen, Wang, Ruijia, Kang, Jiazheng, Li, Shujie, Chen, Boyu, Ma, Tengfei, Zhou, Zikai, Yang, Cheng, Shi, Chuan
Drug-target interaction prediction (DTI) is essential in various applications including drug discovery and clinical application. There are two perspectives of input data widely used in DTI prediction: Intrinsic data represents how drugs or targets are constructed, and extrinsic data represents how drugs or targets are related to other biological entities. However, any of the two perspectives of input data can be scarce for some drugs or targets, especially for those unpopular or newly discovered. Furthermore, ground-truth labels for specific interaction types can also be scarce. Therefore, we propose the first method to tackle DTI prediction under input data and/or label scarcity. To make our model functional when only one perspective of input data is available, we design two separate experts to process intrinsic and extrinsic data respectively and fuse them adaptively according to different samples. Furthermore, to make the two perspectives complement each other and remedy label scarcity, two experts synergize with each other in a mutually supervised way to exploit the enormous unlabeled data. Extensive experiments on 3 real-world datasets under different extents of input data scarcity and/or label scarcity demonstrate our model outperforms states of the art significantly and steadily, with a maximum improvement of 53.53%. We also test our model without any data scarcity and it still outperforms current methods.
PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths
Chen, Boyu, Guo, Zirui, Yang, Zidan, Chen, Yuluo, Chen, Junze, Liu, Zhenghao, Shi, Chuan, Yang, Cheng
Retrieval-augmented generation (RAG) improves the response quality of large language models (LLMs) by retrieving knowledge from external databases. Typical RAG approaches split the text database into chunks, organizing them in a flat structure for efficient searches. To better capture the inherent dependencies and structured relationships across the text database, researchers propose to organize textual information into an indexing graph, known asgraph-based RAG. However, we argue that the limitation of current graph-based RAG methods lies in the redundancy of the retrieved information, rather than its insufficiency. Moreover, previous methods use a flat structure to organize retrieved information within the prompts, leading to suboptimal performance. To overcome these limitations, we propose PathRAG, which retrieves key relational paths from the indexing graph, and converts these paths into textual form for prompting LLMs. Specifically, PathRAG effectively reduces redundant information with flow-based pruning, while guiding LLMs to generate more logical and coherent responses with path-based prompting. Experimental results show that PathRAG consistently outperforms state-of-the-art baselines across six datasets and five evaluation dimensions. The code is available at the following link: https://github.com/BUPT-GAMMA/PathRAG
JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning
Chen, Boyu, Li, Peike, Yao, Yao, Wang, Alex
Large models for text-to-music generation have achieved significant progress, facilitating the creation of high-quality and varied musical compositions from provided text prompts. However, input text prompts may not precisely capture user requirements, particularly when the objective is to generate music that embodies a specific concept derived from a designated reference collection. In this paper, we propose a novel method for customized text-to-music generation, which can capture the concept from a two-minute reference music and generate a new piece of music conforming to the concept. We achieve this by fine-tuning a pretrained text-to-music model using the reference music. However, directly fine-tuning all parameters leads to overfitting issues. To address this problem, we propose a Pivotal Parameters Tuning method that enables the model to assimilate the new concept while preserving its original generative capabilities. Additionally, we identify a potential concept conflict when introducing multiple concepts into the pretrained model. We present a concept enhancement strategy to distinguish multiple concepts, enabling the fine-tuned model to generate music incorporating either individual or multiple concepts simultaneously. Since we are the first to work on the customized music generation task, we also introduce a new dataset and evaluation protocol for the new task. Our proposed Jen1-DreamStyler outperforms several baselines in both qualitative and quantitative evaluations. Demos will be available at https://www.jenmusic.ai/research#DreamStyler.
JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation
Yao, Yao, Li, Peike, Chen, Boyu, Wang, Alex
With rapid advances in generative artificial intelligence, the text-to-music synthesis task has emerged as a promising direction for music generation from scratch. However, finer-grained control over multi-track generation remains an open challenge. Existing models exhibit strong raw generation capability but lack the flexibility to compose separate tracks and combine them in a controllable manner, differing from typical workflows of human composers. To address this issue, we propose JEN-1 Composer, a unified framework to efficiently model marginal, conditional, and joint distributions over multi-track music via a single model. JEN-1 Composer framework exhibits the capacity to seamlessly incorporate any diffusion-based music generation system, \textit{e.g.} Jen-1, enhancing its capacity for versatile multi-track music generation. We introduce a curriculum training strategy aimed at incrementally instructing the model in the transition from single-track generation to the flexible generation of multi-track combinations. During the inference, users have the ability to iteratively produce and choose music tracks that meet their preferences, subsequently creating an entire musical composition incrementally following the proposed Human-AI co-composition workflow. Quantitative and qualitative assessments demonstrate state-of-the-art performance in controllable and high-fidelity multi-track music synthesis. The proposed JEN-1 Composer represents a significant advance toward interactive AI-facilitated music creation and composition. Demos will be available at https://www.jenmusic.ai/audio-demos.
Ternary Singular Value Decomposition as a Better Parameterized Form in Linear Mapping
Chen, Boyu, Chen, Hanxuan, He, Jiao, Sun, Fengyu, Jui, Shangling
We present a simple yet novel parameterized form of linear mapping to achieves remarkable network compression performance: a pseudo SVD called Ternary SVD (TSVD). Unlike vanilla SVD, TSVD limits the $U$ and $V$ matrices in SVD to ternary matrices form in $\{\pm 1, 0\}$. This means that instead of using the expensive multiplication instructions, TSVD only requires addition instructions when computing $U(\cdot)$ and $V(\cdot)$. We provide direct and training transition algorithms for TSVD like Post Training Quantization and Quantization Aware Training respectively. Additionally, we analyze the convergence of the direct transition algorithms in theory. In experiments, we demonstrate that TSVD can achieve state-of-the-art network compression performance in various types of networks and tasks, including current baseline models such as ConvNext, Swim, BERT, and large language model like OPT.
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Li, Peike, Chen, Boyu, Yao, Yao, Wang, Yikai, Wang, Allen, Wang, Alex
Music generation has attracted growing interest with the advancement of deep generative models. However, generating music conditioned on textual descriptions, known as text-to-music, remains challenging due to the complexity of musical structures and high sampling rate requirements. This paper introduces JEN-1, a universal high-fidelity model for text-to-music generation. JEN-1 is a diffusion model incorporating both autoregressive and non-autoregressive training. Through incontext learning, JEN-1 performs various generation tasks including text-guided music generation, music inpainting, and continuation. Evaluations demonstrate JEN-1's superior performance over state-of-the-art methods in text-music alignment and music quality while maintaining computational efficiency. Our demos are available at https://www.futureverse.com/research/jen/ "Music is the universal language of mankind." - Henry Wadsworth Longfellow Music, as an artistic expression comprising harmony, melody and rhythm, holds great cultural significance and appeal to humans. Recent years have witnessed remarkable progress in music generation with the rise of deep generative models (Liu et al., 2023; Kreuk et al., 2022; Agostinelli et al., 2023).
Modify Training Directions in Function Space to Reduce Generalization Error
Yu, Yi, Lu, Wenlian, Chen, Boyu
We propose theoretical analyses of a modified natural gradient descent method in the neural network function space based on the eigendecompositions of neural tangent kernel and Fisher information matrix. We firstly present analytical expression for the function learned by this modified natural gradient under the assumptions of Gaussian distribution and infinite width limit. Thus, we explicitly derive the generalization error of the learned neural network function using theoretical methods from eigendecomposition and statistics theory. By decomposing of the total generalization error attributed to different eigenspace of the kernel in function space, we propose a criterion for balancing the errors stemming from training set and the distribution discrepancy between the training set and the true data. Through this approach, we establish that modifying the training direction of the neural network in function space leads to a reduction in the total generalization error. Furthermore, We demonstrate that this theoretical framework is capable to explain many existing results of generalization enhancing methods. These theoretical results are also illustrated by numerical examples on synthetic data.
Intelligent Solution System towards Parts Logistics Optimization
Huang, Yaoting, Chen, Boyu, Lu, Wenlian, Jin, Zhong-Xiao, Zheng, Ren
Due to the complication of the presented problem, intelligent algorithms show great power to solve the parts logistics optimization problem related to the vehicle routing problem (VRP). However, most of the existing research to VRP are incomprehensive and failed to solve a real-work parts logistics problem. In this work, towards SAIC logistics problem, we propose a systematic solution to this 2-Dimensional Loading Capacitated Multi-Depot Heterogeneous VRP with Time Windows by integrating diverse types of intelligent algorithms, including, a heuristic algorithm to initialize feasible logistics planning schemes by imitating manual planning, the core Tabu Search algorithm for global optimization, accelerated by a novel bundle technique, heuristically algorithms for routing, packing and queuing associated, and a heuristic post-optimization process to promote the optimal solution. Based on these algorithms, the SAIC Motor has successfully established an intelligent management system to give a systematic solution for the parts logistics planning, superior than manual planning in its performance, customizability and expandability.
Meta-Learning with Hessian Free Approach in Deep Neural Nets Training
Chen, Boyu, Lu, Wenlian
Meta-learning is a promising method to achieve efficient training method towards deep neural net and has been attracting increases interests in recent years. But most of the current methods are still not capable to train complex neuron net model with long-time training process. In this paper, a novel second-order meta-optimizer, named Meta-learning with Hessian-Free(MLHF) approach, is proposed based on the Hessian Free approach as the framework. Two recurrent neural networks are established to generate the damping and the precondition matrix of this Hessian free framework. A series of techniques to meta-train the MLHF towards stable and reinforce the meta-training of this optimizer, including the gradient calculation of $H$, and use experiment replay on $w^0$. Numerical experiments on deep convolution neural nets, including CUDA-convnet and resnet18(v2), with datasets of cifar10 and ILSVRC2012, indicate that the MLHF shows good and continuous training performance during the whole long-time training process, i.e., both the rapid-decreasing early stage and the steadily-deceasing later stage, and so is a promising meta-learning framework towards elevating the training efficiency in real-world deep neural nets.