AITopics | Leng, Dawei

Collaborating Authors

Leng, Dawei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CCMB: A Large-scale Chinese Cross-modal Benchmark

Xie, Chunyu, Cai, Heng, Li, Jincheng, Kong, Fanjing, Wu, Xiaoyu, Song, Jianfei, Morimitsu, Henrique, Yao, Lin, Wang, Dexin, Zhang, Xiangzheng, Leng, Dawei, Zhang, Baochang, Ji, Xiangyang, Deng, Yafeng

arXiv.org Artificial IntelligenceNov-8-2023

Vision-language pre-training (VLP) on large-scale datasets has shown premier performance on various downstream tasks. In contrast to plenty of available benchmarks with English corpus, large-scale pre-training datasets and downstream datasets with Chinese corpus remain largely unexplored. In this work, we build a large-scale high-quality Chinese Cross-Modal Benchmark named CCMB for the research community, which contains the currently largest public pre-training dataset Zero and five human-annotated fine-tuning datasets for downstream tasks. Zero contains 250 million images paired with 750 million text descriptions, plus two of the five fine-tuning datasets are also currently the largest ones for Chinese cross-modal downstream tasks. Along with the CCMB, we also develop a VLP framework named R2D2, applying a pre-Ranking + Ranking strategy to learn powerful vision-language representations and a two-way distillation method (i.e., target-guided Distillation and feature-guided Distillation) to further enhance the learning capability. With the Zero and the R2D2 VLP framework, we achieve state-of-the-art performance on twelve downstream datasets from five broad categories of tasks including image-text retrieval, image-text matching, image caption, text-to-image generation, and zero-shot image classification. The datasets, models, and codes are available at https://github.com/yuxie11/R2D2

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3581783.3611877

2205.0386

Country:

Asia > China (0.16)
North America > Canada (0.16)
North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
(2 more...)

Add feedback

Bridge Diffusion Model: bridge non-English language-native text-to-image diffusion model with English communities

Liu, Shanyuan, Leng, Dawei, Yin, Yuhui

arXiv.org Artificial IntelligenceSep-2-2023

Text-to-Image generation (TTI) technologies are advancing rapidly, especially in the English language communities. However, English-native TTI models inherently carry biases from English world centric training data, which creates a dilemma for development of other language-native TTI models. One common choice is fine-tuning the English-native TTI model with translated samples from non-English communities. It falls short of fully addressing the model bias problem. Alternatively, training non-English language native models from scratch can effectively resolve the English world bias, but diverges from the English TTI communities, thus not able to utilize the strides continuously gaining in the English TTI communities any more. To build non-English language native TTI model meanwhile keep compatability with the English TTI communities, we propose a novel model structure referred as "Bridge Diffusion Model" (BDM). The proposed BDM employs a backbone-branch network structure to learn the non-English language semantics while keep the latent space compatible with the English-native TTI backbone, in an end-to-end manner. The unique advantages of the proposed BDM are that it's not only adept at generating images that precisely depict non-English language semantics, but also compatible with various English-native TTI plugins, such as different checkpoints, LoRA, ControlNet, Dreambooth, and Textual Inversion, etc. Moreover, BDM can concurrently generate content seamlessly combining both non-English native and English-native semantics within a single image, fostering cultural interaction. We verify our method by applying BDM to build a Chinese-native TTI model, whereas the method is generic and applicable to any other language.

diffusion model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2309.00952

Country:

Asia > China (0.46)
North America > United States > Maryland (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Enhance Information Propagation for Graph Neural Network by Heterogeneous Aggregations

Leng, Dawei, Guo, Jinjiang, Pan, Lurong, Li, Jie, Wang, Xinyu

arXiv.org Artificial IntelligenceFeb-8-2021

Success of deep learning in computer vision and natural language processing has recently boosted flood of research on applying neural networks to graph data (Wu et al., 2020). Graph is a simple yet versatile data structure jointly described by sets of nodes and edges. Aside from image and text data we're familiar, lots of real world data are better described as graph and thus processed by graph neural networks, such as social networks (Fan et al., 2019), financial fraud detection (Wang et al., 2020), knowledge graph (Zhang et al., 2020), biology interaction network (Higham et al., 2008), small molecule in drug discovery (Hu et al., 2019), to name a few. Since the seminal works (Kipf and Welling, 2016; Hamilton et al., 2017), tens of different graph neural network variants have been proposed, emphasizing different graph properties and design options. GNN research routes can be roughly divided into two categories: spectral based and spatial based. Spectral based GNNs try to approximate CNN's convolution by defining Fourier transform on graph (Kipf and Welling, 2016) and thus where the name graph convolution network comes from.

deep learning, immunology, operator, (16 more...)

arXiv.org Artificial Intelligence

2102.04064

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.89)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.69)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Heterogeneous Graph based Deep Learning for Biomedical Network Link Prediction

Guo, Jinjiang, Li, Jie, Leng, Dawei, Pan, Lurong

arXiv.org Artificial IntelligenceFeb-2-2021

Multi-scale biomedical knowledge networks are expanding with emerging experimental technologies. Link prediction is increasingly used especially in bipartite biomedical networks. We propose a Graph Neural Networks (GNN) method, namely Graph Pair based Link Prediction model (GPLP), for predicting biomedical network links simply based on their topological interaction information. In GPLP, 1-hop subgraphs extracted from known network interaction matrix is learnt to predict missing links. To evaluate our method, three heterogeneous biomedical networks were used, i.e. Drug-Target Interaction network (DTI), Compound-Protein Interaction network (CPI) from NIH Tox21, and Compound-Virus Inhibition network (CVI). In 5-fold cross validation, our proposed GPLP method significantly outperforms over the state-of-the-art baselines. Besides, robustness is tested with different network incompleteness. Our method has the potential applications in other biomedical networks.

dataset, deep learning, immunology, (17 more...)

arXiv.org Artificial Intelligence

2102.01649

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback