AITopics | Cen, Yukuo

Collaborating Authors

Cen, Yukuo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

Zhao, Qingfei, Wang, Ruobing, Cen, Yukuo, Zha, Daren, Tan, Shicheng, Dong, Yuxiao, Tang, Jie

arXiv.org Artificial IntelligenceNov-1-2024

Long-Context Question Answering (LCQA), a challenging task, aims to reason over long-context documents to yield accurate answers to questions. Existing long-context Large Language Models (LLMs) for LCQA often struggle with the "lost in the middle" issue. Retrieval-Augmented Generation (RAG) mitigates this issue by providing external factual evidence. However, its chunking strategy disrupts the global long-context information, and its low-quality retrieval in long contexts hinders LLMs from identifying effective factual details due to substantial noise. To this end, we propose LongRAG, a general, dual-perspective, and robust LLM-based RAG system paradigm for LCQA to enhance RAG's understanding of complex long-context knowledge (i.e., global information and factual details). We design LongRAG as a plug-and-play paradigm, facilitating adaptation to various domains and LLMs. Extensive experiments on three multi-hop datasets demonstrate that LongRAG significantly outperforms long-context LLMs (up by 6.94%), advanced RAG (up by 6.16%), and Vanilla RAG (up by 17.25%). Furthermore, we conduct quantitative ablation studies and multi-dimensional analyses, highlighting the effectiveness of the system's components and fine-tuning strategies. Data and code are available at https://github.com/QingFei1/LongRAG.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.1805

Country:

Europe (1.00)
Asia (0.93)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Media > Music (1.00)
Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generalizing Graph Transformers Across Diverse Graphs and Tasks via Pre-Training on Industrial-Scale Data

He, Yufei, Hou, Zhenyu, Cen, Yukuo, He, Feng, Cheng, Xu, Hooi, Bryan

arXiv.org Artificial IntelligenceJul-4-2024

Graph pre-training has been concentrated on graph-level on small graphs (e.g., molecular graphs) or learning node representations on a fixed graph. Extending graph pre-trained models to web-scale graphs with billions of nodes in industrial scenarios, while avoiding negative transfer across graphs or tasks, remains a challenge. We aim to develop a general graph pre-trained model with inductive ability that can make predictions for unseen new nodes and even new graphs. In this work, we introduce a scalable transformer-based graph pre-training framework called PGT (Pre-trained Graph Transformer). Specifically, we design a flexible and scalable graph transformer as the backbone network. Meanwhile, based on the masked autoencoder architecture, we design two pre-training tasks: one for reconstructing node features and the other one for reconstructing local structures. Unlike the original autoencoder architecture where the pre-trained decoder is discarded, we propose a novel strategy that utilizes the decoder for feature augmentation. We have deployed our framework on Tencent's online game data. Extensive experiments have demonstrated that our framework can perform pre-training on real-world web-scale graphs with over 540 million nodes and 12 billion edges and generalizes effectively to unseen new graphs with different downstream tasks. We further conduct experiments on the publicly available ogbn-papers100M dataset, which consists of 111 million nodes and 1.6 billion edges. Our framework achieves state-of-the-art performance on both industrial datasets and public datasets, while also enjoying scalability and efficiency.

artificial intelligence, machine learning, node, (19 more...)

arXiv.org Artificial Intelligence

2407.03953

Country:

Asia (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

OAG-Bench: A Human-Curated Benchmark for Academic Graph Mining

Zhang, Fanjin, Shi, Shijie, Zhu, Yifan, Chen, Bo, Cen, Yukuo, Yu, Jifan, Chen, Yelin, Wang, Lulu, Zhao, Qingfei, Cheng, Yuqing, Han, Tianyi, An, Yuwei, Zhang, Dan, Tam, Weng Lam, Cao, Kun, Pang, Yunhe, Guan, Xinyu, Yuan, Huihui, Song, Jian, Li, Xiaoyan, Dong, Yuxiao, Tang, Jie

arXiv.org Artificial IntelligenceJun-20-2024

With the rapid proliferation of scientific literature, versatile academic knowledge services increasingly rely on comprehensive academic graph mining. Despite the availability of public academic graphs, benchmarks, and datasets, these resources often fall short in multi-aspect and fine-grained annotations, are constrained to specific task types and domains, or lack underlying real academic graphs. In this paper, we present OAG-Bench, a comprehensive, multi-aspect, and fine-grained human-curated benchmark based on the Open Academic Graph (OAG). OAG-Bench covers 10 tasks, 20 datasets, 70+ baselines, and 120+ experimental results to date. We propose new data annotation strategies for certain tasks and offer a suite of data pre-processing codes, algorithm implementations, and standardized evaluation protocols to facilitate academic graph mining. Extensive experiments reveal that even advanced algorithms like large language models (LLMs) encounter difficulties in addressing key challenges in certain tasks, such as paper source tracing and scholar profiling. We also introduce the Open Academic Graph Challenge (OAG-Challenge) to encourage community input and sharing. We envisage that OAG-Bench can serve as a common ground for the community to evaluate and compare algorithms in academic graph mining, thereby accelerating algorithm development and advancement in this field. OAG-Bench is accessible at https://www.aminer.cn/data/.

data mining, large language model, machine learning, (24 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3637528.3672354

2402.1581

Country:

Europe > Spain (0.16)
Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Does Negative Sampling Matter? A Review with Insights into its Theory and Applications

Yang, Zhen, Ding, Ming, Huang, Tinglin, Cen, Yukuo, Song, Junshuai, Xu, Bin, Dong, Yuxiao, Tang, Jie

arXiv.org Artificial IntelligenceFeb-27-2024

Negative sampling has swiftly risen to prominence as a focal point of research, with wide-ranging applications spanning machine learning, computer vision, natural language processing, data mining, and recommender systems. This growing interest raises several critical questions: Does negative sampling really matter? Is there a general framework that can incorporate all existing negative sampling methods? In what fields is it applied? Addressing these questions, we propose a general framework that leverages negative sampling. Delving into the history of negative sampling, we trace the development of negative sampling through five evolutionary paths. We dissect and categorize the strategies used to select negative sample candidates, detailing global, local, mini-batch, hop, and memory-based approaches. Our review categorizes current negative sampling methods into five types: static, hard, GAN-based, Auxiliary-based, and In-batch methods, providing a clear structure for understanding negative sampling. Beyond detailed categorization, we highlight the application of negative sampling in various areas, offering insights into its practical benefits. Finally, we briefly discuss open problems and future directions for negative sampling.

data mining, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2402.17238

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre:

Overview (0.93)
Research Report (0.81)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

PST-Bench: Tracing and Benchmarking the Source of Publications

Zhang, Fanjin, Cao, Kun, Cen, Yukuo, Yu, Jifan, Yin, Da, Tang, Jie

arXiv.org Artificial IntelligenceFeb-25-2024

Tracing the source of research papers is a fundamental yet challenging task for researchers. The billion-scale citation relations between papers hinder researchers from understanding the evolution of science efficiently. To date, there is still a lack of an accurate and scalable dataset constructed by professional researchers to identify the direct source of their studied papers, based on which automatic algorithms can be developed to expand the evolutionary knowledge of science. In this paper, we study the problem of paper source tracing (PST) and construct a high-quality and ever-increasing dataset PST-Bench in computer science. Based on PST-Bench, we reveal several intriguing discoveries, such as the differing evolution patterns across various topics. An exploration of various methods underscores the hardness of PST-Bench, pinpointing potential directions on this topic. The dataset and codes have been available at https://github.com/THUDM/paper-source-trace.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2402.16009

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(2 more...)

Add feedback

BatchSampler: Sampling Mini-Batches for Contrastive Learning in Vision, Language, and Graphs

Yang, Zhen, Huang, Tinglin, Ding, Ming, Dong, Yuxiao, Ying, Rex, Cen, Yukuo, Geng, Yangliao, Tang, Jie

arXiv.org Artificial IntelligenceJun-5-2023

In-Batch contrastive learning is a state-of-the-art self-supervised method that brings semantically-similar instances close while pushing dissimilar instances apart within a mini-batch. Its key to success is the negative sharing strategy, in which every instance serves as a negative for the others within the mini-batch. Recent studies aim to improve performance by sampling hard negatives \textit{within the current mini-batch}, whose quality is bounded by the mini-batch itself. In this work, we propose to improve contrastive learning by sampling mini-batches from the input data. We present BatchSampler\footnote{The code is available at \url{https://github.com/THUDM/BatchSampler}} to sample mini-batches of hard-to-distinguish (i.e., hard and true negatives to each other) instances. To make each mini-batch have fewer false negatives, we design the proximity graph of randomly-selected instances. To form the mini-batch, we leverage random walk with restart on the proximity graph to help sample hard-to-distinguish instances. BatchSampler is a simple and general technique that can be directly plugged into existing contrastive learning models in vision, language, and graphs. Extensive experiments on datasets of three modalities show that BatchSampler can consistently improve the performance of powerful contrastive models, as shown by significant improvements of SimCLR on ImageNet-100, SimCSE on STS (language), and GraphCL and MVGRL on graph datasets.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2306.03355

Country: North America > United States (0.30)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner

Hou, Zhenyu, He, Yufei, Cen, Yukuo, Liu, Xiao, Dong, Yuxiao, Kharlamov, Evgeny, Tang, Jie

arXiv.org Artificial IntelligenceApr-10-2023

Graph self-supervised learning (SSL), including contrastive and generative approaches, offers great potential to address the fundamental challenge of label scarcity in real-world graph data. Among both sets of graph SSL techniques, the masked graph autoencoders (e.g., GraphMAE)--one type of generative method--have recently produced promising results. The idea behind this is to reconstruct the node features (or structures)--that are randomly masked from the input--with the autoencoder architecture. However, the performance of masked feature reconstruction naturally relies on the discriminability of the input features and is usually vulnerable to disturbance in the features. In this paper, we present a masked self-supervised learning framework GraphMAE2 with the goal of overcoming this issue. The idea is to impose regularization on feature reconstruction for graph SSL. Specifically, we design the strategies of multi-view random re-mask decoding and latent representation prediction to regularize the feature reconstruction. The multi-view random re-mask decoding is to introduce randomness into reconstruction in the feature space, while the latent representation prediction is to enforce the reconstruction in the embedding space. Extensive experiments show that GraphMAE2 can consistently generate top results on various public datasets, including at least 2.45% improvements over state-of-the-art baselines on ogbn-Papers100M with 111M nodes and 1.6B edges.

artificial intelligence, graph, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2304.04779

Country: North America > United States (0.48)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

GACT: Activation Compressed Training for Generic Network Architectures

Liu, Xiaoxuan, Zheng, Lianmin, Wang, Dequan, Cen, Yukuo, Chen, Weize, Han, Xu, Chen, Jianfei, Liu, Zhiyuan, Tang, Jie, Gonzalez, Joey, Mahoney, Michael, Cheung, Alvin

arXiv.org Artificial IntelligenceSep-3-2022

Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint. This paper presents GACT, an ACT framework to support a broad range of machine learning tasks for generic NN architectures with limited domain knowledge. By analyzing a linearized version of ACT's approximate gradient, we prove the convergence of GACT without prior knowledge on operator type or model architecture. To make training stable, we propose an algorithm that decides the compression ratio for each tensor by estimating its impact on the gradient at run time. We implement GACT as a PyTorch library that readily applies to any NN architecture. GACT reduces the activation memory for convolutional NNs, transformers, and graph NNs by up to 8.1x, enabling training with a 4.2x to 24.7x larger batch size, with negligible accuracy loss. We implement GACT as a PyTorch library at https://github.com/LiuXiaoxuanPKU/GACT-ICML.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2206.11357

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Improving the Training of Graph Neural Networks with Consistency Regularization

Zhang, Chenhui, He, Yufei, Cen, Yukuo, Hou, Zhenyu, Tang, Jie

arXiv.org Machine LearningDec-8-2021

Graph neural networks (GNNs) have achieved notable success in the semi-supervised learning scenario. The message passing mechanism in graph neural networks helps unlabeled nodes gather supervision signals from their labeled neighbors. In this work, we investigate how consistency regularization, one of widely adopted semi-supervised learning methods, can help improve the performance of graph neural networks. We revisit two methods of consistency regularization for graph neural networks. One is simple consistency regularization (SCR), and the other is mean-teacher consistency regularization (MCR). We combine the consistency regularization methods with two state-of-the-art GNNs and conduct experiments on the ogbn-products dataset. With the consistency regularization, the performance of state-of-the-art GNNs can be improved by 0.3% on the ogbn-products dataset of Open Graph Benchmark (OGB) both with and without external data.

machine learning, teaching medhods, teaching method, (14 more...)

arXiv.org Machine Learning

2112.04319

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning

Zheng, Qinkai, Zou, Xu, Dong, Yuxiao, Cen, Yukuo, Yin, Da, Xu, Jiarong, Yang, Yang, Tang, Jie

arXiv.org Artificial IntelligenceNov-8-2021

Adversarial attacks on graphs have posed a major threat to the robustness of graph machine learning (GML) models. Naturally, there is an ever-escalating arms race between attackers and defenders. However, the strategies behind both sides are often not fairly compared under the same and realistic conditions. To bridge this gap, we present the Graph Robustness Benchmark (GRB) with the goal of providing a scalable, unified, modular, and reproducible evaluation for the adversarial robustness of GML models. GRB standardizes the process of attacks and defenses by 1) developing scalable and diverse datasets, 2) modularizing the attack and defense implementations, and 3) unifying the evaluation protocol in refined scenarios. By leveraging the GRB pipeline, the end-users can focus on the development of robust GML models with automated data processing and experimental evaluations. To support open and reproducible research on graph adversarial learning, GRB also hosts public leaderboards across different scenarios. As a starting point, we conduct extensive experiments to benchmark baseline techniques. GRB is open-source and welcomes contributions from the community.

artificial intelligence, gml model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2111.04314

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.49)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.89)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback