AITopics | Yuan, Baichuan

Collaborating Authors

Yuan, Baichuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Graph Quantized Tokenizers for Transformers

Wang, Limei, Hassani, Kaveh, Zhang, Si, Fu, Dongqi, Yuan, Baichuan, Cong, Weilin, Hua, Zhigang, Wu, Hao, Yao, Ning, Long, Bo

arXiv.org Artificial IntelligenceOct-17-2024

Transformers serve as the backbone architectures of Foundational Models, where a domain-specific tokenizer helps them adapt to various domains. Graph Transformers (GTs) have recently emerged as a leading model in geometric deep learning, outperforming Graph Neural Networks (GNNs) in various graph learning tasks. However, the development of tokenizers for graphs has lagged behind other modalities, with existing approaches relying on heuristics or GNNs co-trained with Transformers. To address this, we introduce GQT (Graph Quantized Tokenizer), which decouples tokenizer training from Transformer training by leveraging multitask graph self-supervised learning, yielding robust and generalizable graph tokens. Furthermore, the GQT utilizes Residual Vector Quantization (RVQ) to learn hierarchical discrete tokens, resulting in significantly reduced memory requirements and improved generalization capabilities. By combining the GQT with token modulation, a Transformer encoder achieves state-of-the-art performance on 16 out of 18 benchmarks, including large-scale homophilic and heterophilic datasets. Unlike message-passing Graph Neural Networks (GNNs), which rely on strong locality inductive biases (Battaglia et al., 2018; Veličković et al., 2018; Hou et al., 2020; Hamilton et al., 2017a; Kipf & Welling, 2017), GTs are inherently more expressive due to their ability to capture long-range interactions between nodes (Ma et al., 2023). This is particularly beneficial in heterophilous settings where local alignment does not hold (Fu et al., 2024). GTs possess an expressive power at least equivalent to the 2-Weisfeiler-Lehman (WL) isomorphism test (Kim et al., 2022), which is sufficient for most real-world tasks (Zopf, 2022). This surpasses the expressive power of message-passing GNNs, which are limited to the 1-WL test (Ying et al., 2021a).

artificial intelligence, machine learning, representation, (12 more...)

arXiv.org Artificial Intelligence

2410.13798

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Do We Really Need Complicated Model Architectures For Temporal Networks?

Cong, Weilin, Zhang, Si, Kang, Jian, Yuan, Baichuan, Wu, Hao, Zhou, Xin, Tong, Hanghang, Mahdavi, Mehrdad

arXiv.org Artificial IntelligenceFeb-22-2023

Recurrent neural network (RNN) and self-attention mechanism (SAM) are the de facto methods to extract spatial-temporal information for temporal graph learning. Interestingly, we found that although both RNN and SAM could lead to a good performance, in practice neither of them is always necessary. In this paper, we propose GraphMixer, a conceptually and technically simple architecture that consists of three components: 1 a link-encoder that is only based on multi-layer perceptrons (MLP) to summarize the information from temporal links, 2 a node-encoder that is only based on neighbor mean-pooling to summarize node information, and 3 an MLP-based link classifier that performs link prediction based on the outputs of the encoders. These results motivate us to rethink the importance of simpler model architecture. In recent years, temporal graph learning has been recognized as an important machine learning problem and has become the cornerstone behind a wealth of high-impact applications Yu et al. (2018); Bui et al. (2021); Kazemi et al. (2020); Zhou et al. (2020); Cong et al. (2021b). Temporal link prediction is one of the classic downstream tasks which focuses on predicting the future interactions among nodes. For example, in an ads ranking system, the user-ad clicks can be modeled as a temporal bipartite graph whose nodes represent users and ads, and links are associated with timestamps indicating when users click ads. Link prediction between them can be used to predict whether a user will click an ad. Designing graph learning models that can capture node evolutionary patterns and accurately predict future links is a crucial direction for many real-world recommender systems. In temporal graph learning, recurrent neural network (RNN) and self-attention mechanism (SAM) have become the de facto standard for temporal graph learning Kumar et al. (2019); Sankar et al. (2020); Xu et al. (2020); Rossi et al. (2020); Wang et al. (2020), and the majority of the existing works focus on designing neural architectures with one of them and additional components to learn representations from raw data. Although powerful, these methods are conceptually and technically complicated with advanced model architectures.

artificial intelligence, graphmixer, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.11636

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Education (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multivariate Spatiotemporal Hawkes Processes and Network Reconstruction

Yuan, Baichuan, Li, Hao, Bertozzi, Andrea L., Brantingham, P. Jeffrey, Porter, Mason A.

arXiv.org Machine LearningNov-15-2018

There is often latent network structure in spatial and temporal data and the tools of network analysis can yield fascinating insights into such data. In this paper, we develop a nonparametric method for network reconstruction from spatiotemporal data sets using multivariate Hawkes processes. In contrast to prior work on network reconstruction with point-process models, which has often focused on exclusively temporal information, our approach uses both temporal and spatial information and does not assume a specific parametric form of network dynamics. This leads to an effective way of recovering an underlying network. We illustrate our approach using both synthetic networks and networks constructed from real-world data sets (a location-based social media network, a narrative of crime events, and violent gang crimes). Our results demonstrate that, in comparison to using only temporal data, our spatiotemporal approach yields improved network reconstruction, providing a basis for meaningful subsequent analysis --- such as community structure and motif analysis --- of the reconstructed networks.

law enforcement, point process, public safety, (23 more...)

arXiv.org Machine Learning

1811.06321

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > New Finding (0.54)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
(3 more...)

Add feedback

Graph-Based Deep Modeling and Real Time Forecasting of Sparse Spatio-Temporal Data

Wang, Bao, Luo, Xiyang, Zhang, Fangbo, Yuan, Baichuan, Bertozzi, Andrea L., Brantingham, P. Jeffrey

arXiv.org Machine LearningApr-2-2018

We present a generic framework for spatio-temporal (ST) data modeling, analysis, and forecasting, with a special focus on data that is sparse in both space and time. Our multi-scaled framework is a seamless coupling of two major components: a self-exciting point process that models the macroscale statistical behaviors of the ST data and a graph structured recurrent neural network (GSRNN) to discover the microscale patterns of the ST data on the inferred graph. This novel deep neural network (DNN) incorporates the real time interactions of the graph nodes to enable more accurate real time forecasting. The effectiveness of our method is demonstrated on both crime and traffic forecasting.

deep learning, law enforcement, time series, (21 more...)

arXiv.org Machine Learning

1804.00684

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.82)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback