AITopics | Wang, Yihang

Collaborating Authors

Wang, Yihang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Graph Foundation Models for Recommendation: A Comprehensive Survey

Wu, Bin, Wang, Yihang, Zeng, Yuanhao, Liu, Jiawei, Zhao, Jiashu, Yang, Cheng, Li, Yawen, Xia, Long, Yin, Dawei, Shi, Chuan

arXiv.org Artificial IntelligenceFeb-16-2025

Recommender systems (RS) serve as a fundamental tool for navigating the vast expanse of online information, with deep learning advancements playing an increasingly important role in improving ranking accuracy. Among these, graph neural networks (GNNs) excel at extracting higher-order structural information, while large language models (LLMs) are designed to process and comprehend natural language, making both approaches highly effective and widely adopted. Recent research has focused on graph foundation models (GFMs), which integrate the strengths of GNNs and LLMs to model complex RS problems more efficiently by leveraging the graph-based structure of user-item relationships alongside textual understanding. In this survey, we provide a comprehensive overview of GFM-based RS technologies by introducing a clear taxonomy of current approaches, diving into methodological details, and highlighting key challenges and future directions. By synthesizing recent advancements, we aim to offer valuable insights into the evolving landscape of GFM-based recommender systems.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.08346

Country: North America > United States > Minnesota (0.28)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting

Li, Zhe, Qiu, Xiangfei, Chen, Peng, Wang, Yihang, Cheng, Hanyin, Shu, Yang, Hu, Jilin, Guo, Chenjuan, Zhou, Aoying, Wen, Qingsong, Jensen, Christian S., Yang, Bin

arXiv.org Artificial IntelligenceNov-26-2024

Time Series Forecasting (TSF) is key functionality in numerous fields, including in finance, weather services, and energy management. While TSF methods are emerging these days, many of them require domain-specific data collection and model training and struggle with poor generalization performance on new domains. Foundation models aim to overcome this limitation. Pre-trained on large-scale language or time series data, they exhibit promising inferencing capabilities in new or unseen data. This has spurred a surge in new TSF foundation models. We propose a new benchmark, FoundTS, to enable thorough and fair evaluation and comparison of such models. FoundTS covers a variety of TSF foundation models, including those based on large language models and those pretrained on time series. Next, FoundTS supports different forecasting strategies, including zero-shot, few-shot, and full-shot, thereby facilitating more thorough evaluations. Finally, FoundTS offers a pipeline that standardizes evaluation processes such as dataset splitting, loading, normalization, and few-shot sampling, thereby facilitating fair evaluations. Building on this, we report on an extensive evaluation of TSF foundation models on a broad range of datasets from diverse domains and with different statistical characteristics. Specifically, we identify pros and cons and inherent limitations of existing foundation models, and we identify directions for future model design. We make our code and datasets available at https://anonymous.4open.science/r/FoundTS-C2B0.

data mining, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.11802

Country: North America > United States (0.48)

Genre: Research Report > New Finding (0.92)

Industry:

Energy > Power Industry (1.00)
Government > Regional Government > North America Government > United States Government (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CATCH: Channel-Aware multivariate Time Series Anomaly Detection via Frequency Patching

Wu, Xingjian, Qiu, Xiangfei, Li, Zhengyu, Wang, Yihang, Hu, Jilin, Guo, Chenjuan, Xiong, Hui, Yang, Bin

arXiv.org Artificial IntelligenceOct-16-2024

Anomaly detection in multivariate time series is challenging as heterogeneous subsequence anomalies may occur. Reconstruction-based methods, which focus on learning nomral patterns in the frequency domain to detect diverse abnormal subsequences, achieve promising resutls, while still falling short on capturing fine-grained frequency characteristics and channel correlations. To contend with the limitations, we introduce CATCH, a framework based on frequency patching. We propose to patchify the frequency domain into frequency bands, which enhances its ability to capture fine-grained frequency characteristics. To perceive appropriate channel correlations, we propose a Channel Fusion Module (CFM), which features a patch-wise mask generator and a masked-attention mechanism. Driven by a bi-level multi-objective optimization algorithm, the CFM is encouraged to iteratively discover appropriate patch-wise channel correlations, and to cluster relevant channels while isolating adverse effects from irrelevant channels. Extensive experiments on 9 real-world datasets and 12 synthetic datasets demonstrate that CATCH achieves state-of-the-art performance.

channel correlation, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.12261

Country:

North America (0.28)
Asia > China (0.27)

Genre: Research Report (1.00)

Industry: Information Technology (0.92)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

ROSE: Register Assisted General Time Series Forecasting with Decomposed Frequency Learning

Wang, Yihang, Qiu, Yuying, Chen, Peng, Zhao, Kai, Shu, Yang, Rao, Zhongwen, Pan, Lujia, Yang, Bin, Guo, Chenjuan

arXiv.org Machine LearningMay-24-2024

With the increasing collection of time series data from various domains, there arises a strong demand for general time series forecasting models pre-trained on a large number of time-series datasets to support a variety of downstream prediction tasks. Enabling general time series forecasting faces two challenges: how to obtain unified representations from multi-domian time series data, and how to capture domain-specific features from time series data across various domains for adaptive transfer in downstream tasks. To address these challenges, we propose a Register Assisted General Time Series Forecasting Model with Decomposed Frequency Learning (ROSE), a novel pre-trained model for time series forecasting. ROSE employs Decomposed Frequency Learning for the pre-training task, which decomposes coupled semantic and periodic information in time series with frequency-based masking and reconstruction to obtain unified representations across domains. We also equip ROSE with a Time Series Register, which learns to generate a register codebook to capture domain-specific representations during pre-training and enhances domain-adaptive transfer by selecting related register tokens on downstream tasks. After pre-training on large-scale time series data, ROSE achieves state-of-the-art forecasting performance on 8 real-world benchmarks. Remarkably, even in few-shot scenarios, it demonstrates competitive or superior performance compared to existing methods trained with full data.

data mining, information, machine learning, (17 more...)

arXiv.org Machine Learning

2405.17478

Country:

North America > United States > California (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Token-Efficient Leverage Learning in Large Language Models

Zeng, Yuanhao, Wang, Min, Wang, Yihang, Shao, Yingxia

arXiv.org Artificial IntelligenceApr-1-2024

Large Language Models (LLMs) have excelled in various tasks but perform better in high-resource scenarios, which presents challenges in low-resource scenarios. Data scarcity and the inherent difficulty of adapting LLMs to specific tasks compound the challenge. To address the twin hurdles, we introduce \textbf{Leverage Learning}. We present a streamlined implement of this methodology called Token-Efficient Leverage Learning (TELL). TELL showcases the potential of Leverage Learning, demonstrating effectiveness across various LLMs and low-resource tasks, ranging from $10^4$ to $10^6$ tokens. It reduces task data requirements by up to nearly an order of magnitude compared to conventional Supervised Fine-Tuning (SFT) while delivering competitive performance. With the same amount of task data, TELL leads in improving task performance compared to SFT. We discuss the mechanism of Leverage Learning, suggesting it aligns with quantization hypothesis and explore its promising potential through empirical testing.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.00914

Country:

Asia > China (0.29)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Pathformer: Multi-scale transformers with Adaptive Pathways for Time Series Forecasting

Chen, Peng, Zhang, Yingying, Cheng, Yunyao, Shu, Yang, Wang, Yihang, Wen, Qingsong, Yang, Bin, Guo, Chenjuan

arXiv.org Artificial IntelligenceFeb-4-2024

Transformer-based models have achieved some success in time series forecasting. Existing methods mainly model time series from limited or fixed scales, making it challenging to capture different characteristics spanning various scales. In this paper, we propose multi-scale transformers with adaptive pathways (Pathformer). The proposed Transformer integrates both temporal resolution and temporal distance for multi-scale modeling. Multi-scale division divides the time series into different temporal resolutions using patches of various sizes. Based on the division of each scale, dual attention is performed over these patches to capture global correlations and local details as temporal dependencies. We further enrich the multi-scale transformer with adaptive pathways, which adaptively adjust the multi-scale modeling process based on the varying temporal dynamics in the input time series, improving the prediction accuracy and generalization of Pathformer. Extensive experiments on eleven real-world datasets demonstrate that Pathformer not only achieves state-of-the-art performance by surpassing all current models but also exhibits stronger generalization abilities under various transfer scenarios. Time series forecasting is an essential task for various industries, such as energy, finance, traffic, and cloud computing (Chen et al., 2012; Cirstea et al., 2022b; Qin et al., 2023; Pan et al., 2023). Motivated by its widespread application in sequence modeling and impressive success in various fields such as CV and NLP (Dosovitskiy et al., 2021; Brown et al., 2020), Transformer (Vaswani et al., 2017) receives emerging attention in time series (Wu et al., 2021; Liu et al., 2022c).

data mining, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2402.05956

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From latent dynamics to meaningful representations

Wang, Dedi, Wang, Yihang, Evans, Luke, Tiwary, Pratyush

arXiv.org Artificial IntelligenceSep-11-2023

While representation learning has been central to the rise of machine learning and artificial intelligence, a key problem remains in making the learnt representations meaningful. For this the typical approach is to regularize the learned representation through prior probability distributions. However such priors are usually unavailable or ad hoc. To deal with this, we propose a dynamics-constrained representation learning framework. Instead of using predefined probabilities, we restrict the latent representation to follow specific dynamics, which is a more natural constraint for representation learning in dynamical systems. Our belief stems from a fundamental observation in physics that though different systems can have different marginalized probability distributions, they typically obey the same dynamics, such as Newton's and Schrodinger's equations. We validate our framework for different systems including a real-world fluorescent DNA movie dataset. We show that our algorithm can uniquely identify an uncorrelated, isometric and meaningful latent representation.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2209.00905

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback