AITopics | Wang, Daoyu

Collaborating Authors

Wang, Daoyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Survey on Knowledge-Oriented Retrieval-Augmented Generation

Cheng, Mingyue, Luo, Yucong, Ouyang, Jie, Liu, Qi, Liu, Huijie, Li, Li, Yu, Shuo, Zhang, Bohou, Cao, Jiawei, Ma, Jie, Wang, Daoyu, Chen, Enhong

arXiv.org Artificial IntelligenceMar-17-2025

Retrieval-Augmented Generation (RAG) has gained significant attention in recent years for its potential to enhance natural language understanding and generation by combining large-scale retrieval systems with generative models. RAG leverages external knowledge sources, such as documents, databases, or structured data, to improve model performance and generate more accurate and contextually relevant outputs. This survey aims to provide a comprehensive overview of RAG by examining its fundamental components, including retrieval mechanisms, generation processes, and the integration between the two. We discuss the key characteristics of RAG, such as its ability to augment generative models with dynamic external knowledge, and the challenges associated with aligning retrieved information with generative objectives. We also present a taxonomy that categorizes RAG methods, ranging from basic retrieval-augmented approaches to more advanced models incorporating multi-modal data and reasoning capabilities. Additionally, we review the evaluation benchmarks and datasets commonly used to assess RAG systems, along with a detailed exploration of its applications in fields such as question answering, summarization, and information retrieval. Finally, we highlight emerging research directions and opportunities for improving RAG systems, such as enhanced retrieval efficiency, model interpretability, and domain-specific adaptations. This paper concludes by outlining the prospects for RAG in addressing real-world challenges and its potential to drive further advancements in natural language processing.

information retrieval, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.10677

Country:

Asia (1.00)
Europe (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
Media (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(6 more...)

Add feedback

Diffusion Auto-regressive Transformer for Effective Self-supervised Time Series Forecasting

Wang, Daoyu, Cheng, Mingyue, Liu, Zhiding, Liu, Qi, Chen, Enhong

arXiv.org Artificial IntelligenceDec-7-2024

Self-supervised learning has become a popular and effective approach for enhancing time series forecasting, enabling models to learn universal representations from unlabeled data. However, effectively capturing both the global sequence dependence and local detail features within time series data remains challenging. To address this, we propose a novel generative self-supervised method called TimeDART, denoting Diffusion Auto-regressive Transformer for Time series forecasting. In TimeDART, we treat time series patches as basic modeling units. Specifically, we employ an self-attention based Transformer encoder to model the dependencies of inter-patches. Additionally, we introduce diffusion and denoising mechanisms to capture the detail locality features of intra-patch. Notably, we design a cross-attention-based denoising decoder that allows for adjustable optimization difficulty in the self-supervised task, facilitating more effective self-supervised pre-training. Furthermore, the entire model is optimized in an auto-regressive manner to obtain transferable representations. Extensive experiments demonstrate that TimeDART achieves state-of-the-art fine-tuning performance compared to the most advanced competitive methods in forecasting tasks. Time series forecasting (Harvey, 1990; Hamilton, 2020; Box et al., 2015; Cheng et al., 2024b) is crucial in a wide array of domains, including finance (Black & Scholes, 1973), healthcare (Cheng et al., 2024c), energy management (Zhou et al., 2024). Accurate predictions of future data points could enable better decision-making, resource allocation, and risk management, ultimately leading to significant operational improvements and strategic advantages. Among the various methods developed for time series forecasting (Miller et al., 2024), deep neural networks (Ding et al., 2024; Jin et al., 2023; Cao et al., 2023; Cheng et al., 2024b) have emerged as a popular and effective solution paradigm. To further enhance the performance of time series forecasting, self-supervised learning has become an increasingly popular research paradigm (Nie et al., 2022).

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.05711

Country:

Asia (0.46)
North America (0.28)

Genre: Research Report (0.64)

Industry: Energy > Power Industry (0.87)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FDF: Flexible Decoupled Framework for Time Series Forecasting with Conditional Denoising and Polynomial Modeling

Zhang, Jintao, Cheng, Mingyue, Tao, Xiaoyu, Liu, Zhiding, Wang, Daoyu

arXiv.org Artificial IntelligenceOct-31-2024

Time series forecasting is vital in numerous web applications, influencing critical decision-making across industries. While diffusion models have recently gained increasing popularity for this task, we argue they suffer from a significant drawback: indiscriminate noise addition to the original time series followed by denoising, which can obscure underlying dynamic evolving trend and complicate forecasting. To address this limitation, we propose a novel flexible decoupled framework (FDF) that learns high-quality time series representations for enhanced forecasting performance. A key characteristic of our approach leverages the inherent inductive bias of time series data of its decomposed trend and seasonal components, each modeled separately to enable decoupled analysis and modeling. Specifically, we propose an innovative Conditional Denoising Seasonal Module (CDSM) within the diffusion model, which leverages statistical information from the historical window to conditionally model the complex seasonal component. Notably, we incorporate a Polynomial Trend Module (PTM) to effectively capture the smooth trend component, thereby enhancing the model's ability to represent temporal dependencies. Extensive experiments validate the effectiveness of our framework, demonstrating superior performance over existing methods and highlighting its flexibility in time series forecasting. The source code is available at https://github.com/zjt-gpu/FDF.

data mining, forecasting, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.13253

Country:

Asia > China > Anhui Province (0.14)
North America > United States > Hawaii (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Revisiting the Solution of Meta KDD Cup 2024: CRAG

Ouyang, Jie, Luo, Yucong, Cheng, Mingyue, Wang, Daoyu, Yu, Shuo, Liu, Qi, Chen, Enhong

arXiv.org Artificial IntelligenceSep-9-2024

This paper presents the solution of our team APEX in the Meta KDD CUP 2024: CRAG Comprehensive RAG Benchmark Challenge. The CRAG benchmark addresses the limitations of existing QA benchmarks in evaluating the diverse and dynamic challenges faced by Retrieval-Augmented Generation (RAG) systems. It provides a more comprehensive assessment of RAG performance and contributes to advancing research in this field. We propose a routing-based domain and dynamic adaptive RAG pipeline, which performs specific processing for the diverse and dynamic nature of the question in all three stages: retrieval, augmentation, and generation. Our method achieved superior performance on CRAG and ranked 2nd for Task 2&3 on the final competition leaderboard. Our implementation is available at this link: https://github.com/USTCAGI/CRAG-in-KDD-Cup2024.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2409.15337

Country:

North America (0.46)
Asia > China > Anhui Province (0.16)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports > Soccer (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback