AITopics | Yang, Jingyi

Collaborating Authors

Yang, Jingyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Shall Your Data Strategy Work? Perform a Swift Study

Peng, Minlong, Yang, Jingyi, He, Zhongjun, Wu, Hua

arXiv.org Artificial IntelligenceFeb-19-2025

This work presents a swift method to assess the efficacy of particular types of instruction-tuning data, utilizing just a handful of probe examples and eliminating the need for model retraining. This method employs the idea of gradient-based data influence estimation, analyzing the gradient projections of probe examples from the chosen strategy onto evaluation examples to assess its advantages. Building upon this method, we conducted three swift studies to investigate the potential of Chain-of-thought (CoT) data, query clarification data, and response evaluation data in enhancing model generalization. Subsequently, we embarked on a validation study to corroborate the findings of these swift studies. In this validation study, we developed training datasets tailored to each studied strategy and compared model performance with and without the use of these datasets. The results of the validation study aligned with the findings of the swift studies, validating the efficacy of our proposed method.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.13514

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

STGCN-LSTM for Olympic Medal Prediction: Dynamic Power Modeling and Causal Policy Optimization

Wang, Yiquan, Wang, Jiaying, Yang, Jingyi, Xu, Zihao

arXiv.org Artificial IntelligenceJan-29-2025

This paper proposes a novel hybrid model, STGCN-LSTM, to forecast Olympic medal distributions by integrating the spatio-temporal relationships among countries and the long-term dependencies of national performance. The Spatial-Temporal Graph Convolution Network (STGCN) captures geographic and interactive factors-such as coaching exchange and socio-economic links-while the Long Short-Term Memory (LSTM) module models historical trends in medal counts, economic data, and demographics. To address zero-inflated outputs (i.e., the disparity between countries that consistently yield wins and those never having won medals), a Zero-Inflated Compound Poisson (ZICP) framework is incorporated to separate random zeros from structural zeros, providing a clearer view of potential breakthrough performances. Validation includes historical backtracking, policy shock simulations, and causal inference checks, confirming the robustness of the proposed method. Results shed light on the influence of coaching mobility, event specialization, and strategic investment on medal forecasts, offering a data-driven foundation for optimizing sports policies and resource allocation in diverse Olympic contexts.

artificial intelligence, machine learning, olympic, (19 more...)

arXiv.org Artificial Intelligence

2501.17711

Country:

Europe (1.00)
Asia > China (0.95)
North America (0.94)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Leisure & Entertainment > Sports > Olympic Games (1.00)
Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Context Parallelism for Scalable Million-Token Inference

Yang, Amy, Yang, Jingyi, Ibrahim, Aya, Xie, Xinfeng, Tang, Bangsheng, Sizov, Grigory, Reizenstein, Jeremy, Park, Jongsoo, Huang, Jianyu

arXiv.org Artificial IntelligenceNov-10-2024

We present context parallelism for long-context large language model inference, which achieves near-linear scaling for long-context prefill latency with up to 128 H100 GPUs across 16 nodes. Particularly, our method achieves 1M context prefill with Llama3 405B model in 77s (93% parallelization efficiency, 63% FLOPS utilization) and 128K context prefill in 3.8s. We develop two lossless exact ring attention variants: pass-KV and pass-Q to cover a wide range of use cases with the state-of-the-art performance: full prefill, persistent KV prefill and decode. Benchmarks on H100 GPU hosts inter-connected with RDMA and TCP both show similar scalability for long-context prefill, demonstrating that our method scales well using common commercial data center with medium-to-low inter-host bandwidth.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.01783

Genre: Research Report (0.40)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FedPAE: Peer-Adaptive Ensemble Learning for Asynchronous and Model-Heterogeneous Federated Learning

Mueller, Brianna, Street, W. Nick, Baek, Stephen, Lin, Qihang, Yang, Jingyi, Huang, Yankun

arXiv.org Artificial IntelligenceOct-17-2024

Federated learning (FL) enables multiple clients with distributed data sources to collaboratively train a shared model without compromising data privacy. However, existing FL paradigms face challenges due to heterogeneity in client data distributions and system capabilities. Personalized federated learning (pFL) has been proposed to mitigate these problems, but often requires a shared model architecture and a central entity for parameter aggregation, resulting in scalability and communication issues. More recently, model-heterogeneous FL has gained attention due to its ability to support diverse client models, but existing methods are limited by their dependence on a centralized framework, synchronized training, and publicly available datasets. To address these limitations, we introduce Federated Peer-Adaptive Ensemble Learning (FedPAE), a fully decentralized pFL algorithm that supports model heterogeneity and asynchronous learning. Our approach utilizes a peer-to-peer model sharing mechanism and ensemble selection to achieve a more refined balance between local and global information. Experimental results show that FedPAE outperforms existing state-of-the-art pFL algorithms, effectively managing diverse client capabilities and demonstrating robustness against statistical heterogeneity.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.14075

Country: North America > United States (0.69)

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.93)
Health & Medicine (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Design and Verification of a Novel Triphibian Robot

Yang, Shiqi, Xue, Kaiwen, Lv, Minen, Xu, Yingtai, Yang, Jingyi, Lu, Yiying, Liu, Chongfeng, Qian, Huihuan

arXiv.org Artificial IntelligenceSep-21-2023

Multi-modal robots expand their operations from one working medium to another, land to air for example. The majorities of multi-modal robots mainly refer to platforms that operate in two different media. However, for all-terrain tasks, there are seldom research to date in the literature. Generally, locomotions in different working media, i.e. land, water and air, require different propelling actuators, and thus the triphibian system becomes bulky. To overcome this challenge, we proposed a triphibian robot and provide the robot with driving forces to perform all-terrain operations in an efficient way. A morphable mechanism is designed to enable the transition between different motion modes, and specifically a cylindrical body is implemented as the rolling mechanism in land mode. Detailed design principles of different mechanisms and the transition between various locomotion modes are analyzed. Finally, a triphibian robot prototype is fabricated and tested in various working media with both mono-modal and multi-modal functionalities. Experiments have verified our platform, and the results show promising adaptions in future exploration tasks in various working scenarios.

artificial intelligence, propeller, robot, (15 more...)

arXiv.org Artificial Intelligence

2211.17022

Country: Asia > China > Guangdong Province (0.15)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)

Add feedback

Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation

Han, Yuxing, Wu, Ziniu, Wu, Peizhi, Zhu, Rong, Yang, Jingyi, Tan, Liang Wei, Zeng, Kai, Cong, Gao, Qin, Yanzhao, Pfadler, Andreas, Qian, Zhengping, Zhou, Jingren, Li, Jiangneng, Cui, Bin

arXiv.org Artificial IntelligenceSep-15-2021

Cardinality estimation (CardEst) plays a significant role in generating high-quality query plans for a query optimizer in DBMS. In the last decade, an increasing number of advanced CardEst methods (especially ML-based) have been proposed with outstanding estimation accuracy and inference latency. However, there exists no study that systematically evaluates the quality of these methods and answer the fundamental problem: to what extent can these methods improve the performance of query optimizer in real-world settings, which is the ultimate goal of a CardEst method. In this paper, we comprehensively and systematically compare the effectiveness of CardEst methods in a real DBMS. We establish a new benchmark for CardEst, which contains a new complex real-world dataset STATS and a diverse query workload STATS-CEB. We integrate multiple most representative CardEst methods into an open-source database system PostgreSQL, and comprehensively evaluate their true effectiveness in improving query plan quality, and other important aspects affecting their applicability, ranging from inference latency, model size, and training time, to update efficiency and accuracy. We obtain a number of key findings for the CardEst methods, under different data and query settings. Furthermore, we find that the widely used estimation accuracy metric(Q-Error) cannot distinguish the importance of different sub-plan queries during query optimization and thus cannot truly reflect the query plan quality generated by CardEst methods. Therefore, we propose a new metric P-Error to evaluate the performance of CardEst methods, which overcomes the limitation of Q-Error and is able to reflect the overall end-to-end performance of CardEst methods. We have made all of the benchmark data and evaluation code publicly available at https://github.com/Nathaniel-Han/End-to-End-CardEst-Benchmark.

artificial intelligence, cardest method, information retrieval query processing, (19 more...)

arXiv.org Artificial Intelligence

2109.05877

Genre: Research Report (0.82)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Partially Interpretable Estimators (PIE): Black-Box-Refined Interpretable Machine Learning

Wang, Tong, Yang, Jingyi, Li, Yunyi, Wang, Boxiang

arXiv.org Artificial IntelligenceMay-5-2021

We propose Partially Interpretable Estimators (PIE) which attribute a prediction to individual features via an interpretable model, while a (possibly) small part of the PIE prediction is attributed to the interaction of features via a black-box model, with the goal to boost the predictive performance while maintaining interpretability. As such, the interpretable model captures the main contributions of features, and the black-box model attempts to complement the interpretable piece by capturing the "nuances" of feature interactions as a refinement. We design an iterative training algorithm to jointly train the two types of models. Experimental results show that PIE is highly competitive to black-box models while outperforming interpretable baselines. In addition, the understandability of PIE is comparable to simple linear models as validated via a human evaluation.

air transportation, health & medicine, prediction, (18 more...)

arXiv.org Artificial Intelligence

2105.0241

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Air (1.00)
Health & Medicine (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback