AITopics | Yao, Jiawei

Collaborating Authors

Yao, Jiawei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems

Zhang, Jusheng, Huang, Zimeng, Fan, Yijia, Liu, Ningyuan, Li, Mingyan, Yang, Zhuojie, Yao, Jiawei, Wang, Jian, Wang, Keze

arXiv.org Artificial IntelligenceFeb-11-2025

As scaling large language models faces prohibitive costs, multi-agent systems emerge as Multi-Agent Systems (MAS) (Guo et al., 2024b) offer a a promising alternative, though challenged by promising alternative by coordinating multiple specialized static knowledge assumptions and coordination agents to achieve superior performance compared to individual inefficiencies. We introduce Knowledge-Aware systems while maintaining manageable computational Bayesian Bandits (KABB), a novel framework costs and budgets. Recent advances in MAS have led to that enhances multi-agent system coordination the development of several frameworks. For example, the through semantic understanding and dynamic Mixture of Agents (MoA) (Wang et al., 2024) employs multiple adaptation. The framework features three key LLMs as proposers to iteratively refine responses, with innovations: a three-dimensional knowledge distance a central aggregator delivering the final output. Although model for deep semantic understanding, a MoA has demonstrated robustness and scalability in deployment, dual-adaptation mechanism for continuous expert its computational cost scales linearly with the number optimization, and a knowledge-aware Thompson of agents, and significant redundancy and noise become a Sampling strategy for efficient expert selection.

artificial intelligence, machine learning, submission and formatting instruction, (14 more...)

arXiv.org Artificial Intelligence

2502.0735

Country: North America > United States > Ohio (0.14)

Genre: Research Report > New Finding (0.92)

Industry:

Health & Medicine (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Enhancing Multimodal Large Language Models Complex Reason via Similarity Computation

Zhang, Xiaofeng, Zeng, Fanshuo, Quan, Yihao, Hui, Zheng, Yao, Jiawei

arXiv.org Artificial IntelligenceDec-12-2024

Multimodal large language models have experienced rapid growth, and numerous different models have emerged. The interpretability of LVLMs remains an under-explored area. Especially when faced with more complex tasks such as chain-of-thought reasoning, its internal mechanisms still resemble a black box that is difficult to decipher. By studying the interaction and information flow between images and text, we noticed that in models such as LLaVA1.5, image tokens that are semantically related to text are more likely to have information flow convergence in the LLM decoding layer, and these image tokens receive higher attention scores. However, those image tokens that are less relevant to the text do not have information flow convergence, and they only get very small attention scores. To efficiently utilize the image information, we propose a new image token reduction method, Simignore, which aims to improve the complex reasoning ability of LVLMs by computing the similarity between image and text embeddings and ignoring image tokens that are irrelevant and unimportant to the text. Through extensive experiments, we demonstrate the effectiveness of our method for complex reasoning tasks. The paper's source code can be accessed from \url{https://github.com/FanshuoZeng/Simignore}.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.09817

Country: North America (0.28)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning

Yao, Jiawei, Qian, Qi, Hu, Juhua

arXiv.org Artificial IntelligenceNov-6-2024

Multiple clustering aims to discover various latent structures of data from different aspects. Deep multiple clustering methods have achieved remarkable performance by exploiting complex patterns and relationships in data. However, existing works struggle to flexibly adapt to diverse user-specific needs in data grouping, which may require manual understanding of each clustering. To address these limitations, we introduce Multi-Sub, a novel end-to-end multiple clustering approach that incorporates a multi-modal subspace proxy learning framework in this work. Utilizing the synergistic capabilities of CLIP and GPT-4, Multi-Sub aligns textual prompts expressing user preferences with their corresponding visual representations. This is achieved by automatically generating proxy words from large language models that act as subspace bases, thus allowing for the customized representation of data in terms specific to the user's interests. Our method consistently outperforms existing baselines across a broad set of datasets in visual multiple clustering tasks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.03978

Country: North America > United States > Washington > Pierce County > Tacoma (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

Swift Sampler: Efficient Learning of Sampler by 10 Parameters

Yao, Jiawei, Li, Chuming, Xiao, Canran

arXiv.org Artificial IntelligenceOct-7-2024

Data selection is essential for training deep learning models. An effective data sampler assigns proper sampling probability for training data and helps the model converge to a good local minimum with high performance. Previous studies in data sampling are mainly based on heuristic rules or learning through a huge amount of time-consuming trials. In this paper, we propose an automatic \textbf{swift sampler} search algorithm, \textbf{SS}, to explore automatically learning effective samplers efficiently. In particular, \textbf{SS} utilizes a novel formulation to map a sampler to a low dimension of hyper-parameters and uses an approximated local minimum to quickly examine the quality of a sampler. Benefiting from its low computational expense, \textbf{SS} can be applied on large-scale data sets with high efficiency. Comprehensive experiments on various tasks demonstrate that \textbf{SS} powered sampling can achieve obvious improvements (e.g., 1.5\% on ImageNet) and transfer among different neural networks. Project page: https://github.com/Alexander-Yao/Swift-Sampler.

artificial intelligence, machine learning, sampler, (12 more...)

arXiv.org Artificial Intelligence

2410.05578

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Inverse Moment Methods for Sufficient Forecasting using High-Dimensional Predictors

Luo, Wei, Xue, Lingzhou, Yao, Jiawei

arXiv.org Machine LearningApr-30-2017

We consider forecasting a single time series using high-dimensional predictors in the presence of a possible nonlinear forecast function. The sufficient forecasting (Fan et al., 2016) used sliced inverse regression to estimate lower-dimensional sufficient indices for nonparametric forecasting using factor models. However, Fan et al. (2016) is fundamentally limited to the inverse first-moment method, by assuming the restricted fixed number of factors, linearity condition for factors, and monotone effect of factors on the response. In this work, we study the inverse second-moment method using directional regression and the inverse third-moment method to extend the methodology and applicability of the sufficient forecasting. As the number of factors diverges with the dimension of predictors, the proposed method relaxes the distributional assumption of the predictor and enhances the capability of capturing the non-monotone effect of factors on the response. We not only provide a high-dimensional analysis of inverse moment methods such as exhaustiveness and rate of convergence, but also prove their model selection consistency. The power of our proposed methods is demonstrated in both simulation studies and an empirical study of forecasting monthly macroeconomic data from Q1 1959 to Q1 2016. During our theoretical development, we prove an invariance result for inverse moment methods, which make a separate contribution to the sufficient dimension reduction.

artificial intelligence, banking & finance, regression, (17 more...)

arXiv.org Machine Learning

1705.00395

Country: North America > United States (0.28)

Genre: Research Report (0.81)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

Sufficient Forecasting Using Factor Models

Fan, Jianqing, Xue, Lingzhou, Yao, Jiawei

arXiv.org Machine LearningDec-24-2015

We consider forecasting a single time series when there is a large number of predictors and a possible nonlinear effect. The dimensionality was first reduced via a high-dimensional (approximate) factor model implemented by the principal component analysis. Using the extracted factors, we develop a novel forecasting method called the sufficient forecasting, which provides a set of sufficient predictive indices, inferred from high-dimensional predictors, to deliver additional predictive power. The projected principal component analysis will be employed to enhance the accuracy of inferred factors when a semi-parametric (approximate) factor model is assumed. Our method is also applicable to cross-sectional sufficient regression using extracted factors. The connection between the sufficient forecasting and the deep learning architecture is explicitly stated. The sufficient forecasting correctly estimates projection indices of the underlying factors even in the presence of a nonparametric forecasting function. The proposed method extends the sufficient dimension reduction to high-dimensional regimes by condensing the cross-sectional information through factor models. We derive asymptotic properties for the estimate of the central subspace spanned by these projection directions as well as the estimates of the sufficient predictive indices. We further show that the natural method of running multiple regression of target on estimated factors yields a linear estimate that actually falls into this central subspace. Our method and theory allow the number of predictors to be larger than the number of observations. We finally demonstrate that the sufficient forecasting improves upon the linear forecasting in both simulation studies and an empirical study of forecasting macroeconomic variables.

deep learning, forecasting, neural network, (22 more...)

arXiv.org Machine Learning

1505.07414

Country: North America > United States (0.93)

Genre: Research Report (0.81)

Industry:

Health & Medicine (1.00)
Banking & Finance > Economy (0.92)
Government (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.44)

Add feedback