AITopics | Lu, Siyuan

Collaborating Authors

Lu, Siyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PreAct: Prediction Enhances Agent's Planning Ability

Fu, Dayuan, Huang, Jianzhao, Lu, Siyuan, Dong, Guanting, Wang, Yejie, He, Keqing, Xu, Weiran

arXiv.org Artificial IntelligenceDec-4-2024

Addressing the disparity between forecasts and actual results can enable individuals to expand their thought processes and stimulate self-reflection, thus promoting accurate planning. In this research, we present **PreAct**, an agent framework that integrates **pre**diction, **rea**soning, and **act**ion. By utilizing the information derived from predictions, the large language model (LLM) agent can provide a wider range and more strategically focused reasoning. This leads to more efficient actions that aid the agent in accomplishing intricate tasks. Our experimental results show that PreAct surpasses the ReAct method in completing complex tasks and that PreAct's performance can be further improved when paired with other memory or selection strategy techniques. We presented the model with varying quantities of historical predictions and discovered that these predictions consistently enhance LLM planning.The variances in single-step reasoning between PreAct and ReAct indicate that PreAct indeed has benefits in terms of diversity and strategic orientation over ReAct.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.11534

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

MorphAgent: Empowering Agents through Self-Evolving Profiles and Decentralized Collaboration

Lu, Siyuan, Shao, Jiaqi, Luo, Bing, Lin, Tao

arXiv.org Artificial IntelligenceOct-19-2024

The rapid advancement of Large Language Models (LLMs) (Achiam et al., 2023; Touvron et al., 2023b) has ushered in a new era of artificial intelligence, enabling the creation of sophisticated AI agents capable of tackling complex tasks across various domains (Nakajima, 2023; Torantulino, 2023). As these AI systems become more intricate, there is a growing need for effective collaboration mechanisms that allow multiple agents to work together. This collaborative approach, known as Multi-Agent Systems (MAS) (Han et al., 2024), has shown great promise in addressing challenges that are too complex or diverse for single-agent systems (Hong et al., 2024; Liu et al., 2023). While existing MAS implementations have shown promising results, they often rely on predefined roles (Li et al., 2023), centralized coordination (Guo et al., 2024; Chen et al., 2024), or rigid organizational structures (Wang et al., 2024b; Hong et al., 2024). These approaches limit cooperative resilience within MAS (Chacon-Chamorro et al., 2024), which focuses on robustness and adaptability in dynamic, unpredictable environments. Figure 1 presents two examples to illustrate the real-world challenges with details elaborated below: Example 1.1 (Domain shift). Domain shift refers to a change in the characteristics or requirements of a task as it progresses through different phases or contexts, presenting new challenges and requiring different skill sets. For instance, a scientific research project could begin with literature review, move to experiment design, and conclude with result analysis and paper writing. These transitions demand a flexible and adaptive multi-agent system that can seamlessly adjust its collaborative strategies and agent roles as the task progresses.

agent, artificial intelligence, survey article, (16 more...)

arXiv.org Artificial Intelligence

2410.15048

Genre:

Overview (0.88)
Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)

Add feedback

Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers

Tang, Xinyu, Wang, Xiaolei, Zhao, Wayne Xin, Lu, Siyuan, Li, Yaliang, Wen, Ji-Rong

arXiv.org Artificial IntelligenceApr-16-2024

Automatic prompt optimization is an important approach to improving the performance of large language models (LLMs). Recent research demonstrates the potential of using LLMs as prompt optimizers, which can generate improved task prompts via iterative refinement. In this paper, we propose a novel perspective to investigate the design of LLM-based prompt optimizers, by drawing an analogy with gradient-based model optimizers. To connect these two approaches, we identify two pivotal factors in model parameter learning: update direction and update method. Focused on the two aspects, we borrow the theoretical framework and learning methods from gradient-based optimization to design improved strategies for LLM-based prompt optimizers. By systematically analyzing a rich set of improvement strategies, we further develop a capable Gradient-inspired LLM-based Prompt Optimizer called GPO. At each step, it first retrieves relevant prompts from the optimization trajectory as the update direction. Then, it utilizes the generation-based refinement strategy to perform the update, while controlling the edit distance through a cosine-based decay strategy. Extensive experiments demonstrate the effectiveness and efficiency of GPO. In particular, GPO brings an additional improvement of up to 56.8% on Big-Bench Hard and 55.3% on MMLU compared to baseline methods.

current prompt, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2402.17564

Country:

Europe (1.00)
Asia > Middle East > UAE (0.14)
North America > United States > California (0.14)

Genre:

Workflow (1.00)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT

Lu, Siyuan, Zhou, Chenchen, Xie, Keli, Lin, Jun, Wang, Zhongfeng

arXiv.org Artificial IntelligenceDec-5-2022

With the development of deep learning and Transformer-based pre-trained models like BERT, the accuracy of many NLP tasks has been dramatically improved. However, the large number of parameters and computations also pose challenges for their deployment. For instance, using BERT can improve the predictions in the financial sentiment analysis (FSA) task but slow it down, where speed and accuracy are equally important in terms of profits. To address these issues, we first propose an efficient and lightweight BERT (ELBERT) along with a novel confidence-window-based (CWB) early exit mechanism. Based on ELBERT, an innovative method to accelerate text processing on the GPU platform is developed, solving the difficult problem of making the early exit mechanism work more effectively with a large input batch size. Afterward, a fast and high-accuracy FSA system is built. Experimental results show that the proposed CWB early exit mechanism achieves significantly higher accuracy than existing early exit methods on BERT under the same computation cost. By using this acceleration method, our FSA system can boost the processing speed by nearly 40 times to over 1000 texts per second with sufficient accuracy, which is nearly twice as fast as FastBERT, thus providing a more powerful text processing capability for modern trading systems.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.08842

Country:

Asia (0.68)
North America > United States > Minnesota (0.28)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.48)

Industry:

Banking & Finance > Trading (1.00)
Semiconductors & Electronics (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PAIRS AutoGeo: an Automated Machine Learning Framework for Massive Geospatial Data

Zhou, Wang, Klein, Levente J., Lu, Siyuan

arXiv.org Artificial IntelligenceDec-12-2020

An automated machine learning framework for geospatial data named PAIRS AutoGeo is introduced on IBM PAIRS Geoscope big data and analytics platform. The framework simplifies the development of industrial machine learning solutions leveraging geospatial data to the extent that the user inputs are minimized to merely a text file containing labeled GPS coordinates. PAIRS AutoGeo automatically gathers required data at the location coordinates, assembles the training data, performs quality check, and trains multiple machine learning models for subsequent deployment. The framework is validated using a realistic industrial use case of tree species classification. Open-source tree species data are used as the input to train a random forest classifier and a modified ResNet model for 10-way tree species classification based on aerial imagery, which leads to an accuracy of $59.8\%$ and $81.4\%$, respectively. This use case exemplifies how PAIRS AutoGeo enables users to leverage machine learning without extensive geospatial expertise.

deep learning, geospatial data, neural network, (20 more...)

arXiv.org Artificial Intelligence

2012.06907

Country:

North America > United States > Texas (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology (1.00)
Energy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

An Unsupervised Clustering-Based Short-Term Solar Forecasting Methodology Using Multi-Model Machine Learning Blending

Feng, Cong, Cui, Mingjian, Hodge, Bri-Mathias, Lu, Siyuan, Hamann, Hendrik F., Zhang, Jie

arXiv.org Machine LearningMay-10-2018

Solar forecasting accuracy is affected by weather conditions, and weather awareness forecasting models are expected to improve the performance. However, it may not be available and reliable to classify different forecasting tasks by using only meteorological weather categorization. In this paper, an unsupervised clustering-based (UC-based) solar forecasting methodology is developed for short-term (1-hour-ahead) global horizontal irradiance (GHI) forecasting. This methodology consists of three parts: GHI time series unsupervised clustering, pattern recognition, and UC-based forecasting. The daily GHI time series is first clustered by an Optimized Cross-validated ClUsteRing (OCCUR) method, which determines the optimal number of clusters and best clustering results. Then, support vector machine pattern recognition (SVM-PR) is adopted to recognize the category of a certain day using the first few hours' data in the forecasting stage. GHI forecasts are generated by the most suitable models in different clusters, which are built by a two-layer Machine learning based Multi-Model (M3) forecasting framework. The developed UC-based methodology is validated by using 1-year of data with six solar features. Numerical results show that (i) UC-based models outperform non-UC (all-in-one) models with the same M3 architecture by approximately 20%; (ii) M3-based models also outperform the single-algorithm machine learning (SAML) models by approximately 20%.

forecasting, renewable energy, survey article, (18 more...)

arXiv.org Machine Learning

1805.04193

Country:

North America > United States > Texas (0.14)
North America > United States > Colorado (0.14)

Genre: Research Report (0.84)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback