AITopics | Wang, Wenyi

Collaborating Authors

Wang, Wenyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How to Correctly do Semantic Backpropagation on Language-based Agentic Systems

Wang, Wenyi, Alyahya, Hisham A., Ashley, Dylan R., Serikov, Oleg, Khizbullin, Dmitrii, Faccio, Francesco, Schmidhuber, Jürgen

arXiv.org Machine LearningDec-4-2024

Language-based agentic systems have shown great promise in recent years, transitioning from solving small-scale research problems to being deployed in challenging real-world tasks. However, optimizing these systems often requires substantial manual labor. Recent studies have demonstrated that these systems can be represented as computational graphs, enabling automatic optimization. Despite these advancements, most current efforts in Graph-based Agentic System Optimization (GASO) fail to properly assign feedback to the system's components given feedback on the system's output. To address this challenge, we formalize the concept of semantic backpropagation with semantic gradients -- a generalization that aligns several key optimization techniques, including reverse-mode automatic differentiation and the more recent TextGrad by exploiting the relationship among nodes with a common successor. This serves as a method for computing directional information about how changes to each component of an agentic system might improve the system's output. To use these gradients, we propose a method called semantic gradient descent which enables us to solve GASO effectively. Our results on both BIG-Bench Hard and GSM8K show that our approach outperforms existing state-of-the-art methods for solving GASO problems. A detailed ablation study on the LIAR dataset demonstrates the parsimonious nature of our method. A full copy of our implementation is publicly available at https://github.com/HishamAlyahya/semantic_backprop

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2412.03624

Country:

North America > United States (0.28)
Asia > Middle East > Saudi Arabia (0.28)
Europe (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.62)

Add feedback

FACTS: A Factored State-Space Framework For World Modelling

Nanbo, Li, Laakom, Firas, Xu, Yucheng, Wang, Wenyi, Schmidhuber, Jürgen

arXiv.org Artificial IntelligenceOct-28-2024

World modelling is essential for understanding and predicting the dynamics of complex systems by learning both spatial and temporal dependencies. However, current frameworks, such as Transformers and selective state-space models like Mambas, exhibit limitations in efficiently encoding spatial and temporal structures, particularly in scenarios requiring long-term high-dimensional sequence modelling. To address these issues, we propose a novel recurrent framework, the \textbf{FACT}ored \textbf{S}tate-space (\textbf{FACTS}) model, for spatial-temporal world modelling. The FACTS framework constructs a graph-structured memory with a routing mechanism that learns permutable memory representations, ensuring invariance to input permutations while adapting through selective state-space propagation. Furthermore, FACTS supports parallel computation of high-dimensional sequences. We empirically evaluate FACTS across diverse tasks, including multivariate time series forecasting and object-centric world modelling, demonstrating that it consistently outperforms or matches specialised state-of-the-art models, despite its general-purpose world modelling design.

arxiv preprint arxiv, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.20922

Country: Europe (0.46)

Genre: Research Report (1.00)

Industry: Energy (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Agent-as-a-Judge: Evaluate Agents with Agents

Zhuge, Mingchen, Zhao, Changsheng, Ashley, Dylan, Wang, Wenyi, Khizbullin, Dmitrii, Xiong, Yunyang, Liu, Zechun, Chang, Ernie, Krishnamoorthi, Raghuraman, Tian, Yuandong, Shi, Yangyang, Chandra, Vikas, Schmidhuber, Jürgen

arXiv.org Artificial IntelligenceOct-16-2024

Recent years have seen multimodal agentic systems move from occasionally being able to solve small toy problems to being regularly deployed for challenging real-world problems (the dream of most AI research). Yet, the current evaluation methods and the available benchmarks for agentic systems are struggling to keep up with these rapid advances, dramatically slowing true progress. We believe that the current issue with evaluating agentic systems stems from the lack of feedback during the intermediate task-solving stages for these nontraditional systems. Agentic systems think more like humans, often act step-by-step (Wooldridge, 1999) and often host very human-like symbolic communications internally to solve problems (Zhuge et al., 2023). And thus agentic systems should be evaluated like a human, with rich evaluative feedback which looks at the full thought and action trajectory; evaluating an agentic system in the traditional way is like evaluating a student using multiple-choice testing--a comparatively unreliable estimator (Park, 2010). For example, while SWE-Bench (Yang et al., 2024a) is widespread, its evaluation method, which relies solely on the final resolve rate for long-term automated repair tasks, does not effectively pinpoint what is happening within agentic systems that affects the resolve rate. On the other hand, performing a better evaluation with a human is prohibitively expensive. We instead propose that agentic systems should be used to evaluate agentic systems. Inspired by LLM-as-a-Judge (Zheng et al., 2024; Fu et al., 2023; Chen et al., 2024b), which uses LLMs to evaluate LLMs, we call this framework Agent-as-a-Judge, of which it is

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.10934

Country: North America > United States (0.28)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry:

Information Technology (0.67)
Education (0.48)
Banking & Finance > Economy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards a Robust Retrieval-Based Summarization System

Liu, Shengjie, Wu, Jing, Bao, Jingyuan, Wang, Wenyi, Hovakimyan, Naira, Healey, Christopher G

arXiv.org Artificial IntelligenceMar-28-2024

This paper describes an investigation of the robustness of large language models (LLMs) for retrieval augmented generation (RAG)-based summarization tasks. While LLMs provide summarization capabilities, their performance in complex, real-world scenarios remains under-explored. Our first contribution is LogicSumm, an innovative evaluation framework incorporating realistic scenarios to assess LLM robustness during RAG-based summarization. Based on limitations identified by LogiSumm, we then developed SummRAG, a comprehensive system to create training dialogues and fine-tune a model to enhance robustness within LogicSumm's scenarios. SummRAG is an example of our goal of defining structured methods to test the capabilities of an LLM, rather than addressing issues in a one-off fashion. Experimental results confirm the power of SummRAG, showcasing improved logical coherence and summarization quality. Data, corresponding model weights, and Python code are available online.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2403.19889

Country:

Europe (0.28)
North America > United States > Illinois (0.14)

Genre: Research Report (1.00)

Industry:

Media (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Language Agents as Optimizable Graphs

Zhuge, Mingchen, Wang, Wenyi, Kirsch, Louis, Faccio, Francesco, Khizbullin, Dmitrii, Schmidhuber, Jürgen

arXiv.org Artificial IntelligenceFeb-27-2024

Various human-designed prompt engineering techniques have been proposed to improve problem solvers based on Large Language Models (LLMs), yielding many disparate code bases. We unify these approaches by describing LLM-based agents as computational graphs. The nodes implement functions to process multimodal data or query LLMs, and the edges describe the information flow between operations. Graphs can be recursively combined into larger composite graphs representing hierarchies of inter-agent collaboration (where edges connect operations of different agents). Our novel automatic graph optimizers (1) refine node-level LLM prompts (node optimization) and (2) improve agent orchestration by changing graph connectivity (edge optimization). Experiments demonstrate that our framework can be used to efficiently develop, integrate, and automatically improve various LLM agents. The code can be found at https://github.com/metauto-ai/gptswarm.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2402.16823

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision

Hudson, Nathaniel, Pauloski, J. Gregory, Baughman, Matt, Kamatar, Alok, Sakarvadia, Mansi, Ward, Logan, Chard, Ryan, Bauer, André, Levental, Maksim, Wang, Wenyi, Engler, Will, Skelly, Owen Price, Blaiszik, Ben, Stevens, Rick, Chard, Kyle, Foster, Ian

arXiv.org Artificial IntelligenceFeb-5-2024

Deep learning methods are transforming research, enabling new techniques, and ultimately leading to new discoveries. As the demand for more capable AI models continues to grow, we are now entering an era of Trillion Parameter Models (TPM), or models with more than a trillion parameters -- such as Huawei's PanGu-$\Sigma$. We describe a vision for the ecosystem of TPM users and providers that caters to the specific needs of the scientific community. We then outline the significant technical challenges and open problems in system design for serving TPMs to enable scientific research and discovery. Specifically, we describe the requirements of a comprehensive software stack and interfaces to support the diverse and flexible requirements of researchers.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.0348

Country: North America > United States (0.94)

Genre:

Research Report (0.84)
Overview (0.68)

Industry:

Information Technology > Security & Privacy (0.46)
Information Technology > Services (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mindstorms in Natural Language-Based Societies of Mind

Zhuge, Mingchen, Liu, Haozhe, Faccio, Francesco, Ashley, Dylan R., Csordás, Róbert, Gopalakrishnan, Anand, Hamdi, Abdullah, Hammoud, Hasan Abed Al Kader, Herrmann, Vincent, Irie, Kazuki, Kirsch, Louis, Li, Bing, Li, Guohao, Liu, Shuming, Mai, Jinjie, Piękos, Piotr, Ramesh, Aditya, Schlag, Imanol, Shi, Weimin, Stanić, Aleksandar, Wang, Wenyi, Wang, Yuhui, Xu, Mengmeng, Fan, Deng-Ping, Ghanem, Bernard, Schmidhuber, Jürgen

arXiv.org Artificial IntelligenceMay-26-2023

Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overcome the limitations of single LLMs, improving multimodal zero-shot reasoning. In these natural language-based societies of mind (NLSOMs), new agents -- all communicating through the same universal symbolic language -- are easily added in a modular fashion. To demonstrate the power of NLSOMs, we assemble and experiment with several of them (having up to 129 members), leveraging mindstorms in them to solve some practical AI tasks: visual question answering, image captioning, text-to-image synthesis, 3D generation, egocentric retrieval, embodied AI, and general language-based task solving. We view this as a starting point towards much larger NLSOMs with billions of agents-some of which may be humans. And with this emergence of great societies of heterogeneous minds, many new research questions have suddenly become paramount to the future of artificial intelligence. What should be the social structure of an NLSOM? What would be the (dis)advantages of having a monarchical rather than a democratic structure? How can principles of NN economies be used to maximize the total reward of a reinforcement learning NLSOM? In this work, we identify, discuss, and try to answer some of these questions.

machine learning, natural language, proposal, (19 more...)

arXiv.org Artificial Intelligence

2305.17066

Country:

Asia (1.00)
Europe > United Kingdom > England (0.28)
North America > United States > California (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.34)

Industry:

Transportation > Air (1.00)
Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Distribution Similarity Based Regularizer for Learning Bayesian Networks

Kong, Weirui, Wang, Wenyi

arXiv.org Machine LearningAug-20-2018

Probabilistic graphical models compactly represent joint distributions by decomposing them into factors over subsets of random variables. In Bayesian networks, the factors are conditional probability distributions. For many problems, common information exists among those factors. Adding similarity restrictions can be viewed as imposing prior knowledge for model regularization. With proper restrictions, learned models usually generalize better. In this work, we study methods that exploit such high-level similarities to regularize the learning process and apply them to the task of modeling the wave propagation in inhomogeneous media. We propose a novel distribution-based penalization approach that encourages similar conditional probability distribution rather than force the parameters to be similar explicitly. We show in experiment that our proposed algorithm solves the modeling wave propagation problem, which other baseline methods are not able to solve.

bayesian inference, upstream oil & gas, wave propagation, (16 more...)

arXiv.org Machine Learning

1808.06347

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Formulation of Recursive Self-Improvement and Its Possible Efficiency

Wang, Wenyi

arXiv.org Artificial IntelligenceMay-17-2018

Recursive self-improving (RSI) systems have been dreamed of since the early days of computer science and artificial intelligence. However, many existing studies on RSI systems remain philosophical, and lacks clear formulation and results. In this paper, we provide a formal definition for one class of RSI systems, and then demonstrate the existence of computable and efficient RSI systems on a restricted version. We use simulation to empirically show that we achieve logarithmic runtime complexity with respect to the size of the search space, and these results suggest it is possible to achieve an efficient recursive self-improvement.

artificial intelligence, neural network, score function, (19 more...)

arXiv.org Artificial Intelligence

1805.0661

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.99)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.35)

Add feedback

Bayesian Optimization Using Monotonicity Information and Its Application in Machine Learning Hyperparameter

Wang, Wenyi, Welch, William J.

arXiv.org Machine LearningFeb-16-2018

We propose an algorithm for a family of optimization problems where the objective can be decomposed as a sum of functions with monotonicity properties. The motivating problem is optimization of hyperparameters of machine learning algorithms, where we argue that the objective, validation error, can be decomposed as monotonic functions of the hyperparameters. Our proposed algorithm adapts Bayesian optimization methods to incorporate the monotonicity constraints. We illustrate the advantages of exploiting monotonicity using illustrative examples and demonstrate the improvements in optimization efficiency for some machine learning hyperparameter tuning applications.

algorithm, neural network, optimization problem, (16 more...)

arXiv.org Machine Learning

1802.03532

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback