Goto

Collaborating Authors

 Cui, Hao


CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning

arXiv.org Artificial Intelligence

Scientific problem-solving involves synthesizing information while applying expert knowledge. We introduce CURIE, a scientific long-Context Understanding,Reasoning and Information Extraction benchmark to measure the potential of Large Language Models (LLMs) in scientific problem-solving and assisting scientists in realistic workflows. This benchmark introduces ten challenging tasks with a total of 580 problems and solution pairs curated by experts in six disciplines - materials science, condensed matter physics, quantum computing, geospatial analysis, biodiversity, and proteins - covering both experimental and theoretical work-flows in science. We evaluate a range of closed and open LLMs on tasks in CURIE which requires domain expertise, comprehension of long in-context information,and multi-step reasoning. While Gemini Flash 2.0 and Claude-3 show consistent high comprehension across domains, the popular GPT-4o and command-R+ fail dramatically on protein sequencing tasks. With the best performance at 32% there is much room for improvement for all models. We hope that insights gained from CURIE can guide the future development of LLMs in sciences. Evaluation code and data are in https://github.com/google/curie


AI-enhanced Collective Intelligence: The State of the Art and Prospects

arXiv.org Artificial Intelligence

The current societal challenges exceed the capacity of human individual or collective effort alone. As AI evolves, its role within human collectives is poised to vary from an assistive tool to a participatory member. Humans and AI possess complementary capabilities that, when synergized, can achieve a level of collective intelligence that surpasses the collective capabilities of either humans or AI in isolation. However, the interactions in human-AI systems are inherently complex, involving intricate processes and interdependencies. This review incorporates perspectives from network science to conceptualize a multilayer representation of human-AI collective intelligence, comprising a cognition layer, a physical layer, and an information layer. Within this multilayer network, humans and AI agents exhibit varying characteristics; humans differ in diversity from surface-level to deep-level attributes, while AI agents range in degrees of functionality and anthropomorphism. The interplay among these agents shapes the overall structure and dynamics of the system. We explore how agents' diversity and interactions influence the system's collective intelligence. Furthermore, we present an analysis of real-world instances of AI-enhanced collective intelligence. We conclude by addressing the potential challenges in AI-enhanced collective intelligence and offer perspectives on future developments in this field.


From Stochastic Planning to Marginal MAP

Neural Information Processing Systems

It is well known that the problems of stochastic planning and probabilistic inference are closely related. This paper makes two contributions in this context. The first is to provide an analysis of the recently developed SOGBOFA heuristic planning algorithm that was shown to be effective for problems with large factored state and action spaces. It is shown that SOGBOFA can be seen as a specialized inference algorithm that computes its solutions through a combination of a symbolic variant of belief propagation and gradient ascent. The second contribution is a new solver for Marginal MAP (MMAP) inference. We introduce a new reduction from MMAP to maximum expected utility problems which are suitable for the symbolic computation in SOGBOFA. This yields a novel algebraic gradient-based solver (AGS) for MMAP. An experimental evaluation illustrates the potential of AGS in solving difficult MMAP problems.


From Stochastic Planning to Marginal MAP

Neural Information Processing Systems

It is well known that the problems of stochastic planning and probabilistic inference are closely related. This paper makes two contributions in this context. The first is to provide an analysis of the recently developed SOGBOFA heuristic planning algorithm that was shown to be effective for problems with large factored state and action spaces. It is shown that SOGBOFA can be seen as a specialized inference algorithm that computes its solutions through a combination of a symbolic variant of belief propagation and gradient ascent. The second contribution is a new solver for Marginal MAP (MMAP) inference. We introduce a new reduction from MMAP to maximum expected utility problems which are suitable for the symbolic computation in SOGBOFA. This yields a novel algebraic gradient-based solver (AGS) for MMAP. An experimental evaluation illustrates the potential of AGS in solving difficult MMAP problems.


Lifted Stochastic Planning, Belief Propagation and Marginal MAP

AAAI Conferences

It is well known that the problems of stochastic planning and probabilistic inference are closely related. This paper makes several contributions in this context for factored spaces where the complexity of solutions is challenging. First, we analyze the recently developed SOGBOFA heuristic, which performs stochastic planning by building an explicit computation graph capturing an approximate aggregate simulation of the dynamics. It is shown that the values computed by this algorithm are identical to the approximation provided by Belief Propagation (BP). Second, as a consequence of this observation, we show how ideas on lifted BP can be used to develop a lifted version of SOGBOFA. Unlike implementations of lifted BP, Lifted SOGBOFA has a very simple implementation as a dynamic programming version of the original graph construction. Third, we show that the idea of graph construction for aggregate simulation can be used to solve marginal MAP (MMAP) problems in Bayesian networks, where MAP variables are constrained to be at roots of the network. This yields a novel algorithm for MMAP for this subclass. An experimental evaluation illustrates the advantage of Lifted SOGBOFA for planning.


Factored MCTS for Large Scale Stochastic Planning

AAAI Conferences

This paper investigates stochastic planning problemswith large factored state and action spaces. We show that even with moderate increase in the size of existing challenge problems, the performance of state of the art algorithms deteriorates rapidly, making them ineffective.To address this problem we propose a family of simple but scalable online planning algorithms that combine sampling, as in Monte Carlo tree search, with “aggregation,” where the aggregation approximates a distribution over random variables by the product of their marginals. The algorithms are correct under some rather strong technical conditions and can serve as an unsound but effective heuristic when the conditions do not hold. An extensive experimental evaluation demonstrates that the new algorithms provide significant improvement over the state of the art when solving largeproblems in a number of challenge benchmark domains.