AITopics

AI agents have been evaluated in isolation or within small groups, where interactions remain limited in scope and complexity. Large-scale simulations involving many autonomous agents -- reflecting the full spectrum of civilizational processes -- have yet to be explored. Here, we demonstrate how 10 - 1000+ AI agents behave and progress within agent societies. We first introduce the PIANO (Parallel Information Aggregation via Neural Orchestration) architecture, which enables agents to interact with humans and other agents in real-time while maintaining coherence across multiple output streams. We then evaluate agent performance in agent simulations using civilizational benchmarks inspired by human history. These simulations, set within a Minecraft environment, reveal that agents are capable of meaningful progress -- autonomously developing specialized roles, adhering to and changing collective rules, and engaging in cultural and religious transmission. These preliminary results show that agents can achieve significant milestones towards AI civilizations, opening new avenues for large simulations, agentic organizational intelligence, and integrating AI into human civilizations.

agent, artificial intelligence, arxiv preprint arxiv, (15 more...)

2411.00114

Genre: Research Report > New Finding (1.00)

Industry:

Materials > Metals & Mining (1.00)
Law (1.00)
Government (1.00)
Leisure & Entertainment > Games > Computer Games (0.37)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)

Brzychczy, Edyta, Pełech-Pilichowski, Tomasz, Dworakowski, Ziemowit

Case ID detection based on time series data -- the mining use case

Process mining gains increasing popularity in business process analysis, also in heavy industry. It requires a specific data format called an event log, with the basic structure including a case identifier (case ID), activity (event) name, and timestamp. In the case of industrial processes, data is very often provided by a monitoring system as time series of low level sensor readings. This data cannot be directly used for process mining since there is no explicit marking of activities in the event log, and sometimes, case ID is not provided. We propose a novel rule-based algorithm for identification patterns, based on the identification of significant changes in short-term mean values of selected variable to detect case ID. We present our solution on the mining use case. We compare computed results (identified patterns) with expert labels of the same dataset. Experiments show that the developed algorithm in the most of the cases correctly detects IDs in datasets with and without outliers reaching F1 score values: 96.8% and 97% respectively. We also evaluate our algorithm on dataset from manufacturing domain reaching value 92.6% for F1 score.

algorithm, dataset, sensor data, (13 more...)

2410.23846

Country:

Europe > Switzerland (0.04)
Europe > Poland > Lesser Poland Province > Kraków (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Materials > Metals & Mining (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Local Superior Soups: A Catalyst for Model Merging in Cross-Silo Federated Learning

Chen, Minghui, Jiang, Meirui, Zhang, Xin, Dou, Qi, Wang, Zehua, Li, Xiaoxiao

Federated learning (FL) is a learning paradigm that enables collaborative training of models using decentralized data. Recently, the utilization of pre-trained weight initialization in FL has been demonstrated to effectively improve model performance. However, the evolving complexity of current pre-trained models, characterized by a substantial increase in parameters, markedly intensifies the challenges associated with communication rounds required for their adaptation to FL. To address these communication cost issues and increase the performance of pre-trained model adaptation in FL, we propose an innovative model interpolation-based local training technique called ``Local Superior Soups.'' Our method enhances local training across different clients, encouraging the exploration of a connected low-loss basin within a few communication rounds through regularized model interpolation. This approach acts as a catalyst for the seamless adaptation of pre-trained models in in FL. We demonstrated its effectiveness and efficiency across diverse widely-used FL datasets. Our code is available at \href{https://github.com/ubc-tea/Local-Superior-Soups}{https://github.com/ubc-tea/Local-Superior-Soups}.

accuracy, communication round, dataset, (15 more...)

2410.2366

Country:

North America > United States > Virginia (0.04)
Asia > China > Hong Kong (0.04)
North America > Canada > British Columbia (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.66)

Industry:

Information Technology (0.67)
Materials > Chemicals > Specialty Chemicals (0.60)
Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Mars: Situated Inductive Reasoning in an Open-World Environment

Tang, Xiaojuan, Li, Jiaqi, Liang, Yitao, Zhu, Song-chun, Zhang, Muhan, Zheng, Zilong

Large Language Models (LLMs) trained on massive corpora have shown remarkable success in knowledge-intensive tasks. Yet, most of them rely on pre-stored knowledge. Inducing new general knowledge from a specific environment and performing reasoning with the acquired knowledge -- \textit{situated inductive reasoning}, is crucial and challenging for machine intelligence. In this paper, we design Mars, an interactive environment devised for situated inductive reasoning. It introduces counter-commonsense game mechanisms by modifying terrain, survival setting and task dependency while adhering to certain principles. In Mars, agents need to actively interact with their surroundings, derive useful rules and perform decision-making tasks in specific contexts. We conduct experiments on various RL-based and LLM-based methods, finding that they all struggle on this challenging situated inductive reasoning benchmark. Furthermore, we explore \textit{Induction from Reflection}, where we instruct agents to perform inductive reasoning from history trajectory. The superior performance underscores the importance of inductive reasoning in Mars. Through Mars, we aim to galvanize advancements in situated inductive reasoning and set the stage for developing the next generation of AI systems that can reason in an adaptive and context-sensitive way.

diamond, func, pickaxe, (15 more...)

2410.08126

Country:

Asia > China (0.04)
North America > United States > Massachusetts (0.04)
North America > Montserrat (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Workflow (0.93)
Research Report > New Finding (0.92)

Industry:

Materials > Metals & Mining > Diamonds (0.67)
Materials > Metals & Mining > Coal (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

König, Gunnar, Günther, Eric, von Luxburg, Ulrike

Disentangling Interactions and Dependencies in Feature Attribution

arXiv.org Machine LearningOct-31-2024

In explainable machine learning, global feature importance methods try to determine how much each individual feature contributes to predicting the target variable, resulting in one importance score for each feature. But often, predicting the target variable requires interactions between several features (such as in the XOR function), and features might have complex statistical dependencies that allow to partially replace one feature with another one. In commonly used feature importance scores these cooperative effects are conflated with the features' individual contributions, making them prone to misinterpretations. In this work, we derive DIP, a new mathematical decomposition of individual feature importance scores that disentangles three components: the standalone contribution and the contributions stemming from interactions and dependencies. We prove that the DIP decomposition is unique and show how it can be estimated in practice. Based on these results, we propose a new visualization of feature importance scores that clearly illustrates the different contributions.

decomposition, dependency, interaction, (12 more...)

arXiv.org Machine Learning

2410.23772

Country:

North America > United States > California (0.05)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report (1.00)

Industry: Materials (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-30-2024

On Memorization of Large Language Models in Logical Reasoning

Xie, Chulin, Huang, Yangsibo, Zhang, Chiyuan, Yu, Da, Chen, Xinyun, Lin, Bill Yuchen, Li, Bo, Ghazi, Badih, Kumar, Ravi

Large language models (LLMs) achieve good performance on challenging reasoning benchmarks, yet could also make basic reasoning mistakes. This contrasting behavior is puzzling when it comes to understanding the mechanisms behind LLMs' reasoning capabilities. One hypothesis is that the increasingly high and nearly saturated performance on common reasoning benchmarks could be due to the memorization of similar problems. In this paper, we systematically investigate this hypothesis with a quantitative measurement of memorization in reasoning tasks, using a dynamically generated logical reasoning benchmark based on Knights and Knaves (K&K) puzzles. We found that LLMs could interpolate the training puzzles (achieving near-perfect accuracy) after fine-tuning, yet fail when those puzzles are slightly perturbed, suggesting that the models heavily rely on memorization to solve those training puzzles. On the other hand, we show that while fine-tuning leads to heavy memorization, it also consistently improves generalization performance. In-depth analyses with perturbation tests, cross difficulty-level transferability, probing model internals, and fine-tuning with wrong answers suggest that the LLMs learn to reason on K&K puzzles despite training data memorization. This phenomenon indicates that LLMs exhibit a complex interplay between memorization and genuine reasoning abilities. Finally, our analysis with per-sample memorization score sheds light on how LLMs switch between reasoning and memorization in solving logical puzzles. Our code and data are available at https://memkklogic.github.io.

large language model, machine learning, puzzle, (20 more...)

2410.23123

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.92)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.45)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.45)
Energy > Oil & Gas > Midstream (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Ravendran, Ahalya, Bryson, Mitch, Dansereau, Donald G.

LBurst: Learning-Based Robotic Burst Feature Extraction for 3D Reconstruction in Low Light

arXiv.org Artificial IntelligenceOct-30-2024

Abstract-- Drones have revolutionized the fields of aerial imaging, mapping, and disaster recovery. However, the deployment of drones in low-light conditions is constrained by the image quality produced by their on-board cameras. In this paper, we present a learning architecture for improving 3D reconstructions in low-light conditions by finding features in a burst. Our approach enhances visual reconstruction by detecting and describing high quality true features and less spurious features in low signal-to-noise ratio images. We demonstrate that our method is capable of handling challenging scenes in millilux illumination, making it a significant step towards drones operating at night and in extremely low-light applications such as underground mining and search and rescue operations.

computer vision, reconstruction, robotic burst, (13 more...)

2410.23522

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.50)

Industry: Materials > Metals & Mining (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

arXiv.org Artificial IntelligenceOct-30-2024

TumblerBots: Tumbling Robotic sensors for Minimally-invasive Benthic Monitoring

Romanello, L., Teboul, A., Wiesemuller, F., Nguyen, P. H., Kovac, M., Armanini, S. F.

--Robotic systems show significant promise for water environmental sensing applications such as water quality monitoring, pollution mapping and biodiversity data collection. Conventional deployment methods often disrupt fragile ecosystems, preventing depiction of the undisturbed environmental condition. In response to this challenge, we propose a novel framework utilizing a lightweight tumbler system equipped with a sensing unit, deployed via a drone. The sensing unit is detached once on the water surface, enabling precise and non-invasive data collection from the benthic zone. The tumbler is designed to be lightweight and compact, enabling deployment via a drone. The sensing pod, which detaches from the tumbler and descends to the bottom of the water body, is equipped with temperature and pressure sensors, as well as a buoyancy system. The later, activated upon task completion, utilizes a silicon membrane inflated via a chemical reaction. The reaction generates a pressure of 70 kPa, causing the silicon membrane to expand by 30%, which exceeds the 5.7% volume increase required for positive buoyancy. The tumblers, made from ecofriendly materials to minimize environmental impact when lost during the mission, were tested for their gliding ratio and descent rate. Additionally, the system demonstrated robustness in moderate to strong wind conditions during outdoor tests, validating the overall framework.

payload, sensor, tumbler, (16 more...)

2410.23049

Country:

North America > United States > California (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Germany > Saarland (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Materials > Chemicals (0.93)
Energy (0.93)
Transportation > Air (0.69)

Technology:

Information Technology > Communications > Networks > Sensor Networks (0.48)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)

arXiv.org Artificial IntelligenceOct-28-2024

Molecular Dynamics and Machine Learning Unlock Possibilities in Beauty Design -- A Perspective

Xu, Yuzhi, Ni, Haowei, Gao, Qinhui, Chang, Chia-Hua, Huo, Yanran, Zhao, Fanyu, Hu, Shiyu, Xia, Wei, Zhang, Yike, Grovu, Radu, He, Min, Zhang, John. Z. H., Wang, Yuanqing

Computational molecular design -- the endeavor to design molecules, with various missions, aided by machine learning and molecular dynamics approaches, has been widely applied to create valuable new molecular entities, from small molecule therapeutics to protein biologics. In the small data regime, physics-based approaches model the interaction between the molecule being designed and proteins of key physiological functions, providing structural insights into the mechanism. When abundant data has been collected, a quantitative structure-activity relationship (QSAR) can be more directly constructed from experimental data, from which machine learning can distill key insights to guide the design of the next round of experiment design. Machine learning methodologies can also facilitate physical modeling, from improving the accuracy of force fields and extending them to unseen chemical spaces, to more directly enhancing the sampling on the conformational spaces. We argue that these techniques are mature enough to be applied to not just extend the longevity of life, but the beauty it manifests. In this perspective, we review the current frontiers in the research \& development of skin care products, as well as the statistical and physical toolbox applicable to addressing the challenges in this industry. Feasible interdisciplinary research projects are proposed to harness the power of machine learning tools to design innovative, effective, and inexpensive skin care products.

large language model, machine learning, natural language, (22 more...)

doi: 10.1063/5.0245365

2410.18101

Country: North America > United States > California (0.28)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.93)
Research Report > Strength High (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceOct-28-2024

MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses

Yang, Zonglin, Liu, Wanhao, Gao, Ben, Xie, Tong, Li, Yuqiang, Ouyang, Wanli, Poria, Soujanya, Cambria, Erik, Zhou, Dongzhan

Scientific discovery contributes largely to human society's prosperity, and recent progress shows that LLMs could potentially catalyze this process. However, it is still unclear whether LLMs can discover novel and valid hypotheses in chemistry. In this work, we investigate this central research question: Can LLMs automatically discover novel and valid chemistry research hypotheses given only a chemistry research background (consisting of a research question and/or a background survey), without limitation on the domain of the research question? After extensive discussions with chemistry experts, we propose an assumption that a majority of chemistry hypotheses can be resulted from a research background and several inspirations. With this key insight, we break the central question into three smaller fundamental questions. In brief, they are: (1) given a background question, whether LLMs can retrieve good inspirations; (2) with background and inspirations, whether LLMs can lead to hypothesis; and (3) whether LLMs can identify good hypotheses to rank them higher. To investigate these questions, we construct a benchmark consisting of 51 chemistry papers published in Nature, Science, or a similar level in 2024 (all papers are only available online since 2024). Every paper is divided by chemistry PhD students into three components: background, inspirations, and hypothesis. The goal is to rediscover the hypothesis, given only the background and a large randomly selected chemistry literature corpus consisting the ground truth inspiration papers, with LLMs trained with data up to 2023. We also develop an LLM-based multi-agent framework that leverages the assumption, consisting of three stages reflecting the three smaller questions. The proposed method can rediscover many hypotheses with very high similarity with the ground truth ones, covering the main innovations.

artificial intelligence, large language model, natural language, (19 more...)

2410.07076

Country:

Europe > Austria > Vienna (0.14)
Asia > Singapore (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.75)

Industry:

Materials > Chemicals (0.93)
Energy (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)