AITopics

2501.16867

Country:

North America > United States (0.46)
Europe (0.28)
Asia > India (0.28)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Tiefenbacher, Maximilian X., Bachmair, Brigitta, Chen, Cheng Giuseppe, Westermayr, Julia, Marquetand, Philipp, Dietschreit, Johannes C. B., González, Leticia

Excited-state nonadiabatic dynamics in explicit solvent using machine learned interatomic potentials

arXiv.org Artificial IntelligenceJan-28-2025

Excited-state nonadiabatic simulations with quantum mechanics/molecular mechanics (QM/MM) are essential to understand photoinduced processes in explicit environments. However, the high computational cost of the underlying quantum chemical calculations limits its application in combination with trajectory surface hopping methods. Here, we use FieldSchNet, a machine-learned interatomic potential capable of incorporating electric field effects into the electronic states, to replace traditional QM/MM electrostatic embedding with its ML/MM counterpart for nonadiabatic excited state trajectories. The developed method is applied to furan in water, including five coupled singlet states. Our results demonstrate that with sufficiently curated training data, the ML/MM model reproduces the electronic kinetics and structural rearrangements of QM/MM surface hopping reference simulations. Furthermore, we identify performance metrics that provide robust and interpretable validation of model accuracy.

artificial intelligence, machine learning, trajectory, (15 more...)

2501.16974

Country:

Europe > Austria > Vienna (0.14)
North America > United States (0.14)
Europe > Germany > Saxony > Leipzig (0.04)
Europe > Italy (0.04)

Genre: Research Report > New Finding (0.86)

Industry:

Energy (0.46)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceJan-28-2025

PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding

Chow, Wei, Mao, Jiageng, Li, Boyi, Seita, Daniel, Guizilini, Vitor, Wang, Yue

Understanding the physical world is a fundamental challenge in embodied AI, critical for enabling agents to perform complex tasks and operate safely in real-world environments. While Vision-Language Models (VLMs) have shown great promise in reasoning and task planning for embodied agents, their ability to comprehend physical phenomena remains extremely limited. To close this gap, we introduce PhysBench, a comprehensive benchmark designed to evaluate VLMs' physical world understanding capability across a diverse set of tasks. PhysBench contains 10,002 entries of interleaved video-image-text data, categorized into four major domains: physical object properties, physical object relationships, physical scene understanding, and physics-based dynamics, further divided into 19 subclasses and 8 distinct capability dimensions. Our extensive experiments, conducted on 75 representative VLMs, reveal that while these models excel in common-sense reasoning, they struggle with understanding the physical world -- likely due to the absence of physical knowledge in their training data and the lack of embedded physical priors. To tackle the shortfall, we introduce PhysAgent, a novel framework that combines the generalization strengths of VLMs with the specialized expertise of vision models, significantly enhancing VLMs' physical understanding across a variety of tasks, including an 18.4\% improvement on GPT-4o. Furthermore, our results demonstrate that enhancing VLMs' physical world understanding capabilities can help embodied agents such as MOKA. We believe that PhysBench and PhysAgent offer valuable insights and contribute to bridging the gap between VLMs and physical world understanding.

ieee cvf international conference, large language model, machine learning, (19 more...)

2501.16411

Country:

North America > United States > California (0.13)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (1.00)
Information Technology (1.00)
Education > Educational Setting (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Graziani, Mara, Molnar, Malina, Morales, Irina Espejo, Cadow-Gossweiler, Joris, Laino, Teodoro

Making Sense of Data in the Wild: Data Analysis Automation at Scale

As the volume of publicly available data continues to grow, researchers face the challenge of limited diversity in benchmarking machine learning tasks. Although thousands of datasets are available in public repositories, the sheer abundance often complicates the search for suitable data, leaving many valuable datasets underexplored. This situation is further amplified by the fact that, despite longstanding advocacy for improving data curation quality, current solutions remain prohibitively time-consuming and resource-intensive. In this paper, we propose a novel approach that combines intelligent agents with retrieval augmented generation to automate data analysis, dataset curation and indexing at scale. Our system leverages multiple agents to analyze raw, unstructured data across public repositories, generating dataset reports and interactive visual indexes that can be easily explored. We demonstrate that our approach results in more detailed dataset descriptions, higher hit rates and greater diversity in dataset retrieval tasks. Additionally, we show that the dataset reports generated by our method can be leveraged by other machine learning models to improve the performance on specific tasks, such as improving the accuracy and realism of synthetic data generation. By streamlining the process of transforming raw data into machine-learning-ready datasets, our approach enables researchers to better utilize existing data resources.

large language model, machine learning, natural language, (20 more...)

2502.15718

Country:

Europe > Germany (0.28)
North America > United States > New York (0.14)
North America > United States > Massachusetts > Middlesex County (0.14)
(6 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (1.00)
Health & Medicine (1.00)
Transportation (0.93)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark

Yi, Dongyi, Zhu, Guibo, Ding, Chenglin, Li, Zongshu, Yi, Dong, Wang, Jinqiao

With the rapid advancement of Multimodal Large Language Models (MLLMs), numerous evaluation benchmarks have emerged. However, comprehensive assessments of their performance across diverse industrial applications remain limited. In this paper, we introduce MME-Industry, a novel benchmark designed specifically for evaluating MLLMs in industrial settings.The benchmark encompasses 21 distinct domain, comprising 1050 question-answer pairs with 50 questions per domain. To ensure data integrity and prevent potential leakage from public datasets, all question-answer pairs were manually crafted and validated by domain experts. Besides, the benchmark's complexity is effectively enhanced by incorporating non-OCR questions that can be answered directly, along with tasks requiring specialized domain knowledge. Moreover, we provide both Chinese and English versions of the benchmark, enabling comparative analysis of MLLMs' capabilities across these languages. Our findings contribute valuable insights into MLLMs' practical industrial applications and illuminate promising directions for future model optimization research.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2501.16688

Country: Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Energy (1.00)
Materials > Chemicals (0.94)
Automobiles & Trucks (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Vision (0.94)

Safe Reinforcement Learning for Real-World Engine Control

Bedei, Julian, Koch, Lucas, Badalian, Kevin, Winkler, Alexander, Schaber, Patrick, Andert, Jakob

This work introduces a toolchain for applying Reinforcement Learning (RL), specifically the Deep Deterministic Policy Gradient (DDPG) algorithm, in safety-critical real-world environments. As an exemplary application, transient load control is demonstrated on a single-cylinder internal combustion engine testbench in Homogeneous Charge Compression Ignition (HCCI) mode, that offers high thermal efficiency and low emissions. However, HCCI poses challenges for traditional control methods due to its nonlinear, autoregressive, and stochastic nature. RL provides a viable solution, however, safety concerns, such as excessive pressure rise rates, must be addressed when applying to HCCI. A single unsuitable control input can severely damage the engine or cause misfiring and shut down. Additionally, operating limits are not known a priori and must be determined experimentally. To mitigate these risks, real-time safety monitoring based on the k-nearest neighbor algorithm is implemented, enabling safe interaction with the testbench. The feasibility of this approach is demonstrated as the RL agent learns a control policy through interaction with the testbench. A root mean square error of 0.1374 bar is achieved for the indicated mean effective pressure, comparable to neural network-based controllers from the literature. The toolchain's flexibility is further demonstrated by adapting the agent's policy to increase ethanol energy shares, promoting renewable fuel use while maintaining safety. This RL approach addresses the longstanding challenge of applying RL to safety-critical real-world environments. The developed toolchain, with its adaptability and safety mechanisms, paves the way for future applicability of RL in engine testbenches and other safety-critical settings.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2501.16613

Country:

South America > Uruguay > Maldonado > Maldonado (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > Carlsbad (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)

Genre: Research Report (0.82)

Industry:

Materials > Chemicals > Commodity Chemicals (0.59)
Energy > Renewable > Biofuel > Ethanol (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.88)

Cissé, Abdoulatif, Evangelopoulos, Xenophon, Gusev, Vladimir V., Cooper, Andrew I.

Language-Based Bayesian Optimization Research Assistant (BORA)

Many important scientific problems involve multivariate optimization coupled with slow and laborious experimental measurements. These complex, high-dimensional searches can be defined by non-convex optimization landscapes that resemble needle-in-a-haystack surfaces, leading to entrapment in local minima. Contextualizing optimizers with human domain knowledge is a powerful approach to guide searches to localized fruitful regions. However, this approach is susceptible to human confirmation bias and it is also challenging for domain experts to keep track of the rapidly expanding scientific literature. Here, we propose the use of Large Language Models (LLMs) for contextualizing Bayesian optimization (BO) via a hybrid optimization framework that intelligently and economically blends stochastic inference with domain knowledge-based insights from the LLM, which is used to suggest new, better-performing areas of the search space for exploration. Our method fosters user engagement by offering real-time commentary on the optimization progress, explaining the reasoning behind the search strategies. We validate the effectiveness of our approach on synthetic benchmarks with up to 15 independent variables and demonstrate the ability of LLMs to reason in four real-world experimental tasks where context-aware suggestions boost optimization performance substantially.

experiment, large language model, natural language, (20 more...)

2501.16224

Country:

Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Oregon > Jackson County > Central Point (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Materials > Chemicals (0.99)
Energy > Renewable > Solar (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

From Molecules to Mixtures: Learning Representations of Olfactory Mixture Similarity using Inductive Biases

Tom, Gary, Ser, Cher Tian, Rajaonson, Ella M., Lo, Stanley, Park, Hyun Suk, Lee, Brian K., Sanchez-Lengeling, Benjamin

Olfaction -- how molecules are perceived as odors to humans -- remains poorly understood. Recently, the principal odor map (POM) was introduced to digitize the olfactory properties of single compounds. However, smells in real life are not pure single molecules, but complex mixtures of molecules, whose representations remain relatively under-explored. In this work, we introduce POMMix, an extension of the POM to represent mixtures. Our representation builds upon the symmetries of the problem space in a hierarchical manner: (1) graph neural networks for building molecular embeddings, (2) attention mechanisms for aggregating molecular representations into mixture representations, and (3) cosine prediction heads to encode olfactory perceptual distance in the mixture embedding space. POMMix achieves state-of-the-art predictive performance across multiple datasets. We also evaluate the generalizability of the representation on multiple splits when applied to unseen molecules and mixture sizes. Our work advances the effort to digitize olfaction, and highlights the synergy of domain expertise and deep learning in crafting expressive representations in low-data regimes.

dataset, molecule, representation, (13 more...)

2501.16271

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Weinheim (0.04)

Genre: Research Report (0.83)

Industry:

Materials > Chemicals (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

arXiv.org Artificial IntelligenceJan-26-2025

AirIO: Learning Inertial Odometry with Enhanced IMU Feature Observability

Qiu, Yuheng, Xu, Can, Chen, Yutian, Zhao, Shibo, Geng, Junyi, Scherer, Sebastian

Inertial odometry (IO) using only Inertial Measurement Units (IMUs) offers a lightweight and cost-effective solution for Unmanned Aerial Vehicle (UAV) applications, yet existing learning-based IO models often fail to generalize to UAVs due to the highly dynamic and non-linear-flight patterns that differ from pedestrian motion. In this work, we identify that the conventional practice of transforming raw IMU data to global coordinates undermines the observability of critical kinematic information in UAVs. By preserving the body-frame representation, our method achieves substantial performance improvements, with a 66.7% average increase in accuracy across three datasets. Furthermore, explicitly encoding attitude information into the motion network results in an additional 23.8% improvement over prior results. Combined with a data-driven IMU correction model (AirIMU) and an uncertainty-aware Extended Kalman Filter (EKF), our approach ensures robust state estimation under aggressive UAV maneuvers without relying on external sensors or control inputs. Notably, our method also demonstrates strong generalizability to unseen data not included in the training set, underscoring its potential for real-world UAV applications.

artificial intelligence, dataset, machine learning, (19 more...)

2501.15659

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.82)

Industry:

Information Technology > Robotics & Automation (0.88)
Aerospace & Defense > Aircraft (0.66)
Materials > Chemicals > Industrial Gases > Liquified Gas (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Haddadnia, Mohammad, Grashoff, Leonie, Strieth-Kalthoff, Felix

BoTier: Multi-Objective Bayesian Optimization with Tiered Composite Objectives

arXiv.org Machine LearningJan-26-2025

Scientific optimization problems are usually concerned with balancing multiple competing objectives, which come as preferences over both the outcomes of an experiment (e.g. maximize the reaction yield) and the corresponding input parameters (e.g. minimize the use of an expensive reagent). Typically, practical and economic considerations define a hierarchy over these objectives, which must be reflected in algorithms for sample-efficient experiment planning. Herein, we introduce BoTier, a composite objective that can flexibly represent a hierarchy of preferences over both experiment outcomes and input parameters. We provide systematic benchmarks on synthetic and real-life surfaces, demonstrating the robust applicability of BoTier across a number of use cases. Importantly, BoTier is implemented in an auto-differentiable fashion, enabling seamless integration with the BoTorch library, thereby facilitating adoption by the scientific community.

artificial intelligence, machine learning, objective, (14 more...)

arXiv.org Machine Learning

2501.15554

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Germany (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Materials > Chemicals (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)