AITopics | mep

Collaborating Authors

mep

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

07a363fd2263091c2063998e0034999c-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 11:29:46 GMT

machine learning, natural language, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: Asia > Japan (0.28)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.96)

Add feedback

An Efficient End-to-End Training Approach for Zero-Shot Human-AI Coordination Xue Y an

Neural Information Processing SystemsFeb-7-2026, 13:23:49 GMT

The goal of zero-shot human-AI coordination is to develop an agent capable of collaborating with humans without relying on human data.

large language model, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)

Add feedback

Entropic Confinement and Mode Connectivity in Overparameterized Neural Networks

Di Carlo, Luca, Goddard, Chase, Schwab, David J.

arXiv.org Machine LearningDec-9-2025

Modern neural networks exhibit a striking property: basins of attraction in the loss landscape are often connected by low-loss paths, yet optimization dynamics generally remain confined to a single convex basin (Baity-Jesi et al., 2019; Juneja et al., 2023) and rarely explore intermediate points. We resolve this paradox by identifying entropic barriers arising from the interplay between curvature variations along these paths and noise in optimization dynamics. Empirically, we find that curvature systematically rises away from minima, producing effective forces that bias noisy dynamics back toward the endpoints -- even when the loss remains nearly flat. These barriers persist longer than energetic barriers, shaping the late-time localization of solutions in parameter space. Our results highlight the role of curvature-induced entropic forces in governing both connectivity and confinement in deep learning landscapes. Deep neural networks trained, in the overparametrized regime, exhibit a number of surprising and counterintuitive properties. One of the most striking is the observation that distinct solutions, found with standard optimization algorithms, are often connected by low-loss paths in parameter space (Garipov et al., 2018; Draxler et al., 2018; Frankle et al., 2020). Such mode connectivity results imply that the landscape is far less rugged than once assumed: minima that appear isolated are, in fact, linked by paths of low, nearly constant loss. At the same time, however, optimization dynamics display a seemingly contradictory behavior.

curvature, entropic force, minima, (16 more...)

arXiv.org Machine Learning

2512.06297

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

High-Resolution Probabilistic Data-Driven Weather Modeling with a Stretched-Grid

Nordhagen, Even Marius, Haugen, Håvard Homleid, Salihi, Aram Farhad Shafiq, Ingstad, Magnus Sikora, Nipen, Thomas Nils, Seierstad, Ivar Ambjørn, Frogner, Inger-Lise, Clare, Mariana, Lang, Simon, Chantry, Matthew, Dueben, Peter, Kristiansen, Jørn

arXiv.org Artificial IntelligenceDec-1-2025

We present a probabilistic data-driven weather model capable of providing an ensemble of high spatial resolution realizations of 87 variables at arbitrary forecast length and ensemble size. The model uses a stretched grid, dedicating 2.5 km resolution to a region of interest, and 31 km resolution elsewhere. Based on a stochastic encoder-decoder architecture, the model is trained using a loss function based on the Continuous Ranked Probability Score (CRPS) evaluated point-wise in real and spectral space. The spectral loss components is shown to be necessary to create fields that are spatially coherent. The model is compared to high-resolution operational numerical weather prediction forecasts from the MetCoOp Ensemble Prediction System (MEPS), showing competitive forecasts when evaluated against observations from surface weather stations. The model produced fields that are more spatially coherent than mean squared error based models and CRPS based models without the spectral component in the loss.

artificial intelligence, high-resolution probabilistic data-driven weather modeling, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2511.23043

Country: Europe > Norway (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Exploring Image Generation via Mutually Exclusive Probability Spaces and Local Correlation Hypothesis

Zhao, Chenqiu, Basu, Anup

arXiv.org Artificial IntelligenceSep-24-2025

A common assumption in probabilistic generative models for image generation is that learning the global data distribution suffices to generate novel images via sampling. We investigate the limitation of this core assumption, namely that learning global distributions leads to memorization rather than generative behavior. We propose two theoretical frameworks, the Mutually Exclusive Probability Space (MEPS) and the Local Dependence Hypothesis (LDH), for investigation. MEPS arises from the observation that deterministic mappings (e.g. We further propose a lower bound in terms of the overlap coefficient, and introduce a Binary Latent Autoencoder (BL-AE) that encodes images into signed binary latent representations. LDH formalizes dependence within a finite observation radius, which motivates our γ- Autoregressive Random V ariable Model (γ-ARVM). Using γ-ARVM, we observe that as the observation range increases, autoregressive models progressively shift toward memorization. In the limit of global dependence, the model behaves as a pure memorizer when operating on the binary latents produced by our BL-AE. Comprehensive experiments and discussions support our investigation. Figure 1: Selecting images for values in the overlap range is ambiguous. Probabilistic generative models such as V aria-tional Autoencoders (V AEs), Generative Adversarial Networks (GANs), diffusion models, and autoregressive models have achieved remarkable progress in image generation. A core assumption is that these models learn an image distribution from which new images can be generated via sampling (Bond-Taylor et al., 2022). Specifically, we focus on au-toregressive models. For this investigation, we introduce two theoretical frameworks.

artificial intelligence, autoregressive model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.21731

Country: North America (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Follow the MEP: Scalable Neural Representations for Minimum-Energy Path Discovery in Molecular Systems

Petersen, Magnus, Roig, Gemma, Covino, Roberto

arXiv.org Artificial IntelligenceSep-22-2025

Characterizing conformational transitions in physical systems remains a fundamental challenge, as traditional sampling methods struggle with the high-dimensional nature of molecular systems and high-energy barriers between stable states. These rare events often represent the most biologically significant processes, yet may require months of continuous simulation to observe. One way to understand the function and mechanics of such systems is through the minimum energy path (MEP), which represents the most probable transition pathway between stable states in the high-friction, low-temperature limit. We present a method that reformulates MEP discovery as a fast and scalable neural optimization problem. By representing paths as implicit neural representations and training with differentiable molecular force fields, our method discovers transition pathways without expensive sampling. Our approach scales to large biomolecular systems through a simple loss function derived from the path's likelihood via the Onsager-Machlup action and a scalable new architecture, AdaPath. We demonstrate this approach on two proteins, including an explicitly hydrated BPTI system with more than 3,500 atoms. Our method identifies a MEP that captures the same conformational change observed in a millisecond-scale molecular dynamics (MD) simulation in just minutes on a standard GPU, rather than weeks on a specialized cluster.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2504.16381

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Benchmarking Gender and Political Bias in Large Language Models

Yang, Jinrui, Han, Xudong, Baldwin, Timothy

arXiv.org Artificial IntelligenceSep-17-2025

We introduce EuroParlVote, a novel benchmark for evaluating large language models (LLMs) in politically sensitive contexts. It links European Parliament debate speeches to roll-call vote outcomes and includes rich demographic metadata for each Member of the European Parliament (MEP), such as gender, age, country, and political group. Using EuroParlVote, we evaluate state-of-the-art LLMs on two tasks -- gender classification and vote prediction -- revealing consistent patterns of bias. We find that LLMs frequently misclassify female MEPs as male and demonstrate reduced accuracy when simulating votes for female speakers. Politically, LLMs tend to favor centrist groups while underperforming on both far-left and far-right ones. Proprietary models like GPT-4o outperform open-weight alternatives in terms of both robustness and fairness. We release the EuroParlVote dataset, code, and demo to support future research on fairness and accountability in NLP within political contexts.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.06164

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Government > Foreign Policy (0.93)
Government > Regional Government > Europe Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Transferable Learning of Reaction Pathways from Geometric Priors

Nam, Juno, Steiner, Miguel, Misterka, Max, Yang, Soojung, Singhal, Avni, Gómez-Bombarelli, Rafael

arXiv.org Artificial IntelligenceApr-23-2025

Identifying minimum-energy paths (MEPs) is crucial for understanding chemical reaction mechanisms but remains computationally demanding. We introduce MEPIN, a scalable machine-learning method for efficiently predicting MEPs from reactant and product configurations, without relying on transition-state geometries or pre-optimized reaction paths during training. The task is defined as predicting deviations from geometric interpolations along reaction coordinates. We address this task with a continuous reaction path model based on a symmetry-broken equivariant neural network that generates a flexible number of intermediate structures. The model is trained using an energy-based objective, with efficiency enhanced by incorporating geometric priors from geodesic interpolation as initial interpolations or pre-training objectives. Our approach generalizes across diverse chemical reactions and achieves accurate alignment with reference intrinsic reaction coordinates, as demonstrated on various small molecule reactions and [3+2] cycloadditions. Our method enables the exploration of large chemical reaction spaces with efficient, data-driven predictions of reaction pathways.

artificial intelligence, machine learning, reaction, (19 more...)

arXiv.org Artificial Intelligence

2504.1537

Country: North America > United States > Massachusetts (0.29)

Genre: Research Report (1.00)

Industry:

Energy (0.93)
Government > Regional Government > North America Government > United States Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

EvoAgent: Agent Autonomous Evolution with Continual World Model for Long-Horizon Tasks

Feng, Tongtong, Wang, Xin, Zhou, Zekai, Wang, Ren, Zhan, Yuwei, Li, Guangyao, Li, Qing, Zhu, Wenwu

arXiv.org Artificial IntelligenceFeb-9-2025

Completing Long-Horizon (LH) tasks in open-ended worlds is an important yet difficult problem for embodied agents. Existing approaches suffer from two key challenges: (1) they heavily rely on experiences obtained from human-created data or curricula, lacking the ability to continuously update multimodal experiences, and (2) they may encounter catastrophic forgetting issues when faced with new tasks, lacking the ability to continuously update world knowledge. To solve these challenges, this paper presents EvoAgent, an autonomous-evolving agent with a continual World Model (WM), which can autonomously complete various LH tasks across environments through self-planning, self-control, and self-reflection, without human intervention. Our proposed EvoAgent contains three modules, i.e., i) the memory-driven planner which uses an LLM along with the WM and interaction memory, to convert LH tasks into executable sub-tasks; ii) the WM-guided action controller which leverages WM to generate low-level actions and incorporates a self-verification mechanism to update multimodal experiences; iii) the experience-inspired reflector which implements a two-stage curriculum learning algorithm to select experiences for task-adaptive WM updates. Moreover, we develop a continual World Model for EvoAgent, which can continuously update the multimodal experience pool and world knowledge through closed-loop dynamics. We conducted extensive experiments on Minecraft, compared with existing methods, EvoAgent can achieve an average success rate improvement of 105% and reduce ineffective actions by more than 6x.

evoagent, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2502.05907

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Increasing transformer token length with a Maximum Entropy Principle Method

Cukier, R. I.

arXiv.org Artificial IntelligenceAug-17-2024

Transformers suffer from the computational overhead of their quadratic dependence on the length of sequences processed. We present three methods, all adding an intermediate step between training and inference/generation, which extend the autoregressive length of transformers. All rely on a Maximum Entropy Principle (MEP) whereby entropy is maximized in the presence of suitable constraints, accounted for by use of Lagrange Multipliers. These constraint methods extend the autoregressive character from T to 2T tokens in a linear-with-T fashion. There is overhead associated with this added step, but they should still be faster than the standard methods.

constraint equation, equation, probability, (11 more...)

arXiv.org Artificial Intelligence

2408.10277

Country:

North America > United States > New York (0.04)
North America > United States > Michigan > Ingham County > Lansing (0.04)
North America > United States > Michigan > Ingham County > East Lansing (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.61)

Add feedback