mace
MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields
Creating fast and accurate force fields is a long-standing challenge in computational chemistry and materials science. Recently, Equivariant Message Passing Neural Networks (MPNNs) have emerged as a powerful tool for building machine learning interatomic potentials, outperforming other approaches in terms of accuracy. However, they suffer from high computational cost and poor scalability. Moreover, most MPNNs only pass two-body messages leading to an intricate relationship between the number of layers and the expressivity of the features. This work introduces MACE, a new equivariant MPNN model that uses higher order messages, and demonstrates that this leads to an improved learning law. We show that by using four-body messages, the required number of message passing iterations reduces to just one, resulting in a fast and highly parallelizable model, reaching or exceeding state of the art accuracy on the rMD17 and 3BPA benchmark tasks. Our implementation is available at https://github.com/ACEsuit/mace.
Learning Potential Energy Surfaces of Hydrogen Atom Transfer Reactions in Peptides
Neubert, Marlen, Reiser, Patrick, Gräter, Frauke, Friederich, Pascal
Hydrogen atom transfer (HAT) reactions are essential in many biological processes, such as radical migration in damaged proteins, but their mechanistic pathways remain incompletely understood. Simulating HAT is challenging due to the need for quantum chemical accuracy at biologically relevant scales; thus, neither classical force fields nor DFT-based molecular dynamics are applicable. Machine-learned potentials offer an alternative, able to learn potential energy surfaces (PESs) with near-quantum accuracy. However, training these models to generalize across diverse HAT configurations, especially at radical positions in proteins, requires tailored data generation and careful model selection. Here, we systematically generate HAT configurations in peptides to build large datasets using semiempirical methods and DFT. We benchmark three graph neural network architectures (SchNet, Allegro, and MACE) on their ability to learn HAT PESs and indirectly predict reaction barriers from energy predictions. MACE consistently outperforms the others in energy, force, and barrier prediction, achieving a mean absolute error of 1.13 kcal/mol on out-of-distribution DFT barrier predictions. Using molecular dynamics, we show our MACE potential is stable, reactive, and generalizes beyond training data to model HAT barriers in collagen I. This accuracy enables integration of ML potentials into large-scale collagen simulations to compute reaction rates from predicted barriers, advancing mechanistic understanding of HAT and radical migration in peptides. We analyze scaling laws, model transferability, and cost-performance trade-offs, and outline strategies for improvement by combining ML potentials with transition state search algorithms and active learning. Our approach is generalizable to other biomolecular systems, enabling quantum-accurate simulations of chemical reactivity in complex environments.
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
- Energy (0.93)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- Europe > Italy (0.04)
- Energy (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
- North America > Canada > British Columbia > Vancouver (0.04)
- Europe > France (0.04)
Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Fleisig, Eve, Orlikowski, Matthias, Cimiano, Philipp, Klein, Dan
For machine learning datasets to accurately represent diverse opinions in a population, they must preserve variation in data labels while filtering out spam or low-quality responses. How can we balance annotator reliability and representation? We empirically evaluate how a range of heuristics for annotator filtering affect the preservation of variation on subjective tasks. We find that these methods, designed for contexts in which variation from a single ground-truth label is considered noise, often remove annotators who disagree instead of spam annotators, introducing suboptimal tradeoffs between accuracy and label diversity. We find that conservative settings for annotator removal (<5%) are best, after which all tested methods increase the mean absolute error from the true average label. We analyze performance on synthetic spam to observe that these methods often assume spam annotators are more random than real spammers tend to be: most spammers are distributionally indistinguishable from real annotators, and the minority that are distinguishable tend to give relatively fixed answers, not random ones. Thus, tasks requiring the preservation of variation reverse the intuition of existing spam filtering methods: spammers tend to be less random than non-spammers, so metrics that assume variation is spam fare worse. These results highlight the need for spam removal methods that account for label diversity.
- Europe > Austria > Vienna (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Singapore (0.04)
- (18 more...)
- Information Technology > Security & Privacy > Spam Filtering (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Communications > Social Media > Crowdsourcing (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Nancy Mace Curses, Berates Confused Cops in Airport Meltdown: Police Report
At an airport in South Carolina on Thursday, US representative Nancy Mace called police officers "fucking incompetent" and berated them repeatedly, according to an incident report. Nancy Mace, the South Carolina Republican congresswoman, unleashed a tirade against law enforcement at the Charleston International Airport on Thursday, WIRED has learned. According to an incident report obtained by WIRED under South Carolina's Freedom of Information Act, Mace cursed at police officers, making repeated derogatory comments toward them. The report says that a Transportation Security Administration (TSA) supervisor told officers that Mace had treated their staff similarly and that they would be reporting her to their superiors. According to the report, officers with the Charleston County Aviation Authority Police Department were tasked with meeting Mace at 6:30 am to escort her from the curb to her flight and had been told that she would be arriving in a white BMW at the ticketing curb area.
- North America > United States > South Carolina > Charleston County (0.25)
- North America > United States > California (0.15)
- North America > United States > New York (0.06)
- (3 more...)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- North America > United States > New York (0.04)
- Europe > Italy (0.04)
- Energy (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
- North America > Canada > British Columbia > Vancouver (0.04)
- Europe > France (0.04)
MACE: A Hybrid LLM Serving System with Colocated SLO-aware Continuous Retraining Alignment
Li, Yufei, Fu, Yu, Dong, Yue, Liu, Cong
Large language models (LLMs) deployed on edge servers are increasingly used in latency-sensitive applications such as personalized assistants, recommendation, and content moderation. However, the non-stationary nature of user data necessitates frequent retraining, which introduces a fundamental tension between inference latency and model accuracy under constrained GPU resources. Existing retraining strategies either delay model updates, over-commit resources to retraining, or overlook iteration-level retraining granularity. In this paper, we identify that iteration-level scheduling is crucial for adapting retraining frequency to model drift without violating service-level objectives (SLOs). We propose MACE, a hybrid LLM system that colocates concurrent inference (prefill, decode) and fine-tuning, with intelligent memory management to maximize task performance while promising inference throughput. MACE leverages the insight that not all model updates equally affect output alignment and allocates GPU cycles accordingly to balance throughput, latency, and update freshness. Our trace-driven evaluation shows that MACE matches or exceeds continuous retraining while reducing inference latency by up to 63% and maintaining throughput under resource constraints. Compared to periodic retraining, MACE improves latency breakdown across prefill, decode, and finetune stages, and sustains GPU utilization above 85% in NVIDIA AGX Orin. These results demonstrate that iteration-level hybrid scheduling is a promising direction for deploying LLMs with continual learning capabilities on edge platforms.
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- North America > United States > California > Riverside County > Riverside (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Automated Evaluation can Distinguish the Good and Bad AI Responses to Patient Questions about Hospitalization
Soni, Sarvesh, Demner-Fushman, Dina
Automated approaches to answer patient-posed health questions are rising, but selecting among systems requires reliable evaluation. The current gold standard for evaluating the free-text artificial intelligence (AI) responses--human expert review--is labor-intensive and slow, limiting scalability. Automated metrics are promising yet variably aligned with human judgments and often context-dependent. To address the feasibility of automating the evaluation of AI responses to hospitalization-related questions posed by patients, we conducted a large systematic study of evaluation approaches. Across 100 patient cases, we collected responses from 28 AI systems (2800 total) and assessed them along three dimensions: whether a system response (1) answers the question, (2) appropriately uses clinical note evidence, and (3) uses general medical knowledge. Using clinician-authored reference answers to anchor metrics, automated rankings closely matched expert ratings. Our findings suggest that carefully designed automated evaluation can scale comparative assessment of AI systems and support patient-clinician communication.
- Europe > Austria > Vienna (0.14)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (12 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.47)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)