Goto

Collaborating Authors

 Nashua




Meta AI adviser spreads disinformation about shootings, vaccines and trans people

The Guardian

Robby Starbuck speaks in an interview in New York in March. Robby Starbuck speaks in an interview in New York in March. Critics condemn Robby Starbuck, appointed in lawsuit settlement, for'peddling lies and pushing extremism' A prominent anti-DEI campaigner appointed by Meta in August as an adviser on AI bias has spent the weeks since his appointment spreading disinformation about shootings, transgender people, vaccines, crime, and protests. Robby Starbuck, 36, of Nashville, was appointed in August as an adviser by Meta - owner of Facebook, Instagram, WhatsApp, and other tech platforms - in an August lawsuit settlement. Since his appointment, Starbuck has baselessly claimed that individual shooters in the US were motivated by leftist ideology, described faith-based protest groups as communists, and without evidence tied Democratic lawmakers to murders.


A global log for medical AI

Noori, Ayush, Rodman, Adam, Karthikesalingam, Alan, Mateen, Bilal A., Longhurst, Christopher A., Yang, Daniel, deBronkart, Dave, Galea, Gauden, Wolf, Harold F. III, Waxman, Jacob, Mandel, Joshua C., Rotich, Juliana, Mandl, Kenneth D., Mustafa, Maryam, Miles, Melissa, Shah, Nigam H., Lee, Peter, Korom, Robert, Mahoney, Scott, Hain, Seth, Wong, Tien Yin, Mundel, Trevor, Natarajan, Vivek, Dagan, Noa, Clifton, David A., Balicer, Ran D., Kohane, Isaac S., Zitnik, Marinka

arXiv.org Artificial Intelligence

Modern computer systems often rely on syslog, a simple, universal protocol that records every critical event across heterogeneous infrastructure. However, healthcare's rapidly growing clinical AI stack has no equivalent. As hospitals rush to pilot large language models and other AI-based clinical decision support tools, we still lack a standard way to record how, when, by whom, and for whom these AI models are used. Without that transparency and visibility, it is challenging to measure real-world performance and outcomes, detect adverse events, or correct bias or dataset drift. In the spirit of syslog, we introduce MedLog, a protocol for event-level logging of clinical AI. Any time an AI model is invoked to interact with a human, interface with another algorithm, or act independently, a MedLog record is created. This record consists of nine core fields: header, model, user, target, inputs, artifacts, outputs, outcomes, and feedback, providing a structured and consistent record of model activity. To encourage early adoption, especially in low-resource settings, and minimize the data footprint, MedLog supports risk-based sampling, lifecycle-aware retention policies, and write-behind caching; detailed traces for complex, agentic, or multi-stage workflows can also be captured under MedLog. MedLog can catalyze the development of new databases and software to store and analyze MedLog records. Realizing this vision would enable continuous surveillance, auditing, and iterative improvement of medical AI, laying the foundation for a new form of digital epidemiology.



Breaking Reversibility Accelerates Langevin Dynamics for Non-Convex Optimization

Neural Information Processing Systems

Langevin dynamics (LD) has been proven to be a powerful technique for optimizing a non-convex objective as an efficient algorithm to find local minima while eventually visiting a global minimum on longer time-scales.


Trust-Region Twisted Policy Improvement

de Vries, Joery A., He, Jinke, Oren, Yaniv, Spaan, Matthijs T. J.

arXiv.org Artificial Intelligence

Monte-Carlo tree search (MCTS) has driven many recent breakthroughs in deep reinforcement learning (RL). However, scaling MCTS to parallel compute has proven challenging in practice which has motivated alternative planners like sequential Monte-Carlo (SMC). Many of these SMC methods adopt particle filters for smoothing through a reformulation of RL as a policy inference problem. Yet, persisting design choices of these particle filters often conflict with the aim of online planning in RL, which is to obtain a policy improvement at the start of planning. Drawing inspiration from MCTS, we tailor SMC planners specifically for RL by improving data generation within the planner through constrained action sampling and explicit terminal state handling, as well as improving policy and value target estimation. This leads to our Trust-Region Twisted SMC (TRT-SMC), which shows improved runtime and sample-efficiency over baseline MCTS and SMC methods in both discrete and continuous domains.


Pro-Routing: Proactive Routing of Autonomous Multi-Capacity Robots for Pickup-and-Delivery Tasks

Garces, Daniel, Gil, Stephanie

arXiv.org Artificial Intelligence

We consider a multi-robot setting, where we have a fleet of multi-capacity autonomous robots that must service spatially distributed pickup-and-delivery requests with fixed maximum wait times. Requests can be either scheduled ahead of time or they can enter the system in real-time. In this setting, stability for a routing policy is defined as the cost of the policy being uniformly bounded over time. Most previous work either solve the problem offline to theoretically maintain stability or they consider dynamically arriving requests at the expense of the theoretical guarantees on stability. In this paper, we aim to bridge this gap by proposing a novel proactive rollout-based routing framework that adapts to real-time demand while still provably maintaining the stability of the learned routing policy. We derive provable stability guarantees for our method by proposing a fleet sizing algorithm that obtains a sufficiently large fleet that ensures stability by construction. To validate our theoretical results, we consider a case study on real ride requests for Harvard's evening Van System. We also evaluate the performance of our framework using the currently deployed smaller fleet size. In this smaller setup, we compare against the currently deployed routing algorithm, greedy heuristics, and Monte-Carlo-Tree-Search-based algorithms. Our empirical results show that our framework maintains stability when we use the sufficiently large fleet size found in our theoretical results. For the smaller currently deployed fleet size, our method services 6% more requests than the closest baseline while reducing median passenger wait times by 33%.


OptiChat: Bridging Optimization Models and Practitioners with Large Language Models

Chen, Hao, Constante-Flores, Gonzalo Esteban, Mantri, Krishna Sri Ipsit, Kompalli, Sai Madhukiran, Ahluwalia, Akshdeep Singh, Li, Can

arXiv.org Artificial Intelligence

Optimization models have been applied to solve a wide variety of decision-making problems. These models are usually developed by optimization experts but are used by practitioners without optimization expertise in various application domains. As a result, practitioners often struggle to interact with and draw useful conclusions from optimization models independently. To fill this gap, we introduce OptiChat, a natural language dialogue system designed to help practitioners interpret model formulation, diagnose infeasibility, analyze sensitivity, retrieve information, evaluate modifications, and provide counterfactual explanations. By augmenting large language models (LLMs) with functional calls and code generation tailored for optimization models, we enable seamless interaction and minimize the risk of hallucinations in OptiChat. We develop a new dataset to evaluate OptiChat's performance in explaining optimization models. Experiments demonstrate that OptiChat effectively bridges the gap between optimization models and practitioners, delivering autonomous, accurate, and instant responses.


Recent Advances in Non-convex Smoothness Conditions and Applicability to Deep Linear Neural Networks

Patel, Vivak, Varner, Christian

arXiv.org Artificial Intelligence

The presence of non-convexity in smooth optimization problems arising from deep learning have sparked new smoothness conditions in the literature and corresponding convergence analyses. We discuss these smoothness conditions, order them, provide conditions for determining whether they hold, and evaluate their applicability to training a deep linear neural network for binary classification.