Goto

Collaborating Authors

 hal


Highly Adaptive Principal Component Regression

Wang, Mingxun, Schuler, Alejandro, van der Laan, Mark, Meixide, Carlos García

arXiv.org Machine Learning

The Highly Adaptive Lasso (HAL) is a nonparametric regression method that achieves almost dimension-free convergence rates under minimal smoothness assumptions, but its implementation can be computationally prohibitive in high dimensions due to the large basis matrix it requires. The Highly Adaptive Ridge (HAR) has been proposed as a scalable alternative. Building on both procedures, we introduce the Principal Component based Highly Adaptive Lasso (PCHAL) and Principal Component based Highly Adaptive Ridge (PCHAR). These estimators constitute an outcome-blind dimension reduction which offer substantial gains in computational efficiency and match the empirical performances of HAL and HAR. We also uncover a striking spectral link between the leading principal components of the HAL/HAR Gram operator and a discrete sinusoidal basis, revealing an explicit Fourier-type structure underlying the PC truncation.


That wasn't a strike, Hal!

Slate

They also discuss the retirement of basketball all-time great, Diana Taurasi. Finally, they talk about a right-wing proposal for a steroid-fueled sports league. For Afterballs, Alex remembers the careers of Czech hockey superstar Jaromir Jagr as well as the beloved voice of the Pittsburgh Penguins, Mike Lange, who died last month. Episode Notes: Read about robot umpires being tested in Spring Training. Read Time's interview with Diana Taurasi where she announced her retirement. Read about the Enhanced Games' push to be a steroid-fuelled Olympics.


Automated Code Generation and Validation for Software Components of Microcontrollers

Haug, Sebastian, Böhm, Christoph, Mayer, Daniel

arXiv.org Artificial Intelligence

This paper proposes a method for generating software components for embedded systems, integrating seamlessly into existing implementations without developer intervention. We demonstrate this by automatically generating hardware abstraction layer (HAL) code for GPIO operations on the STM32F407 microcontroller. Using Abstract Syntax Trees (AST) for code analysis and Retrieval-Augmented Generation (RAG) for component generation, our approach enables autonomous code completion for embedded applications.


Constructing optimal treatment length strategies to maximize quality-adjusted lifetimes

Sun, Hao, Ertefaie, Ashkan, Duttweiler, Luke, Johnson, Brent A.

arXiv.org Machine Learning

Real-world clinical decision making is a complex process that involves balancing the risks and benefits of treatments. Quality-adjusted lifetime is a composite outcome that combines patient quantity and quality of life, making it an attractive outcome in clinical research. We propose methods for constructing optimal treatment length strategies to maximize this outcome. Existing methods for estimating optimal treatment strategies for survival outcomes cannot be applied to a quality-adjusted lifetime due to induced informative censoring. We propose a weighted estimating equation that adjusts for both confounding and informative censoring. We also propose a nonparametric estimator of the mean counterfactual quality-adjusted lifetime survival curve under a given treatment length strategy, where the weights are estimated using an undersmoothed sieve-based estimator. We show that the estimator is asymptotically linear and provide a data-dependent undersmoothing criterion. We apply our method to obtain the optimal time for percutaneous endoscopic gastrostomy insertion in patients with amyotrophic lateral sclerosis.


Lassoed Tree Boosting

Schuler, Alejandro, Li, Yi, van der Laan, Mark

arXiv.org Machine Learning

Gradient boosting performs exceptionally in most prediction problems and scales well to large datasets. In this paper we prove that a ``lassoed'' gradient boosted tree algorithm with early stopping achieves faster than $n^{-1/4}$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation. This rate is remarkable because it does not depend on the dimension, sparsity, or smoothness. We use simulation and real data to confirm our theory and demonstrate empirical performance and scalability on par with standard boosting. Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.


Exploration and Incentives in Reinforcement Learning

Simchowitz, Max, Slivkins, Aleksandrs

arXiv.org Artificial Intelligence

How do you incentivize self-interested agents to $\textit{explore}$ when they prefer to $\textit{exploit}$? We consider complex exploration problems, where each agent faces the same (but unknown) MDP. In contrast with traditional formulations of reinforcement learning, agents control the choice of policies, whereas an algorithm can only issue recommendations. However, the algorithm controls the flow of information, and can incentivize the agents to explore via information asymmetry. We design an algorithm which explores all reachable states in the MDP. We achieve provable guarantees similar to those for incentivizing exploration in static, stateless exploration problems studied previously. To the best of our knowledge, this is the first work to consider mechanism design in a stateful, reinforcement learning setting.


Hyperactive Learning (HAL) for Data-Driven Interatomic Potentials

van der Oord, Cas, Sachs, Matthias, Kovács, Dávid Péter, Ortner, Christoph, Csányi, Gábor

arXiv.org Machine Learning

Data-driven interatomic potentials have emerged as a powerful class of surrogate models for {\it ab initio} potential energy surfaces that are able to reliably predict macroscopic properties with experimental accuracy. In generating accurate and transferable potentials the most time-consuming and arguably most important task is generating the training set, which still requires significant expert user input. To accelerate this process, this work presents \text{\it hyperactive learning} (HAL), a framework for formulating an accelerated sampling algorithm specifically for the task of training database generation. The key idea is to start from a physically motivated sampler (e.g., molecular dynamics) and add a biasing term that drives the system towards high uncertainty and thus to unseen training configurations. Building on this framework, general protocols for building training databases for alloys and polymers leveraging the HAL framework will be presented. For alloys, ACE potentials for AlSi10 are created by fitting to a minimal HAL-generated database containing 88 configurations (32 atoms each) with fast evaluation times of <100 microsecond/atom/cpu-core. These potentials are demonstrated to predict the melting temperature with excellent accuracy. For polymers, a HAL database is built using ACE, able to determine the density of a long polyethylene glycol (PEG) polymer formed of 200 monomer units with experimental accuracy by only fitting to small isolated PEG polymers with sizes ranging from 2 to 32.


AI Is Not Taking Away Our Jobs -- Because It Can't Do Them

#artificialintelligence

Note: HAL 9000 "was an incredibly knowledgeable AI system that was given one simple order: to make sure that the ship reached its destination at Jupiter." – (VillainsWiki) As Dr. Marks notes, AI has no natural ethical system so HAL did not hesitate to plot the deaths of crew members in order to guide the ship to its destination: "At one point on the trip from Earth to Jupiter, HAL becomes suspicious that the crew might be sabotaging the mission. HAL then purposely tries to kill all the crew. The most logical explanation for this act is a coding error. HAL was programmed to operate on the basis that the mission took priority over human life. By contrast, science fiction writer Isaac Asimov did not allow his AI to kill.


Machine Mind

#artificialintelligence

"We shall not cease from exploration And the end of all our exploring Will be to arrive where we started And know the place for the first time." In 1956, the Dartmouth Summer Research Project on Artificial Intelligence gave A.I. its standing as a legitimate field of study. "Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it." A.I. development has fallen in line with this sentiment ever since. Our scientists, analysts, and engineers have taken a multidisciplinary approach that focuses on describing and reproducing as many observable faculties of the mind as possible. This reproduction has proven to be quite successful.


Possibility Before Utility: Learning And Using Hierarchical Affordances

Costales, Robby, Iqbal, Shariq, Sha, Fei

arXiv.org Machine Learning

Reinforcement learning algorithms struggle on tasks with complex hierarchical dependency structures. Humans and other intelligent agents do not waste time assessing the utility of every high-level action in existence, but instead only consider ones they deem possible in the first place. By focusing only on what is feasible, or "afforded", at the present moment, an agent can spend more time both evaluating the utility of and acting on what matters. To this end, we present Hierarchical Affordance Learning (HAL), a method that learns a model of hierarchical affordances in order to prune impossible subtasks for more effective learning. Existing works in hierarchical reinforcement learning provide agents with structural representations of subtasks but are not affordance-aware, and by grounding our definition of hierarchical affordances in the present state, our approach is more flexible than the multitude of approaches that ground their subtask dependencies in a symbolic history. While these logic-based methods often require complete knowledge of the subtask hierarchy, our approach is able to utilize incomplete and varying symbolic specifications. Furthermore, we demonstrate that relative to non-affordance-aware methods, HAL agents are better able to efficiently learn complex tasks, navigate environment stochasticity, and acquire diverse skills in the absence of extrinsic supervision -- all of which are hallmarks of human learning.