Burns, Ethan
Exploring Large Language Models for Specialist-level Oncology Care
Palepu, Anil, Dhillon, Vikram, Niravath, Polly, Weng, Wei-Hung, Prasad, Preethi, Saab, Khaled, Tanno, Ryutaro, Cheng, Yong, Mai, Hanh, Burns, Ethan, Ajmal, Zainub, Kulkarni, Kavita, Mansfield, Philip, Webster, Dale, Barral, Joelle, Gottweis, Juraj, Schaekermann, Mike, Mahdavi, S. Sara, Natarajan, Vivek, Karthikesalingam, Alan, Tu, Tao
Large language models (LLMs) have shown remarkable progress in encoding clinical knowledge and responding to complex medical queries with appropriate clinical reasoning. However, their applicability in subspecialist or complex medical settings remains underexplored. In this work, we probe the performance of AMIE, a research conversational diagnostic AI system, in the subspecialist domain of breast oncology care without specific fine-tuning to this challenging domain. To perform this evaluation, we curated a set of 50 synthetic breast cancer vignettes representing a range of treatment-naive and treatment-refractory cases and mirroring the key information available to a multidisciplinary tumor board for decision-making (openly released with this work). We developed a detailed clinical rubric for evaluating management plans, including axes such as the quality of case summarization, safety of the proposed care plan, and recommendations for chemotherapy, radiotherapy, surgery and hormonal therapy. To improve performance, we enhanced AMIE with the inference-time ability to perform web search retrieval to gather relevant and up-to-date clinical knowledge and refine its responses with a multi-stage self-critique pipeline. We compare response quality of AMIE with internal medicine trainees, oncology fellows, and general oncology attendings under both automated and specialist clinician evaluations. In our evaluations, AMIE outperformed trainees and fellows demonstrating the potential of the system in this challenging and important domain. We further demonstrate through qualitative examples, how systems such as AMIE might facilitate conversational interactions to assist clinicians in their decision making. However, AMIE's performance was overall inferior to attending oncologists suggesting that further research is needed prior to consideration of prospective uses.
Solving Large Problems with Heuristic Search: General-Purpose Parallel External-Memory Search
Hatem, Matthew, Burns, Ethan, Ruml, Wheeler
Classic best-first heuristic search algorithms, like A*, record every unique state they encounter in RAM, making them infeasible for solving large problems. In this paper, we demonstrate how best-first search can be scaled to solve much larger problems by exploiting disk storage and parallel processing and, in some cases, slightly relaxing the strict best-first node expansion order. Some previous disk-based search algorithms abandon best-first search order in an attempt to increase efficiency. We present two case studies showing that A*, when augmented with Delayed Duplicate Detection, can actually be more efficient than these non-best-first search orders. First, we present a straightforward external variant of A*, called PEDAL, that slightly relaxes best-first order in order to be I/O efficient in both theory and practice, even on problems featuring real-valued node costs. Because it is easy to parallelize, PEDAL can be faster than in-memory IDA* even on domains with few duplicate states, such as the sliding-tile puzzle. Second, we present a variant of PEDAL, called PE2A*, that uses partial expansion to handle problems that have large branching factors. When tested on the problem of Multiple Sequence Alignment, PE2A* is the first algorithm capable of solving the entire Reference Set 1 of the standard BAliBASE benchmark using a biologically accurate cost function. This work shows that classic best-first algorithms like A* can be applied to large real-world problems. We also provide a detailed implementation guide with source code both for generic parallel disk-based best-first search and for Multiple Sequence Alignment with a biologically accurate cost function. Given its effectiveness as a general-purpose problem-solving method, we hope that this makes parallel and disk-based search accessible to a wider audience.
Integrating Vehicle Routing and Motion Planning
Kiesel, Scott (University of New Hampshire) | Burns, Ethan (University of New Hampshire) | Wilt, Christopher (University of New Hampshire) | Ruml, Wheeler (University of New Hampshire)
There has been much interest recently in problems that com-bine high-level task planning with low-level motion planning.In this paper, we present a problem of this kind that arises inmulti-vehicle mission planning. It tightly integrates task al-location and scheduling, who will do what when, with pathplanning, how each task will actually be performed. It ex-tends classical vehicle routing in that the cost of executing aset of high-level tasks can vary significantly in time and costaccording to the low-level paths selected. It extends classi-cal motion planning in that each path must minimize costwhile also respecting temporal constraints, including thoseimposed by the agentโs other tasks and the tasks assigned toother agents. Furthermore, the problem is a subtask withinan interactive system and therefore must operate within se-vere time constraints. We present an approach to the problembased on a combination of tabu search, linear programming,and heuristic search. We evaluate our planner on represen-tative problem instances and find that its performance meetsthe demanding requirements of our application. These resultsdemonstrate how integrating multiple diverse techniques cansuccessfully solve challenging real-world planning problemsthat are beyond the reach of any single method.
Anticipatory On-Line Planning
Burns, Ethan (University of New Hampshire) | Benton, J. (Graduate Student, Arizona State University) | Ruml, Wheeler (University of New Hampshire) | Yoon, Sungwook (Palo Alto Research Center) | Do, Minh B. (NASA Ames Research Center)
It assumes that the Consider the problem faced by a unmanned aerial vehicle probability distribution over incoming goals is either known (UAV) dispatcher who must plan for a set of UAVs to service or learn-able and employs the technique of optimization a set of observation requests. To service a request, one of the in hindsight, previously developed for online scheduling UAVs must fly over a given strip of land with its observation and recently investigated for planning with stochastic actions equipment turned on. The dispatcher wants to minimize the (Mercier and van Hentenryck 2007; Yoon et al. 2008; time between when a request arrives and when an UAV has 2010). This technique first samples from the distribution of completed the flyover. Even when the actions of the UAV, possible future goal arrivals and then considers which next such as flying particular routes or switching on/off observational action optimizes the expected cost when averaged over the equipment, can be regarded as deterministic, the sampled futures. By using this anticipatory technique, our stochastic arrival of new requests can make for a challenging planner is able to take future goals into account.
Heuristic Search for Large Problems With Real Costs
Hatem, Matthew (University of New Hampshire) | Burns, Ethan (University of New Hampshire) | Ruml, Wheeler (University of New Hampshire)
The memory requirements of basic best-first heuristic search algorithms like A* make them infeasible for solving large problems. External disk storage is cheap and plentiful com- pared to the cost of internal RAM. Unfortunately, state-of- the-art external memory search algorithms either rely on brute-force search techniques, such as breadth-first search, or they rely on all node values falling in a narrow range of in- tegers, and thus perform poorly on real-world domains with real-valued costs. We present a new general-purpose algo- rithm, PEDAL, that uses external memory and parallelism to perform a best-first heuristic search capable of solving large problems with real costs. We show theoretically that PEDAL is I/O efficient and empirically that it is both better on a stan- dard unit-cost benchmark, surpassing internal IDA* on the 15-puzzle, and gives far superior performance on problems with real costs.
Searching Without a Heuristic: Efficient Use of Abstraction
Larsen, Bradford John (University of New Hampshire) | Burns, Ethan (University of New Hampshire) | Ruml, Wheeler (University of New Hampshire) | Holte, Robert (University of Alberta)
In problem domains where an informative heuristic evaluation function is not known or not easily computed, abstraction can be used to derive admissible heuristic values. Optimal path lengths in the abstracted problem are consistent heuristic estimates for the original problem. Pattern databases are the traditional method of creating such heuristics, but they exhaustively compute costs for all abstract states and are thus usually appropriate only when all instances share the same single goal state. Hierarchical heuristic search algorithms address these shortcomings by searching for paths in the abstract space on an as-needed basis. However, existing hierarchical algorithms search less efficiently than pattern database constructors: abstract nodes may be expanded many times during the course of a base-level search. We present a novel hierarchical heuristic search algorithm, called Switchback, that uses an alternating direction of search to avoid abstract node re-expansions. This algorithm is simple to implement and demonstrates superior performance to existing hierarchical heuristic search algorithms on several standard benchmarks.
Parallel Best-First Search: The Role of Abstraction
Burns, Ethan (University of New Hampshire) | Lemons, Sofia (University of New Hampshire) | Ruml, Wheeler (University of New Hampshire) | Zhou, Rong (Palo Alto Research Center)
To harness modern multicore processors, it is imperative to develop parallel versions of fundamental algorithms. In this paper, we present a general approach to best-first heuristic search in a shared-memory setting. Each thread attempts to expand the most promising nodes. By using abstraction to partition the state space, we detect duplicate states while avoiding lock contention. We allow speculative expansions when necessary to keep threads busy. We identify and fix potential livelock conditions. In an empirical comparison on STRIPS planning, grid pathfinding, and sliding tile puzzle problems using an 8-core machine, we show that A* implemented in our framework yields faster search performance than previous parallel search proposals. We also demonstrate that our approach extends easily to other best-first searches, such as weighted A* and anytime heuristic search.