Tran, Daniel
Vision-Language Modeling in PET/CT for Visual Grounding of Positive Findings
Huemann, Zachary, Church, Samuel, Warner, Joshua D., Tran, Daniel, Tie, Xin, McMillan, Alan B, Hu, Junjie, Cho, Steve Y., Lubner, Meghan, Bradshaw, Tyler J.
Vision-language models can connect the text description of an object to its specific location in an image through visual grounding. This has potential applications in enhanced radiology reporting. However, these models require large annotated image-text datasets, which are lacking for PET/CT. We developed an automated pipeline to generate weak labels linking PET/CT report descriptions to their image locations and used it to train a 3D vision-language visual grounding model. Our pipeline finds positive findings in PET/CT reports by identifying mentions of SUVmax and axial slice numbers. From 25,578 PET/CT exams, we extracted 11,356 sentence-label pairs. Using this data, we trained ConTEXTual Net 3D, which integrates text embeddings from a large language model with a 3D nnU-Net via token-level cross-attention. The model's performance was compared against LLMSeg, a 2.5D version of ConTEXTual Net, and two nuclear medicine physicians. The weak-labeling pipeline accurately identified lesion locations in 98% of cases (246/251), with 7.5% requiring boundary adjustments. ConTEXTual Net 3D achieved an F1 score of 0.80, outperforming LLMSeg (F1=0.22) and the 2.5D model (F1=0.53), though it underperformed both physicians (F1=0.94 and 0.91). The model achieved better performance on FDG (F1=0.78) and DCFPyL (F1=0.75) exams, while performance dropped on DOTATE (F1=0.58) and Fluciclovine (F1=0.66). The model performed consistently across lesion sizes but showed reduced accuracy on lesions with low uptake. Our novel weak labeling pipeline accurately produced an annotated dataset of PET/CT image-text pairs, facilitating the development of 3D visual grounding models. ConTEXTual Net 3D significantly outperformed other models but fell short of the performance of nuclear medicine physicians. Our study suggests that even larger datasets may be needed to close this performance gap.
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks
Chang, Matthew, Chhablani, Gunjan, Clegg, Alexander, Cote, Mikael Dallaire, Desai, Ruta, Hlavac, Michal, Karashchuk, Vladimir, Krantz, Jacob, Mottaghi, Roozbeh, Parashar, Priyam, Patki, Siddharth, Prasad, Ishita, Puig, Xavier, Rai, Akshara, Ramrakhya, Ram, Tran, Daniel, Truong, Joanne, Turner, John M., Undersander, Eric, Yang, Tsung-Yen
We present a benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration (PARTNR) designed to study human-robot coordination in household activities. PARTNR tasks exhibit characteristics of everyday tasks, such as spatial, temporal, and heterogeneous agent capability constraints. We employ a semi-automated task generation pipeline using Large Language Models (LLMs), incorporating simulation in the loop for grounding and verification. PARTNR stands as the largest benchmark of its kind, comprising 100,000 natural language tasks, spanning 60 houses and 5,819 unique objects. We analyze state-of-the-art LLMs on PARTNR tasks, across the axes of planning, perception and skill execution. The analysis reveals significant limitations in SoTA models, such as poor coordination and failures in task tracking and recovery from errors. When LLMs are paired with real humans, they require 1.5x as many steps as two humans collaborating and 1.1x more steps than a single human, underscoring the potential for improvement in these models. We further show that fine-tuning smaller LLMs with planning data can achieve performance on par with models 9 times larger, while being 8.6x faster at inference. Overall, PARTNR highlights significant challenges facing collaborative embodied agents and aims to drive research in this direction.
Automated Scheduling for NASA's Deep Space Network
Johnston, Mark D. (Jet Propulsion Laboratory, California Institute of Technology) | Tran, Daniel (Jet Propulsion Laboratory, California Institute of Technology) | Arroyo, Belinda (Jet Propulsion Laboratory, California Institute of Technology) | Sorensen, Sugi (Jet Propulsion Laboratory, California Institute of Technology) | Tay, Peter (Jet Propulsion Laboratory, California Institute of Technology) | Carruth, Butch (Innovative Productivity Solutions, Inc.) | Coffman, Adam (Innovative Productivity Solutions, Inc.) | Wallace, Mike (Innovative Productivity Solutions, Inc.)
This article describes the DSN scheduling wngine (DSE) component of a new scheduling system being deployed for NASA's deep space network. The DSE provides core automation functionality for scheduling the network, including the interpretation of scheduling requirements expressed by users, their elaboration into tracking passes, and the resolution of conflicts and constraint violations. It has been integrated with a web application which provides DSE functionality to all DSN users through a standard web browser, as part of a peer-to-peer schedule negotiation process for the entire network. The system has been deployed operationally and is in routine use, and is in the process of being extended to support long-range planning and forecasting, and near-real-time scheduling.
Leveraging Multiple Artificial Intelligence Techniques to Improve the Responsiveness in Operations Planning: ASPEN for Orbital Express
Knight, Russell (Jet Propulsion Laboratory, California Institute of Technology) | Chouinard, Caroline (Red Canyon Software) | Jones, Grailing (Jet Propulsion Laboratory, California Institute of Technology) | Tran, Daniel (Jet Propulsion Laboratory, California Institute of Technology)
The challenging timeline for DARPA's Orbital Express mission demanded a flexible, responsive, and (above all) safe approach to mission planning. Mission planning for space is challenging because of the mixture of goals and constraints. These technologies had a significant impact on the success of the Orbital Express mission. Finally, we formulated a technique for converting procedural information to declarative information by transforming procedures into models of hierarchical task networks (HTNs).
Leveraging Multiple Artificial Intelligence Techniques to Improve the Responsiveness in Operations Planning: ASPEN for Orbital Express
Knight, Russell (Jet Propulsion Laboratory, California Institute of Technology) | Chouinard, Caroline (Red Canyon Software) | Jones, Grailing (Jet Propulsion Laboratory, California Institute of Technology) | Tran, Daniel (Jet Propulsion Laboratory, California Institute of Technology)
The challenging timeline for DARPA’s Orbital Express mission demanded a flexible, responsive, and (above all) safe approach to mission planning. Mission planning for space is challenging because of the mixture of goals and constraints. Every space mission tries to squeeze all of the capacity possible out of the spacecraft. For Orbital Express, this means performing as many experiments as possible, while still keeping the spacecraft safe. Keeping the spacecraft safe can be very challenging because we need to maintain the correct thermal environment (or batteries might freeze), we need to avoid pointing cameras and sensitive sensors at the sun, we need to keep the spacecraft batteries charged, and we need to keep the two spacecraft from colliding... made more difficult as only one of the spacecraft had thrusters. Because the mission was a technology demonstration, pertinent planning information was learned during actual mission execution. For example, we didn’t know for certain how long it would take to transfer propellant from one spacecraft to the other, although this was a primary mission goal. The only way to find out was to perform the task and monitor how long it actually took. This information led to amendments to procedures, which led to changes in the mission plan. In general, we used the ASPEN planner scheduler to generate and validate the mission plans. ASPEN is a planning system that allows us to enter all of the spacecraft constraints, the resources, the communications windows, and our objectives. ASPEN then could automatically plan our day. We enhanced ASPEN to enable it to reason about uncertainty. We also developed a model generator that would read the text of a procedure and translate it into an ASPEN model. Note that a model is the input to ASPEN that describes constraints, resources, and activities. These technologies had a significant impact on the success of the Orbital Express mission. Finally, we formulated a technique for converting procedural information to declarative information by transforming procedures into models of hierarchical task networks (HTNs). The impact of this effort on the mission was a significant reduction in (1) the execution time of the mission, (2) the daily staff required to produce plans, and (3) planning errors. Not a single miss-configured command was sent during operations.
Automated Scheduling for NASA's Deep Space Network
Johnston, Mark D. (Jet Propulsion Laboratory, California Institute of Technology) | Tran, Daniel (Jet Propulsion Laboratory, California Institute of Technology) | Arroyo, Belinda (Jet Propulsion Laboratory, California Institute of Technology) | Sorensen, Sugi (Jet Propulsion Laboratory, California Institute of Technology) | Tay, Peter (Jet Propulsion Laboratory, California Institute of Technology) | Carruth, Butch (Innovative Productivity Solutions, Inc.) | Coffman, Adam (Innovative Productivity Solutions, Inc.) | Wallace, Mike (Innovative Productivity Solutions, Inc.)
This article describes the DSN scheduling wngine (DSE) component of a new scheduling system being deployed for NASA's deep space network. The DSE provides core automation functionality for scheduling the network, including the interpretation of scheduling requirements expressed by users, their elaboration into tracking passes, and the resolution of conflicts and constraint violations. The DSE incorporates both systematic search and repair-based algorithms, used for different phases and purposes in the overall system. It has been integrated with a web application which provides DSE functionality to all DSN users through a standard web browser, as part of a peer-to-peer schedule negotiation process for the entire network. The system has been deployed operationally and is in routine use, and is in the process of being extended to support long-range planning and forecasting, and near-real-time scheduling.
An Efficient Search Strategy for Aggregation and Discretization of Attributes of Bayesian Networks Using Minimum Description Length
Corcoran, Jem, Tran, Daniel, Levine, Nicholas
Bayesian networks are convenient graphical expressions for high dimensional probability distributions representing complex relationships between a large number of random variables. They have been employed extensively in areas such as bioinformatics, artificial intelligence, diagnosis, and risk management. The recovery of the structure of a network from data is of prime importance for the purposes of modeling, analysis, and prediction. Most recovery algorithms in the literature assume either discrete of continuous but Gaussian data. For general continuous data, discretization is usually employed but often destroys the very structure one is out to recover. Friedman and Goldszmidt suggest an approach based on the minimum description length principle that chooses a discretization which preserves the information in the original data set, however it is one which is difficult, if not impossible, to implement for even moderately sized networks. In this paper we provide an extremely efficient search strategy which allows one to use the Friedman and Goldszmidt discretization in practice.