Goto

Collaborating Authors

 high-quality plan


A learning-driven automatic planning framework for proton PBS treatments of H&N cancers

arXiv.org Artificial Intelligence

Proton pencil beam scanning (PBS) treatment planning for head & neck (H&N) cancers involves numerous conflicting objectives, requiring iterative objective parameter adjustments to balance multiple clinical goals. We propose a learning-driven inverse optimizer and integrate it into a proximal policy optimization (PPO)-based planning framework to automatically generate high-quality plans for patients with diverse treatment requirements. The inverse optimizer is a learning-to-optimize (L2O) method that predicts update steps by learning from task-specific data distributions. For the first time, long-context processing techniques developed for large language models (LLMs) are utilized to address the scalability limitations of existing L2O methods, enabling simultaneous optimization over a substantially large set of variables. The PPO framework functions as an outer-loop virtual planner, autonomously adjusting objective parameters through a policy network, and the inner-loop L2O inverse optimizer computes machine-deliverable spot monitor unit (MU) values based on the PPO-refined objectives. Moreover, a Swin UnetR dose predictor is trained with prescription- and beam-specific information to estimate the initial objective parameters. In our experiments, total 97 patients with bilateral or ipsilateral H&N cancers are collected for training and testing. Compared with the second-order gradient-based methods, our L2O optimizer improves the effectiveness and efficiency of the time-consuming inverse optimization by 22.97% and 36.41%, respectively, and in conjunction with the PPO-based virtual planner, plans are generated within clinically acceptable times, i.e. 2.55 hours in average, and shows improved or comparable organs-at-risk sparing with superior target coverage compared with human-generated plans.


Probabilistic contingent planning based on HTN for high-quality plans

arXiv.org Artificial Intelligence

Deterministic planning assumes that the planning evolves along a fully predictable path, and therefore it loses the practical value in most real projections. A more realistic view is that planning ought to take into consideration partial observability beforehand and aim for a more flexible and robust solution. What is more significant, it is inevitable that the quality of plan varies dramatically in the partially observable environment. In this paper we propose a probabilistic contingent Hierarchical Task Network (HTN) planner, named High-Quality Contingent Planner (HQCP), to generate high-quality plans in the partially observable environment. The formalisms in HTN planning are extended into partial observability and are evaluated regarding the cost. Next, we explore a novel heuristic for high-quality plans and develop the integrated planning algorithm. Finally, an empirical study verifies the effectiveness and efficiency of the planner both in probabilistic contingent planning and for obtaining high-quality plans.


Hypothesis Exploration for Malware Detection Using Planning

AAAI Conferences

In this paper we apply AI planning to address the hypothesis exploration problem and provide assistance to network administrators in detecting malware based on unreliable observations derived from network traffic.Building on the already established characterization and use of AI planning for similar problems, we propose a formulation of the hypothesis generation problem for malware detection as an AI planning problem with temporally extended goals and actions costs. Furthermore, we propose a notion of hypothesis ``plausibility'' under unreliable observations, which we model as plan quality. We then show that in the presence of unreliable observations, simply finding one most ``plausible'' hypothesis, although challenging, is not sufficient for effective malware detection. To that end, we propose a method for applying a state-of-the-art planner within a principled exploration process, to generate multiple distinct high-quality plans. We experimentally evaluate this approach by generating random problems of varying hardness both with respect to the number of observations, as well as the degree of unreliability. Based on these experiments, we argue that our approach presents a significant improvement over prior work that are focused on finding a single optimal plan, and that our hypothesis exploration application can motivate the development of new planners capable of generating the top high-quality plans.