deadend
- Europe > Austria > Vienna (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > Canada (0.04)
- (9 more...)
- Europe > Austria > Vienna (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > Canada (0.04)
- (9 more...)
SOAP: Enhancing Efficiency of Generated Code via Self-Optimization
Huang, Dong, Dai, Jianbo, Weng, Han, Wu, Puzhen, Qing, Yuhao, Zhang, Jie M., Cui, Heming, Guo, Zhijiang
Large language models (LLMs) have shown remarkable progress in code generation, but their generated code often suffers from inefficiency, resulting in longer execution times and higher memory consumption. To address this issue, we propose Self Optimization based on OverheAd Profile (SOAP), a self-optimization framework that utilizes execution overhead profiles to improve the efficiency of LLM-generated code. SOAP first generates code using an LLM, then executes it locally to capture execution time and memory usage profiles. These profiles are fed back to the LLM, which then revises the code to reduce overhead. To evaluate the effectiveness of SOAP, we conduct extensive experiments on the EffiBench, HumanEval, and MBPP with 16 open-source and 6 closed-source models. Our evaluation results demonstrate that through iterative self-optimization, SOAP significantly enhances the efficiency of LLM-generated code. For example, the execution time (ET) of StarCoder2-15B for the EffiBench decreases from 0.93 (s) to 0.12 (s) which reduces 87.1% execution time requirement compared with the initial code. The total memory usage (TMU) of StarCoder2-15B also decreases from 22.02 (Mb*s) to 2.03 (Mb*s), which decreases 90.8% total memory consumption during the execution process. The source code of SOAP was released in https://github.com/huangd1999/SOAP.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > China > Hong Kong (0.04)
- (8 more...)
PRP Rebooted: Advancing the State of the Art in FOND Planning
Muise, Christian, McIlraith, Sheila A., Beck, J. Christopher
Fully Observable Non-Deterministic (FOND) planning is a variant of classical symbolic planning in which actions are nondeterministic, with an action's outcome known only upon execution. It is a popular planning paradigm with applications ranging from robot planning to dialogue-agent design and reactive synthesis. Over the last 20 years, a number of approaches to FOND planning have emerged. In this work, we establish a new state of the art, following in the footsteps of some of the most powerful FOND planners to date. Our planner, PR2, decisively outperforms the four leading FOND planners, at times by a large margin, in 17 of 18 domains that represent a comprehensive benchmark suite. Ablation studies demonstrate the impact of various techniques we introduce, with the largest improvement coming from our novel FOND-aware heuristic.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (3 more...)
Safe-Planner: A Single-Outcome Replanner for Computing Strong Cyclic Policies in Fully Observable Non-Deterministic Domains
Mokhtari, Vahid, Sathya, Ajay Suresha, Tsiogkas, Nikolaos, Decre, Wilm
Replanners are efficient methods for solving non-deterministic planning problems. Despite showing good scalability, existing replanners often fail to solve problems involving a large number of misleading plans, i.e., weak plans that do not lead to strong solutions, however, due to their minimal lengths, are likely to be found at every replanning iteration. The poor performance of replanners in such problems is due to their all-outcome determinization. That is, when compiling from non-deterministic to classical, they include all compiled classical operators in a single deterministic domain which leads replanners to continually generate misleading plans. We introduce an offline replanner, called Safe-Planner (SP), that relies on a single-outcome determinization to compile a non-deterministic domain to a set of classical domains, and ordering heuristics for ranking the obtained classical domains. The proposed single-outcome determinization and the heuristics allow for alternating between different classical domains. We show experimentally that this approach can allow SP to avoid generating misleading plans but to generate weak plans that directly lead to strong solutions. The experiments show that SP outperforms state-of-the-art non-deterministic solvers by solving a broader range of problems. We also validate the practical utility of SP in real-world non-deterministic robotic tasks.
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- Africa > Madagascar (0.04)
Driving in Dense Traffic with Model-Free Reinforcement Learning
Saxena, Dhruv Mauria, Bae, Sangjae, Nakhaei, Alireza, Fujimura, Kikuo, Likhachev, Maxim
Traditional planning and control methods could fail to find a feasible trajectory for an autonomous vehicle to execute amongst dense traffic on roads. This is because the obstacle-free volume in spacetime is very small in these scenarios for the vehicle to drive through. However, that does not mean the task is infeasible since human drivers are known to be able to drive amongst dense traffic by leveraging the cooperativeness of other drivers to open a gap. The traditional methods fail to take into account the fact that the actions taken by an agent affect the behaviour of other vehicles on the road. In this work, we rely on the ability of deep reinforcement learning to implicitly model such interactions and learn a continuous control policy over the action space of an autonomous vehicle. The application we consider requires our agent to negotiate and open a gap in the road in order to successfully merge or change lanes. Our policy learns to repeatedly probe into the target road lane while trying to find a safe spot to move in to. We compare against two model-predictive control-based algorithms and show that our policy outperforms them in simulation.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- Europe > Spain > Galicia > Madrid (0.04)
- (12 more...)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (0.95)
Improved Non-Deterministic Planning by Exploiting State Relevance
Muise, Christian James (University of Toronto) | McIlraith, Sheila A. (University of Toronto) | Beck, Christopher (University of Toronto)
We address the problem of computing a policy for fully observable non-deterministic (FOND) planning problems. By focusing on the relevant aspects of the state of the world, we introduce a series of improvements to the previous state of the art and extend the applicability of our planner, PRP, to work in an online setting. The use of state relevance allows our policy to be exponentially more succinct in representing a solution to a FOND problem for some domains. Through the introduction of new techniques for avoiding deadends and determining sufficient validity conditions, PRP has the potential to compute a policy up to several orders of magnitude faster than previous approaches. We also find dramatic improvements over the state of the art in online replanning when we treat suitable probabilistic domains as FOND domains.