Goto

Collaborating Authors

 Planning & Scheduling


A Provably Efficient Sample Collection Strategy for Reinforcement Learning

arXiv.org Machine Learning

A common assumption in reinforcement learning (RL) is to have access to a generative model (i.e., a simulator of the environment), which allows to generate samples from any desired state-action pair. Nonetheless, in many settings a generative model may not be available and an adaptive exploration strategy is needed to efficiently collect samples from an unknown environment by direct interaction. In this paper, we study the scenario where an algorithm based on the generative model assumption defines the (possibly time-varying) amount of samples $b(s,a)$ required at each state-action pair $(s,a)$ and an exploration strategy has to learn how to generate $b(s,a)$ samples as fast as possible. Building on recent results for regret minimization in the stochastic shortest path (SSP) setting (Cohen et al., 2020; Tarbouriech et al., 2020), we derive an algorithm that requires $\tilde{O}( B D + D^{3/2} S^2 A)$ time steps to collect the $B = \sum_{s,a} b(s,a)$ desired samples, in any unknown and communicating MDP with $S$ states, $A$ actions and diameter $D$. Leveraging the generality of our strategy, we readily apply it to a variety of existing settings (e.g., model estimation, pure exploration in MDPs) for which we obtain improved sample-complexity guarantees, and to a set of new problems such as best-state identification and sparse reward discovery.


Deployment and Evaluation of a Flexible Human-Robot Collaboration Model Based on AND/OR Graphs in a Manufacturing Environment

arXiv.org Artificial Intelligence

The Industry 4.0 paradigm promises shorter development times, increased ergonomy, higher flexibility, and resource efficiency in manufacturing environments. Collaborative robots are an important tangible technology for implementing such a paradigm. A major bottleneck to effectively deploy collaborative robots to manufacturing industries is developing task planning algorithms that enable them to recognize and naturally adapt to varying and even unpredictable human actions while simultaneously ensuring an overall efficiency in terms of production cycle time. In this context, an architecture encompassing task representation, task planning, sensing, and robot control has been designed, developed and evaluated in a real industrial environment. A pick-and-place palletization task, which requires the collaboration between humans and robots, is investigated. The architecture uses AND/OR graphs for representing and reasoning upon human-robot collaboration models online. Furthermore, objective measures of the overall computational performance and subjective measures of naturalness in human-robot collaboration have been evaluated by performing experiments with production-line operators. The results of this user study demonstrate how human-robot collaboration models like the one we propose can leverage the flexibility and the comfort of operators in the workplace. In this regard, an extensive comparison study among recent models has been carried out.


Efficient Planning in Large MDPs with Weak Linear Function Approximation

arXiv.org Machine Learning

Large-scale Markov decision processes (MDPs) require planning algorithms with runtime independent of the number of states of the MDP. We consider the planning problem in MDPs using linear value function approximation with only weak requirements: low approximation error for the optimal value function, and a small set of "core" states whose features span those of other states. In particular, we make no assumptions about the representability of policies or value functions of non-optimal policies. Our algorithm produces almost-optimal actions for any state using a generative oracle (simulator) for the MDP, while its computation time scales polynomially with the number of features, core states, and actions and the effective horizon.


Integrating Artificial Intelligence in Treatment Planning

#artificialintelligence

At the American Association of Physicists in Medicine (AAPM) 2019 meeting, new artificial intelligence (AI) software to assist with radiotherapy treatment planning systems was highlighted. The goal of the AI-based systems is to save staff time, while still allowing clinicians to do the final patient review. RaySearch demonstrated a new U.S. Food and Drug Administration (FDA)-cleared machine learning treatment planning system. The RaySearch RayStation machine learning algorithm is being used clinically by University Health Network, Princess Margaret Cancer Center, Toronto, Canada, where it was rolled out over several months in late-2019. Medical physicist Leigh Conroy, Ph.D., was involved in this rollout and helped conduct a study, showing the automated plans and traditionally made plans to radiation oncologists to get valuable feedback.


Bottom-up mechanism and improved contract net protocol for the dynamic task planning of heterogeneous Earth observation resources

arXiv.org Artificial Intelligence

Earth observation resources are becoming increasingly indispensable in disaster relief, damage assessment and related domains. Many unpredicted factors, such as the change of observation task requirements, to the occurring of bad weather and resource failures, may cause the scheduled observation scheme to become infeasible. Therefore, it is crucial to be able to promptly and maybe frequently develop high-quality replanned observation schemes that minimize the effects on the scheduled tasks. A bottom-up distributed coordinated framework together with an improved contract net are proposed to facilitate the dynamic task replanning for heterogeneous Earth observation resources. This hierarchical framework consists of three levels, namely, neighboring resource coordination, single planning center coordination, and multiple planning center coordination. Observation tasks affected by unpredicted factors are assigned and treated along with a bottom-up route from resources to planning centers. This bottom-up distributed coordinated framework transfers part of the computing load to various nodes of the observation systems to allocate tasks more efficiently and robustly. To support the prompt assignment of large-scale tasks to proper Earth observation resources in dynamic environments, we propose a multiround combinatorial allocation (MCA) method. Moreover, a new float interval-based local search algorithm is proposed to obtain the promising planning scheme more quickly. The experiments demonstrate that the MCA method can achieve a better task completion rate for large-scale tasks with satisfactory time efficiency. It also demonstrates that this method can help to efficiently obtain replanning schemes based on original scheme in dynamic environments.


A Survey of Algorithms for Black-Box Safety Validation

arXiv.org Artificial Intelligence

Autonomous and semi-autonomous systems for safety-critical applications require rigorous testing before deployment. Due to the complexity of these systems, formal verification may be impossible and real-world testing may be dangerous during development. Therefore, simulation-based techniques have been developed that treat the system under test as a black box during testing. Safety validation tasks include finding disturbances to the system that cause it to fail (falsification), finding the most-likely failure, and estimating the probability that the system fails. Motivated by the prevalence of safety-critical artificial intelligence, this work provides a survey of state-of-the-art safety validation techniques with a focus on applied algorithms and their modifications for the safety validation problem. We present and discuss algorithms in the domains of optimization, path planning, reinforcement learning, and importance sampling. Problem decomposition techniques are presented to help scale algorithms to large state spaces, and a brief overview of safety-critical applications is given, including autonomous vehicles and aircraft collision avoidance systems. Finally, we present a survey of existing academic and commercially available safety validation tools.


Grassley disappointed Iowa-Iowa State game nixed after Big Ten make schedule change

FOX News

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. The Big Ten's plan to have a conference-only schedule during the college football season stunned many fans and left at least one U.S. senator upset. Sen. Chuck Grassley, R-Iowa, tweeted his disappointment Thursday after learning one of the top conferences in college football was changing up its schedule as the U.S. continues to battle the coronavirus pandemic with no real end in sight. Grassley was particularly upset that there would be no rivalry game between Iowa and Iowa State this year.


Integrating Artificial Intelligence in Treatment Planning

#artificialintelligence

At the American Association of Physicists in Medicine (AAPM) 2019 meeting, new artificial intelligence (AI) software to assist with radiotherapy treatment planning systems was highlighted. The goal of the AI-based systems is to save staff time, while still allowing clinicians to do the final patient review. RaySearch demonstrated a new U.S. Food and Drug Administration (FDA)-cleared machine learning treatment planning system. The RaySearch RayStation machine learning algorithm is being used clinically by University Health Network, Princess Margaret Cancer Center, Toronto, Canada, where it was rolled out over several months in late-2019. Medical physicist Leigh Conroy, Ph.D., was involved in this rollout and helped conduct a study, showing the automated plans and traditionally made plans to radiation oncologists to get valuable feedback.


Learning Generalized Relational Heuristic Networks for Model-Agnostic Planning

arXiv.org Artificial Intelligence

Computing goal-directed behavior (sequential decision-making, or planning) is essential to designing efficient AI systems. Due to the computational complexity of planning, current approaches rely primarily upon hand-coded symbolic domain models and hand-coded heuristic-function generators for efficiency. Learned heuristics for such problems have been of limited utility as they are difficult to apply to problems with objects and object quantities that are significantly different from those in the training data. This paper develops a new approach for learning generalized heuristics in the absence of symbolic domain models using deep neural networks that utilize an input predicate vocabulary but are agnostic to object names and quantities. It uses an abstract state representation to facilitate data efficient, generalizable learning. Empirical evaluation on a range of benchmark domains show that in contrast to prior approaches, generalized heuristics computed by this method can be transferred easily to problems with different objects and with object quantities much larger than those in the training data.


Current Advancements on Autonomous Mission Planning and Management Systems: an AUV and UAV perspective

arXiv.org Artificial Intelligence

Analyzing encircling situation is the most crucial part of autonomous adaptation. Since there are many unknown and constantly changing factors in the real environment, momentary adjustment to the consistently alternating circumstances is highly required for addressing autonomy. To respond properly to changing environment, an utterly self-ruling vehicle ought to have the capacity to realize/comprehend its particular position and the surrounding environment. However, these vehicles extremely rely on human involvement to resolve entangled missions that cannot be precisely characterized in advance, which restricts their applications and accuracy. Reducing dependence on human supervision can be achieved by improving level of autonomy. Over the previous decades, autonomy and mission planning have been extensively researched on different structures and diverse conditions; nevertheless, aiming at robust mission planning in extreme conditions, here we provide exhaustive study of UVs autonomy as well as its related properties in internal and external situation awareness. In the following discussion, different difficulties in the scope of AUVs and UAVs will be discussed.