skipper
- Europe > Italy > Lombardy > Milan (0.05)
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- Europe > Italy > Lombardy > Milan (0.04)
- North America > Canada (0.04)
Identifying and Addressing Delusions for Target-Directed Decision-Making
Zhao, Mingde, Sylvain, Tristan, Precup, Doina, Bengio, Yoshua
Target-directed agents utilize self-generated targets, to guide their behaviors for better generalization. These agents are prone to blindly chasing problematic targets, resulting in worse generalization and safety catastrophes. We show that these behaviors can be results of delusions, stemming from improper designs around training: the agent may naturally come to hold false beliefs about certain targets. We identify delusions via intuitive examples in controlled environments, and investigate their causes and mitigations. With the insights, we demonstrate how we can make agents address delusions preemptively and autonomously. We validate empirically the effectiveness of the proposed strategies in correcting delusional behaviors and improving out-of-distribution generalization.
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States (0.04)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- Information Technology > Artificial Intelligence > Robots (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Human-Robot Collaboration System Setup for Weed Harvesting Scenarios in Aquatic Lakes
Elsayed, Ahmed H., Lejman, Andrej, Stahl, Frederic
Abstract-- Artificial Water Bodies (AWBs) are human-made and require continuous monitoring due to their artificial biological processes. These systems necessitate regular maintenance to manage their ecosystems effectively. Unmanned Surface Vehicle (USV) offers a collaborative approach for monitoring these environments, working alongside human operators such as boat skippers to identify specific locations. This paper discusses a weed harvesting scenario, demonstrating how human-robot collaboration can be achieved, supported by preliminary results. I. INTRODUCTION AWBs are created by humans for different reasons such as water retention in dam construction, urban development, rainwater storage, or leisure activities.
- North America > United States > Florida (0.05)
- Europe > Germany > Lower Saxony > Hanover (0.05)
- Asia > Japan (0.05)
Skipper: Improving the Reach and Fidelity of Quantum Annealers by Skipping Long Chains
Ayanzadeh, Ramin, Qureshi, Moinuddin
Quantum Annealers (QAs) operate as single-instruction machines, lacking a SWAP operation to overcome limited qubit connectivity. Consequently, multiple physical qubits are chained to form a program qubit with higher connectivity, resulting in a drastically diminished effective QA capacity by up to 33x. We observe that in QAs: (a) chain lengths exhibit a power-law distribution, a few dominant chains holding substantially more qubits than others; and (b) about 25% of physical qubits remain unused, getting isolated between these chains. We propose Skipper, a software technique that enhances the capacity and fidelity of QAs by skipping dominant chains and substituting their program qubit with two readout results. Using a 5761-qubit QA, we demonstrate that Skipper can tackle up to 59% (Avg. 28%) larger problems when eleven chains are skipped. Additionally, Skipper can improve QA fidelity by up to 44% (Avg. 33%) when cutting five chains (32 runs). Users can specify up to eleven chain cuts in Skipper, necessitating about 2,000 distinct quantum executable runs. To mitigate this, we introduce Skipper-G, a greedy scheme that skips sub-problems less likely to hold the global optimum, executing a maximum of 23 quantum executables with eleven chain trims. Skipper-G can boost QA fidelity by up to 41% (Avg. 29%) when cutting five chains (11 runs).
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- North America > United States > Maryland > Baltimore County (0.04)
- (7 more...)
- Information Technology (1.00)
- Health & Medicine (0.67)
- Information Technology > Hardware (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
- Information Technology > Artificial Intelligence > Machine Learning (0.66)
Nonstochastic Multiarmed Bandits with Unrestricted Delays
Thune, Tobias Sommer, Cesa-Bianchi, Nicolò, Seldin, Yevgeny
We investigate multiarmed bandits with delayed feedback, where the delays need neither be identical nor bounded. We first prove that "delayed" Exp3 achieves the $O(\sqrt{(KT + D)\ln K})$ regret bound conjectured by Cesa-Bianchi et al. [2016] in the case of variable, but bounded delays. Here, $K$ is the number of actions and $D$ is the total delay over $T$ rounds. We then introduce a new algorithm that lifts the requirement of bounded delays by using a wrapper that skips rounds with excessively large delays. The new algorithm maintains the same regret bound, but similar to its predecessor requires prior knowledge of $D$ and $T$. For this algorithm we then construct a novel doubling scheme that forgoes the prior knowledge requirement under the assumption that the delays are available at action time (rather than at loss observation time). This assumption is satisfied in a broad range of applications, including interaction with servers and service providers. The resulting oracle regret bound is of order $\min_\beta (|S_\beta|+\beta \ln K + (KT + D_\beta)/\beta)$, where $|S_\beta|$ is the number of observations with delay exceeding $\beta$, and $D_\beta$ is the total delay of observations with delay below $\beta$. The bound relaxes to $O(\sqrt{(KT + D)\ln K})$, but we also provide examples where $D_\beta \ll D$ and the oracle bound has a polynomially better dependence on the problem parameters.
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- Europe > Italy > Lombardy > Milan (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Nonstochastic Multiarmed Bandits with Unrestricted Delays
Thune, Tobias Sommer, Cesa-Bianchi, Nicolò, Seldin, Yevgeny
We investigate multiarmed bandits with delayed feedback, where the delays need neither be identical nor bounded. We first prove that the "delayed" Exp3 achieves the $O(\sqrt{(KT + D)\ln K})$ regret bound conjectured by Cesa-Bianchi et al. [2016], in the case of variable, but bounded delays. Here, $K$ is the number of actions and $D$ is the total delay over $T$ rounds. We then introduce a new algorithm that lifts the requirement of bounded delays by using a wrapper that skips rounds with excessively large delays. The new algorithm maintains the same regret bound, but similar to its predecessor requires prior knowledge of $D$ and $T$. For this algorithm we then construct a novel doubling scheme that forgoes this requirement under the assumption that the delays are available at action time (rather than at loss observation time). This assumption is satisfied in a broad range of applications, including interaction with servers and service providers. The resulting oracle regret bound is of order $\min_{\beta} (|S_\beta|+\beta \ln K + (KT + D_\beta)/\beta)$, where $|S_\beta|$ is the number of observations with delay exceeding $\beta$, and $D_\beta$ is the total delay of observations with delay below $\beta$. The bound relaxes to $O(\sqrt{(KT + D)\ln K})$, but we also provide examples where $D_\beta \ll D$ and the oracle bound has a polynomially better dependence on the problem parameters.
A poet does TensorFlow
After reading Pete Warden's excellent TensorFlow for Poets, I was impressed at how easy it seemed to build a working deep learning classifier. It was so simple that I had to try it myself. I have a lot of photos around, mostly of birds and butterflies. So, I decided to build a simple butterfly classifier. I chose butterflies because I didn't have as many photos to work with, and because they were already fairly well sorted.