fringe
TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning
We propose a novel approach to interactive theorem proving (ITP) using deep reinforcement learning. The proposed framework is able to learn proof search strategies as well as tactic and arguments prediction in an end-to-end manner. We formulate the process of ITP as a Markov decision process (MDP) in which each state represents a set of potential derivation paths. This structure allows us to introduce a search mechanism which enables the agent to efficiently discard (predicted) dead-end derivations and restart from promising alternatives. We implement the framework in the HOL4 theorem prover. Experimental results show that the framework using learned search strategies outperforms existing automated theorem provers (i.e.
Appendix: On the Expressivity of Markov Reward
We first address questions that might arise in response to the main text. That is, if Alice chooses a SOAP, PO, or TO for Bob to learn to solve, when can Alice determine Bob has solved the task? A: Bob can be said to be doing better on a given task if his behavior improves, as is typical in evaluating behavior under reward. The difference with SOAPs, POs, and TOs is that we measure improvement relative to the task rather than reward. For instance, given a SOAP, we might say that Bob has solved the task once he has found one of the good policies, and we might measure Bob's progress on a task in terms of the distance of his greedy policy to one of the good policies (as done in our learning experiments). The same reasoning applies to POs and TOs: Bob is doing better on a task in so far as his greedy policy (or trajectories) is (are) higher up the ordering. That is, the studied reward functions must be a function of s, (s,a), or (s,a,s0). A: Indeed, as discussed in our introduction, our goal is to examine the expressivity of Markov rewards in the context of finite MDPs.
Are you a Flat Earther? You're probably ARROGANT: People who believe in conspiracy theories are 'massively overconfident', study finds
When it comes to conspiracy theories, there are some pretty extreme ones out there. While some people insist the Earth is flat, others are certain the world is secretly ruled by reptilian humanoids. Now, a study has revealed that people who believe in these concepts are likely to be hugely overconfident. And it could go some way to explaining why it's impossible to try and change their minds. Analysis of eight studies has found a consistent pattern among people who believe in conspiracy theories – they tend to be overconfident in their cognitive abilities and significantly overestimate how much others agree with them.
Risk Awareness in HTN Planning
Alnazer, Ebaa, Georgievski, Ilche, Aiello, Marco
Actual real-world domains are characterised by uncertain situations in which acting and using resources may entail the embracing of risks. Performing actions in such domains involves costs of consuming some resource, such as time or energy, where the knowledge about these costs can range from known to totally unknown. In autonomous vehicles, actions have uncertain costs due to factors like traffic. Choosing an action requires assessing delay risks, as each road may have unpredictable congestion. Thus, these domains call for not only planning under uncertainty but also planning while embracing risk. Resorting to HTN planning as a widely used planning technique in real-world applications, one can observe that existing approaches assume risk neutrality, relying on single-valued action costs without considering risk. Here, we enhance HTN planning with risk awareness by considering expected utility theory. We introduce a general framework for HTN planning that allows modelling risk and uncertainty using a probability distribution of action costs upon which we define risk-aware HTN planning as being capable of accounting for the different risk attitudes and allowing the computation of plans that go beyond risk neutrality. We lay out that computing risk-aware plans requires finding plans with the highest expected utility. We argue that it is possible for HTN planning agents to solve specialised risk-aware HTN planning problems by adapting existing HTN planning approaches, and develop an approach that surpasses the expressiveness of current approaches by allowing these agents to compute plans tailored to a particular risk attitude. An empirical evaluation of two case studies highlights the feasibility and expressiveness of this approach. We also highlight open issues, such as applying the proposal beyond HTN planning, covering both modelling and plan generation.
WAVE-UNET: Wavelength based Image Reconstruction method using attention UNET for OCT images
Viqar, Maryam, Sahin, Erdem, Madjarova, Violeta, Stoykova, Elena, Hong, Keehoon
In this work, we propose to leverage a deep-learning (DL) based reconstruction framework for high quality Swept-Source Optical Coherence Tomography (SS-OCT) images, by incorporating wavelength ({\lambda}) space interferometric fringes. Generally, the SS-OCT captured fringe is linear in wavelength space and if Inverse Discrete Fourier Transform (IDFT) is applied to extract depth-resolved spectral information, the resultant images are blurred due to the broadened Point Spread Function (PSF). Thus, the recorded wavelength space fringe is to be scaled to uniform grid in wavenumber (k) space using k-linearization and calibration involving interpolations which may result in loss of information along with increased system complexity. Another challenge in OCT is the speckle noise, inherent in the low coherence interferometry-based systems. Hence, we propose a systematic design methodology WAVE-UNET to reconstruct the high-quality OCT images directly from the {\lambda}-space to reduce the complexity. The novel design paradigm surpasses the linearization procedures and uses DL to enhance the realism and quality of raw {\lambda}-space scans. This framework uses modified UNET having attention gating and residual connections, with IDFT processed {\lambda}-space fringes as the input. The method consistently outperforms the traditional OCT system by generating good-quality B-scans with highly reduced time-complexity.
Serpentine Synergy: Design and Fabrication of a Dual Soft Continuum Manipulator and Soft Snake Robot
S, Rajashekhar V, Rajesh, Aravinth, Athaaillah, Muhammad Imam Anugrahadi, Prabhakar, Gowdham
This work presents a soft continuum robot (SCR) that can be used as a soft continuum manipulator (SCM) and a soft snake robot (SSR). This is achieved using expanded polyethylene foam (EPE) modules as the soft material. In situations like post-earthquake search operations, these dual-purpose robots could play a vital role. The soft continuum manipulator with a camera attached to the tip can manually search for survivors in the debris. On the other hand, the soft snake robot can be made by attaching an active wheel to the soft continuum manipulator. This mobile robot can reach places humans cannot and gather information about survivors. This work presents the design, fabrication, and experimental validation of the dual soft continuum robot.