Goto

Collaborating Authors

 Country


Static and Dynamic Values of Computation in MCTS

arXiv.org Artificial Intelligence

Monte-Carlo Tree Search (MCTS) is one of the most-widely used methods for planning, and has powered many recent advances in artificial intelligence. In MCTS, one typically performs computations (i.e., simulations) to collect statistics about the possible future consequences of actions, and then chooses accordingly. Many popular MCTS methods such as UCT and its variants decide which computations to perform by trading-off exploration and exploitation. In this work, we take a more direct approach, and explicitly quantify the value of a computation based on its expected impact on the quality of the action eventually chosen. Our approach goes beyond the "myopic" limitations of existing computation-value-based methods in two senses: (I) we are able to account for the impact of non-immediate (ie, future) computations (II) on non-immediate actions. We show that policies that greedily optimize computation values are optimal under certain assumptions and obtain results that are competitive with the state-of-the-art.


ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning

arXiv.org Artificial Intelligence

Recent powerful pre-trained language models have achieved remarkable performance on most of the popular datasets for reading comprehension. It is time to introduce more challenging datasets to push the development of this field towards more comprehensive reasoning of text. In this paper, we introduce a new Reading Comprehension dataset requiring logical reasoning (ReClor) extracted from standardized graduate admission examinations. As earlier studies suggest, human-annotated datasets usually contain biases, which are often exploited by models to achieve high accuracy without truly understanding the text. In order to comprehensively evaluate the logical reasoning ability of models on ReClor, we propose to identify biased data points and separate them into EASY set while the rest as HARD set. Empirical results show that state-of-the-art models have an outstanding ability to capture biases contained in the dataset with high accuracy on EASY set. However, they struggle on HARD set with poor performance near that of random guess, indicating more research is needed to essentially enhance the logical reasoning ability of current models. 1


Learning Coupled Policies for Simultaneous Machine Translation

arXiv.org Artificial Intelligence

In simultaneous machine translation, the system needs to incrementally generate the output translation before the input sentence ends. This is a coupled decision process consisting of a programmer and interpreter. The programmer's policy decides about when to WRITE the next output or READ the next input, and the interpreter's policy decides what word to write. We present an imitation learning (IL) approach to efficiently learn effective coupled programmer-interpreter policies. To enable IL, we present an algorithmic oracle to produce oracle READ/WRITE actions for training bilingual sentence-pairs using the notion of word alignments. We attribute the effectiveness of the learned coupled policies to (i) scheduled sampling addressing the coupled exposure bias, and (ii) quality of oracle actions capturing enough information from the partial input before writing the output. Experiments show our method outperforms strong baselines in terms of translation quality and delay, when translating from German/Arabic/Czech/Bulgarian/Romanian to English.


Human-to-Robot Attention Transfer for Robot Execution Failure Avoidance Using Stacked Neural Networks

arXiv.org Artificial Intelligence

Due to world dynamics and hardware uncertainty, robots inevitably fail in task executions, leading to undesired or even dangerous executions. To avoid failures for improved robot performance, it is critical to identify and correct robot abnormal executions in an early stage. However, limited by reasoning capability and knowledge level, it is challenging for a robot to self diagnose and correct their abnormal behaviors. To solve this problem, a novel method is proposed, human-to-robot attention transfer (H2R-AT) to seek help from a human. H2R-AT is developed based on a novel stacked neural networks model, transferring human attention embedded in verbal reminders to robot attention embedded in robot visual perceiving. With the attention transfer from a human, a robot understands what and where human concerns are to identify and correct its abnormal executions. To validate the effectiveness of H2R-AT, two representative task scenarios, "serve water for a human in a kitchen" and "pick up a defective gear in a factory" with abnormal robot executions, were designed in an open-access simulation platform V-REP; $252$ volunteers were recruited to provide about 12000 verbal reminders to learn and test the attention transfer model H2R-AT. With an accuracy of $73.68\%$ in transferring attention and accuracy of $66.86\%$ in avoiding robot execution failures, the effectiveness of H2R-AT was validated.


Hyper-Meta Reinforcement Learning with Sparse Reward

arXiv.org Artificial Intelligence

Despite their success, existing meta reinforcement learning methods still have difficulty in learning a meta policy effectively for RL problems with sparse reward. To this end, we develop a novel meta reinforcement learning framework, Hyper-Meta RL (HMRL), for sparse reward RL problems. It consists of meta state embedding, meta reward shaping and meta policy learning modules: The cross-environment meta state embedding module constructs a common meta state space to adapt to different environments; The meta state based environment-specific meta reward shaping effectively extends the original sparse reward trajectory by cross-environmental knowledge complementarity; As a consequence, the meta policy then achieves better generalization and efficiency with the shaped meta reward. Experiments with sparse reward show the superiority of HMRL on both transferability and policy learning efficiency.


Efficiently Learning and Sampling Interventional Distributions from Observations

arXiv.org Artificial Intelligence

We study the problem of efficiently estimating the effect of an intervention on a single variable using observational samples in a causal Bayesian network. Our goal is to give algorithms that are efficient in both time and sample complexity in a non-parametric setting. Tian and Pearl (AAAI `02) have exactly characterized the class of causal graphs for which causal effects of atomic interventions can be identified from observational data. We make their result quantitative. Suppose P is a causal model on a set V of n observable variables with respect to a given causal graph G with observable distribution $P$. Let $P_x$ denote the interventional distribution over the observables with respect to an intervention of a designated variable X with x. We show that assuming that G has bounded in-degree, bounded c-components, and that the observational distribution is identifiable and satisfies certain strong positivity condition: 1. [Evaluation] There is an algorithm that outputs with probability $2/3$ an evaluator for a distribution $P'$ that satisfies $d_{tv}(P_x, P') \leq \epsilon$ using $m=\tilde{O}(n\epsilon^{-2})$ samples from $P$ and $O(mn)$ time. The evaluator can return in $O(n)$ time the probability $P'(v)$ for any assignment $v$ to $V$. 2. [Generation] There is an algorithm that outputs with probability $2/3$ a sampler for a distribution $\hat{P}$ that satisfies $d_{tv}(P_x, \hat{P}) \leq \epsilon$ using $m=\tilde{O}(n\epsilon^{-2})$ samples from $P$ and $O(mn)$ time. The sampler returns an iid sample from $\hat{P}$ with probability $1-\delta$ in $O(n\epsilon^{-1} \log\delta^{-1})$ time. We extend our techniques to estimate marginals $P_x|_Y$ over a given $Y \subset V$ of interest. We also show lower bounds for the sample complexity showing that our sample complexity has optimal dependence on the parameters n and $\epsilon$ as well as the strong positivity parameter.


Brain scans can help predict who'll benefit from an antidepressant

New Scientist

An AI can predict from people's brainwaves whether an antidepressant is likely to help them. The technique may offer a new approach to prescribing medicines for mental illnesses. "We have a central problem in psychiatry because we characterise diseases by their end point, such as what behaviours they cause," says Amit Etkin at Stanford University in California. "You tell me you're depressed, and I don't know any more than that. I don't really know what's going on in the brain and we prescribe medication on very little information."


Elephants mourn their dead even if they did not have a close bond

Daily Mail - Science & tech

Elephants mourn their dead even if they did not have a close bond and continue to take an interest long after their bodies start to decay, a new study finds. Experts from the San Diego Zoo Institute for Conservation Research looked at 32 wild elephant carcasses from 12 different sources across Africa. They monitored the way in which the animals interacted with the carcasses and found that, in all cases, they would touch and examine the remains. They were also seen vocalising and attempting to lift or pull fallen elephants that had just died, according to researchers. New research has shown they mourn their dead even if they don't know them well (stock image) The idea that elephants have a'unique relationship' with the dead has been touted for a number of years, but this new study is the first to examine it in detail.


Leaks give detailed preview of Samsung's new smart speaker, the Galaxy Home Mini ahead of release

Daily Mail - Science & tech

With a mere 24 hours before Samsung's Unpacked event, leaks of some of its soon-to-be-announced products are still rolling in. In a tweet sent out just three days before Samsung's major product event, Max Weinbach of XDA Developers, who has leaked several other major details about Samsung's forthcoming products, showed off real glimpses of the company's new smart speaker, the Galaxy Home Mini. A video of the smart speaker in action and corresponding literature for the Galaxy Home Mini offer insight into just what the product will do. According to images posted by Weinbach, among the capabilities will be the usual list of voice-activated queries like'what's the weather?' or'play Jazz music' in addition to a range of smart home controls. The sheet leaked by Weinbach also suggests that users will be able to summon Samsung's voice-assistant Bixby to change smart thermostats, turn devices off or on, or with applicable hardware, even change the channel on a TV.


Code & Supply

#artificialintelligence

Stephanie Vaughn - Mind the Gap: Closing the Digital Divide in America (Abstractions II Raw Cuts) - Duration: 25 minutes.