Country
Optimal strategies in the Fighting Fantasy gaming system: influencing stochastic dynamics by gambling with limited resource
Fighting Fantasy is a popular recreational fantasy gaming system worldwide. Combat in this system progresses through a stochastic game involving a series of rounds, each of which may be won or lost. Each round, a limited resource (`luck') may be spent on a gamble to amplify the benefit from a win or mitigate the deficit from a loss. However, the success of this gamble depends on the amount of remaining resource, and if the gamble is unsuccessful, benefits are reduced and deficits increased. Players thus dynamically choose to expend resource to attempt to influence the stochastic dynamics of the game, with diminishing probability of positive return. The identification of the optimal strategy for victory is a Markov decision problem that has not yet been solved. Here, we combine stochastic analysis and simulation with dynamic programming to characterise the dynamical behaviour of the system in the absence and presence of gambling policy. We derive a simple expression for the victory probability without luck-based strategy. We use a backward induction approach to solve the Bellman equation for the system and identify the optimal strategy for any given state during the game. The optimal control strategies can dramatically enhance success probabilities, but take detailed forms; we use stochastic simulation to approximate these optimal strategies with simple heuristics that can be practically employed. Our findings provide a roadmap to improving success in the games that millions of people play worldwide, and inform a class of resource allocation problems with diminishing returns in stochastic games.
Cognitive Argumentation and the Suppression Task
Saldanha, Emmanuelle-Anna Dietz, Kakas, Antonis
This paper addresses the challenge of modeling human reasoning, within a new framework called Cognitive Argumentation. This framework rests on the assumption that human logical reasoning is inherently a process of dialectic argumentation and aims to develop a cognitive model for human reasoning that is computational and implementable. To give logical reasoning a human cognitive form the framework relies on cognitive principles, based on empirical and theoretical work in Cognitive Science, to suitably adapt a general and abstract framework of computational argumentation from AI. The approach of Cognitive Argumentation is evaluated with respect to Byrne's suppression task, where the aim is not only to capture the suppression effect between different groups of people but also to account for the variation of reasoning within each group. Two main cognitive principles are particularly important to capture human conditional reasoning that explain the participants' responses: (i) the interpretation of a condition within a conditional as sufficient and/or necessary and (ii) the mode of reasoning either as predictive or explanatory. We argue that Cognitive Argumentation provides a coherent and cognitively adequate model for human conditional reasoning that allows a natural distinction between definite and plausible conclusions, exhibiting the important characteristics of context-sensitive and defeasible reasoning.
Safe reinforcement learning for probabilistic reachability and safety specifications: A Lyapunov-based approach
Emerging applications in robotics and autonomous systems, such as autonomous driving and robotic surgery, often involve critical safety constraints that must be satisfied even when information about system models is limited. In this regard, we propose a model-free safety specification method that learns the maximal probability of safe operation by carefully combining probabilistic reachability analysis and safe reinforcement learning (RL). Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage. As a result, it yields a sequence of safe policies that determine the range of safe operation, called the safe set, which monotonically expands and gradually converges. We also develop an efficient safe exploration scheme that accelerates the process of identifying the safety of unexamined states. Exploiting the Lyapunov shielding, our method regulates the exploratory policy to avoid dangerous states with high confidence. To handle high-dimensional systems, we further extend our approach to deep RL by introducing a Lagrangian relaxation technique to establish a tractable actor-critic algorithm. The empirical performance of our method is demonstrated through continuous control benchmark problems, such as a reaching task on a planar robot arm.
Predicting Subjective Features from Questions on QA Websites using BERT
Annamoradnejad, Issa, Fazli, Mohammadamin, Habibi, Jafar
Modern Question-Answering websites, such as StackOverflow and Quora, have specific user rules to maintain their content quality. These systems rely on user reports for accessing new contents, which has serious problems including the slow handling of violations, the loss of normal and experienced users' time, the low quality of some reports, and discouraging feedback to new users. Therefore, with the overall goal of providing solutions for automating moderation actions in Q&A websites, we aim to provide a model to predict 20 quality or subjective aspects of questions in QA websites. To this end, we used data gathered by the CrowdSource team at Google Research in 2019 and fine-tuned pre-trained BERT model on our problem. Model achieves 95.4% accuracy after 2 epochs of training and did not improve substantially in the next ones. Results confirm that by simple fine-tuning, we can achieve accurate models, in little time, and on less amount of data.
Towards precise causal effect estimation from data with hidden variables
Cheng, Debo, Li, Jiuyong, Liu, Lin, Yu, Kui, Lee, Thuc Duy, Liu, Jixue
Causal effect estimation from observational data is a crucial but challenging task. Currently, only a limited number of data-driven causal effect estimation methods are available. These methods either only provide a bound estimation of the causal effect of a treatment on the outcome, or have impractical assumptions on the data or low efficiency although providing a unique estimation of the causal effect. In this paper, we identify a practical problem setting and propose an approach to achieving unique causal effect estimation from data with hidden variables under this setting. For the approach, we develop the theorems to support the discovery of the proper covariate sets for confounding adjustment (adjustment sets). Based on the theorems, two algorithms are presented for finding the proper adjustment sets from data with hidden variables to obtain unbiased and unique causal effect estimation. Experiments with benchmark Bayesian networks and real-world datasets have demonstrated the efficiency and effectiveness of the proposed algorithms, indicating the practicability of the identified problem setting and the potential of the approach in real-world applications.
Scalable Constrained Bayesian Optimization
Eriksson, David, Poloczek, Matthias
The global optimization of a high-dimensional black-box function under black-box constraints is a pervasive task in machine learning, control, and engineering. These problems are difficult since the feasible set is typically non-convex and hard to find, in addition to the curses of dimensionality and the heterogeneity of the underlying functions. In particular, these characteristics dramatically impact the performance of Bayesian optimization methods, that otherwise have become the de-facto standard for sample-efficient optimization in unconstrained settings. Due to the lack of sample-efficient methods, practitioners usually fall back to evolutionary strategies or heuristics. We propose the scalable constrained Bayesian optimization (SCBO) algorithm that addresses the above challenges by data-independent transformations of the functions and follows the recent theme of local Bayesian optimization. A comprehensive experimental evaluation demonstrates that SCBO achieves excellent results and outperforms the state-of-the-art methods.
You created a machine learning application. Now make sure it's secure.
In a recent post, we described what it would take to build a sustainable machine learning practice. By "sustainable," we mean projects that aren't just proofs of concepts or experiments. A sustainable practice means projects that are integral to an organization's mission: projects by which an organization lives or dies. These projects are built and supported by a stable team of engineers, and supported by a management team that understands what machine learning is, why it's important, and what it's capable of accomplishing. Finally, sustainable machine learning means that as many aspects of product development as possible are automated: not just building models, but cleaning data, building and managing data pipelines, testing, and much more. Machine learning will penetrate our organizations so deeply that it won't be possible for humans to manage them unassisted. Organizations throughout the world are waking up to the fact that security is essential to their software projects. Nobody wants to be the next Sony, the next Anthem, or the next Equifax. But while we know how to make traditional software more secure (even though we frequently don't), machine learning presents a new set of problems. Any sustainable machine learning practice must address machine learning's unique security issues. We didn't do that for traditional software, and we're paying the price now.
Privacy Attacks on Machine Learning Models
Machine learning is an exciting field of new opportunities and applications; but like most technology, there are also dangers present as we expand the machine learning systems and reach within our organizations. The use of machine learning on sensitive information, such as financial data, shopping histories, conversations with friends and health-related data, has expanded in the past five years -- and so has the research on vulnerabilities within those machine learning systems. In the news and commentary today, the most common example of hacking a machine learning system is adversarial input. Adversarial input, like the video shown below, are crafted examples which fool a machine learning system into making a false prediction. In this video, a group of researchers at MIT were able to show that they can 3D print an adversarial turtle which is misclassified as a rifle from multiple angles by a computer vision system.
Harnham hiring Machine Learning Engineer in New York, New York, United States LinkedIn
Based in NYC, a specialized data driven firm is growing the opportunity to create data-driven products in the E-Commerce space. The company is one of the largest and most profitable consumer product companies in the Amazon ecosystem while building an amazing place to work.The team there is utilizing applied intelligence to unlock the full potential of an e-commerce market. The team is creating innovative and actionable algorithms to reveal research-based secrets pulled from massive data sources, in order to get ahead of today's market. They've recently received funding for a Series A round, and is ALREADY profitable. This role is part individual contributor, part manager, where you will work with business stakeholders to design data driven products and built the internal tools that will help ensure data quality and create opportunity for team members to quickly access data and make decisions related to growth of the brands.
Novateur Research Solutions hiring Machine Learning Scientist in New York City Metropolitan Area LinkedIn
We value creativity, vision, collaboration, and above all, ambition to innovate. We are looking for Scientists and Mathematicians to join our research team and help us solve challenging scientific and computational problems in machine learning, computer vision, distributed computing, and related areas. As part of the Novateur Team, you will actively collaborate with world-renowned researchers in academia and industry to develop cutting-edge technologies for smart systems. You will have opportunities and professional freedom to create novel research and technical directions in your areas of interest, attend major scientific conferences and seminars, and publish research papers. Novateur offers competitive pay and benefits comparable to Fortune 500 companies that include a wide choice of healthcare options with generous company subsidy, 401(k) with generous employer match, paid holidays and paid time off increasing with tenure, and company paid short-term disability, long-term disability, and life insurance.