Goto

Collaborating Authors

 Polani, Daniel


Normative Feeling: Socially Patterned Affective Mechanisms

arXiv.org Artificial Intelligence

Norms and the normative processes that enforce them such as social maintenance are considered fundamental building blocks of human societies, shaping many aspects of our cognition. However, emerging work argues that the building blocks of normativity emerged much earlier in evolution than previously considered. In light of this, we argue that normative processes must be taken into account to consider the evolution of even ancient processes such as affect. We show through an agent-based model (with an evolvable model of affect) that different affective dispositions emerge when taking into account social maintenance. Further, we demonstrate that social maintenance results in the emergence of a minimal population regulation mechanism in a dynamic environment, without the need to predict the state of the environment or reason about the mental state of others. We use a cultural interpretation of our model to derive a new definition of norm emergence which distinguishes between indirect and direct social maintenance. Indirect social maintenance tends to one equilibrium (similar to environmental scaffolding) and the richer direct social maintenance results in many possible equilibria in behaviour, capturing an important aspect of normative behaviour in that it bears a certain degree of arbitrariness. We also distinguish between single-variable and mechanistic normative regularities. A mechanistic regularity, rather than a particular behaviour specified by one value e.g. walking speed, is a collection of values that specify a culturally patterned version of a psychological mechanism e.g. a disposition. This is how culture reprograms entire cognitive and physiological systems.


SuPLE: Robot Learning with Lyapunov Rewards

arXiv.org Artificial Intelligence

The reward function is an essential component in robot learning. Reward directly affects the sample and computational complexity of learning, and the quality of a solution. The design of informative rewards requires domain knowledge, which is not always available. We use the properties of the dynamics to produce system-appropriate reward without adding external assumptions. Specifically, we explore an approach to utilize the Lyapunov exponents of the system dynamics to generate a system-immanent reward. We demonstrate that the `Sum of the Positive Lyapunov Exponents' (SuPLE) is a strong candidate for the design of such a reward. We develop a computational framework for the derivation of this reward, and demonstrate its effectiveness on classical benchmarks for sample-based stabilization of various dynamical systems. It eliminates the need to start the training trajectories at arbitrary states, also known as auxiliary exploration. While the latter is a common practice in simulated robot learning, it is unpractical to consider to use it in real robotic systems, since they typically start from natural rest states such as a pendulum at the bottom, a robot on the ground, etc. and can not be easily initialized at arbitrary states. Comparing the performance of SuPLE to commonly-used reward functions, we observe that the latter fail to find a solution without auxiliary exploration, even for the task of swinging up the double pendulum and keeping it stable at the upright position, a prototypical scenario for multi-linked robots. SuPLE-induced rewards for robot learning offer a novel route for effective robot learning in typical as opposed to highly specialized or fine-tuned scenarios. Our code is publicly available for reproducibility and further research.


The Effect of Noise on the Emergence of Continuous Norms and its Evolutionary Dynamics

arXiv.org Artificial Intelligence

The social world is replete with norms, an important aspect Going beyond continuous game theory, Aubert-Kato et al. of organising societies. Social norms reduce the degrees (2015) investigated the emergence of frugal and greedy of freedom in the actions of individuals, making them behaviours in an embodied version of a dilemma where more predictable and stabilising societies (FeldmanHall and agents varied in how long they exploited a food source Shenhav, 2019). Norms also enable unrelated agents to - the longer it exploits the food source, the more selfish manage shared resources (Mathew et al., 2013), thereby the agent is. Michaeli and Spiro (2015) showed how extending cooperation beyond genetic relatives (Richerson "liberal" and "conservative" punishment regimes can affect et al., 2016).


Dimensionality Reduction of Dynamics on Lie Manifolds via Structure-Aware Canonical Correlation Analysis

arXiv.org Artificial Intelligence

Incorporating prior knowledge into a data-driven modeling problem can drastically improve performance, reliability, and generalization outside of the training sample. The stronger the structural properties, the more effective these improvements become. Manifolds are a powerful nonlinear generalization of Euclidean space for modeling finite dimensions. Structural impositions in constrained systems increase when applying group structure, converting them into Lie manifolds. The range of their applications is very wide and includes the important case of robotic tasks. Canonical Correlation Analysis (CCA) can construct a hierarchical sequence of maximal correlations of up to two paired data sets in these Euclidean spaces. We present a method to generalize this concept to Lie Manifolds and demonstrate its efficacy through the substantial improvements it achieves in making structure-consistent predictions about changes in the state of a robotic hand.


A space of goals: the cognitive geometry of informationally bounded agents

arXiv.org Artificial Intelligence

Traditionally, Euclidean geometry is treated by scientists as a priori and objective. However, when we take the position of an agent, the problem of selecting a best route should also factor in the abilities of the agent, its embodiment and particularly its cognitive effort. In this paper we consider geometry in terms of travel between states within a world by incorporating information processing costs with the appropriate spatial distances. This induces a geometry that increasingly differs from the original geometry of the given world, as information costs become increasingly important. We visualize this \textit{"cognitive geometry"} by projecting it onto 2- and 3-dimensional spaces showing distinct distortions reflecting the emergence of epistemic and information-saving strategies as well as pivot states. The analogies between traditional cost-based geometries and those induced by additional informational costs invite a generalization of the traditional notion of geodesics as cheapest routes towards the notion of \textit{infodesics}. Crucially, the concept of infodesics approximates the usual geometric property that, travelling from a start to a goal along a geodesic, not only the goal, but all intermediate points are equally visited at optimal cost from the start.


Causal blankets: Theory and algorithmic framework

arXiv.org Artificial Intelligence

We introduce a novel framework to identify perception-action loops (PALOs) directly from data based on the principles of computational mechanics. Our approach is based on the notion of causal blanket, which captures sensory and active variables as dynamical sufficient statistics -- i.e. as the "differences that make a difference." Furthermore, our theory provides a broadly applicable procedure to construct PALOs that requires neither a steady-state nor Markovian dynamics. Using our theory, we show that every bipartite stochastic process has a causal blanket, but the extent to which this leads to an effective PALO formulation varies depending on the integrated information of the bipartition.


AvE: Assistance via Empowerment

arXiv.org Artificial Intelligence

One difficulty in using artificial agents for human-assistive applications lies in the challenge of accurately assisting with a person's goal(s). Existing methods tend to rely on inferring the human's goal, which is challenging when there are many potential goals or when the set of candidate goals is difficult to identify. We propose a new paradigm for assistance by instead increasing the human's ability to control their environment, and formalize this approach by augmenting reinforcement learning with human empowerment. This task-agnostic objective preserves the person's autonomy and ability to achieve any eventual state. We test our approach against assistance based on goal inference, highlighting scenarios where our method overcomes failure modes stemming from goal ambiguity or misspecification. As existing methods for estimating empowerment in continuous domains are computationally hard, precluding its use in real time learned assistance, we also propose an efficient empowerment-inspired proxy metric. Using this, we are able to successfully demonstrate our method in a shared autonomy user study for a challenging simulated teleoperation task with human-in-the-loop training.


Bold Hearts Team Description for RoboCup 2019 (Humanoid Kid Size League)

arXiv.org Artificial Intelligence

We participated in the RoboCup 2018 competition in Montreal with our newly developed BoldBot based on the Darwin-OP and mostly self-printed custom parts. This paper is about the lessons learnt from that competition and further developments for the RoboCup 2019 competition. Firstly, we briefly introduce the team along with an overview of past achievements. We then present a simple, standalone 2D simulator we use for simplifying the entry for new members with making basic RoboCup concepts quickly accessible. We describe our approach for semantic-segmentation for our vision used in the 2018 competition, which replaced the lookup-table (LUT) implementation we had before. We also discuss the extra structural support we plan to add to the printed parts of the BoldBot and our transition to ROS 2 as our new middleware. Lastly, we will present a collection of open-source contributions of our team.


Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop

arXiv.org Artificial Intelligence

Active inference is an ambitious theory that treats perception, inference and action selection of autonomous agents under the heading of a single principle. It suggests biologically plausible explanations for many cognitive phenomena, including consciousness. In active inference, action selection is driven by an objective function that evaluates possible future actions with respect to current, inferred beliefs about the world. Active inference at its core is independent from extrinsic rewards, resulting in a high level of robustness across e.g.\ different environments or agent morphologies. In the literature, paradigms that share this independence have been summarised under the notion of intrinsic motivations. In general and in contrast to active inference, these models of motivation come without a commitment to particular inference and action selection mechanisms. In this article, we study if the inference and action selection machinery of active inference can also be used by alternatives to the originally included intrinsic motivation. The perception-action loop explicitly relates inference and action selection to the environment and agent memory, and is consequently used as foundation for our analysis. We reconstruct the active inference approach, locate the original formulation within, and show how alternative intrinsic motivations can be used while keeping many of the original features intact. Furthermore, we illustrate the connection to universal reinforcement learning by means of our formalism. Active inference research may profit from comparisons of the dynamics induced by alternative intrinsic motivations. Research on intrinsic motivations may profit from an additional way to implement intrinsically motivated agents that also share the biological plausibility of active inference.


Action and perception for spatiotemporal patterns

arXiv.org Artificial Intelligence

This is a contribution to the formalization of the concept of agents in multivariate Markov chains. Agents are commonly defined as entities that act, perceive, and are goal-directed. In a multivariate Markov chain (e.g. a cellular automaton) the transition matrix completely determines the dynamics. This seems to contradict the possibility of acting entities within such a system. Here we present definitions of actions and perceptions within multivariate Markov chains based on entity-sets. Entity-sets represent a largely independent choice of a set of spatiotemporal patterns that are considered as all the entities within the Markov chain. For example, the entity-set can be chosen according to operational closure conditions or complete specific integration. Importantly, the perception-action loop also induces an entity-set and is a multivariate Markov chain. We then show that our definition of actions leads to non-heteronomy and that of perceptions specialize to the usual concept of perception in the perception-action loop.