AITopics

2209.0617

Country:

Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
North America > United States > Massachusetts > Middlesex County > Somerville (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > India (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.88)
Research Report > Experimental Study (0.66)

Industry:

Information Technology > Services (1.00)
Media (0.71)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

McClement, Daniel G., Lawrence, Nathan P., Backstrom, Johan U., Loewen, Philip D., Forbes, Michael G., Gopaluni, R. Bhushan

Meta-Reinforcement Learning for the Tuning of PI Controllers: An Offline Approach

Meta-learning is a branch of machine learning which trains neural network models to synthesize a wide variety of data in order to rapidly solve new problems. In process control, many systems have similar and well-understood dynamics, which suggests it is feasible to create a generalizable controller through meta-learning. In this work, we formulate a meta reinforcement learning (meta-RL) control strategy that can be used to tune proportional--integral controllers. Our meta-RL agent has a recurrent structure that accumulates "context" to learn a system's dynamics through a hidden state variable in closed-loop. This architecture enables the agent to automatically adapt to changes in the process dynamics. In tests reported here, the meta-RL agent was trained entirely offline on first order plus time delay systems, and produced excellent results on novel systems drawn from the same distribution of process dynamics used for training. A key design element is the ability to leverage model-based information offline during training in simulated environments while maintaining a model-free policy structure for interacting with novel processes where there is uncertainty regarding the true process dynamics. Meta-learning is a promising approach for constructing sample-efficient intelligent controllers.

machine learning, meta-rl agent, reinforcement learning, (18 more...)

doi: 10.1016/j.jprocont.2022.08.002

2203.09661

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Journal of Artificial Intelligence ResearchSep-19-2022

Motion Planning Under Uncertainty with Complex Agents and Environments via Hybrid Search

Strawser, Daniel (a:1:{s:5:"en_US";s:3:"MIT";}) | Williams, Brian (MIT)

As autonomous systems and robots are applied to more real world situations, they must reason about uncertainty when planning actions. Mission success oftentimes cannot be guaranteed and the planner must reason about the probability of failure. Unfortunately, computing a trajectory that satisfies mission goals while constraining the probability of failure is difficult because of the need to reason about complex, multidimensional probability distributions. Recent methods have seen success using chance-constrained, model-based planning. However, the majority of these methods can only handle simple environment and agent models. We argue that there are two main drawbacks of current approaches to goal-directed motion planning under uncertainty. First, current methods suffer from an inability to deal with expressive environment models such as 3D non-convex obstacles. Second, most planners rely on considerable simplifications when computing trajectory risk including approximating the agent’s dynamics, geometry, and uncertainty. In this article, we apply hybrid search to the risk-bound, goal-directed planning problem. The hybrid search consists of a region planner and a trajectory planner. The region planner makes discrete choices by reasoning about geometric regions that the autonomous agent should visit in order to accomplish its mission. In formulating the region planner, we propose landmark regions that help produce obstacle-free paths. The region planner passes paths through the environment to a trajectory planner; the task of the trajectory planner is to optimize trajectories that respect the agent’s dynamics and the user’s desired risk of mission failure. We discuss three approaches to modeling trajectory risk: a CDF-based approach, a sampling-based collocation method, and an algorithm named Shooting Method Monte Carlo. These models allow computation of trajectory risk with more complex environments, agent dynamics, geometries, and models of uncertainty than past approaches. A variety of 2D and 3D test cases are presented including a linear case, a Dubins car model, and an underwater autonomous vehicle. The method is shown to outperform other methods in terms of speed and utility of the solution. Additionally, the models of trajectory risk are shown to better approximate risk in simulation.

artificial intelligence, constraint, planning & scheduling, (20 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13361

AI Access Foundation

13361

Journal of Artificial Intelligence Research

Country:

Europe (0.27)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Oceania > Australia (0.14)
(2 more...)

Genre:

Overview (0.45)
Research Report (0.45)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Aubret, Arthur, Matignon, Laetitia, Hassas, Salima

An information-theoretic perspective on intrinsic motivation in reinforcement learning: a survey

Traditionally, an agent maximizes a reward defined according to the task to perform: it may be a score when the agent learns to solve a game or a distance function when the agent learns to reach a goal. The reward is then considered as extrinsic (or as a feedback) because the reward function is provided expertly and specifically for the task. With an extrinsic reward, many spectacular results have been obtained on Atari game [Bellemare et al. 2015] with the Deep Q-network (DQN) [Mnih et al. 2015] through the integration of deep learning to RL, leading to deep reinforcement learning (DRL). However, despite the recent improvements of DRL approaches, they turn out to be most of the time unsuccessful when the rewards are scattered in the environment, as the agent is then unable to learn the desired behavior for the targeted task [Francois-Lavet et al. 2018]. Moreover, the behaviors learned by the agent are hardly reusable, both within the same task and across many different tasks [Francois-Lavet et al. 2018]. It is difficult for an agent to generalize the learnt skills to make high-level decisions in the environment. For example, such skill could be go to the door using primitive actions consisting in moving in the four cardinal directions; or even to move forward controlling different joints of a humanoid robot like in the robotic simulator MuJoCo [Todorov et al. 2012]. On another side, unlike RL, developmental learning [Cangelosi and Schlesinger 2018; Oudeyer and Smith 2016; Piaget and Cook 1952] is based on the trend that babies, or more broadly organisms, acquire new skill while spontaneously exploring their environment [Barto 2013; Gopnik et al. 1999].

artificial intelligence, machine learning, reinforcement learning, (13 more...)

doi: 10.3390/e25020327

2209.0889

Country:

Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
North America > United States > New York (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(9 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Games > Computer Games (0.86)

Salahirad, Alireza, Gay, Gregory, Mohammadi, Ehsan

Mapping the Structure and Evolution of Software Testing Research Over the Past Three Decades

Background: The field of software testing is growing and rapidly-evolving. Aims: Based on keywords assigned to publications, we seek to identify predominant research topics and understand how they are connected and have evolved. Method: We apply co-word analysis to map the topology of testing research as a network where author-assigned keywords are connected by edges indicating co-occurrence in publications. Keywords are clustered based on edge density and frequency of connection. We examine the most popular keywords, summarize clusters into high-level research topics, examine how topics connect, and examine how the field is changing. Results: Testing research can be divided into 16 high-level topics and 18 subtopics. Creation guidance, automated test generation, evolution and maintenance, and test oracles have particularly strong connections to other topics, highlighting their multidisciplinary nature. Emerging keywords relate to web and mobile apps, machine learning, energy consumption, automated program repair and test generation, while emerging connections have formed between web apps, test oracles, and machine learning with many topics. Random and requirements-based testing show potential decline. Conclusions: Our observations, advice, and map data offer a deeper understanding of the field and inspiration regarding challenges and connections to explore.

data mining, evolutionary algorithm, machine learning, (20 more...)

2109.04086

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > South Carolina (0.04)
(6 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Energy (0.87)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Data Science > Data Mining (1.00)
(5 more...)

Seo, Sangwon, Unhelkar, Vaibhav V.

Semi-Supervised Imitation Learning of Team Policies from Suboptimal Demonstrations

We present Bayesian Team Imitation Learner (BTIL), an imitation learning algorithm to model the behavior of teams performing sequential tasks in Markovian domains. In contrast to existing multi-agent imitation learning techniques, BTIL explicitly models and infers the time-varying mental states of team members, thereby enabling learning of decentralized team policies from demonstrations of suboptimal teamwork. Further, to allow for sample- and label-efficient policy learning from small datasets, BTIL employs a Bayesian perspective and is capable of learning from semi-supervised demonstrations. We demonstrate and benchmark the performance of BTIL on synthetic multi-agent tasks as well as a novel dataset of human-agent teamwork. Our experiments show that BTIL can successfully learn team policies from demonstrations despite the influence of team members' (time-varying and potentially misaligned) mental states on their behavior.

artificial intelligence, demonstration, machine learning, (18 more...)

doi: 10.24963/ijcai.2022/346

2205.02959

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Leisure & Entertainment (0.67)
Health & Medicine (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

Space-time tradeoffs of lenses and optics via higher category theory

Gavranović, Bruno

Optics and lenses are abstract categorical gadgets that model systems with bidirectional data flow. In this paper we observe that the denotational definition of optics - identifying two optics as equivalent by observing their behaviour from the outside - is not suitable for operational, software oriented approaches where optics are not merely observed, but built with their internal setups in mind. We identify operational differences between denotationally isomorphic categories of cartesian optics and lenses: their different composition rule and corresponding space-time tradeoffs, positioning them at two opposite ends of a spectrum. With these motivations we lift the existing categorical constructions and their relationships to the 2-categorical level, showing that the relevant operational concerns become visible. We define the 2-category $\textbf{2-Optic}(\mathcal{C})$ whose 2-cells explicitly track optics' internal configuration. We show that the 1-category $\textbf{Optic}(\mathcal{C})$ arises by locally quotienting out the connected components of this 2-category. We show that the embedding of lenses into cartesian optics gets weakened from a functor to an oplax functor whose oplaxator now detects the different composition rule. We determine the difficulties in showing this functor forms a part of an adjunction in any of the standard 2-categories. We establish a conjecture that the well-known isomorphism between cartesian lenses and optics arises out of the lax 2-adjunction between their double-categorical counterparts. In addition to presenting new research, this paper is also meant to be an accessible introduction to the topic.

artificial intelligence, category, machine learning, (18 more...)

2209.09351

Genre:

Research Report (0.50)
Overview (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Evolved Open-Endedness in Cultural Evolution: A New Dimension in Open-Ended Evolution Research

Borg, James M., Buskell, Andrew, Kapitany, Rohan, Powers, Simon T., Reindl, Eva, Tennie, Claudio

The goal of Artificial Life research, as articulated by Chris Langton, is "to contribute to theoretical biology by locating life-as-we-know-it within the larger picture of life-as-it-could-be" (1989, p.1). The study and pursuit of open-ended evolution in artificial evolutionary systems exemplifies this goal. However, open-ended evolution research is hampered by two fundamental issues; the struggle to replicate open-endedness in an artificial evolutionary system, and the fact that we only have one system (genetic evolution) from which to draw inspiration. Here we argue that cultural evolution should be seen not only as another real-world example of an open-ended evolutionary system, but that the unique qualities seen in cultural evolution provide us with a new perspective from which we can assess the fundamental properties of, and ask new questions about, open-ended evolutionary systems, especially in regard to evolved open-endedness and transitions from bounded to unbounded evolution. Here we provide an overview of culture as an evolutionary system, highlight the interesting case of human cultural evolution as an open-ended evolutionary system, and contextualise cultural evolution under the framework of (evolved) open-ended evolution. We go on to provide a set of new questions that can be asked once we consider cultural evolution within the framework of open-ended evolution, and introduce new insights that we may be able to gain about evolved open-endedness as a result of asking these questions.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

2203.1305

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
(6 more...)

Genre:

Overview (0.68)
Instructional Material (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Müller, Ricardo, Schreyer, Marco, Sattarov, Timur, Borth, Damian

RESHAPE: Explaining Accounting Anomalies in Financial Statement Audits by enhancing SHapley Additive exPlanations

Detecting accounting anomalies is a recurrent challenge in financial statement audits. Recently, novel methods derived from Deep-Learning (DL) have been proposed to audit the large volumes of a statement's underlying accounting records. However, due to their vast number of parameters, such models exhibit the drawback of being inherently opaque. At the same time, the concealing of a model's inner workings often hinders its real-world application. This observation holds particularly true in financial audits since auditors must reasonably explain and justify their audit decisions. Nowadays, various Explainable AI (XAI) techniques have been proposed to address this challenge, e.g., SHapley Additive exPlanations (SHAP). However, in unsupervised DL as often applied in financial audits, these methods explain the model output at the level of encoded variables. As a result, the explanations of Autoencoder Neural Networks (AENNs) are often hard to comprehend by human auditors. To mitigate this drawback, we propose (RESHAPE), which explains the model output on an aggregated attribute-level. In addition, we introduce an evaluation framework to compare the versatility of XAI methods in auditing. Our experimental results show empirical evidence that RESHAPE results in versatile explanations compared to state-of-the-art baselines. We envision such attribute-level explanations as a necessary next step in the adoption of unsupervised DL techniques in financial auditing.

data mining, explanation, machine learning, (18 more...)

2209.09157

Country:

Europe > Switzerland > St. Gallen > St. Gallen (0.04)
North America > United States > Hawaii (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Overview (0.94)
Research Report > New Finding (0.34)

Industry: Banking & Finance > Audit (0.75)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Tsereteli, Tornike, Kartal, Yavuz Selim, Ponzetto, Simone Paolo, Zielinski, Andrea, Eckert, Kai, Mayr, Philipp

Overview of the SV-Ident 2022 Shared Task on Survey Variable Identification in Social Science Publications

In this paper, we provide an overview of the SV-Ident shared task as part of the 3rd Workshop on Scholarly Document Processing (SDP) at COLING 2022. In the shared task, participants were provided with a sentence and a vocabulary of variables, and asked to identify which variables, if any, are mentioned in individual sentences from scholarly documents in full text. Two teams made a total of 9 submissions to the shared task leaderboard. While none of the teams improve on the baseline systems, we still draw insights from their submissions. Furthermore, we provide a detailed evaluation. Data and baselines for our shared task are freely available at https://github.com/vadis-project/sv-ident

artificial intelligence, machine learning, natural language, (20 more...)

2209.09062

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
(5 more...)

Genre:

Overview (0.86)
Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Information Management > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)