AITopics

Parallel search algorithms have been shown to improve planning speed by harnessing the multithreading capability of modern processors. One such algorithm PA*SE achieves this by parallelizing state expansions, whereas another algorithm ePA*SE achieves this by effectively parallelizing edge evaluations. ePA*SE targets domains in which the action space comprises actions with expensive but similar evaluation times. However, in a number of robotics domains, the action space is heterogenous in the computational effort required to evaluate the cost of an action and its outcome. Motivated by this, we introduce GePA*SE: Generalized Edge-based Parallel A* for Slow Evaluations, which generalizes the key ideas of PA*SE and ePA*SE i.e. parallelization of state expansions and edge evaluations respectively. This extends its applicability to domains that have actions requiring varying computational effort to evaluate them. The open-source code for GePA*SE along with the baselines is available here: https://github.com/shohinm/parallel_search

artificial intelligence, expansion, likhachev, (13 more...)

2301.10347

Country:

North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.91)

GFlowCausal: Generative Flow Networks for Causal Discovery

Li, Wenqian, Li, Yinchuan, Zhu, Shengyu, Shao, Yunfeng, Hao, Jianye, Pang, Yan

Causal discovery aims to uncover causal structure among a set of variables. Score-based approaches mainly focus on searching for the best Directed Acyclic Graph (DAG) based on a predefined score function. However, most of them are not applicable on a large scale due to the limited searchability. Inspired by the active learning in generative flow networks, we propose a novel approach to learning a DAG from observational data called GFlowCausal. It converts the graph search problem to a generation problem, in which direct edges are added gradually. GFlowCausal aims to learn the best policy to generate high-reward DAGs by sequential actions with probabilities proportional to predefined rewards. We propose a plug-and-play module based on transitive closure to ensure efficient sampling. Theoretical analysis shows that this module could guarantee acyclicity properties effectively and the consistency between final states and fully-connected graphs. We conduct extensive experiments on both synthetic and real datasets, and results show the proposed approach to be superior and also performs well in a large-scale setting.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2210.08185

Country:

Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)
Asia > Singapore (0.04)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(3 more...)

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

Zheng, Xiawu, Yang, Chenyi, Zhang, Shaokun, Wang, Yan, Zhang, Baochang, Wu, Yongjian, Wu, Yunsheng, Shao, Ling, Ji, Rongrong

Neural Architecture Search (NAS) has demonstrated state-of-the-art performance on various computer vision tasks. Despite the superior performance achieved, the efficiency and generality of existing methods are highly valued due to their high computational complexity and low generality. In this paper, we propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning, facilitating a theoretical bound on accuracy and efficiency. In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs. With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints, which is practical for on-device models across diverse search spaces and constraints. The architectures searched by our method achieve remarkable top-1 accuracies, 97.56 and 77.2 on CIFAR-10 and ImageNet (mobile settings), respectively, with the fastest search process, i.e., only 1.8 GPU hours on a Tesla V100. Codes for searching and network generation are available at: https://openi.pcl.ac.cn/PCL AutoML/XNAS.

artificial intelligence, deep learning, machine learning, (17 more...)

doi: 10.1007/s11263-023-01753-6

1905.13543

Country:

Asia > China > Fujian Province > Xiamen (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Africa > Ethiopia (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Norambuena, Brian Keith, Mitra, Tanushree, North, Chris

A Survey on Event-based News Narrative Extraction

Narratives are fundamental to our understanding of the world, providing us with a natural structure for knowledge representation over time. Computational narrative extraction is a subfield of artificial intelligence that makes heavy use of information retrieval and natural language processing techniques. Despite the importance of computational narrative extraction, relatively little scholarly work exists on synthesizing previous research and strategizing future research in the area. In particular, this article focuses on extracting news narratives from an event-centric perspective. Extracting narratives from news data has multiple applications in understanding the evolving information landscape. This survey presents an extensive study of research in the area of event-based news narrative extraction. In particular, we screened over 900 articles that yielded 54 relevant articles. These articles are synthesized and organized by representation model, extraction criteria, and evaluation approaches. Based on the reviewed studies, we identify recent trends, open challenges, and potential research lines.

information retrieval, machine learning, natural language, (21 more...)

doi: 10.1145/3584741

2302.08351

Country:

North America > United States > Washington > King County > Seattle (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
(32 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Media > News (1.00)
Government (1.00)
Law Enforcement & Public Safety (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(3 more...)

Luo, Zhezheng, Mao, Jiayuan, Wu, Jiajun, Lozano-Pérez, Tomás, Tenenbaum, Joshua B., Kaelbling, Leslie Pack

Learning Rational Subgoals from Demonstrations and Instructions

arXiv.org Artificial IntelligenceMar-9-2023

We present a framework for learning useful subgoals that support efficient long-term planning to achieve novel goals. At the core of our framework is a collection of rational subgoals (RSGs), which are essentially binary classifiers over the environmental states. RSGs can be learned from weakly-annotated data, in the form of unsegmented demonstration trajectories, paired with abstract task descriptions, which are composed of terms initially unknown to the agent (e.g., collect-wood then craft-boat then go-across-river). Our framework also discovers dependencies between RSGs, e.g., the task collect-wood is a helpful subgoal for the task craft-boat. Given a goal description, the learned subgoals and the derived dependencies facilitate off-the-shelf planning algorithms, such as A* and RRT, by setting helpful subgoals as waypoints to the planner, which significantly improves performance-time efficiency.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2303.05487

Country:

North America > United States > Oregon (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Iowa (0.04)

Genre:

Workflow (0.69)
Research Report (0.63)

Industry: Materials > Metals & Mining (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Pirinen, Aleksis, Samuelsson, Anton, Backsund, John, Åström, Kalle

Aerial View Localization with Reinforcement Learning: Towards Emulating Search-and-Rescue

arXiv.org Artificial IntelligenceMar-9-2023

Climate-induced disasters are and will continue to be on the rise, and thus search-and-rescue (SAR) operations, where the task is to localize and assist one or several people who are missing, become increasingly relevant. In many cases the rough location may be known and a UAV can be deployed to explore a given, confined area to precisely localize the missing people. Due to time and battery constraints it is often critical that localization is performed as efficiently as possible. In this work we approach this type of problem by abstracting it as an aerial view goal localization task in a framework that emulates a SAR-like setup without requiring access to actual UAVs. In this framework, an agent operates on top of an aerial image (proxy for a search area) and is tasked with localizing a goal that is described in terms of visual cues. To further mimic the situation on an actual UAV, the agent is not able to observe the search area in its entirety, not even at low resolution, and thus it has to operate solely based on partial glimpses when navigating towards the goal. To tackle this task, we propose AiRLoc, a reinforcement learning (RL)-based model that decouples exploration (searching for distant goals) and exploitation (localizing nearby goals). Extensive evaluations show that AiRLoc outperforms heuristic search methods as well as alternative learnable approaches, and that it generalizes across datasets, e.g. to disaster-hit areas without seeing a single disaster scenario during training. We also conduct a proof-of-concept study which indicates that the learnable methods outperform humans on average. Code and models have been made publicly available at https://github.com/aleksispi/airloc.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2209.03694

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts (0.05)
Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
Europe > Sweden (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-9-2023

Planning with Large Language Models for Code Generation

Zhang, Shun, Chen, Zhenfang, Shen, Yikang, Ding, Mingyu, Tenenbaum, Joshua B., Gan, Chuang

Existing large language model-based code generation pipelines typically use beam search or sampling algorithms during the decoding process. Although the programs they generate achieve high token-matching-based scores, they often fail to compile or generate incorrect outputs. The main reason is that conventional Transformer decoding algorithms may not be the best choice for code generation. In this work, we propose a novel Transformer decoding algorithm, Planning-Guided Transformer Decoding (PG-TD), that uses a planning algorithm to do lookahead search and guide the Transformer to generate better programs. Specifically, instead of simply optimizing the likelihood of the generated sequences, the Transformer makes use of a planner to generate candidate programs and test them on public test cases. The Transformer can therefore make more informed decisions and generate tokens that will eventually lead to higher-quality programs. We also design a mechanism that shares information between the Transformer and the planner to make our algorithm computationally efficient. We empirically evaluate our framework with several large language models as backbones on public coding challenge benchmarks, showing that 1) it can generate programs that consistently achieve higher performance compared with competing baseline methods; 2) it enables controllable code generation, such as concise codes and highly-commented codes by optimizing modified objective.

large language model, machine learning, natural language, (22 more...)

2303.0551

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-8-2023

BeamAttack: Generating High-quality Textual Adversarial Examples through Beam Search and Mixed Semantic Spaces

Zhu, Hai, Zhao, Qingyang, Wu, Yuren

Natural language processing models based on neural networks are vulnerable to adversarial examples. These adversarial examples are imperceptible to human readers but can mislead models to make the wrong predictions. In a black-box setting, attacker can fool the model without knowing model's parameters and architecture. Previous works on word-level attacks widely use single semantic space and greedy search as a search strategy. However, these methods fail to balance the attack success rate, quality of adversarial examples and time consumption. In this paper, we propose BeamAttack, a textual attack algorithm that makes use of mixed semantic spaces and improved beam search to craft high-quality adversarial examples. Extensive experiments demonstrate that BeamAttack can improve attack success rate while saving numerous queries and time, e.g., improving at most 7\% attack success rate than greedy search when attacking the examples from MR dataset. Compared with heuristic search, BeamAttack can save at most 85\% model queries and achieve a competitive attack success rate. The adversarial examples crafted by BeamAttack are highly transferable and can effectively improve model's robustness during adversarial training. Code is available at https://github.com/zhuhai-ustc/beamattack/tree/master

adversarial example, machine learning, natural language, (17 more...)

2303.07199

Country:

Asia > China > Guangdong Province > Zhuhai (0.24)
Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Ferry, Julien, Laberge, Gabriel, Aïvodji, Ulrich

Learning Hybrid Interpretable Models: Theory, Taxonomy, and Methods

arXiv.org Artificial IntelligenceMar-8-2023

A hybrid model involves the cooperation of an interpretable model and a complex black box. At inference, any input of the hybrid model is assigned to either its interpretable or complex component based on a gating mechanism. The advantages of such models over classical ones are two-fold: 1) They grant users precise control over the level of transparency of the system and 2) They can potentially perform better than a standalone black box since redirecting some of the inputs to an interpretable model implicitly acts as regularization. Still, despite their high potential, hybrid models remain under-studied in the interpretability/explainability literature. In this paper, we remedy this fact by presenting a thorough investigation of such models from three perspectives: Theory, Taxonomy, and Methods. First, we explore the theory behind the generalization of hybrid models from the Probably-Approximately-Correct (PAC) perspective. A consequence of our PAC guarantee is the existence of a sweet spot for the optimal transparency of the system. When such a sweet spot is attained, a hybrid model can potentially perform better than a standalone black box. Secondly, we provide a general taxonomy for the different ways of training hybrid models: the Post-Black-Box and Pre-Black-Box paradigms. These approaches differ in the order in which the interpretable and complex components are trained. We show where the state-of-the-art hybrid models Hybrid-Rule-Set and Companion-Rule-List fall in this taxonomy. Thirdly, we implement the two paradigms in a single method: HybridCORELS, which extends the CORELS algorithm to hybrid modeling. By leveraging CORELS, HybridCORELS provides a certificate of optimality of its interpretable component and precise control over transparency. We finally show empirically that HybridCORELS is competitive with existing hybrid models, and performs just as well as a standalone black box (or even better) while being partly transparent.

data mining, hybrid model, machine learning, (13 more...)

2303.04437

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > Texas (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.92)

Beard, Jared J., Butts, R. Michael, Gu, Yu

Feeling Optimistic? Ambiguity Attitudes for Online Decision Making

arXiv.org Artificial IntelligenceMar-7-2023

As autonomous agents enter complex environments, it becomes more difficult to adequately model the interactions between the two. Agents must therefore cope with greater ambiguity (e.g., unknown environments, underdefined models, and vague problem definitions). Despite the consequences of ignoring ambiguity, tools for decision making under ambiguity are understudied. The general approach has been to avoid ambiguity (exploit known information) using robust methods. This work contributes ambiguity attitude graph search (AAGS), generalizing robust methods with ambiguity attitudes--the ability to trade-off between seeking and avoiding ambiguity in the problem. AAGS solves online decision making problems with limited budget to learn about their environment. To evaluate this approach AAGS is tasked with path planning in static and dynamic environments. Results demonstrate that appropriate ambiguity attitudes are dependent on the quality of information from the environment. In relatively certain environments, AAGS can readily exploit information with robust policies. Conversely, model complexity reduces the information conveyed by individual samples; this allows the risks taken by optimistic policies to achieve better performance.

ambiguity, artificial intelligence, machine learning, (18 more...)

2303.04225

Country:

North America > United States > West Virginia > Monongalia County > Morgantown (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)