Goto

Collaborating Authors

 tspn


Towards a Reward-Free Reinforcement Learning Framework for Vehicle Control

arXiv.org Artificial Intelligence

Reinforcement learning plays a crucial role in vehicle control by guiding agents to learn optimal control strategies through designing or learning appropriate reward signals. However, in vehicle control applications, rewards typically need to be manually designed while considering multiple implicit factors, which easily introduces human biases. Although imitation learning methods does not rely on explicit reward signals, they necessitate high-quality expert actions, which are often challenging to acquire. To address these issues, we propose a reward-free reinforcement learning framework (RFRLF). This framework directly learns the target states to optimize agent behavior through a target state prediction network (TSPN) and a reward-free state-guided policy network (RFSGPN), avoiding the dependence on manually designed reward signals. Specifically, the policy network is learned via minimizing the differences between the predicted state and the expert state. Experimental results demonstrate the effectiveness of the proposed RFRLF in controlling vehicle driving, showing its advantages in improving learning efficiency and adapting to reward-free environments.


What and When to Look?: Temporal Span Proposal Network for Video Visual Relation Detection

arXiv.org Artificial Intelligence

Identifying relations between objects is central to understanding the scene. While several works have been proposed for relation modeling in the image domain, there have been many constraints in the video domain due to challenging dynamics of spatio-temporal interactions (e.g., Between which objects are there an interaction? When do relations occur and end?). To date, two representative methods have been proposed to tackle Video Visual Relation Detection (VidVRD): segment-based and window-based. We first point out the limitations these two methods have and propose Temporal Span Proposal Network (TSPN), a novel method with two advantages in terms of efficiency and effectiveness. 1) TSPN tells what to look: it sparsifies relation search space by scoring relationness (i.e., confidence score for the existence of a relation between pair of objects) of object pair. 2) TSPN tells when to look: it leverages the full video context to simultaneously predict the temporal span and categories of the entire relations. TSPN demonstrates its effectiveness by achieving new state-of-the-art by a significant margin on two VidVRD benchmarks (ImageNet-VidVDR and VidOR) while also showing lower time complexity than existing methods - in particular, twice as efficient as a popular segment-based approach.


Deep Compression of Sum-Product Networks on Tensor Networks

arXiv.org Machine Learning

Abstract--Sum-product networks (SPNs) represent an emerging class of neural networks with clear probabilistic semantics and superior inference speed over graphical models. This work reveals a strikingly intimate connection between SPNs and tensor networks, thus leading to a highly efficient representation that we call tensor SPNs (tSPNs). For the first time, through mapping an SPN onto a tSPN and employing novel optimization techniques, we demonstrate remarkable parameter compression with negligible loss in accuracy. INCE the inception of sum-product networks (SPNs) [1], a multitude of works have emerged with respect to their structure and weight learning, e.g., [2], [3], [4], as well as their application in image completion, speech modeling, semantic mapping and robotics, e.g., [5], just to name a few. An SPN exhibits a clear semantics of mixtures (sum nodes) and features (product nodes). Compared to other probabilistic graphical models like Bayesian and Markov networks with #P or NPhard computation, an SPN enjoys a tractable exact inference cost, and its learning is relatively simple and fast. On the other hand, there has been an exploding number of works on tensors (a multilinear operator rooted in physics) [6] including their connection and utilization in various engineering fields such as signal processing [7], and lately also in neural networks and machine learning [8], [9], [10].