vtt
Exploring environment exploitation for self-reconfiguration in modular robotics
Wyder, Philippe Martin, Li, Haorui, Bae, Andrew, Zhao, Henry, Yim, Mark
Modular robotics research has long been preoccupied with perfecting the modules themselves -- their actuation methods, connectors, controls, communication, and fabrication. This inward focus results, in part, from the complexity of the task and largely confines modular robots to sterile laboratory settings. The latest generation of truss modular robots, such as the Variable Topology Truss and the Truss Link, have begun to focus outward and reveal a key insight: the environment is not just a backdrop; it is a tool. In this work, we shift the paradigm from building better robots to building better robot environment interactions for modular truss robots. We study how modular robots can effectively exploit their surroundings to achieve faster locomotion, adaptive self-reconfiguration, and complex three-dimensional assembly from simple two-dimensional robot assemblies. By using environment features -- ledges, gaps, and slopes -- we show how the environment can extend the robots' capabilities. Nature has long mastered this principle: organisms not only adapt, but exploit their environments to their advantage. Robots must learn to do the same. This study is a step towards modular robotic systems that transcend their limitations by exploiting environmental features.
An economically-consistent discrete choice model with flexible utility specification based on artificial neural networks
Hernandez, Jose Ignacio, Mouter, Niek, van Cranenburgh, Sander
Random utility maximisation (RUM) models are one of the cornerstones of discrete choice modelling. However, specifying the utility function of RUM models is not straightforward and has a considerable impact on the resulting interpretable outcomes and welfare measures. In this paper, we propose a new discrete choice model based on artificial neural networks (ANNs) named "Alternative-Specific and Shared weights Neural Network (ASS-NN)", which provides a further balance between flexible utility approximation from the data and consistency with two assumptions: RUM theory and fungibility of money (i.e., "one euro is one euro"). Therefore, the ASS-NN can derive economically-consistent outcomes, such as marginal utilities or willingness to pay, without explicitly specifying the utility functional form. Using a Monte Carlo experiment and empirical data from the Swissmetro dataset, we show that ASS-NN outperforms (in terms of goodness of fit) conventional multinomial logit (MNL) models under different utility specifications. Furthermore, we show how the ASS-NN is used to derive marginal utilities and willingness to pay measures.
Motion Planning for Variable Topology Trusses: Reconfiguration and Locomotion
Liu, Chao, Yu, Sencheng, Yim, Mark
Truss robots are highly redundant parallel robotic systems that can be applied in a variety of scenarios. The variable topology truss (VTT) is a class of modular truss robots. As self-reconfigurable modular robots, a VTT is composed of many edge modules that can be rearranged into various structures depending on the task. These robots change their shape by not only controlling joint positions as with fixed morphology robots, but also reconfiguring the connectivity between truss members in order to change their topology. The motion planning problem for VTT robots is difficult due to their varying morphology, high dimensionality, the high likelihood for self-collision, and complex motion constraints. In this paper, a new motion planning framework to dramatically alter the structure of a VTT is presented. It can also be used to solve locomotion tasks that are much more efficient compared with previous work. Several test scenarios are used to show its effectiveness. Supplementary materials are available at https://www.modlabupenn.org/vtt-motion-planning/.
Visuo-Tactile Transformers for Manipulation
Chen, Yizhou, Sipos, Andrea, Van der Merwe, Mark, Fazeli, Nima
Learning representations in the joint domain of vision and touch can improve manipulation dexterity, robustness, and sample-complexity by exploiting mutual information and complementary cues. Here, we present Visuo-Tactile Transformers (VTTs), a novel multimodal representation learning approach suited for model-based reinforcement learning and planning. Our approach extends the Visual Transformer \cite{dosovitskiy2021image} to handle visuo-tactile feedback. Specifically, VTT uses tactile feedback together with self and cross-modal attention to build latent heatmap representations that focus attention on important task features in the visual domain. We demonstrate the efficacy of VTT for representation learning with a comparative evaluation against baselines on four simulated robot tasks and one real world block pushing task. We conduct an ablation study over the components of VTT to highlight the importance of cross-modality in representation learning.
Toward a Human-Level Video Understanding Intelligence
Heo, Yu-Jung, Lee, Minsu, Choi, Seongho, Choi, Woo Suk, Shin, Minjung, Jung, Minjoon, Ryu, Jeh-Kwang, Zhang, Byoung-Tak
We aim to develop an AI agent that can watch video clips and have a conversation with human about the video story. Developing video understanding intelligence is a significantly challenging task, and evaluation methods for adequately measuring and analyzing the progress of AI agent are lacking as well. In this paper, we propose the Video Turing Test to provide effective and practical assessments of video understanding intelligence as well as human-likeness evaluation of AI agents. We define a general format and procedure of the Video Turing Test and present a case study to confirm the effectiveness and usefulness of the proposed test.
TRECVID 2019: An Evaluation Campaign to Benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & Retrieval
Awad, George, Butt, Asad A., Curtis, Keith, Lee, Yooyoung, Fiscus, Jonathan, Godil, Afzal, Delgado, Andrew, Zhang, Jesse, Godard, Eliot, Diduch, Lukas, Smeaton, Alan F., Graham, Yvette, Kraaij, Wessel, Quenot, Georges
The TREC Video Retrieval Evaluation (TRECVID) 2019 was a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last nineteen years this effort has yielded a better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. TRECVID has been funded by NIST (National Institute of Standards and Technology) and other US government agencies. In addition, many organizations and individuals worldwide contribute significant time and effort. TRECVID 2019 represented a continuation of four tasks from TRECVID 2018. In total, 27 teams from various research organizations worldwide completed one or more of the following four tasks: 1. Ad-hoc Video Search (AVS) 2. Instance Search (INS) 3. Activities in Extended Video (ActEV) 4. Video to Text Description (VTT) This paper is an introduction to the evaluation framework, tasks, data, and measures used in the workshop.
VTT founds a subsidiary to offer innovations for autonomous mobility
Finland is a forerunner in autonomous and remotely controlled solutions that are renewing the industrial sector, logistics and the way people move. The use of autonomous systems in various fields of society requires robust investment in competence, innovations and technology. VTT will strengthen the development of autonomous systems by founding a subsidiary VTT SenseWay Oy, focusing on such systems. The new company will seek access to the global markets from Turku, where some of the world's leading expertise in autonomous shipping systems can already be found. Autonomous systems are also making a strong entry into other transport and logistics sectors and mobile work machines, and VTT will respond to this demand with its solutions.
Third AI winter is coming unless we change course – VTT
We are heading towards a third AI winter. AI winter refers to a period of time during which companies, researchers, research funders and the general public are disappointed in artificial intelligence and the results achieved through it. This, in turn, leads to the freezing of funding and investments as well as to the stagnation of development. Researchers and experts move on to other technologies or start calling their activities by other names. Disappointment is preceded by a hype period of high expectations.