Tsetserukou, Dzmitry
SwarmGear: Heterogeneous Swarm of Drones with Reconfigurable Leader Drone and Virtual Impedance Links for Multi-Robot Inspection
Darush, Zhanibek, Martynov, Mikhail, Fedoseev, Aleksey, Shcherbak, Aleksei, Tsetserukou, Dzmitry
The continuous monitoring by drone swarms remains a challenging problem due to the lack of power supply and the inability of drones to land on uneven surfaces. Heterogeneous swarms, including ground and aerial vehicles, can support longer inspections and carry a higher number of sensors on board. However, their capabilities are limited by the mobility of wheeled and legged robots in a cluttered environment. In this paper, we propose a novel concept for autonomous inspection that we call SwarmGear. SwarmGear utilizes a heterogeneous swarm that investigates the environment in a leader-follower formation. The leader drone is able to land on rough terrain and traverse it by four compliant robotic legs, possessing both the functionalities of an aerial and mobile robot. To preserve the formation of the swarm during its motion, virtual impedance links were developed between the leader and the follower drones. We evaluated experimentally the accuracy of the hybrid leader drone's ground locomotion. By changing the step parameters, the optimal step configuration was found. Two types of gaits were evaluated. The experiments revealed low crosstrack error (mean of 2 cm and max of 4.8 cm) and the ability of the leader drone to move with a 190 mm step length and a 3 degree standard yaw deviation. Four types of drone formations were considered. The best formation was used for experiments with SwarmGear, and it showed low overall crosstrack error for the swarm (mean 7.9 cm for the type 1 gait and 5.1 cm for the type 2 gait). The proposed system can potentially improve the performance of autonomous swarms in cluttered and unstructured environments by allowing all agents of the swarm to switch between aerial and ground formations to overcome various obstacles and perform missions over a large area.
Multi-sensor large-scale dataset for multi-view 3D reconstruction
Voynov, Oleg, Bobrovskikh, Gleb, Karpyshev, Pavel, Galochkin, Saveliy, Ardelean, Andrei-Timotei, Bozhenko, Arseniy, Karmanova, Ekaterina, Kopanev, Pavel, Labutin-Rymsho, Yaroslav, Rakhimov, Ruslan, Safin, Aleksandr, Serpiva, Valerii, Artemov, Alexey, Burnaev, Evgeny, Tsetserukou, Dzmitry, Zorin, Denis
We present a new multi-sensor dataset for multi-view 3D surface reconstruction. It includes registered RGB and depth data from sensors of different resolutions and modalities: smartphones, Intel RealSense, Microsoft Kinect, industrial cameras, and structured-light scanner. The scenes are selected to emphasize a diverse set of material properties challenging for existing algorithms. We provide around 1.4 million images of 107 different scenes acquired from 100 viewing directions under 14 lighting conditions. We expect our dataset will be useful for evaluation and training of 3D reconstruction algorithms and for related tasks. The dataset is available at skoltech3d.appliedai.tech.
Comparison of modern open-source visual SLAM approaches
Sharafutdinov, Dinar, Griguletskii, Mark, Kopanev, Pavel, Kurenkov, Mikhail, Ferrer, Gonzalo, Burkov, Aleksey, Gonnochenko, Aleksei, Tsetserukou, Dzmitry
SLAM is one of the most fundamental areas of research in robotics and computer vision. State of the art solutions has advanced significantly in terms of accuracy and stability. Unfortunately, not all the approaches are available as open-source solutions and free to use. The results of some of them are difficult to reproduce, and there is a lack of comparison on common datasets. In our work, we make a comparative analysis of state of the art open-source methods. We assess the algorithms based on accuracy, computational performance, robustness, and fault tolerance. Moreover, we present a comparison of datasets as well as an analysis of algorithms from a practical point of view. The findings of the work raise several crucial questions for SLAM researchers.
RL-Based Guidance in Outpatient Hysteroscopy Training: A Feasibility Study
Poliakov, Vladimir, Niu, Kenan, Poorten, Emmanuel Vander, Tsetserukou, Dzmitry
This work presents an RL-based agent for outpatient hysteroscopy training. Hysteroscopy is a gynecological procedure for examination of the uterine cavity. Recent advancements enabled performing this type of intervention in the outpatient setup without anaesthesia. While being beneficial to the patient, this approach introduces new challenges for clinicians, who should take additional measures to maintain the level of patient comfort and prevent tissue damage. Our prior work has presented a platform for hysteroscopic training with the focus on the passage of the cervical canal. With this work, we aim to extend the functionality of the platform by designing a subsystem that autonomously performs the task of the passage of the cervical canal. This feature can later be used as a virtual instructor to provide educational cues for trainees and assess their performance. The developed algorithm is based on the soft actor critic approach to smooth the learning curve of the agent and ensure uniform exploration of the workspace. The designed algorithm was tested against the performance of five clinicians. Overall, the algorithm demonstrated high efficiency and reliability, succeeding in 98% of trials and outperforming the expert group in three out of four measured metrics.
LiePoseNet: Heterogeneous Loss Function Based on Lie Group for Significant Speed-up of PoseNet Training Process
Kurenkov, Mikhail, Kalinov, Ivan, Tsetserukou, Dzmitry
Abstract-- Visual localization is an essential modern technology for robotics and computer vision. Popular approaches for solving this task are image-based methods. Nowadays, these methods have low accuracy and a long training time. The reasons are the lack of rigid-body and projective geometry awareness, landmark symmetry, and homogeneous error assumption. We propose a heterogeneous loss function based on concentrated Gaussian distribution with the Lie group to overcome these difficulties. They firstly match 2D I. Methods [14]-[22] Visual localization is an essential part of robotic frameworks. Structurebased standard for localization are LIDAR-based methods.
HyperPalm: DNN-based hand gesture recognition interface for intelligent communication with quadruped robot in 3D space
Nazarova, Elena, Babataev, Ildar, Weerakkodi, Nipun, Fedoseev, Aleksey, Tsetserukou, Dzmitry
Nowadays, autonomous mobile robots support people in many areas where human presence either redundant or too dangerous. They have successfully proven themselves in expeditions, gas industry, mines, warehouses, etc. However, even legged robots may stuck in rough terrain conditions requiring human cognitive abilities to navigate the system. While gamepads and keyboards are convenient for wheeled robot control, the quadruped robot in 3D space can move along all linear coordinates and Euler angles, requiring at least 12 buttons for independent control of their DoF. Therefore, more convenient interfaces of control are required. In this paper we present HyperPalm: a novel gesture interface for intuitive human-robot interaction with quadruped robots. Without additional devices, the operator has full position and orientation control of the quadruped robot in 3D space through hand gesture recognition with only 5 gestures and 6 DoF hand motion. The experimental results revealed to classify 5 static gestures with high accuracy (96.5%), accurately predict the position of the 6D position of the hand in three-dimensional space. The absolute linear deviation Root mean square deviation (RMSD) of the proposed approach is 11.7 mm, which is almost 50% lower than for the second tested approach, the absolute angular deviation RMSD of the proposed approach is 2.6 degrees, which is almost 27% lower than for the second tested approach. Moreover, the user study was conducted to explore user's subjective experience from human-robot interaction through the proposed gesture interface. The participants evaluated their interaction with HyperPalm as intuitive (2.0), not causing frustration (2.63), and requiring low physical demand (2.0).