Goto

Collaborating Authors

 renet


Reviewer

Neural Information Processing Systems

We thank the reviewers for their helpful comments. The code and models will be open-sourced. Along with peak RAM, we report inference FLOPs for all the models. Finally, the rationale behind ImageNet-10 can be found in Appendix A.1. And even then ReNet's accuracy is GRU or LSTM as the RNN unit; we use GRU as it is more efficient.


RENet: Fault-Tolerant Motion Control for Quadruped Robots via Redundant Estimator Networks under Visual Collapse

Zhang, Yueqi, Qian, Quancheng, Hou, Taixian, Zhai, Peng, Wei, Xiaoyi, Hu, Kangmai, Yi, Jiafu, Zhang, Lihua

arXiv.org Artificial Intelligence

Abstract--Vision-based locomotion in outdoor environments presents significant challenges for quadruped robots. Accurate environmental prediction and effective handling of depth sensor noise during real-world deployment remain difficult, severely restricting the outdoor applications of such algorithms. T o address these deployment challenges in vision-based motion control, this letter proposes the Redundant Estimator Network (RENet) framework. The framework employs a dual-estimator architecture that ensures robust motion performance while maintaining deployment stability during onboard vision failures. Through an online estimator adaptation, our method enables seamless transitions between estimation modules when handling visual perception uncertainties. Experimental validation on a real-world robot demonstrates the framework's effectiveness in complex outdoor environments, showing particular advantages in scenarios with degraded visual perception. This framework demonstrates its potential as a practical solution for reliable robotic deployment in challenging field conditions. N the field of legged robot control, the state estimator plays a crucial role in environmental perception [1] and maintaining dynamic balance [2]. Learning-based implicit estimators, which are trained via supervised learning approaches, are also widely adopted in robust robot control systems [3], [4].


DartsReNet: Exploring new RNN cells in ReNet architectures

Moser, Brian, Raue, Federico, Hees, Jörn, Dengel, Andreas

arXiv.org Artificial Intelligence

We present new Recurrent Neural Network (RNN) cells for image classification using a Neural Architecture Search (NAS) approach called DARTS. We are interested in the ReNet architecture, which is a RNN based approach presented as an alternative for convolutional and pooling steps. ReNet can be defined using any standard RNN cells, such as LSTM and GRU. One limitation is that standard RNN cells were designed for one dimensional sequential data and not for two dimensions like it is the case for image classification. We overcome this limitation by using DARTS to find new cell designs. We compare our results with ReNet that uses GRU and LSTM cells. Our found cells outperform the standard RNN cells on CIFAR-10 and SVHN. The improvements on SVHN indicate generalizability, as we derived the RNN cell designs from CIFAR-10 without performing a new cell search for SVHN.


Researchers at Udacity develop AI that can generate lecture videos from audio narration

#artificialintelligence

Producing content for Massive Open Online Course (MOOC) platforms like Coursera and EdX might be academically rewarding (and potentially lucrative), but it's time-consuming -- particularly where videos are involved. Professional-level lecture clips require not only a veritable studio's worth of equipment, but significant resources to transfer, edit, and upload footage of each lesson. That's why research scientists formerly at Udacity, an online learning platform with over 150 courses, are investigating a machine learning framework that automatically generates lecture videos from audio narration alone. They claim in a preprint paper ("LumièreNet: Lecture Video Synthesis from Audio") on Arxiv.org that their AI system -- LumièreNet -- can synthesize footage of any length by directly mapping between audio and corresponding visuals. "In current video production pipeline, an AI machinery which semi (or fully) automates lecture video production at scale would be highly valuable to enable agile video content development (rather than reshooting each new video)," wrote the paper's coauthors.