Goto

Collaborating Authors

 Information Fusion


TTCDist: Fast Distance Estimation From an Active Monocular Camera Using Time-to-Contact

arXiv.org Artificial Intelligence

Distance estimation from vision is fundamental for a myriad of robotic applications such as navigation, manipulation, and planning. Inspired by the mammal's visual system, which gazes at specific objects, we develop two novel constraints relating time-to-contact, acceleration, and distance that we call the $\tau$-constraint and $\Phi$-constraint. They allow an active (moving) camera to estimate depth efficiently and accurately while using only a small portion of the image. The constraints are applicable to range sensing, sensor fusion, and visual servoing. We successfully validate the proposed constraints with two experiments. The first applies both constraints in a trajectory estimation task with a monocular camera and an Inertial Measurement Unit (IMU). Our methods achieve 30-70% less average trajectory error while running 25$\times$ and 6.2$\times$ faster than the popular Visual-Inertial Odometry methods VINS-Mono and ROVIO respectively. The second experiment demonstrates that when the constraints are used for feedback with efference copies the resulting closed loop system's eigenvalues are invariant to scaling of the applied control signal. We believe these results indicate the $\tau$ and $\Phi$ constraint's potential as the basis of robust and efficient algorithms for a multitude of robotic applications.


Early or Late Fusion Matters: Efficient RGB-D Fusion in Vision Transformers for 3D Object Recognition

arXiv.org Artificial Intelligence

The Vision Transformer (ViT) architecture has established its place in computer vision literature, however, training ViTs for RGB-D object recognition remains an understudied topic, viewed in recent literature only through the lens of multi-task pretraining in multiple vision modalities. Such approaches are often computationally intensive, relying on the scale of multiple pretraining datasets to align RGB with 3D information. In this work, we propose a simple yet strong recipe for transferring pretrained ViTs in RGB-D domains for 3D object recognition, focusing on fusing RGB and depth representations encoded jointly by the ViT. Compared to previous works in multimodal Transformers, the key challenge here is to use the attested flexibility of ViTs to capture cross-modal interactions at the downstream and not the pretraining stage. We explore which depth representation is better in terms of resulting accuracy and compare early and late fusion techniques for aligning the RGB and depth modalities within the ViT architecture. Experimental results in the Washington RGB-D Objects dataset (ROD) demonstrate that in such RGB -> RGB-D scenarios, late fusion techniques work better than most popularly employed early fusion. With our transfer baseline, fusion ViTs score up to 95.4% top-1 accuracy in ROD, achieving new state-of-the-art results in this benchmark. We further show the benefits of using our multimodal fusion baseline over unimodal feature extractors in a synthetic-to-real visual adaptation as well as in an open-ended lifelong learning scenario in the ROD benchmark, where our model outperforms previous works by a margin of >8%. Finally, we integrate our method with a robot framework and demonstrate how it can serve as a perception utility in an interactive robot learning scenario, both in simulation and with a real robot.


Robust, High-Precision GNSS Carrier-Phase Positioning with Visual-Inertial Fusion

arXiv.org Artificial Intelligence

Robust, high-precision global localization is fundamental to a wide range of outdoor robotics applications. Conventional fusion methods use low-accuracy pseudorange based GNSS measurements ($>>5m$ errors) and can only yield a coarse registration to the global earth-centered-earth-fixed (ECEF) frame. In this paper, we leverage high-precision GNSS carrier-phase positioning and aid it with local visual-inertial odometry (VIO) tracking using an extended Kalman filter (EKF) framework that better resolves the integer ambiguity concerned with GNSS carrier-phase. %to achieve centimeter-level accuracy in the ECEF frame. We also propose an algorithm for accurate GNSS-antenna-to-IMU extrinsics calibration to accurately align VIO to the ECEF frame. Together, our system achieves robust global positioning demonstrated by real-world hardware experiments in severely occluded urban canyons, and outperforms the state-of-the-art RTKLIB by a significant margin in terms of integer ambiguity solution fix rate and positioning RMSE accuracy.


Informatica launches AI tool for marketers - AI News

#artificialintelligence

Informatica, an enterprise cloud data management specialist, has launched the industry's only free cloud data loading, integration and ETL/ELT service – Informatica Cloud Data Integration-Free and PayGo. The new offering targets data practitioners and non-technical users such as in marketing, sales, and revenue operations teams to build data pipelines within minutes. For example, it provides operations teams with a fast, free, and frictionless way to load, integrate and analyze high-quality campaign, pipeline, forecast, and revenue data. In addition, data analysts and data engineers benefit from increased productivity and rapid development. This is the second in a series of releases that began with the Informatica Data Loader launch in May 2022.


Autonomous Driving Machine Learning Engineer - Sensor Fusion at Mercedes-Benz R&D North America - Sunnyvale, CA

#artificialintelligence

Find open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general, filtered by job title or popular skill, toolset and products used.


From Data Extraction to Transformation: Creating an ELT Pipeline with Python

#artificialintelligence

Extracting and transforming data is a crucial task in the field of data analytics and data science. The process of extracting data from various sources, transforming it to fit specific business requirements, and loading it into a data warehouse or data lake is commonly known as ETL (Extract, Transform, Load). However, in recent years, a new approach called ELT (Extract, Load, Transform) has emerged, which emphasizes loading data into a target data store before transforming it. In this tutorial, we will walk you through the process of creating an ELT pipeline using Python. The first step is to set up the development environment and install the required dependencies.


Integration remains key challenge for digital transformation

#artificialintelligence

It's a business pain point most know only too well, and new research confirms that integration challenges are not just a pain, they're slowing companies' digital ambitions and causing infrastructure issues and risks. MuleSoft's eighth annual Connectivity Benchmark Report shows the number of applications in Australian organisations (sorry, New Zealand, there are no Kiwi results in this one) have increased nearly 10 percent in the past year, to 1,032, highlighting the complexity of the digital landscape. But 68 percent of those applications are not integrated with other applications used by the business, creating data silos and the flow on effects, including increased costs, duplicated work, productivity bottlenecks and disconnected experiences. It's a situation that's proving costly – not just in terms of money spent building custom integrations (read on for those eye-watering figures) but also in the slowing of digital transformation efforts – something 84 percent of Australians said was happening, causing infrastructure and major risks as IT budgets come under increased scrutiny. And the cost of failing to complete digital transformation initiatives successfully?


Streamlining Multimodal Data Fusion in Wireless Communication and Sensor Networks

arXiv.org Artificial Intelligence

--This paper presents a novel approach for multi-modal data fusion based on the V ector-Quantized V ariational Autoencoder (VQV AE) architecture. The proposed method is simple yet effective in achieving excellent reconstruction performance on paired MNIST -SVHN data and WiFi spectrogram data. Additionally, the multimodal VQV AE model is extended to the 5G communication scenario, where an end-to-end Channel State Information (CSI) feedback system is implemented to compress data transmitted between the base-station (eNodeB) and User Equipment (UE), without significant loss of performance. The proposed model learns a discriminative compressed feature space for various types of input data (CSI, spectrograms, natural images, etc), making it a suitable solution for applications with limited computational resources. Multimodal fusion is an important aspect of modern artificial intelligence and machine learning systems. It is a process of combining data from multiple sensors to create a comprehensive understanding of the environment. In various applications, such as robotics, autonomous vehicles, and Internet of Things (IoT), multiple sensors are used to capture information from the environment, including vision, audio, lidar, radar, sonar, GPS and more. By combining this data, a more accurate and robust representation of the environment can be created. Multimodal sensor fusion is important because it helps to overcome the limitations of individual sensors and allows for more reliable and robust decision-making. However, compression of multimodal data is also needed for increasing efficiency, decreasing the cost of storage and transmission, and facilitating real-time processing of substantial datasets in a variety of applications. For example, in 5G networks, Channel State Information (CSI) feedback plays a critical role in the communication system.


Senior Associate Data Engineering L2 DE-Big Data AWS at Publicis Groupe - India, India

#artificialintelligence

Publicis Sapient is a digital transformation partner helping established organizations get to their future, digitally-enabled state, both in the way they work and the way they serve their customers. We help unlock value through a start-up mindset and modern methods, fusing strategy, consulting and customer experience with agile engineering and problem-solving creativity. United by our core values and our purpose of helping people thrive in the brave pursuit of next, our 20,000 people in 53 offices around the world combine experience across technology, data sciences, consulting and customer obsession to accelerate our clients' businesses through designing the products and services their customers truly value As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solutions.


Senior Data Integrations Manager at BigCommerce - Mexico - Remote

#artificialintelligence

BigCommerce's mission is to help merchants sell more at every stage of growth, from small startups, to mid-market businesses to large enterprises. We focus on being the best ecommerce platform so our customers can focus on what matters most: growing their businesses. We are equally passionate about growing our employee's careers and providing them an incredible experience as we rapidly expand across the globe. We are proud to have been recognized numerous times for our product and workplace culture. We empower our people and customers to build, innovate and grow, so together we can redefine the ecommerce industry.