Collaborating Authors

Sensing and Signal Processing

Vacos Cam review: This promising security camera is handcuffed to a mess of an app


Battery-powered security cameras are a great option for outdoor use, because they remove the logistical hassle of finding a convenient electrical outlet to power them. But their easier installation comes with a cost, as they tend to be priced higher than their AC-powered counterparts. The $139 Vacos Cam would seem to be the best of both worlds, then--supremely flexible, modestly priced. Unfortunately, testing revealed this camera to be far from a polished product. While its video quality and smart motion detection are solid, its barely baked app makes the camera virtually unusable. The camera is the latest to crib its look from the Arlo line of home security cameras, in this case the Arlo Go (except that camera connects to the internet via an onboard LTE radio).

Sony's new Bravia XR TVs are all about 'cognitive intelligence'


Image processing has always been at the heart of Sony's TV designs. Sure, its premium Bravia TVs have typically featured the latest and greatest display hardware around, but the company's devotion to image quality has typically set it apart from competitors. This year, Sony is doubling down on that reputation with the Cognitive Processor XR, a new image processor that will focus on bringing "cognitive intelligence" to its upcoming Bravia XR LED and OLED TVs. I know, that sounds like a marketing buzzword, but it describes a new approach to image processing for Sony. Its previous chips used artificial intelligence to optimize individual elements of the picture, things like brightness, contrast and color.

Can the Government Regulate Deepfakes? WSJD - Technology

Last month, the British television network Channel 4 broadcast an "alternative Christmas address" by Queen Elizabeth II, in which the 94-year-old monarch was shown cracking jokes and performing a dance popular on TikTok. Of course, it wasn't real: The video was produced as a warning about deepfakes--apparently real images or videos that show people doing or saying things they never did or said. If an image of a person can be found, new technologies using artificial intelligence and machine learning now make it possible to show that person doing almost anything at all. The dangers of the technology are clear: A high-school teacher could be shown in a compromising situation with a student, a neighbor could be depicted as a terrorist. Can deepfakes, as such, be prohibited under American law?

TikTok's new confetti effect uses iPhone 12 Pro's LiDAR sensor


One of the most prominently advertised new features of the iPhone 12 Pro was its LiDAR sensor, which enables some cool augmented reality effects thanks to its ability to position objects precisely in 3D space. The only problem is that actual use cases for the feature are still few and far between. TikTok has announced a new AR effect, its first to take advantage of the LiDAR sensor (meaning you have to have the iPhone 12 Pro or the iPhone 12 Pro Plus to use it). The effect displays a virtual Times Square-like disco ball above a person's head; when the counter on the ball reaches zero, it explodes and a "2021" sign pops up while the person (along with any furniture in the frame) is covered in virtual confetti. To ring in 2021 we released our first AR effect on the new iPhone 12 Pro, using LiDAR technology which allows us to create effects that interact with your environment - visually bridging the digital and physical worlds.

Wake up! This device targets tired truckers


Trucking will one day be fully autonomous. In the meantime, big trucks cause a lot of accidents. According to a recent report by the Federal Motor Carrier Safety Administration (FMCSA), 2018 saw roughly 500,000 accidents attributed to large trucks (over 10,000 lb) in the US alone while the rate of trucking related fatalities rose to its highest in 30 years. Joe Exotic may have had a high stakes job, but trucking tops the charts as the most dangerous job in the US. In fact, trucking-related fatalities have been rising, which underscores the urgency of safety problems.

Hot papers on arXiv from the past month – December 2020


Here are the most tweeted papers that were uploaded onto arXiv during December 2020. Results are powered by Arxiv Sanity Preserver. Abstract: Self-attention networks have revolutionized natural language processing and are making impressive strides in image analysis tasks such as image classification and object detection. Inspired by this success, we investigate the application of self-attention networks to 3D point cloud processing. We design self-attention layers for point clouds and use these to construct self-attention networks for tasks such as semantic scene segmentation, object part segmentation, and object classification.

CLAIRE COVID-19 taskforce webinar


As part of its second anniversary activities, CLAIRE hosted a webinar presenting the progress and future plans of its COVID-19 taskforce. Entitled, "CLAIRE taskforce for AI and COVID-19: results and next steps", the webinar was conducted on 15 July 2020 with a focus on the three-month research outcomes in the areas of AI for bioinformatics, drug repurposing, and medical image analysis. "When the pandemic hit Europe, we immediately thought that we have to do something to support the European government and health institutions, with CLAIRE being the biggest community of AI experts in the world," said Emanuela Girardi, co-coordinator of CLAIRE COVID-19 taskforce in her introductory note during the event. Following the launch of CLAIRE's COVID-19 taskforce on 20 March 2020, more than 150 AI researchers throughout Europe collected and curated resources which aimed to leverage AI techniques in the context of COVID-19 and to support the development of new projects in several application areas. Under this taskforce, seven major groups were formed working on mobility and monitoring data analysis; bioinformatics; medical image analysis; social dynamics and networks monitoring; robotics; and scheduling & resource management.

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image Machine Learning

We present Worldsheet, a method for novel view synthesis using just a single RGB image as input. This is a challenging problem as it requires an understanding of the 3D geometry of the scene as well as texture mapping to generate both visible and occluded regions from new view-points. Our main insight is that simply shrink-wrapping a planar mesh sheet onto the input image, consistent with the learned intermediate depth, captures underlying geometry sufficient enough to generate photorealistic unseen views with arbitrarily large view-point changes. To operationalize this, we propose a novel differentiable texture sampler that allows our wrapped mesh sheet to be textured; which is then transformed into a target image via differentiable rendering. Our approach is category-agnostic, end-to-end trainable without using any 3D supervision and requires a single image at test time. Worldsheet consistently outperforms prior state-of-the-art methods on single-image view synthesis across several datasets. Furthermore, this simple idea captures novel views surprisingly well on a wide range of high resolution in-the-wild images in converting them into a navigable 3D pop-up. Video results and code at

Unsupervised Image Segmentation using Mutual Mean-Teaching Artificial Intelligence

Unsupervised image segmentation aims at assigning the pixels with similar feature into a same cluster without annotation, which is an important task in computer vision. Due to lack of prior knowledge, most of existing model usually need to be trained several times to obtain suitable results. To address this problem, we propose an unsupervised image segmentation model based on the Mutual Mean-Teaching (MMT) framework to produce more stable results. In addition, since the labels of pixels from two model are not matched, a label alignment algorithm based on the Hungarian algorithm is proposed to match the cluster labels. Experimental results demonstrate that the proposed model is able to segment various types of images and achieves better performance than the existing methods.

Sequential Attacks on Kalman Filter-based Forward Collision Warning Systems Artificial Intelligence

Kalman Filter (KF) is widely used in various domains to perform sequential learning or variable estimation. In the context of autonomous vehicles, KF constitutes the core component of many Advanced Driver Assistance Systems (ADAS), such as Forward Collision Warning (FCW). It tracks the states (distance, velocity etc.) of relevant traffic objects based on sensor measurements. The tracking output of KF is often fed into downstream logic to produce alerts, which will then be used by human drivers to make driving decisions in near-collision scenarios. In this paper, we study adversarial attacks on KF as part of the more complex machine-human hybrid system of Forward Collision Warning. Our attack goal is to negatively affect human braking decisions by causing KF to output incorrect state estimations that lead to false or delayed alerts. We accomplish this by sequentially manipulating measure ments fed into the KF, and propose a novel Model Predictive Control (MPC) approach to compute the optimal manipulation. Via experiments conducted in a simulated driving environment, we show that the attacker is able to successfully change FCW alert signals through planned manipulation over measurements prior to the desired target time. These results demonstrate that our attack can stealthily mislead a distracted human driver and cause vehicle collisions.