Goto

Collaborating Authors

Results


Regional Attention Network (RAN) for Head Pose and Fine-grained Gesture Recognition

arXiv.org Artificial Intelligence

Affect is often expressed via non-verbal body language such as actions/gestures, which are vital indicators for human behaviors. Recent studies on recognition of fine-grained actions/gestures in monocular images have mainly focused on modeling spatial configuration of body parts representing body pose, human-objects interactions and variations in local appearance. The results show that this is a brittle approach since it relies on accurate body parts/objects detection. In this work, we argue that there exist local discriminative semantic regions, whose "informativeness" can be evaluated by the attention mechanism for inferring fine-grained gestures/actions. To this end, we propose a novel end-to-end \textbf{Regional Attention Network (RAN)}, which is a fully Convolutional Neural Network (CNN) to combine multiple contextual regions through attention mechanism, focusing on parts of the images that are most relevant to a given task. Our regions consist of one or more consecutive cells and are adapted from the strategies used in computing HOG (Histogram of Oriented Gradient) descriptor. The model is extensively evaluated on ten datasets belonging to 3 different scenarios: 1) head pose recognition, 2) drivers state recognition, and 3) human action and facial expression recognition. The proposed approach outperforms the state-of-the-art by a considerable margin in different metrics.


Black-box Adversarial Attacks in Autonomous Vehicle Technology

arXiv.org Artificial Intelligence

Despite the high quality performance of the deep neural network in real-world applications, they are susceptible to minor perturbations of adversarial attacks. This is mostly undetectable to human vision. The impact of such attacks has become extremely detrimental in autonomous vehicles with real-time "safety" concerns. The black-box adversarial attacks cause drastic misclassification in critical scene elements such as road signs and traffic lights leading the autonomous vehicle to crash into other vehicles or pedestrians. In this paper, we propose a novel query-based attack method called Modified Simple black-box attack (M-SimBA) to overcome the use of a white-box source in transfer based attack method. Also, the issue of late convergence in a Simple black-box attack (SimBA) is addressed by minimizing the loss of the most confused class which is the incorrect class predicted by the model with the highest probability, instead of trying to maximize the loss of the correct class. We evaluate the performance of the proposed approach to the German Traffic Sign Recognition Benchmark (GTSRB) dataset. We show that the proposed model outperforms the existing models like Transfer-based projected gradient descent (T-PGD), SimBA in terms of convergence time, flattening the distribution of confused class probability, and producing adversarial samples with least confidence on the true class.


Heatmap-based Object Detection and Tracking with a Fully Convolutional Neural Network

arXiv.org Artificial Intelligence

The main topic of this paper is a brief overview of the field of Artificial Intelligence. The core of this paper is a practical implementation of an algorithm for object detection and tracking. The ability to detect and track fast-moving objects is crucial for various applications of Artificial Intelligence like autonomous driving, ball tracking in sports, robotics or object counting. As part of this paper the Fully Convolutional Neural Network "CueNet" was developed. It detects and tracks the cueball on a labyrinth game robustly and reliably. While CueNet V1 has a single input image, the approach with CueNet V2 was to take three consecutive 240 x 180-pixel images as an input and transform them into a probability heatmap for the cueball's location. The network was tested with a separate video that contained all sorts of distractions to test its robustness. When confronted with our testing data, CueNet V1 predicted the correct cueball location in 99.6% of all frames, while CueNet V2 had 99.8% accuracy.


Explainable Artificial Intelligence (XAI): An Engineering Perspective

arXiv.org Artificial Intelligence

The remarkable advancements in Deep Learning (DL) algorithms have fueled enthusiasm for using Artificial Intelligence (AI) technologies in almost every domain; however, the opaqueness of these algorithms put a question mark on their applications in safety-critical systems. In this regard, the `explainability' dimension is not only essential to both explain the inner workings of black-box algorithms, but it also adds accountability and transparency dimensions that are of prime importance for regulators, consumers, and service providers. eXplainable Artificial Intelligence (XAI) is the set of techniques and methods to convert the so-called black-box AI algorithms to white-box algorithms, where the results achieved by these algorithms and the variables, parameters, and steps taken by the algorithm to reach the obtained results, are transparent and explainable. To complement the existing literature on XAI, in this paper, we take an `engineering' approach to illustrate the concepts of XAI. We discuss the stakeholders in XAI and describe the mathematical contours of XAI from engineering perspective. Then we take the autonomous car as a use-case and discuss the applications of XAI for its different components such as object detection, perception, control, action decision, and so on. This work is an exploratory study to identify new avenues of research in the field of XAI.


Identifying Human Edited Images using a CNN

arXiv.org Artificial Intelligence

Most non-professional photo manipulations are not made using propriety software like Adobe Photoshop, which is expensive and complicated to use for the average consumer selfie-taker or meme-maker. Instead, these individuals opt for user friendly mobile applications like FaceTune and Pixlr to make human face edits and alterations. Unfortunately, there is no existing dataset to train a model to classify these type of manipulations. In this paper, we present a generative model that approximates the distribution of human face edits and a method for detecting Facetune and Pixlr manipulations to human faces.


Self-Attention Based Context-Aware 3D Object Detection

arXiv.org Artificial Intelligence

Most existing point-cloud based 3D object detectors use convolution-like operators to process information in a local neighbourhood with fixed-weight kernels and aggregate global context hierarchically. However, recent work on non-local neural networks and self-attention for 2D vision has shown that explicitly modeling global context and long-range interactions between positions can lead to more robust and competitive models. In this paper, we explore two variants of self-attention for contextual modeling in 3D object detection by augmenting convolutional features with self-attention features. We first incorporate the pairwise self-attention mechanism into the current state-of-the-art BEV, voxel and point-based detectors and show consistent improvement over strong baseline models while simultaneously significantly reducing their parameter footprint and computational cost. We also propose a self-attention variant that samples a subset of the most representative features by learning deformations over randomly sampled locations. This not only allows us to scale explicit global contextual modeling to larger point-clouds, but also leads to more discriminative and informative feature descriptors. Our method can be flexibly applied to most state-of-the-art detectors with increased accuracy and parameter and compute efficiency. We achieve new state-of-the-art detection performance on KITTI and nuScenes datasets. Code is available at \url{https://github.com/AutoVision-cloud/SA-Det3D}.


Artificial Intelligence Methods in In-Cabin Use Cases: A Survey

arXiv.org Artificial Intelligence

As interest in autonomous driving increases, efforts are being made to meet requirements for the high-level automation of vehicles. In this context, the functionality inside the vehicle cabin plays a key role in ensuring a safe and pleasant journey for driver and passenger alike. At the same time, recent advances in the field of artificial intelligence (AI) have enabled a whole range of new applications and assistance systems to solve automated problems in the vehicle cabin. This paper presents a thorough survey on existing work that utilizes AI methods for use-cases inside the driving cabin, focusing, in particular, on application scenarios related to (1) driving safety and (2) driving comfort. Results from the surveyed works show that AI technology has a promising future in tackling in-cabin tasks within the autonomous driving aspect.


Personal Privacy Protection via Irrelevant Faces Tracking and Pixelation in Video Live Streaming

arXiv.org Artificial Intelligence

To date, the privacy-protection intended pixelation tasks are still labor-intensive and yet to be studied. With the prevailing of video live streaming, establishing an online face pixelation mechanism during streaming is an urgency. In this paper, we develop a new method called Face Pixelation in Video Live Streaming (FPVLS) to generate automatic personal privacy filtering during unconstrained streaming activities. Simply applying multi-face trackers will encounter problems in target drifting, computing efficiency, and over-pixelation. Therefore, for fast and accurate pixelation of irrelevant people's faces, FPVLS is organized in a frame-to-video structure of two core stages. On individual frames, FPVLS utilizes image-based face detection and embedding networks to yield face vectors. In the raw trajectories generation stage, the proposed Positioned Incremental Affinity Propagation (PIAP) clustering algorithm leverages face vectors and positioned information to quickly associate the same person's faces across frames. Such frame-wise accumulated raw trajectories are likely to be intermittent and unreliable on video level. Hence, we further introduce the trajectory refinement stage that merges a proposal network with the two-sample test based on the Empirical Likelihood Ratio (ELR) statistic to refine the raw trajectories. A Gaussian filter is laid on the refined trajectories for final pixelation. On the video live streaming dataset we collected, FPVLS obtains satisfying accuracy, real-time efficiency, and contains the over-pixelation problems.


Top 100 Artificial Intelligence Companies in the World

#artificialintelligence

Artificial Intelligence (AI) is not just a buzzword, but a crucial part of the technology landscape. AI is changing every industry and business function, which results in increased interest in its applications, subdomains and related fields. This makes AI companies the top leaders driving the technology swift. AI helps us to optimise and automate crucial business processes, gather essential data and transform the world, one step at a time. From Google and Amazon to Apple and Microsoft, every major tech company is dedicating resources to breakthroughs in artificial intelligence. As big enterprises are busy acquiring or merging with other emerging inventions, small AI companies are also working hard to develop their own intelligent technology and services. By leveraging artificial intelligence, organizations get an innovative edge in the digital age. AI consults are also working to provide companies with expertise that can help them grow. In this digital era, AI is also a significant place for investment. AI companies are constantly developing the latest products to provide the simplest solutions. Henceforth, Analytics Insight brings you the list of top 100 AI companies that are leading the technology drive towards a better tomorrow. AEye develops advanced vision hardware, software, and algorithms that act as the eyes and visual cortex of autonomous vehicles. AEye is an artificial perception pioneer and creator of iDAR, a new form of intelligent data collection that acts as the eyes and visual cortex of autonomous vehicles. Since its demonstration of its solid state LiDAR scanner in 2013, AEye has pioneered breakthroughs in intelligent sensing. Their mission was to acquire the most information with the fewest ones and zeros. This would allow AEye to drive the automotive industry into the next realm of autonomy. Algorithmia invented the AI Layer.


Amazon Web Services launches new tool to detect bias and blind spots in machine learning

#artificialintelligence

A new feature from Amazon Web Services will alert developers to potential bias in machine learning algorithms, part of a larger effort by the tech industry to keep automated predictions from discriminating against women, people of color and other underrepresented groups. The feature, SageMaker Clarify, was announced at the AWS re:Invent conference Tuesday as a new component of the AWS SageMaker machine learning platform. The technology analyzes the data used to train machine learning models for telltale signs of bias, including data sets that don't accurately reflect the larger population. It also analyzes the machine learning model itself to help ensure the accuracy of the resulting predictions. A 2018 MIT study found that the presence of a disproportionate number of white males in data sets used to train facial recognition algorithms led a larger number of errors in recognizing women and people of color.