Collaborating Authors


Artificial intelligence


Deep learning[133] uses several layers of neurons between the network's inputs and outputs. The multiple layers can progressively extract higher-level features from the raw input. For example, in image processing, lower layers may identify edges, while higher layers may identify the concepts relevant to a human such as digits or letters or faces.[134] Deep learning has drastically improved the performance of programs in many important subfields of artificial intelligence, including computer vision, speech recognition, image classification[135] and others. Deep learning often uses convolutional neural networks for many or all of its layers.

Using Makeup to Block Surveillance

Communications of the ACM

Anti-surveillance makeup, used by people who do not want to be identified to fool facial recognition systems, is bold and striking, not exactly the stuff of cloak and daggers. While experts' opinions vary on the makeup's effectiveness to avoid detection, they agree that its use is not yet widespread. Anti-surveillance makeup relies heavily on machine learning and deep learning models to "break up the symmetry of a typical human face" with highly contrasted markings, says John Magee, an associate computer science professor at Clark University in Worcester, MA, who specializes in computer vision research. However, Magee adds that "If you go out [wearing] that makeup, you're going to draw attention to yourself." The effectiveness of anti-surveillance makeup has been debated because of racial justice protesters who do not want to be tracked, Magee notes.

Optical Character Recognition using PaddleOCR


Reading huge documents can be very tiring and very time taking. You must have seen many software or applications where you just click a picture and get key information from the document. This is done by a technique called Optical Character Recognition (OCR). Optical Character Recognition is one of the key researches in the field of AI in recent years. Optical Character Recognition is the process of recognizing text from an image by understanding and analyzing its underlying patterns. This blog post will focus on implementing and comparing various OCR algorithms provided by PaddleOCR using just a few lines of code. Optical Character Recognition is the technique that recognizes and converts text into a machine-readable format by analyzing and understanding its underlying patterns. OCR can recognize handwritten text, printed text and texts "in the wild". In short, OCR enables computers to read. But how does OCR work? OCR makes use of Deep learning and computer vision techniques.

Guide to Panoptic Segmentation - A Semantic + Instance Segmentation Approach


Panoptic segmentation is an image segmentation method used for Computer Vision tasks. It unifies two distinct concepts used to segment images namely, semantic segmentation and instance segmentation. Panoptic segmentation technique was introduced by Kaiming He, Ross Girshick and Piotr Dollar of Facebook AI Research (FAIR), Carsten Rother of HCI/IWR, Heidelberg University (Germany) as well as Alexander Kirillov, a member of both the above mentioned organizations in April 2019 (version v3). Let us first understand semantic segmentation and instance segmentation approaches in order to have clarity about panoptic segmentation. A Computer Vision project aims at developing a deep learning model which can accurately and precisely detect real-world objects comprising the input data in the form of images or videos.

Neuron – Machine Learning & AI Startups HTML Template


There are 24 unique pages with 3 different home pages included where you can find most type of pages. This template is suitable for any type of Machine Learning, Deep Learning, Artificial Intelligence, Computer Vision, Natural Language Processing (NLP), Face Recognition, Speech Analysis, Self Driving Car & any Startup Business Websites. This template include less file so you can change template color easily without any hassle. It's 100% fluid responsive & fits any device perfectly. By using this template you can easily build your own website just you like it.! Features: 03 Unique Awesome Home Pages 20 HTML Templates Available Product Demo pa Read more

An Overview of Small object detection by Slicing Aided Hyper Inference (SAHI)


In surveillance applications, detecting tiny items and objects that are far away in the scene is practically very difficult. Because such things are represented by a limited number of pixels in the image, traditional detectors have a tough time detecting them. So, in this article, we will look at how current models fail to recognize objects at the far end, as well as a strategy presented by Fatih Akyon et al. to address this problem called Slicing Aided Hyper Inference (SAHI) Object detection is a task that involves bounding boxes and classifying them into categories to locate all positions of objects of interest in an input. Several ways have been proposed to accomplish this goal, ranging from traditional methodologies to deep learning-based alternatives. What are the 2 approaches to Object Detection?

A Realistic Face Made Using Maya & DeepFace Live


Each passing day, the technologies surrounding AI-generated ultra-realistic faces become better and better. According to the artist, the project is a test of a Maya viewport applied in a real-time deepfake. To create the deepfake itself, the author used DeepFace Live. The workflow was based on that of Brielle Garcia, an AR/VR Software Developer, who is also known for creating realistic deepfakes.

Computer Vision, Deep Learning and Object Detection


The human visionary mechanism is fascinating. The visual sensors perceive an image and convert it into electrical signals, which they pass to neural systems. The brain then processes the signals, eventually allowing humans to see, as well as understand the context of an image, including which objects are in the image and where and how many of them there are. All of these complex processes happen instantly. If one is given a pen and asked to draw a box around all of the visible objects, this can be easily performed. However, it is questionable whether a machine can perform this process as efficiently as humans. Convolutional Neural Networks (ConvNets or CNNs) are good at extracting features from a given image and finally classifying it as a cat or a dog. This process is known as image classification. This is an easy task if the objects are centred and only a few objects are in the image. If the number of objects is increased and the objects belong to different classes, they must be distinguished and localised within an image.

What is Computer Vision? Know Computer Vision Basic to Advanced & How Does it Work?


Computer vision is a field of study which enables computers to replicate the human visual system. It's a subset of artificial intelligence which collects information from digital images or videos and processes them to define the attributes. The entire process involves image acquiring, screening, analysing, identifying and extracting information. This extensive processing helps computers to understand any visual content and act on it accordingly. You can also take up computer vision course for free to understand the basics under Artificial intelligence domain.

Core Challenges in Embodied Vision-Language Planning

Journal of Artificial Intelligence Research

Recent advances in the areas of multimodal machine learning and artificial intelligence (AI) have led to the development of challenging tasks at the intersection of Computer Vision, Natural Language Processing, and Embodied AI. Whereas many approaches and previous survey pursuits have characterised one or two of these dimensions, there has not been a holistic analysis at the center of all three. Moreover, even when combinations of these topics are considered, more focus is placed on describing, e.g., current architectural methods, as opposed to also illustrating high-level challenges and opportunities for the field. In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language. We propose a taxonomy to unify these tasks and provide an in-depth analysis and comparison of the new and current algorithmic approaches, metrics, simulated environments, as well as the datasets used for EVLP tasks. Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment.