Goto

Collaborating Authors

 Pattern Recognition


Dahua AI Gait Recognition Breaks CASIA-B Gait Records

#artificialintelligence

The average accuracy of Rank1 for NM (normal walking), BG (with a bag) and CL (in a coat) reached 97.4 per cent, 94.0 per cent, and 87.0 per cent respectively, hitting another historical heights and maintaining its leading position. Gait recognition uses body shape and walking posture to identify a person, even if his/her face is occluded. It is one of the biometric recognition technologies with the greatest potential for long distance recognition scenarios. Aiming to address the technical difficulties of gait recognition in clothing changing search, carrying changing search and cross-view search, Dahua Technology integrates innovation and application of multi-modal gait algorithms, local gait feature extraction, and spatio-temporal gait feature extraction technologies. Combined with powerful model training and object recognition, it greatly improves the algorithm's robustness in special scene applications such as clothing changing, similar clothing, facial occlusion, and facial disguise, and thus making the gait recognition analysis more accurate and efficient.


The rise of image recognition AI in medical diagnostics - Electronic Products & Technology

#artificialintelligence

The use of image visualization and limited recognition software in medical diagnostics started over 20 years ago. This technology had however nearly reached its performance limits when deep learning (DL) and convolutional neural networks (CNNs) were developed, heralding a step-change in the capability and performance of machine vision. This progress demonstrates that image recognition AI technology can match or even exceed human-level performance (in terms of accuracy, sensitivity, and specificity) in many disease areas and on many imaging modalities. The technical threshold for the automation of these diagnostic tasks has already been reached, laying the groundwork for commercial growth in the short and long term. This is shown in the market projections below.


TRACE: Transform Aggregate and Compose Visiolinguistic Representations for Image Search with Text Feedback

arXiv.org Artificial Intelligence

The ability to efficiently search for images over an indexed database is the cornerstone for several user experiences. Incorporating user feedback, through multi-modal inputs provide flexible and interaction to serve fine-grained specificity in requirements. We specifically focus on text feedback, through descriptive natural language queries. Given a reference image and textual user feedback, our goal is to retrieve images that satisfy constraints specified by both of these input modalities. The task is challenging as it requires understanding the textual semantics from the text feedback and then applying these changes to the visual representation. To address these challenges, we propose a novel architecture TRACE which contains a hierarchical feature aggregation module to learn the composite visio-linguistic representations. TRACE achieves the SOTA performance on 3 benchmark datasets: FashionIQ, Shoes, and Birds-to-Words, with an average improvement of at least ~5.7%, ~3%, and ~5% respectively in R@K metric. Our extensive experiments and ablation studies show that TRACE consistently outperforms the existing techniques by significant margins both quantitatively and qualitatively.


Tasks Integrated Networks: Joint Detection and Retrieval for Image Search

arXiv.org Artificial Intelligence

The traditional object retrieval task aims to learn a discriminative feature representation with intra-similarity and inter-dissimilarity, which supposes that the objects in an image are manually or automatically pre-cropped exactly. However, in many real-world searching scenarios (e.g., video surveillance), the objects (e.g., persons, vehicles, etc.) are seldom accurately detected or annotated. Therefore, object-level retrieval becomes intractable without bounding-box annotation, which leads to a new but challenging topic, i.e. image-level search. In this paper, to address the image search issue, we first introduce an end-to-end Integrated Net (I-Net), which has three merits: 1) A Siamese architecture and an on-line pairing strategy for similar and dissimilar objects in the given images are designed. 2) A novel on-line pairing (OLP) loss is introduced with a dynamic feature dictionary, which alleviates the multi-task training stagnation problem, by automatically generating a number of negative pairs to restrict the positives. 3) A hard example priority (HEP) based softmax loss is proposed to improve the robustness of classification task by selecting hard categories. With the philosophy of divide and conquer, we further propose an improved I-Net, called DC-I-Net, which makes two new contributions: 1) two modules are tailored to handle different tasks separately in the integrated framework, such that the task specification is guaranteed. 2) A class-center guided HEP loss (C2HEP) by exploiting the stored class centers is proposed, such that the intra-similarity and inter-dissimilarity can be captured for ultimate retrieval. Extensive experiments on famous image-level search oriented benchmark datasets demonstrate that the proposed DC-I-Net outperforms the state-of-the-art tasks-integrated and tasks-separated image search models.


On Open and Strong-Scaling Tools for Atom Probe Crystallography: High-Throughput Methods for Indexing Crystal Structure and Orientation

arXiv.org Artificial Intelligence

Volumetric crystal structure indexing and orientation mapping are key data processing steps for virtually any quantitative study of spatial correlations between the local chemistry and the microstructure of a material. For electron and X-ray diffraction methods it is possible to develop indexing tools which compare measured and analytically computed patterns to decode the structure and relative orientation within local regions of interest. Consequently, a number of numerically efficient and automated software tools exist to solve the above characterisation tasks. For atom probe tomography (APT) experiments, however, the strategy of making comparisons between measured and analytically computed patterns is less robust because many APT datasets may contain substantial noise. Given that general enough predictive models for such noise remain elusive, crystallography tools for APT face several limitations: Their robustness to noise, and therefore, their capability to identify and distinguish different crystal structures and orientation is limited. In addition, the tools are sequential and demand substantial manual interaction. In combination, this makes robust uncertainty quantifying with automated high-throughput studies of the latent crystallographic information a difficult task with APT data. To improve the situation, we review the existent methods and discuss how they link to those in the diffraction communities. With this we modify some of the APT methods to yield more robust descriptors of the atomic arrangement. We report how this enables the development of an open-source software tool for strong-scaling and automated identifying of crystal structure and mapping crystal orientation in nanocrystalline APT datasets with multiple phases.


Gesture Recognition: The Right Way to AI Interaction

#artificialintelligence

This article introduces the research about the interactive ability of gesture recognition by the Tmall Genie M laboratory. The research covers the exploration of business and algorithms in gesture recognition, future applications, and prospects for the algorithms. "Gestures are the most natural form of human communication. Hardware is the only limitation that prevents us from controlling our devices well." Here, the hardware limitation refers to the need for additional depth sensors by traditional gesture recognition algorithms.


What is Machine Learning Image Recognition in Retail?

#artificialintelligence

It's not unusual to say that AI is the future. AI is entering almost all fields that exist right now and mostly leading those sectors on a path of success. The opinion may vary, but we all still have to agree, it has opened the gates to a whole new era of opportunities making things which we only expected to exist in movies, possible. Having said this, it's no surprise that the automatic store checkouts are also designed with the help of a subset of AI, which is machine learning, to be more precise deep learning. Deep learning, which quintessentially is machine learning, helps build the image recognition and object recognition mechanism. Though the terms image recognition and object recognition are used interchangeably, they are not exactly identical, explained later in the blog.


Know your Artificial Intelligenceโ€ฆ

#artificialintelligence

There is a lot of discussion around AI and its related technologies. There certainly is information overload around artificial intelligence, however one thing that stands out is that despite of all the available information, several business leaders seem to have a lack off understanding around AI. Several misconceptions around what artificial intelligence is, what existing technologies can do, where it is applicable, and most importantly knowing the differences between RPA, Machine Learning, Natural Language Processing, Computational Linguistics and more. So if you are trying pick up some basics around these topics, below might help in clarifying some outstanding questions and how they are applicable in real world. Note: This is not a technical write up and would not be getting into deep technical aspects of any specific technology. Also AI is a very broad topic, this blog will not be discussing concepts like Image Recognition, Computer Vision, Robotics, Augmented Reality, Autonomous Vehicles, Real time emotion recognition, cognitive cyber security and other greenfield technologies.


EASTER: Efficient and Scalable Text Recognizer

arXiv.org Machine Learning

Recent progress in deep learning has led to the development of Optical Character Recognition (OCR) systems which perform remarkably well. Most research has been around recurrent networks as well as complex gated layers which make the overall solution complex and difficult to scale. In this paper, we present an Efficient And Scalable TExt Recognizer (EASTER) to perform optical character recognition on both machine printed and handwritten text. Our model utilises 1-D convolutional layers without any recurrence which enables parallel training with considerably less volume of data. We experimented with multiple variations of our architecture and one of the smallest variant (depth and number of parameter wise) performs comparably to RNN based complex choices. Our 20-layered deepest variant outperforms RNN architectures with a good margin on benchmarking datasets like IIIT-5k and SVT. We also showcase improvements over the current best results on offline handwritten text recognition task. We also present data generation pipelines with augmentation setup to generate synthetic datasets for both handwritten and machine printed text.


Intelligence Primer

arXiv.org Artificial Intelligence

This primer explores the exciting subject of intelligence. Intelligence is a fundamental component of all living things, as well as Artificial Intelligence(AI). Artificial Intelligence has the potential to affect all of our lives and a new era for modern humans. This paper is an attempt to explore the ideas associated with intelligence, and by doing so understand the implications, constraints, and potentially the capabilities of future Artificial Intelligence. As an exploration, we journey into different parts of intelligence that appear essential. We hope that people find this useful in determining where Artificial Intelligence may be headed. Also, during the exploration, we hope to create new thought-provoking questions. Intelligence is not a single weighable quantity but a subject that spans Biology, Physics, Philosophy, Cognitive Science, Neuroscience, Psychology, and Computer Science. Historian Yuval Noah Harari pointed out that engineers and scientists in the future will have to broaden their understandings to include disciplines such as Psychology, Philosophy, and Ethics. Fiction writers have long portrayed engineers and scientists as deficient in these areas. Today, modern society, the emergence of Artificial Intelligence, and legal requirements all act as forcing functions to push these broader subjects into the foreground. We start with an introduction to intelligence and move quickly onto more profound thoughts and ideas. We call this a Life, the Universe and Everything primer, after the famous science fiction book by Douglas Adams. Forty-two may very well be the right answer, but what are the questions?