Pattern Recognition
Scalable Reverse Image Search Engine for NASAWorldview
Sodani, Abhigya, Levy, Michael, Koul, Anirudh, Kasam, Meher Anand, Ganju, Siddha
Researchers often spend weeks sifting through decades of unlabeled satellite imagery(on NASA Worldview) in order to develop datasets on which they can start conducting research. We developed an interactive, scalable and fast image similarity search engine (which can take one or more images as the query image) that automatically sifts through the unlabeled dataset reducing dataset generation time from weeks to minutes. In this work, we describe key components of the end to end pipeline. Our similarity search system was created to be able to identify similar images from a potentially petabyte scale database that are similar to an input image, and for this we had to break down each query image into its features, which were generated by a classification layer stripped CNN trained in a supervised manner. To store and search these features efficiently, we had to make several scalability improvements. To improve the speed, reduce the storage, and shrink memory requirements for embedding search, we add a fully connected layer to our CNN make all images into a 128 length vector before entering the classification layers. This helped us compress the size of our image features from 2048 (for ResNet, which was initially tried as our featurizer) to 128 for our new custom model. Additionally, we utilize existing approximate nearest neighbor search libraries to significantly speed up embedding search. Our system currently searches over our entire database of images at 5 seconds per query on a single virtual machine in the cloud. In the future, we would like to incorporate a SimCLR based featurizing model which could be trained without any labelling by a human (since the classification aspect of the model is irrelevant to this use case).
PimEyes: Face Recognition Search Engine and Reverse Image Search
A reverse image search is a technique that allows finding things, people, brands, etc. using a photo. While performing a regular search you usually type a word or phrase that is related to the information you are trying to find; when you do a reverse image search, you upload a picture to a search engine. In the results of regular searches, you receive a list of websites that are connected to these phrases. When you perform a reverse image search, in the results you receive photos of similar things, people, etc, linked to websites about them. Reverse search by image is the best solution to use when looking for similar images, smaller/bigger versions of them, or twin content.
Fast and Scalable Image Search For Histology
Chen, Chengkuan, Lu, Ming Y., Williamson, Drew F. K., Chen, Tiffany Y., Schaumberg, Andrew J., Mahmood, Faisal
The expanding adoption of digital pathology has enabled the curation of large repositories of histology whole slide images (WSIs), which contain a wealth of information. Similar pathology image search offers the opportunity to comb through large historical repositories of gigapixel WSIs to identify cases with similar morphological features and can be particularly useful for diagnosing rare diseases, identifying similar cases for predicting prognosis, treatment outcomes and potential clinical trial success. A critical challenge in developing a WSI search and retrieval system is scalability, which is uniquely challenging given the need to search a growing number of slides that each can consist of billions of pixels and are several gigabytes in size. Such systems are typically slow and retrieval speed often scales with the size of the repository they search through, making their clinical adoption tedious and are not feasible for repositories that are constantly growing. Here we present Fast Image Search for Histopathology (FISH), a histology image search pipeline that is infinitely scalable and achieves constant search speed that is independent of the image database size, while being interpretable and without requiring detailed annotations. FISH uses self-supervised deep learning to encode meaningful representations from WSIs and a Van Emde Boas tree for fast search, followed by an uncertainty-based ranking algorithm to retrieve similar WSIs. We evaluated FISH on multiple tasks and datasets with over 22,000 patient cases spanning 56 disease subtypes. We additionally demonstrate that FISH can be used to assist with the diagnosis of rare cancer types where sufficient cases may not be available to train traditional supervised deep models.
Exploring and mining attributed sequences of interactions
Viard, Tiphaine, Soldano, Henry, Santini, Guillaume
We are faced with data comprised of entities interacting over time: this can be individuals meeting, customers buying products, machines exchanging packets on the IP network, among others. Capturing the dynamics as well as the structure of these interactions is of crucial importance for analysis. These interactions can almost always be labeled with content: group belonging, reviews of products, abstracts, etc. We model these stream of interactions as stream graphs, a recent framework to model interactions over time. Formal Concept Analysis provides a framework for analyzing concepts evolving within a context. Considering graphs as the context, it has recently been applied to perform closed pattern mining on social graphs. In this paper, we are interested in pattern mining in sequences of interactions. After recalling and extending notions from formal concept analysis on graphs to stream graphs, we introduce algorithms to enumerate closed patterns on a labeled stream graph, and introduce a way to select relevant closed patterns. We run experiments on two real-world datasets of interactions among students and citations between authors, and show both the feasibility and the relevance of our method.
Semantic-guided Pixel Sampling for Cloth-Changing Person Re-identification
Shu, Xiujun, Li, Ge, Wang, Xiao, Ruan, Weijian, Tian, Qi
Cloth-changing person re-identification (re-ID) is a new rising research topic that aims at retrieving pedestrians whose clothes are changed. This task is quite challenging and has not been fully studied to date. Current works mainly focus on body shape or contour sketch, but they are not robust enough due to view and posture variations. The key to this task is to exploit cloth-irrelevant cues. This paper proposes a semantic-guided pixel sampling approach for the cloth-changing person re-ID task. We do not explicitly define which feature to extract but force the model to automatically learn cloth-irrelevant cues. Specifically, we first recognize the pedestrian's upper clothes and pants, then randomly change them by sampling pixels from other pedestrians. The changed samples retain the identity labels but exchange the pixels of clothes or pants among different pedestrians. Besides, we adopt a loss function to constrain the learned features to keep consistent before and after changes. In this way, the model is forced to learn cues that are irrelevant to upper clothes and pants. We conduct extensive experiments on the latest released PRCC dataset. Our method achieved 65.8% on Rank1 accuracy, which outperforms previous methods with a large margin. The code is available at https://github.com/shuxjweb/pixel_sampling.git.
A pattern recognition approach for distinguishing between prose and poetry
Poetry and prose are written artistic expressions that help us to appreciate the reality we live. Each of these styles has its own set of subjective properties, such as rhyme and rhythm, which are easily caught by a human reader's eye and ear. With the recent advances in artificial intelligence, the gap between humans and machines may have decreased, and today we observe algorithms mastering tasks that were once exclusively performed by humans. In this paper, we propose an automated method to distinguish between poetry and prose based solely on aural and rhythmic properties. In other to compare prose and poetry rhythms, we represent the rhymes and phones as temporal sequences and thus we propose a procedure for extracting rhythmic features from these sequences.
EMG Pattern Recognition via Bayesian Inference with Scale Mixture-Based Stochastic Generative Models
Furui, Akira, Igaue, Takuya, Tsuji, Toshio
Electromyogram (EMG) has been utilized to interface signals for prosthetic hands and information devices owing to its ability to reflect human motion intentions. Although various EMG classification methods have been introduced into EMG-based control systems, they do not fully consider the stochastic characteristics of EMG signals. This paper proposes an EMG pattern classification method incorporating a scale mixture-based generative model. A scale mixture model is a stochastic EMG model in which the EMG variance is considered as a random variable, enabling the representation of uncertainty in the variance. This model is extended in this study and utilized for EMG pattern classification. The proposed method is trained by variational Bayesian learning, thereby allowing the automatic determination of the model complexity. Furthermore, to optimize the hyperparameters of the proposed method with a partial discriminative approach, a mutual information-based determination method is introduced. Simulation and EMG analysis experiments demonstrated the relationship between the hyperparameters and classification accuracy of the proposed method as well as the validity of the proposed method. The comparison using public EMG datasets revealed that the proposed method outperformed the various conventional classifiers. These results indicated the validity of the proposed method and its applicability to EMG-based control systems. In EMG pattern recognition, a classifier based on a generative model that reflects the stochastic characteristics of EMG signals can outperform the conventional general-purpose classifier.
The World of Reality, Causality and Real Artificial Intelligence: Exposing the Great Unknown Unknowns
"All men by nature desire to know." - Aristotle "He who does not know what the world is does not know where he is." - Marcus Aurelius "If I have seen further, it is by standing on the shoulders of giants." "The universe is a giant causal machine. The world is "at the bottom" governed by causal algorithms. Our bodies are causal machines. Our brains and minds are causal AI computers". The 3 biggest unknown unknowns are described and analyzed in terms of human intelligence and machine intelligence. A deep understanding of reality and its causality is to revolutionize the world, its science and technology, AI machines including. The content is the intro of Real AI Project Confidential Report: How to Engineer Man-Machine Superintelligence 2025: AI for Everything and Everyone (AI4EE). It is all a power set of {known, unknown; known unknown}, known knowns, known unknowns, unknown knowns, and unknown unknowns, like as the material universe's material parts: about 4.6% of baryonic matter, about 26.8% of dark matter, and about 68.3% of dark energy. There are a big number of sciences, all sorts and kinds, hard sciences and soft sciences. But what we are still missing is the science of all sciences, the Science of the World as a Whole, thus making it the biggest unknown unknowns. It is what man/AI does not know what it does not know, neither understand, nor aware of its scope and scale, sense and extent. "the universe consists of objects having various qualities and standing in various relationships" (Whitehead, Russell), "the world is the totality of states of affairs" (D. "World of physical objects and events, including, in particular, biological beings; World of mental objects and events; World of objective contents of thought" (K. How the world is still an unknown unknown one could see from the most popular lexical ontology, WordNet,see supplement. The construct of the world is typically missing its essential meaning, "the world as a whole", the world of reality, the ultimate totality of all worlds, universes, and realities, beings, things, and entities, the unified totalities. The world or reality or being or existence is "all that is, has been and will be". Of which the physical universe and cosmos is a key part, as "the totality of space and times and matter and energy, with all causative fundamental interactions".
Robust Learning for Text Classification with Multi-source Noise Simulation and Hard Example Mining
Xu, Guowei, Ding, Wenbiao, Fu, Weiping, Wu, Zhongqin, Liu, Zitao
Many real-world applications involve the use of Optical Character Recognition (OCR) engines to transform handwritten images into transcripts on which downstream Natural Language Processing (NLP) models are applied. In this process, OCR engines may introduce errors and inputs to downstream NLP models become noisy. Despite that pre-trained models achieve state-of-the-art performance in many NLP benchmarks, we prove that they are not robust to noisy texts generated by real OCR engines. This greatly limits the application of NLP models in real-world scenarios. In order to improve model performance on noisy OCR transcripts, it is natural to train the NLP model on labelled noisy texts. However, in most cases there are only labelled clean texts. Since there is no handwritten pictures corresponding to the text, it is impossible to directly use the recognition model to obtain noisy labelled data. Human resources can be employed to copy texts and take pictures, but it is extremely expensive considering the size of data for model training. Consequently, we are interested in making NLP models intrinsically robust to OCR errors in a low resource manner. We propose a novel robust training framework which 1) employs simple but effective methods to directly simulate natural OCR noises from clean texts and 2) iteratively mines the hard examples from a large number of simulated samples for optimal performance. 3) To make our model learn noise-invariant representations, a stability loss is employed. Experiments on three real-world datasets show that the proposed framework boosts the robustness of pre-trained models by a large margin. We believe that this work can greatly promote the application of NLP models in actual scenarios, although the algorithm we use is simple and straightforward. We make our codes and three datasets publicly available\footnote{https://github.com/tal-ai/Robust-learning-MSSHEM}.