Goto

Collaborating Authors

 Pattern Recognition


Top Machine Learning recommended systems

#artificialintelligence

We cannot underestimate the fact that the Internet plays a vital role in our personal and professional lives! We all rely on technology these days. Sometimes we rely on manual methods to achieve our goals a decade ago, and we never expected to be thinking about machine learning applications at this time. We never thought we could check the real traffic situation on the road before leaving the area to reach our destination. It was hard to imagine 10 years ago that we could order food with just a few clicks!


Curricular SincNet: Towards Robust Deep Speaker Recognition by Emphasizing Hard Samples in Latent Space

arXiv.org Artificial Intelligence

Deep learning models have become an increasingly preferred option for biometric recognition systems, such as speaker recognition. SincNet, a deep neural network architecture, gained popularity in speaker recognition tasks due to its parameterized sinc functions that allow it to work directly on the speech signal. The original SincNet architecture uses the softmax loss, which may not be the most suitable choice for recognition-based tasks. Such loss functions do not impose inter-class margins nor differentiate between easy and hard training samples. Curriculum learning, particularly those leveraging angular margin-based losses, has proven very successful in other biometric applications such as face recognition. The advantage of such a curriculum learning-based techniques is that it will impose inter-class margins as well as taking to account easy and hard samples. In this paper, we propose Curricular SincNet (CL-SincNet), an improved SincNet model where we use a curricular loss function to train the SincNet architecture. The proposed model is evaluated on multiple datasets using intra-dataset and inter-dataset evaluation protocols. In both settings, the model performs competitively with other previously published work. In the case of inter-dataset testing, it achieves the best overall results with a reduction of 4\% error rate compare to SincNet and other published work.


Web image search engine based on LSH index and CNN Resnet50

arXiv.org Artificial Intelligence

To implement a good Content Based Image Retrieval (CBIR) system, it is essential to adopt efficient search methods. One way to achieve this results is by exploiting approximate search techniques. In fact, when we deal with very large collections of data, using an exact search method makes the system very slow. In this project, we adopt the Locality Sensitive Hashing (LSH) index to implement a CBIR system that allows us to perform fast similarity search on deep features. Specifically, we exploit transfer learning techniques to extract deep features from images; this phase is done using two famous Convolutional Neural Networks (CNNs) as features extractors: Resnet50 and Resnet50v2, both pre-trained on ImageNet. Then we try out several fully connected deep neural networks, built on top of both of the previously mentioned CNNs in order to fine-tuned them on our dataset. In both of previous cases, we index the features within our LSH index implementation and within a sequential scan, to better understand how much the introduction of the index affects the results. Finally, we carry out a performance analysis: we evaluate the relevance of the result set, computing the mAP (mean Average Precision) value obtained during the different experiments with respect to the number of done comparison and varying the hyper-parameter values of the LSH index.


Pinterest launches hair pattern search with BIPOC users in mind

Engadget

Pinterest has launched a new search feature that could make it easier for Black, Brown, Indigenous, Latinx and other POC users to find hair inspiration that would suit their hair types. The visual discovery website has introduced hair pattern search, it said, with BIPOC users in mind. This new feature uses computer vision-powered object detection to enable users to refine their searches by six different hair patterns: protective, coily, curly, wavy, straight and shaved/bald. Now, after users search for broader terms like "summer hairstyles," "glam hair" or "short hair," they'll find new hair pattern buttons that will narrow down the results. The feature is now live in the US, UK, Ireland, Canada, Australia and New Zealand on desktop, as well as on iOS and Android. It will roll out to more locations over the coming months.


Coalesced Multi-Output Tsetlin Machines with Clause Sharing

arXiv.org Artificial Intelligence

Using finite-state machines to learn patterns, Tsetlin machines (TMs) have obtained competitive accuracy and learning speed across several benchmarks, with frugal memory- and energy footprint. A TM represents patterns as conjunctive clauses in propositional logic (AND-rules), each clause voting for or against a particular output. While efficient for single-output problems, one needs a separate TM per output for multi-output problems. Employing multiple TMs hinders pattern reuse because each TM then operates in a silo. In this paper, we introduce clause sharing, merging multiple TMs into a single one. Each clause is related to each output by using a weight. A positive weight makes the clause vote for output $1$, while a negative weight makes the clause vote for output $0$. The clauses thus coalesce to produce multiple outputs. The resulting coalesced Tsetlin Machine (CoTM) simultaneously learns both the weights and the composition of each clause by employing interacting Stochastic Searching on the Line (SSL) and Tsetlin Automata (TA) teams. Our empirical results on MNIST, Fashion-MNIST, and Kuzushiji-MNIST show that CoTM obtains significantly higher accuracy than TM on $50$- to $1$K-clause configurations, indicating an ability to repurpose clauses. E.g., accuracy goes from $71.99$% to $89.66$% on Fashion-MNIST when employing $50$ clauses per class (22 Kb memory). While TM and CoTM accuracy is similar when using more than $1$K clauses per class, CoTM reaches peak accuracy $3\times$ faster on MNIST with $8$K clauses. We further investigate robustness towards imbalanced training data. Our evaluations on imbalanced versions of IMDb- and CIFAR10 data show that CoTM is robust towards high degrees of class imbalance. Being able to share clauses, we believe CoTM will enable new TM application domains that involve multiple outputs, such as learning language models and auto-encoding.


NIST SRE CTS Superset: A large-scale dataset for telephony speaker recognition

arXiv.org Artificial Intelligence

This document provides a brief description of the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) conversational telephone speech (CTS) Superset. The CTS Superset has been created in an attempt to provide the research community with a large-scale dataset along with uniform metadata that can be used to effectively train and develop telephony (narrowband) speaker recognition systems. It contains a large number of telephony speech segments from more than 6800 speakers with speech durations distributed uniformly in the [10s, 60s] range. The segments have been extracted from the source corpora used to compile prior SRE datasets (SRE1996-2012), including the Greybeard corpus as well as the Switchboard and Mixer series collected by the Linguistic Data Consortium (LDC). In addition to the brief description, we also report speaker recognition results on the NIST 2020 CTS Speaker Recognition Challenge, obtained using a system trained with the CTS Superset. The results will serve as a reference baseline for the challenge.


HCR-Net: A deep learning based script independent handwritten character recognition network

arXiv.org Artificial Intelligence

Handwritten character recognition (HCR) is a challenging learning problem in pattern recognition, mainly due to similarity in structure of characters, different handwriting styles, noisy datasets and a large variety of languages and scripts. HCR problem is studied extensively for a few decades but there is very limited research on script independent models. This is because of factors, like, diversity of scripts, focus of the most of conventional research efforts on handcrafted feature extraction techniques which are language/script specific and are not always available, and unavailability of public datasets and codes to reproduce the results. On the other hand, deep learning has witnessed huge success in different areas of pattern recognition, including HCR, and provides end-to-end learning, i.e., automated feature extraction and recognition. In this paper, we have proposed a novel deep learning architecture which exploits transfer learning and image-augmentation for end-to-end learning for script independent handwritten character recognition, called HCR-Net. The network is based on a novel transfer learning approach for HCR, where some of lower layers of a pre-trained VGG16 network are utilised. Due to transfer learning and image-augmentation, HCR-Net provides faster training, better performance and better generalisations. The experimental results on publicly available datasets of Bangla, Punjabi, Hindi, English, Swedish, Urdu, Farsi, Tibetan, Kannada, Malayalam, Telugu, Marathi, Nepali and Arabic languages prove the efficacy of HCR-Net and establishes several new benchmarks. For reproducibility of the results and for the advancements of the HCR research, complete code is publicly released at \href{https://github.com/jmdvinodjmd/HCR-Net}{GitHub}.


Working of Image Recognition Explained

#artificialintelligence

Did you know that the Compound Annual Growth Rate (CAGR) of the image recognition market in the United States of America (USA) is 19.5% between 2016 and 2021? Image recognition is essentially the computer's way of looking at the world (computer vision). The image recognition technology is far from being matured, and its implications are already profound in the real world for both consumers and businesses. One of the ramifications of having a potent computer with incredible cameras and highly accurate sensors, smartphones, in our pockets is fueling the image recognition technology more than ever before! Image recognition finds itself improving areas such as cybersecurity, gaming, product designing, and more.


How is Machine Learning used in Image Recognition?

#artificialintelligence

In the article on Artificial Intelligence, Wikipedia states that: "Artificial Intelligence (AI) is intelligence demonstrated by machines, unlike the intelligence of humans and animals, which involves consciousness and emotionality." Machine Learning (ML), as a subset of Artificial Intelligence (AI) can learn by itself.