Goto

Collaborating Authors

 Pattern Recognition


An improved uncertainty propagation method for robust i-vector based speaker recognition

arXiv.org Artificial Intelligence

The performance of automatic speaker recognition systems degrades when facing distorted speech data containing additive noise and/or reverberation. Statistical uncertainty propagation has been introduced as a promising paradigm to address this challenge. So far, different uncertainty propagation methods have been proposed to compensate noise and reverberation in i-vectors in the context of speaker recognition. They have achieved promising results on small datasets such as YOHO and Wall Street Journal, but little or no improvement on the larger, highly variable NIST Speaker Recognition Evaluation (SRE) corpus. In this paper, we propose a complete uncertainty propagation method, whereby we model the effect of uncertainty both in the computation of unbiased Baum-Welch statistics and in the derivation of the posterior expectation of the i-vector. We conduct experiments on the NIST-SRE corpus mixed with real domestic noise and reverberation from the CHiME-2 corpus and preprocessed by multichannel speech enhancement. The proposed method improves the equal error rate (EER) by 4% relative compared to a conventional i-vector based speaker verification baseline. This is to be compared with previous methods which degrade performance.


Automated Machine Learning: is it the Holy Grail? - AnalyticsWeek

#artificialintelligence

Machine learning is in the ascendancy. Particularly when it comes to pattern recognition, machine learning is the method of choice. Tangible examples of its applications include fraud detection, image recognition, predictive maintenance, and train delay prediction systems. In day-to-day machine learning (ML) and the quest to deploy the knowledge gained, we typically encounter these three main problems (but not the only ones). Data Quality โ€“ Data from multiple sources across multiple time frames can be difficult to collate into clean and coherent data sets that will yield the maximum benefit from machine learning.


AI is reinventing the way we invent

#artificialintelligence

Amgen's drug discovery group is a few blocks beyond that. Until recently, Barzilay, one of the world's leading researchers in artificial intelligence, hadn't given much thought to these nearby buildings full of chemists and biologists. But as AI and machine learning began to perform ever more impressive feats in image recognition and language comprehension, she began to wonder: could it also transform the task of finding new drugs? The problem is that human researchers can explore only a tiny slice of what is possible. It's estimated that there are as many as 1060 potentially drug-like molecules--more than the number of atoms in the solar system. But traversing seemingly unlimited possibilities is what machine learning is good at. Trained on large databases of existing molecules and their properties, the programs can explore all possible related molecules.


Graph-RISE: Graph-Regularized Image Semantic Embedding

arXiv.org Machine Learning

Learning image representations to capture fine-grained semantics has been a challenging and important task enabling many applications such as image search and clustering. In this paper, we present Graph-Regularized Image Semantic Embedding (Graph-RISE), a large-scale neural graph learning framework that allows us to train embeddings to discriminate an unprecedented O(40M) ultra-fine-grained semantic labels. Graph-RISE outperforms state-of-the-art image embedding algorithms on several evaluation tasks, including image classification and triplet ranking. We provide case studies to demonstrate that, qualitatively, image retrieval based on Graph-RISE effectively captures semantics and, compared to the state-of-the-art, differentiates nuances at levels that are closer to human-perception.


WhatsApp iPhone update: How to enable facial or fingerprint recognition to keep your chats safe

The Independent - Tech

WhatsApp's recently introduced security feature could be the key to keeping secret messages safe and secure. The company recently unveiled biometric tools in the iPhone version of the app which mean that the phone will check you're the right person before allowing you in. The setting means that you can only open WhatsApp if you have the right fingerprint or face, just like when you unlock your phone. It means that anyone who shares their phone around or is likely to have it unlocked can keep messages secret, even if other apps aren't locked up. The feature involves a slight trade-off: it's much harder for anyone to get into your chats, but it's also a little harder for you to do so, too.


The Long and the Short of It: Summarising Event Sequences with Serial Episodes

arXiv.org Artificial Intelligence

An ideal outcome of pattern mining is a small set of informative patterns, containing no redundancy or noise, that identifies the key structure of the data at hand. Standard frequent pattern miners do not achieve this goal, as due to the pattern explosion typically very large numbers of highly redundant patterns are returned. We pursue the ideal for sequential data, by employing a pattern set mining approach-an approach where, instead of ranking patterns individually, we consider results as a whole. Pattern set mining has been successfully applied to transactional data, but has been surprisingly under studied for sequential data. In this paper, we employ the MDL principle to identify the set of sequential patterns that summarises the data best. In particular, we formalise how to encode sequential data using sets of serial episodes, and use the encoded length as a quality score. As search strategy, we propose two approaches: the first algorithm selects a good pattern set from a large candidate set, while the second is a parameter-free any-time algorithm that mines pattern sets directly from the data. Experimentation on synthetic and real data demonstrates we efficiently discover small sets of informative patterns.


Why CAPTCHAs have gotten so difficult

#artificialintelligence

At some point last year, Google's constant requests to prove I'm human began to feel increasingly aggressive. More and more, the simple, slightly too-cute button saying "I'm not a robot" was followed by demands to prove it -- by selecting all the traffic lights, crosswalks, and storefronts in an image grid. Soon the traffic lights were buried in distant foliage, the crosswalks warped and half around a corner, the storefront signage blurry and in Korean. There's something uniquely dispiriting about being asked to identify a fire hydrant and struggling at it. These tests are called CAPTCHA, an acronym for Completely Automated Public Turing test to tell Computers and Humans Apart, and they've reached this sort of inscrutability plateau before. In the early 2000s, simple images of text were enough to stump most spambots.


Hyperbox based machine learning algorithms: A comprehensive survey

arXiv.org Machine Learning

With the rapid development of digital information, the data volume generated by humans and machines is growing exponentially. Along with this trend, machine learning algorithms have been formed and evolved continuously to discover new information and knowledge from different data sources. Learning algorithms using hyperboxes as fundamental representational and building blocks are a branch of machine learning methods. These algorithms have enormous potential for high scalability and online adaptation of predictors built using hyperbox data representations to the dynamically changing environments and streaming data. This paper aims to give a comprehensive survey of literature on hyperbox-based machine learning models. In general, according to the architecture and characteristic features of the resulting models, the existing hyperbox-based learning algorithms may be grouped into three major categories: fuzzy min-max neural networks, hyperbox-based hybrid models, and other algorithms based on hyperbox representation. Within each of these groups, this paper shows a brief description of the structure of models, associated learning algorithms, and an analysis of their advantages and drawbacks. Main applications of these hyperbox-based models to the real-world problems are also described in this paper. Finally, we discuss some open problems and identify potential future research directions in this field.


5 Steps to Get Started with AI in the Enterprise

#artificialintelligence

Plenty has been written about artificial intelligence (AI) and its game-changing potential. But, like many technologies before it, AI poses an important question. Are the potential benefits enough to get the enterprise on board? And if so, why is everyone so nervous about getting going? It will only take a quick search on your favorite search engine or social networking site before you are sufficiently overwhelmed by the multitude of views on AI, cognitive and automation technologies.


Learned Indexes for Dynamic Workloads

arXiv.org Artificial Intelligence

The recent proposal of learned index structures opens up a new perspective on how traditional range indexes can be optimized. However, the current learned indexes assume the data distribution is relatively static and the access pattern is uniform, while real-world scenarios consist of skew query distribution and evolving data. In this paper, we demonstrate that the missing consideration of access patterns and dynamic data distribution notably hinders the applicability of learned indexes. To this end, we propose solutions for learned indexes for dynamic workloads (called Doraemon). To improve the latency for skew queries, Doraemon augments the training data with access frequencies. To address the slow model re-training when data distribution shifts, Doraemon caches the previously-trained models and incrementally fine-tunes them for similar access patterns and data distribution. Our preliminary result shows that, Doraemon improves the query latency by 45.1% and reduces the model re-training time to 1/20.