length representation
Building a Deep Image Search Engine using tf.Keras
Imagine having a data collection of hundreds of thousands to millions of images without any metadata describing the content of each image. How can we build a system that is able to find a sub-set of those images that best answer a user's search query? What we will basically need is a search engine that is able to rank image results given how well they correspond to the search query, which can be either expressed in a natural language or by another query image. The way we will solve the problem in this post is by training a deep neural model that learns a fixed length representation (or embedding) of any input image and text and makes it so those representations are close in the euclidean space if the pairs text-image or image-image are "similar". I could not find a data-set of search result ranking that is big enough but I was able to get this data-set: http://jmcauley.ucsd.edu/data/amazon/
Learning to recognize touch gestures: recurrent vs. convolutional features and dynamic sampling
Debard, Quentin, Wolf, Christian, Canu, Stéphane, Arné, Julien
Learning to recognize touch gestures: recurrent vs. convolutional features and dynamic sampling Abstract-- We propose a fully automatic method for learning gestures on big touch devices in a potentially multi-user context. The goal is to learn general models capable of adapting to different gestures, user styles and hardware variations (e.g. Based on deep neural networks, our method features a novel dynamic sampling and temporal normalization component, transforming variable length gestures into fixed length representations while preserving finger/surface contact transitions, that is, the topology of the signal. This sequential representation is then processed with a convolutional model capable, unlike recurrent networks, of learning hierarchical representations with different levels of abstraction. To demonstrate the interest of the proposed method, we introduce a new touch gestures dataset with 6591 gestures performed by 27 people, which is, up to our knowledge, the first of its kind: a publicly available multi-touch gesture dataset for interaction. We also tested our method on a standard dataset of symbolic touch gesture recognition, the MMG dataset, outperforming the state of the art and reporting close to perfect performance. I. INTRODUCTION Touch screen technology has been widely integrated into many different devices for about a decade, becoming a major interface with different use cases ranging from smartphones to big touch tables. Starting with simple interactions, such as taps or single touch gestures, we are now using these interfaces to perform more and more complex actions, involving multiple touches and/or multiple users. If simple interactions do not require complicated engineering to perform well, advanced manipulations such as navigating through a 3D modelisation or designing a document in parallel with different users still craves for easier and better interactions. As of today, different methods and frameworks for touch gesture recognition were developed (see for instance [15], [28] and [7] for reviews). These methods define a specific model for the class, and it is up to the user to execute the correct protocol.