Yang, Weixin
Learning stochastic differential equations using RNN with log signature features
Liao, Shujian, Lyons, Terry, Yang, Weixin, Ni, Hao
This paper contributes to the challenge of learning a function on streamed multimodal data through evaluation. The core of the result of our paper is the combination of two quite different approaches to this problem. One comes from the mathematically principled technology of signatures and log-signatures as representations for streamed data, while the other draws on the techniques of recurrent neural networks (RNN). The ability of the former to manage high sample rate streams and the latter to manage large scale nonlinear interactions allows hybrid algorithms that are easy to code, quicker to train, and of lower complexity for a given accuracy. We illustrate the approach by approximating the unknown functional as a controlled differential equation. Linear functionals on solutions of controlled differential equations are the natural universal class of functions on data streams. They are mathematically very explicit and interpretable, allow quantitative arguments, and yet are able to approximate any continuous function on streams arbitrarily well. They form the basis of rough path theory. Stochastic differential equations are examples of controlled differential equations where the controlling stream is a stochastic process. Following this approach, we give a hybrid Logsig-RNN algorithm that learns functionals on streamed data with outstanding performance.
DeepWriterID: An End-to-end Online Text-independent Writer Identification System
Yang, Weixin, Jin, Lianwen, Liu, Manfei
Owing to the rapid growth of touchscreen mobile terminals and pen-based interfaces, handwriting-based writer identification systems are attracting increasing attention for personal authentication, digital forensics, and other applications. However, most studies on writer identification have not been satisfying because of the insufficiency of data and difficulty of designing good features under various conditions of handwritings. Hence, we introduce an end-to-end system, namely DeepWriterID, employed a deep convolutional neural network (CNN) to address these problems. A key feature of DeepWriterID is a new method we are proposing, called DropSegment. It designs to achieve data augmentation and improve the generalized applicability of CNN. For sufficient feature representation, we further introduce path signature feature maps to improve performance. Experiments were conducted on the NLPR handwriting database. Even though we only use pen-position information in the pen-down state of the given handwriting samples, we achieved new state-of-the-art identification rates of 95.72% for Chinese text and 98.51% for English text.