### On Stacked Denoising Autoencoder based Pre-training of ANN for Isolated Handwritten Bengali Numerals Dataset Recognition

This work attempts to find the most optimal parameter setting of a deep artificial neural network (ANN) for Bengali digit dataset by pre-training it using stacked denoising autoencoder (SDA). Although SDA based recognition is hugely popular in image, speech and language processing related tasks among the researchers, it was never tried in Bengali dataset recognition. For this work, a dataset of 70000 handwritten samples were used from (Chowdhury and Rahman, 2016) and was recognized using several settings of network architecture. Among all these settings, the most optimal setting being found to be five or more deeper hidden layers with sigmoid activation and one output layer with softmax activation. We proposed the optimal number of neurons that can be used in the hidden layer is 1500 or more. The minimum validation error found from this work is 2.34% which is the lowest error rate on handwritten Bengali dataset proposed till date.

### Visual Learning of Arithmetic Operation

A simple Neural Network model is presented for end-to-end visual learning of arithmetic operations from pictures of numbers. The input consists of two pictures, each showing a 7-digit number. The output, also a picture, displays the number showing the result of an arithmetic operation (e.g., addition or subtraction) on the two input numbers. The concepts of a number, or of an operator, are not explicitly introduced. This indicates that addition is a simple cognitive task, which can be learned visually using a very small number of neurons. Other operations, e.g., multiplication, were not learnable using this architecture. Some tasks were not learnable end-to-end (e.g., addition with Roman numerals), but were easily learnable once broken into two separate sub-tasks: a perceptual Character Recognition and cognitive Arithmetic sub-tasks. This indicates that while some tasks may be easily learnable end-to-end, other may need to be broken into sub-tasks.

### Visual Learning of Arithmetic Operations

A simple Neural Network model is presented for end-to-end visual learning of arithmetic operations from pictures of numbers. The input consists of two pictures, each showing a 7-digit number. The output, also a picture, displays the number showing the result of an arithmetic operation (e.g., addition or subtraction) on the two input numbers. The concepts of a number, or of an operator, are not explicitly introduced. This indicates that addition is a simple cognitive task, which can be learned visually using a very small number of neurons. Other operations, e.g., multiplication, were not learnable using this architecture. Some tasks were not learnable end-to-end (e.g., addition with Roman numerals), but were easily learnable once broken into two separate sub-tasks: a perceptual \textit{Character Recognition} and cognitive \textit{Arithmetic} sub-tasks. This indicates that while some tasks may be easily learnable end-to-end, other may need to be broken into sub-tasks.

In this paper, we disseminate a new handwritten digits-dataset, termed Kannada-MNIST, for the Kannada script, that can potentially serve as a direct drop-in replacement for the original MNIST dataset. In addition to this dataset, we disseminate an additional real world handwritten dataset (with $10k$ images), which we term as the Dig-MNIST dataset that can serve as an out-of-domain test dataset. We also duly open source all the code as well as the raw scanned images along with the scanner settings so that researchers who want to try out different signal processing pipelines can perform end-to-end comparisons. We provide high level morphological comparisons with the MNIST dataset and provide baselines accuracies for the dataset disseminated. The initial baselines obtained using an oft-used CNN architecture ($96.8\%$ for the main test-set and $76.1\%$ for the Dig-MNIST test-set) indicate that these datasets do provide a sterner challenge with regards to generalizability than MNIST or the KMNIST datasets. We also hope this dissemination will spur the creation of similar datasets for all the languages that use different symbols for the numeral digits.