Goto

Collaborating Authors

 Kittilä


Linguistically-Informed Neural Architectures for Lexical, Syntactic and Semantic Tasks in Sanskrit

Sandhan, Jivnesh

arXiv.org Artificial Intelligence

The primary focus of this thesis is to make Sanskrit manuscripts more accessible to the end-users through natural language technologies. The morphological richness, compounding, free word orderliness, and low-resource nature of Sanskrit pose significant challenges for developing deep learning solutions. We identify four fundamental tasks, which are crucial for developing a robust NLP technology for Sanskrit: word segmentation, dependency parsing, compound type identification, and poetry analysis. The first task, Sanskrit Word Segmentation (SWS), is a fundamental text processing task for any other downstream applications. However, it is challenging due to the sandhi phenomenon that modifies characters at word boundaries. Similarly, the existing dependency parsing approaches struggle with morphologically rich and low-resource languages like Sanskrit. Compound type identification is also challenging for Sanskrit due to the context-sensitive semantic relation between components. All these challenges result in sub-optimal performance in NLP applications like question answering and machine translation. Finally, Sanskrit poetry has not been extensively studied in computational linguistics. While addressing these challenges, this thesis makes various contributions: (1) The thesis proposes linguistically-informed neural architectures for these tasks. (2) We showcase the interpretability and multilingual extension of the proposed systems. (3) Our proposed systems report state-of-the-art performance. (4) Finally, we present a neural toolkit named SanskritShala, a web-based application that provides real-time analysis of input for various NLP tasks. Overall, this thesis contributes to making Sanskrit manuscripts more accessible by developing robust NLP technology and releasing various resources, datasets, and web-based toolkit.


$\Phi$-DVAE: Physics-Informed Dynamical Variational Autoencoders for Unstructured Data Assimilation

Glyn-Davies, Alex, Duffin, Connor, Akyildiz, Ö. Deniz, Girolami, Mark

arXiv.org Artificial Intelligence

Incorporating unstructured data into physical models is a challenging problem that is emerging in data assimilation. Traditional approaches focus on well-defined observation operators whose functional forms are typically assumed to be known. This prevents these methods from achieving a consistent model-data synthesis in configurations where the mapping from data-space to model-space is unknown. To address these shortcomings, in this paper we develop a physics-informed dynamical variational autoencoder ($\Phi$-DVAE) to embed diverse data streams into time-evolving physical systems described by differential equations. Our approach combines a standard, possibly nonlinear, filter for the latent state-space model and a VAE, to assimilate the unstructured data into the latent dynamical system. Unstructured data, in our example systems, comes in the form of video data and velocity field measurements, however the methodology is suitably generic to allow for arbitrary unknown observation operators. A variational Bayesian framework is used for the joint estimation of the encoding, latent states, and unknown system parameters. To demonstrate the method, we provide case studies with the Lorenz-63 ordinary differential equation, and the advection and Korteweg-de Vries partial differential equations. Our results, with synthetic data, show that $\Phi$-DVAE provides a data efficient dynamics encoding methodology which is competitive with standard approaches. Unknown parameters are recovered with uncertainty quantification, and unseen data are accurately predicted.


Fast Newton method solving KLR based on Multilevel Circulant Matrix with log-linear complexity

Zhang, Junna, Zhou, Shuisheng, Fu, Cui, Ye, Feng

arXiv.org Artificial Intelligence

Kernel logistic regression (KLR) is a conventional nonlinear classifier in machine learning. With the explosive growth of data size, the storage and computation of large dense kernel matrices is a major challenge in scaling KLR. Even the nystr\"{o}m approximation is applied to solve KLR, it also faces the time complexity of $O(nc^2)$ and the space complexity of $O(nc)$, where $n$ is the number of training instances and $c$ is the sampling size. In this paper, we propose a fast Newton method efficiently solving large-scale KLR problems by exploiting the storage and computing advantages of multilevel circulant matrix (MCM). Specifically, by approximating the kernel matrix with an MCM, the storage space is reduced to $O(n)$, and further approximating the coefficient matrix of the Newton equation as MCM, the computational complexity of Newton iteration is reduced to $O(n \log n)$. The proposed method can run in log-linear time complexity per iteration, because the multiplication of MCM (or its inverse) and vector can be implemented the multidimensional fast Fourier transform (mFFT). Experimental results on some large-scale binary-classification and multi-classification problems show that the proposed method enables KLR to scale to large scale problems with less memory consumption and less training time without sacrificing test accuracy.


Three ways to maximize the business value of AI

#artificialintelligence

Business interest in artificial intelligence (AI) has rocketed in recent years -- spending could reach $15.7 trillion by 2030, according to PwC. But there remain lingering concerns that businesses are failing to realize the full value from their investments. Ever since their emergence, AI, machine learning (ML), and data science have all been surrounded by hype. We've been promised technology that will solve our most complex challenges for us and automatically optimize everything from internal processes to customer experiences. Advances are being made every day that promise to transform virtually every aspect of our lives.


Fast Fair Regression via Efficient Approximations of Mutual Information

Steinberg, Daniel, Reid, Alistair, O'Callaghan, Simon, Lattimore, Finnian, McCalman, Lachlan, Caetano, Tiberio

arXiv.org Machine Learning

Most work in algorithmic fairness to date has focused on discrete outcomes, such as deciding whether to grant someone a loan or not. In these classification settings, group fairness criteria such as independence, separation and sufficiency can be measured directly by comparing rates of outcomes between subpopulations. Many important problems however require the prediction of a real-valued outcome, such as a risk score or insurance premium. In such regression settings, measuring group fairness criteria is computationally challenging, as it requires estimating information-theoretic divergences between conditional probability density functions. This paper introduces fast approximations of the independence, separation and sufficiency group fairness criteria for regression models from their (conditional) mutual information definitions, and uses such approximations as regularisers to enforce fairness within a regularised risk minimisation framework. Experiments in real-world datasets indicate that in spite of its superior computational efficiency our algorithm still displays state-of-the-art accuracy/fairness tradeoffs.


Finland's airports will soon be run by AI - TechHQ

#artificialintelligence

Delays, overcrowded departure dates, disgruntled passengers-- airports are rarely stress-free or smooth-sailing. But airports around the globe with savvy management are increasingly experimenting with data analytics and AI in order to make the process of flying more attractive for the customers that pass through their terminals. In Finland, Finavia-- the company behind all of the country's 21 airports-- teamed up with advisory firm Fourkind and agency Reaktor to take a look at its airport of Kittilä in Lapland. Finavia found that the airport wasn't able to keep up with the seasonal demands of the country's booming tourism industry. Tourists flocking to see the Northern Lights and catch a glimpse of Santa Claus were causing lengthy delays.


Linear-Time Inference for Pairwise Comparisons with Gaussian-Process Dynamics

Maystre, Lucas, Kristof, Victor, Grossglauser, Matthias

arXiv.org Machine Learning

In many competitive sports and games (such as tennis, basketball, chess and electronic sports), the most useful definition of a competitor's skill is the propensity of that competitor to win against an opponent. It is often difficult to measure this skill explicitly: take basketball for example, a team's skill depends on the abilities of its players in terms of shooting accuracy, physical fitness, mental preparation, but also on the team's cohesion and coordination, on its strategy, on the enthusiasm of its fans, and a number of other intangible factors. However, it is easy to observe this skill implicitly through the outcomes of matches. In this setting, probabilistic models of pairwise-comparison outcomes provide an elegant and effective approach to quantifying skill and to predicting future match outcomes given past data. These models, pioneered by Zermelo [1928] in the context of chess (and by Thurstone [1927] in the context of psychophysics), have been studied for almost a century. They posit that each competitor i (i.e., a team or player) is characterized by a latent score s R and that the outcome probabilities of a match between i and j are a function of


Kalman Temporal Differences

Geist, M., Pietquin, O.

Journal of Artificial Intelligence Research

Because reinforcement learning suffers from a lack of scalability, online value (and Q-) function approximation has received increasing interest this last decade. This contribution introduces a novel approximation scheme, namely the Kalman Temporal Differences (KTD) framework, that exhibits the following features: sample-efficiency, non-linear approximation, non-stationarity handling and uncertainty management. A first KTD-based algorithm is provided for deterministic Markov Decision Processes (MDP) which produces biased estimates in the case of stochastic transitions. Than the eXtended KTD framework (XKTD), solving stochastic MDP, is described. Convergence is analyzed for special cases for both deterministic and stochastic transitions. Related algorithms are experimented on classical benchmarks. They compare favorably to the state of the art while exhibiting the announced features.