AITopics | Country

Plotting

Country

Teaching Machines to Read and Comprehend

Hermann, Karl Moritz, Kočiský, Tomáš, Grefenstette, Edward, Espeholt, Lasse, Kay, Will, Suleyman, Mustafa, Blunsom, Phil

arXiv.org Artificial IntelligenceNov-19-2015

Teaching machines to read natural language documents remains an elusive challenge. Machine reading systems can be tested on their ability to answer questions posed on the contents of documents that they have seen, but until now large scale training and test datasets have been missing for this type of evaluation. In this work we define a new methodology that resolves this bottleneck and provides large scale supervised reading comprehension data. This allows us to develop a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure.

deep learning, neural network, query, (21 more...)

arXiv.org Artificial Intelligence

1506.0334

Country: Europe > United Kingdom > England (0.15)

Industry:

Education (0.88)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stochastic Expectation Propagation

Li, Yingzhen, Hernandez-Lobato, Jose Miguel, Turner, Richard E.

arXiv.org Machine LearningNov-18-2015

Expectation propagation (EP) is a deterministic approximation algorithm that is often used to perform approximate Bayesian parameter learning. EP approximates the full intractable posterior distribution through a set of local approximations that are iteratively refined for each datapoint. EP can offer analytic and computational advantages over other approximations, such as Variational Inference (VI), and is the method of choice for a number of models. The local nature of EP appears to make it an ideal candidate for performing Bayesian learning on large models in large-scale dataset settings. However, EP has a crucial limitation in this context: the number of approximating factors needs to increase with the number of data-points, N, which often entails a prohibitively large memory overhead. This paper presents an extension to EP, called stochastic expectation propagation (SEP), that maintains a global posterior approximation (like VI) but updates it in a local way (like EP). Experiments on a number of canonical learning problems using synthetic and real-world datasets indicate that SEP performs almost as well as full EP, but reduces the memory consumption by a factor of $N$. SEP is therefore ideally suited to performing approximate Bayesian learning in the large model, large dataset setting.

approximation, bayesian inference, neural network, (17 more...)

arXiv.org Machine Learning

1506.04132

Country:

North America (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)

Add feedback

Enhancements in statistical spoken language translation by de-normalization of ASR results

Wołk, Agnieszka, Wołk, Krzysztof, Marasek, Krzysztof

arXiv.org Machine LearningNov-18-2015

Spoken language translation (SLT) has become very important in an increasingly globalized world. Machine translation (MT) for automatic speech recognition (ASR) systems is a major challenge of great interest. This research investigates that automatic sentence segmentation of speech that is important for enriching speech recognition output and for aiding downstream language processing. This article focuses on the automatic sentence segmentation of speech and improving MT results. We explore the problem of identifying sentence boundaries in the transcriptions produced by automatic speech recognition systems in the Polish language. We also experiment with reverse normalization of the recognized speech samples.

machine translation, speech recognition, translation, (18 more...)

arXiv.org Machine Learning

1511.09392

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Harvesting comparable corpora and mining them for equivalent bilingual sentences using statistical classification and analogy- based heuristics

Wołk, Krzysztof, Rejmund, Emilia, Marasek, Krzysztof

arXiv.org Machine LearningNov-18-2015

Parallel sentences are a relatively scarce but extremely useful resource for many applications including cross-lingual retrieval and statistical machine translation. This research explores our new methodologies for mining such data from previously obtained comparable corpora. The task is highly practical since non-parallel multilingual data exist in far greater quantities than parallel corpora, but parallel sentences are a much more useful resource. Here we propose a web crawling method for building subject-aligned comparable corpora from e.g. Wikipedia dumps and Euronews web page. The improvements in machine translation are shown on Polish-English language pair for various text domains. We also tested another method of building parallel corpora based on comparable corpora data. It lets automatically broad existing corpus of sentences from subject of corpora based on analogies between them.

artificial intelligence, corpora, machine translation, (13 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-319-25252-0_46

1511.06285

Country:

Europe (1.00)
North America > United States > Oregon (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Global Linear Convergence of Frank-Wolfe Optimization Variants

Lacoste-Julien, Simon, Jaggi, Martin

arXiv.org Machine LearningNov-18-2015

The Frank-Wolfe (FW) optimization algorithm has lately re-gained popularity thanks in particular to its ability to nicely handle the structured constraints appearing in machine learning applications. However, its convergence rate is known to be slow (sublinear) when the solution lies at the boundary. A simple less-known fix is to add the possibility to take 'away steps' during optimization, an operation that importantly does not require a feasibility oracle. In this paper, we highlight and clarify several variants of the Frank-Wolfe optimization algorithm that have been successfully applied in practice: away-steps FW, pairwise FW, fully-corrective FW and Wolfe's minimum norm point algorithm, and prove for the first time that they all enjoy global linear convergence, under a weaker condition than strong convexity of the objective. The constant in the convergence rate has an elegant interpretation as the product of the (classical) condition number of the function with a novel geometric quantity that plays the role of a 'condition number' of the constraint set. We provide pointers to where these algorithms have made a difference in practice, in particular with the flow polytope, the marginal polytope and the base polytope for submodular optimization.

algorithm, artificial intelligence, optimization problem, (17 more...)

arXiv.org Machine Learning

1511.05932

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Efficient Output Kernel Learning for Multiple Tasks

Jawanpuria, Pratik, Lapin, Maksim, Hein, Matthias, Schiele, Bernt

arXiv.org Machine LearningNov-18-2015

The paradigm of multi-task learning is that one can achieve better generalization by learning tasks jointly and thus exploiting the similarity between the tasks rather than learning them independently of each other. While previously the relationship between tasks had to be user-defined in the form of an output kernel, recent approaches jointly learn the tasks and the output kernel. As the output kernel is a positive semidefinite matrix, the resulting optimization problems are not scalable in the number of tasks as an eigendecomposition is required in each step. \mbox{Using} the theory of positive semidefinite kernels we show in this paper that for a certain class of regularizers on the output kernel, the constraint of being positive semidefinite can be dropped as it is automatically satisfied for the relaxed problem. This leads to an unconstrained dual problem which can be solved efficiently. Experiments on several multi-task and multi-class data sets illustrate the efficacy of our approach in terms of computational efficiency as well as generalization performance.

artificial intelligence, matrix, optimization problem, (15 more...)

arXiv.org Machine Learning

1511.05706

Country: Europe > Germany > Saarland (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

GAP Safe screening rules for sparse multi-task and multi-class models

Ndiaye, Eugene, Fercoq, Olivier, Gramfort, Alexandre, Salmon, Joseph

arXiv.org Machine LearningNov-18-2015

High dimensional regression benefits from sparsity promoting regularizations. Screening rules leverage the known sparsity of the solution by ignoring some variables in the optimization, hence speeding up solvers. When the procedure is proven not to discard features wrongly the rules are said to be \emph{safe}. In this paper we derive new safe rules for generalized linear models regularized with $\ell_1$ and $\ell_1/\ell_2$ norms. The rules are based on duality gap computations and spherical safe regions whose diameters converge to zero. This allows to discard safely more variables, in particular for low regularization parameters. The GAP Safe rule can cope with any iterative solver and we illustrate its performance on coordinate descent for multi-task Lasso, binary and multinomial logistic regression, demonstrating significant speed ups on all tested datasets with respect to previous safe rules.

neurology, optimization problem, screening rule, (17 more...)

arXiv.org Machine Learning

1506.03736

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.50)
Research Report > Experimental Study (0.36)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Add feedback

Bayesian hypothesis testing for one bit compressed sensing with sensing matrix perturbation

Zayyani, H., Korki, M., Marvasti, F.

arXiv.org Machine LearningNov-18-2015

This letter proposes a low-computational Bayesian algorithm for noisy sparse recovery in the context of one bit compressed sensing with sensing matrix perturbation. The proposed algorithm which is called BHT-MLE comprises a sparse support detector and an amplitude estimator. The support detector utilizes Bayesian hypothesis test, while the amplitude estimator uses an ML estimator which is obtained by solving a convex optimization problem. Simulation results show that BHT-MLE algorithm offers more reconstruction accuracy than that of an ML estimator (MLE) at a low computational cost.

algorithm, artificial intelligence, sparse vector, (14 more...)

arXiv.org Machine Learning

1511.0566

Country: Asia > Middle East > Iran (0.29)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.43)

Add feedback

Online learning in repeated auctions

Weed, Jonathan, Perchet, Vianney, Rigollet, Philippe

arXiv.org Machine LearningNov-18-2015

Motivated by online advertising auctions, we consider repeated Vickrey auctions where goods of unknown value are sold sequentially and bidders only learn (potentially noisy) information about a good's value once it is purchased. We adopt an online learning approach with bandit feedback to model this problem and derive bidding strategies for two models: stochastic and adversarial. In the stochastic model, the observed values of the goods are random variables centered around the true value of the good. In this case, logarithmic regret is achievable when competing against well behaved adversaries. In the adversarial model, the goods need not be identical and we simply compare our performance against that of the best fixed bid in hindsight. We show that sublinear regret is also achievable in this case and prove matching minimax lower bounds. To our knowledge, this is the first complete set of strategies for bidders participating in auctions of this type.

auction, computer based training, game theory, (26 more...)

arXiv.org Machine Learning

1511.0572

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.50)

Industry:

Marketing (0.66)
Education > Educational Setting > Online (0.61)
Information Technology > Services (0.48)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

backShift: Learning causal cyclic graphs from unknown shift interventions

Rothenhäusler, Dominik, Heinze, Christina, Peters, Jonas, Meinshausen, Nicolai

arXiv.org Machine LearningNov-18-2015

We propose a simple method to learn linear causal cyclic models in the presence of latent variables. The method relies on equilibrium data of the model recorded under a specific kind of interventions ("shift interventions"). The location and strength of these interventions do not have to be known and can be estimated from the data. Our method, called backShift, only uses second moments of the data and performs simple joint matrix diagonalization, applied to differences between covariance matrices. We give a sufficient and necessary condition for identifiability of the system, which is fulfilled almost surely under some quite general assumptions if and only if there are at least three distinct experimental settings, one of which can be pure observational data. We demonstrate the performance on some simulated data and applications in flow cytometry and financial time series. The code is made available as R-package backShift.

artificial intelligence, health & medicine, intervention, (17 more...)

arXiv.org Machine Learning

1506.02494

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback