AITopics

Bello, Kevin, Honorio, Jean

Exact inference in structured prediction

Neural Information Processing SystemsMar-18-2020, 22:00:50 GMT

Structured prediction can be thought of as a simultaneous prediction of multiple labels. This is often done by maximizing a score function on the space of labels, which decomposes as a sum of pairwise and unary potentials. The above is naturally modeled with a graph, where edges and vertices are related to pairwise and unary potentials, respectively. We consider the generative process proposed by Globerson et al. (2015) and apply it to general connected graphs. We analyze the structural conditions of the graph that allow for the exact recovery of the labels.

exact inference, graph, prediction, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)

Touati, Ahmed, Taiga, Adrien Ali, Bellemare, Marc G.

Zooming for Efficient Model-Free Reinforcement Learning in Metric Spaces

arXiv.org Machine LearningMar-9-2020

Despite the wealth of research into provably efficient reinforcement learning algorithms, most works focus on tabular representation and thus struggle to handle exponentially or infinitely large state-action spaces. In this paper, we consider episodic reinforcement learning with a continuous state-action space which is assumed to be equipped with a natural metric that characterizes the proximity between different states and actions. We propose ZoomRL, an online algorithm that leverages ideas from continuous bandits to learn an adaptive discretization of the joint space by zooming in more promising and frequently visited regions while carefully balancing the exploitation-exploration trade-off. We show that ZoomRL achieves a worst-case regret $\tilde{O}(H^{\frac{5}{2}} K^{\frac{d+1}{d+2}})$ where $H$ is the planning horizon, $K$ is the number of episodes and $d$ is the covering dimension of the space with respect to the metric. Moreover, our algorithm enjoys improved metric-dependent guarantees that reflect the geometry of the underlying space. Finally, we show that our algorithm is robust to small misspecification errors.

algorithm, inequality follow, rad, (16 more...)

2003.04069

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.41)

Palagin, Oleksandr, Velychko, Vitalii, Malakhov, Kyrylo, Shchurov, Oleksandr

Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach

arXiv.org Artificial IntelligenceMar-6-2020

We design a new technique for the distributional semantic modeling with a neural network-based approach to learn distributed term representations (or term embeddings) - term vector space models as a result, inspired by the recent ontology-related approach (using different types of contextual knowledge such as syntactic knowledge, terminological knowledge, semantic knowledge, etc.) to the identification of terms (term extraction) and relations between them (relation extraction) called semantic pre-processing technology - SPT. Our method relies on automatic term extraction from the natural language texts and subsequent formation of the problem-oriented or application-oriented (also deeply annotated) text corpora where the fundamental entity is the term (includes non-compositional and compositional terms). This gives us an opportunity to changeover from distributed word representations (or word embeddings) to distributed term representations (or term embeddings). This transition will allow to generate more accurate semantic maps of different subject domains (also, of relations between input terms - it is useful to explore clusters and oppositions, or to test your hypotheses about them). The semantic map can be represented as a graph using Vec2graph - a Python library for visualizing word embeddings (term embeddings in our case) as dynamic and interactive graphs. The Vec2graph library coupled with term embeddings will not only improve accuracy in solving standard NLP tasks, but also update the conventional concept of automated ontology development. The main practical result of our work is the development kit (set of toolkits represented as web service APIs and web application), which provides all necessary routines for the basic linguistic pre-processing and the semantic pre-processing of the natural language texts in Ukrainian for future training of term vector space models.

representation, space model, vector space model, (16 more...)

2003.0335

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.85)

Tulabandhula, Theja, Sinha, Deeksha

Optimizing Revenue while showing Relevant Assortments at Scale

arXiv.org Artificial IntelligenceMar-6-2020

Scalable real-time assortment optimization has become essential in e-commerce operations due to the need for personalization and the availability of a large variety of items. While this can be done when there are simplistic assortment choices to be made, imposing constraints on the collection of feasible assortments gives more flexibility to incorporate insights of store-managers and historically well-performing assortments. We design fast and flexible algorithms based on variations of binary search that find the revenue of the (approximately) optimal assortment. In particular, we revisit the problem of large-scale assortment optimization under the multinomial logit choice model without any assumptions on the structure of the feasible assortments. We speed up the comparisons steps using novel vector space embeddings, based on advances in the fields of information retrieval and machine learning. For an arbitrary collection of assortments, our algorithms can find a solution in time that is sub-linear in the number of assortments and for the simpler case of cardinality constraints - linear in the number of items (existing methods are quadratic or worse). Empirical validations using the Billion Prices dataset and several retail transaction datasets show that our algorithms are competitive even when the number of items is $\sim 10^5$ ($100$x larger instances than previously studied).

algorithm, assort-mnl, assortment, (17 more...)

2003.04736

Country:

South America (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Central America (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry:

Retail (0.66)
Transportation (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
(2 more...)

arXiv.org Machine LearningMar-2-2020

Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees

Na, Sen, Luo, Yuwei, Yang, Zhuoran, Wang, Zhaoran, Kolar, Mladen

Graph representation learning is a ubiquitous task in machine learning where the goal is to embed each vertex into a low-dimensional vector space. We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution. The bipartite graph is assumed to be generated by a semiparametric exponential family distribution, whose parametric component is given by the proximity of outputs of two one-layer neural networks, while nonparametric (nuisance) component is the base measure. Neural networks take high-dimensional features as inputs and output embedding vectors. In this setting, the representation learning problem is equivalent to recovering the weight matrices. The main challenges of estimation arise from the nonlinearity of activation functions and the nonparametric nuisance component of the distribution. To overcome these challenges, we propose a pseudo-likelihood objective based on the rank-order decomposition technique and focus on its local geometry. We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate. Moreover, we prove that the sample complexity of the problem is linear in dimensions (up to logarithmic factors), which is consistent with parametric Gaussian models. However, our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.

diag, exp, representation, (9 more...)

2003.01013

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine (0.45)
Education > Focused Education > Special Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Collins, Benoît, Kumari, Sushma, Pestov, Vladimir G.

Universal consistency of the $k$-NN rule in metric spaces and Nagata dimension

arXiv.org Machine LearningFeb-28-2020

The $k$ nearest neighbour learning rule (under the uniform distance tie breaking) is universally consistent in every metric space $X$ that is sigma-finite dimensional in the sense of Nagata. This was pointed out by C\'erou and Guyader (2006) as a consequence of the main result by those authors, combined with a theorem in real analysis sketched by D. Preiss (1971) (and elaborated in detail by Assouad and Quentin de Gromard (2006)). We show that it is possible to give a direct proof along the same lines as the original theorem of Charles J. Stone (1977) about the universal consistency of the $k$-NN classifier in the finite dimensional Euclidean space. The generalization is non-trivial because of the distance ties being more prevalent in the non-euclidean setting, and on the way we investigate the relevant geometric properties of the metrics and the limitations of the Stone argument, by constructing various examples.

dimension, nagata dimension, nearest neighbour, (14 more...)

2003.00894

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
South America > Brazil > Bahia > Salvador (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.74)

arXiv.org Machine LearningFeb-28-2020

Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings

Ren, Hongyu, Hu, Weihua, Leskovec, Jure

Answering complex logical queries on large-scale incomplete knowledge graphs (KGs) is a fundamental yet challenging task. Recently, a promising approach to this problem has been to embed KG entities as well as the query into a vector space such that entities that answer the query are embedded close to the query. However, prior work models queries as single points in the vector space, which is problematic because a complex query represents a potentially large set of its answer entities, but it is unclear how such a set can be represented as a single point. Furthermore, prior work can only handle queries that use conjunctions ($\wedge$) and existential quantifiers ($\exists$). Handling queries with logical disjunctions ($\vee$) remains an open problem. Here we propose query2box, an embedding-based framework for reasoning over arbitrary queries with $\wedge$, $\vee$, and $\exists$ operators in massive and incomplete KGs. Our main insight is that queries can be embedded as boxes (i.e., hyper-rectangles), where a set of points inside the box corresponds to a set of answer entities of the query. We show that conjunctions can be naturally represented as intersections of boxes and also prove a negative result that handling disjunctions would require embedding with dimension proportional to the number of KG entities. However, we show that by transforming queries into a Disjunctive Normal Form, query2box is capable of handling arbitrary logical queries with $\wedge$, $\vee$, $\exists$ in a scalable manner. We demonstrate the effectiveness of query2box on three large KGs and show that query2box achieves up to 25% relative improvement over the state of the art.

conjunctive query, query, query structure, (17 more...)

2002.05969

Country:

North America > Canada (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.83)
(2 more...)

Malinin, Andrey, Gales, Mark

Uncertainty in Structured Prediction

arXiv.org Artificial IntelligenceFeb-28-2020

Uncertainty estimation is important for ensuring safety and robustness of AI systems, especially for high-risk applications. While much progress has recently been made in this area, most research has focused on un-structured prediction, such as image classification and regression tasks. However, while task-specific forms of confidence score estimation have been investigated by the speech and machine translation communities, limited work has investigated general uncertainty estimation approaches for structured prediction. Thus, this work aims to investigate uncertainty estimation for structured prediction tasks within a single unified and interpretable probabilistic ensemble-based framework. We consider uncertainty estimation for sequence data at the token-level and complete sequence-level, provide interpretations for, and applications of, various measures of uncertainty and discuss the challenges associated with obtaining them. This work also explores the practical challenges associated with obtaining uncertainty estimates for structured predictions tasks and provides baselines for token-level error detection, sequence-level prediction rejection, and sequence-level out-of-domain input detection using ensembles of auto-regressive transformer models trained on the WMT'14 English-French and WMT'17 English-German translation and LibriSpeech speech recognition datasets.

detection, ensemble, knowledge uncertainty, (15 more...)

2002.0765

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
(2 more...)

Friedman, Tal, Broeck, Guy Van den

Symbolic Querying of Vector Spaces: Probabilistic Databases Meets Relational Embeddings

arXiv.org Artificial IntelligenceFeb-23-2020

To deal with increasing amounts of uncertainty and incompleteness in relational data, we propose unifying techniques from probabilistic databases and relational embedding models. We use probabilistic databases as our formalism to define the probabilistic model with respect to which all queries are done. This allows us to leverage the rich literature of theory and algorithms from probabilistic databases for solving problems. While this formalization can be used with any relational embedding model, the lack of a well defined joint probability distribution causes simple problems to become provably hard. With this in mind, we introduce \TO, a relational embedding model designed in terms of probabilistic databases to exploit typical embedding assumptions within the probabilistic framework. Using principled, efficient inference algorithms that can be derived from its definition, we empirically demonstrate that \TOs is an effective and general model for these tasks.

database, probability, query, (17 more...)

2002.10029

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.40)