projectivity
A Fundamental Algorithm for Dependency Parsing (With Corrections)
Abstract-This paper presents a fundamental algorithm for parsing natural language sentences into dependency trees. Unlike phrase-structure (constituency) parsers, this algorithm operates one word at a time, attaching each word as soon as it can be attached, corresponding to properties claimed for the parser in the human brain. This paper develops, from first principles, several variations on a fundamental algorithm for parsing natural language into dependency trees. This is an exposition of an algorithm that has been known, in some form, since the 1960s but is not presented systematically in the extant literature. Unlike phrase-structure (constituency) parsers, this algorithm operates one word at a time, attaching each word as soon as it can be attached. There is good evidence that the parsing process used by the human mind has these properties [1].
- North America > United States > Georgia > Clarke County > Athens (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > New York (0.04)
- North America > United States > California > Los Angeles County > Santa Monica (0.04)
Review for NeurIPS paper: Baxter Permutation Process
Summary and Contributions: The Baxter Permutation Process provides a generalization of the Mondrian process so that an arbitrary tiling with hyperrectables is supported. In particular, the Mondrian process (MP) is a Bayesian nonparametric version of a decision tree. Since MP is defined by a tree, each cut must extend to the edges of the hyperrectangle being cut. This means that the first cut must extend from -infinity to infinity, and the second cut (should it be perpendicular) must extend from infinity to the level of the first cut (or, if it's parallel, again from -infinity to infinity). This extensive cut nature is not incredibly terrible, but could lead to a lack of local modelling.
PROPRES: Investigating the Projectivity of Presupposition with Various Triggers and Environments
What makes a presupposition of an utterance -- information taken for granted by its speaker -- different from other pragmatic inferences such as an entailment is projectivity (e.g., the negative sentence the boy did not stop shedding tears presupposes the boy had shed tears before). The projectivity may vary depending on the combination of presupposition triggers and environments. However, prior natural language understanding studies fail to take it into account as they either use no human baseline or include only negation as an entailment-canceling environment to evaluate models' performance. The current study attempts to reconcile these issues. We introduce a new dataset, projectivity of presupposition (PROPRES, which includes 12k premise-hypothesis pairs crossing six triggers involving some lexical variety with five environments. Our human evaluation reveals that humans exhibit variable projectivity in some cases. However, the model evaluation shows that the best-performed model, DeBERTa, does not fully capture it. Our findings suggest that probing studies on pragmatic inferences should take extra care of the human judgment variability and the combination of linguistic items.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > United States > New York (0.04)
- (13 more...)
The expected sum of edge lengths in planar linearizations of trees. Theory and applications
Alemany-Puig, Lluís, Ferrer-i-Cancho, Ramon
Dependency trees have proven to be a very successful model to represent the syntactic structure of sentences of human languages. In these structures, vertices are words and edges connect syntactically-dependent words. The tendency of these dependencies to be short has been demonstrated using random baselines for the sum of the lengths of the edges or its variants. A ubiquitous baseline is the expected sum in projective orderings (wherein edges do not cross and the root word of the sentence is not covered by any edge), that can be computed in time $O(n)$. Here we focus on a weaker formal constraint, namely planarity. In the theoretical domain, we present a characterization of planarity that, given a sentence, yields either the number of planar permutations or an efficient algorithm to generate uniformly random planar permutations of the words. We also show the relationship between the expected sum in planar arrangements and the expected sum in projective arrangements. In the domain of applications, we derive a $O(n)$-time algorithm to calculate the expected value of the sum of edge lengths. We also apply this research to a parallel corpus and find that the gap between actual dependency distance and the random baseline reduces as the strength of the formal constraint on dependency structures increases, suggesting that formal constraints absorb part of the dependency distance minimization effect. Our research paves the way for replicating past research on dependency distance minimization using random planar linearizations as random baseline.
- Europe > Czechia > Prague (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
Projectivity revisited
The behaviour of statistical relational representations across differently sized domains has become a focal area of research from both a modelling and a complexity viewpoint.Recently, projectivity of a family of distributions emerged as a key property, ensuring that marginal probabilities are independent of the domain size. However, the formalisation used currently assumes that the domain is characterised only by its size. This contribution extends the notion of projectivity from families of distributions indexed by domain size to functors taking extensional data from a database. This makes projectivity available for the large range of applications taking structured input. We transfer key known results on projective families of distributions to the new setting. This includes a characterisation of projective fragments in different statistical relational formalisms as well as a general representation theorem for projective families of distributions. Furthermore, we prove a correspondence between projectivity and distributions on countably infinite domains, which we use to unify and generalise earlier work on statistical relational representations in infinite domains. Finally, we use the extended notion of projectivity to define a further strengthening, which we call $\sigma$-projectivity, and which allows the use of the same representation in different modes while retaining projectivity.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > New York (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
Why the Rich Get Richer? On the Balancedness of Random Partition Models
Lee, Changwoo J., Sang, Huiyan
Random partition models are widely used in Bayesian methods for various clustering tasks, such as mixture models, topic models, and community detection problems. While the number of clusters induced by random partition models has been studied extensively, another important model property regarding the balancedness of cluster sizes has been largely neglected. We formulate a framework to define and theoretically study the balancedness of exchangeable random partition models, by analyzing how a model assigns probabilities to partitions with different levels of balancedness. We demonstrate that the "rich-get-richer" characteristic of many existing popular random partition models is an inevitable consequence of two common assumptions: product-form exchangeability and projectivity. We propose a principled way to compare the balancedness of random partition models, which gives a better understanding of what model works better and what doesn't for different applications. We also introduce the "rich-get-poorer" random partition models and illustrate their application to entity resolution tasks.
- North America > United States > Texas > Brazos County > College Station (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Inference, Learning, and Population Size: Projectivity for SRL Models
Jaeger, Manfred, Schulte, Oliver
A subtle difference between propositional and relational data is that in many relational models, marginal probabilities depend on the population or domain size. This paper connects the dependence on population size to the classic notion of projectivity from statistical theory: Projectivity implies that relational predictions are robust with respect to changes in domain size. We discuss projectivity for a number of common SRL systems, and identify syntactic fragments that are guaranteed to yield projective models. The syntactic conditions are restrictive, which suggests that projectivity is difficult to achieve in SRL, and care must be taken when working with different domain sizes.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Denmark > North Jutland > Aalborg (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
- Information Technology > Communications (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)