AITopics | ferrer-i-cancho

Collaborating Authors

ferrer-i-cancho

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Distribution of Dependency Distance and Hierarchical Distance in Contemporary Written Japanese and Its Influencing Factors

Wang, Linxuan, Yu, Shuiyuan

arXiv.org Artificial IntelligenceNov-27-2025

To explore the relationship between dependency distance (DD) and hierarchical distance (HD) in Japanese, we compared the probability distributions of DD and HD with and without sentence length fixed, and analyzed the changes in mean dependency distance (MDD) and mean hierarchical distance (MHD) as sentence length increases, along with their correlation coefficient based on the Balanced Corpus of Contemporary Written Japanese. It was found that the valency of the predicates is the underlying factor behind the trade-off relation between MDD and MHD in Japanese. Native speakers of Japanese regulate the linear complexity and hierarchical complexity through the valency of the predicates, and the relative sizes of MDD and MHD depend on whether the threshold of valency has been reached. Apart from the cognitive load, the valency of the predicates also affects the probability distributions of DD and HD. The effect of the valency of the predicates on the distribution of HD is greater than on that of DD, which leads to differences in their probability distributions and causes the mean of MDD to be lower than that of MHD.

artificial intelligence, natural language, probability distribution, (17 more...)

arXiv.org Artificial Intelligence

2504.21421

Country: Asia > China (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.94)

Add feedback

The exponential distribution of the orders of demonstrative, numeral, adjective and noun

Ferrer-i-Cancho, Ramon

arXiv.org Artificial IntelligenceFeb-10-2025

The frequency of the preferred order for a noun phrase formed by demonstrative, numeral, adjective and noun has received significant attention over the last two decades. We investigate the actual distribution of the preferred 24 possible orders. There is no consensus on whether it can be well-fitted by an exponential or a power law distribution. We find that an exponential distribution is a much better model. This finding and other circumstances where an exponential-like distribution is found challenge the view that power-law distributions, e.g., Zipf's law for word frequencies, are inevitable. We also investigate which of two exponential distributions gives a better fit: an exponential model where the 24 orders have non-zero probability or an exponential model where the number of orders that can have non-zero probability is variable. When parsimony and generalizability are prioritized, we find strong support for the exponential model where all 24 orders have non-zero probability. This finding suggests that there is no hard constraint on word order variation and then unattested orders merely result from undersampling, consistently with Cysouw's view.

artificial intelligence, geometric 2, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.06342

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.36)

Add feedback

Predictability maximization and the origins of word order harmony

Ferrer-i-Cancho, Ramon

arXiv.org Artificial IntelligenceSep-12-2024

We address the linguistic problem of the sequential arrangement of a head and its dependents from an information theoretic perspective. In particular, we consider the optimal placement of a head that maximizes the predictability of the sequence. We assume that dependents are statistically independent given a head, in line with the open-choice principle and the core assumptions of dependency grammar. We demonstrate the optimality of harmonic order, i.e., placing the head last maximizes the predictability of the head whereas placing the head first maximizes the predictability of dependents. We also show that postponing the head is the optimal strategy to maximize its predictability while bringing it forward is the optimal strategy to maximize the predictability of dependents. We unravel the advantages of the strategy of maximizing the predictability of the head over maximizing the predictability of dependents. Our findings shed light on the placements of the head adopted by real languages or emerging in different kinds of experiments.

artificial intelligence, natural language, predictability, (15 more...)

arXiv.org Artificial Intelligence

2408.1657

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.14)
Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)

Add feedback

The optimal placement of the head in the noun phrase. The case of demonstrative, numeral, adjective and noun

Ferrer-i-Cancho, Ramon

arXiv.org Artificial IntelligenceMar-22-2024

The word order of a sentence is shaped by multiple principles. The principle of syntactic dependency distance minimization is in conflict with the principle of surprisal minimization (or predictability maximization) in single head syntactic dependency structures: while the former predicts that the head should be placed at the center of the linear arrangement, the latter predicts that the head should be placed at one of the ends (either first or last). A critical question is when surprisal minimization (or predictability maximization) should surpass syntactic dependency distance minimization. In the context of single head structures, it has been predicted that this is more likely to happen when two conditions are met, i.e. (a) fewer words are involved and (b) words are shorter. Here we test the prediction on the noun phrase when it is composed of a demonstrative, a numeral, an adjective and a noun. We find that, across preferred orders in languages, the noun tends to be placed at one of the ends, confirming the theoretical prediction. We also show evidence of anti locality effects: syntactic dependency distances in preferred orders are longer than expected by chance.

dependency distance, ferrer-i-cancho, minimization, (13 more...)

arXiv.org Artificial Intelligence

2402.10311

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Michigan (0.04)
North America > United States > New York (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

Swap distance minimization in SOV languages. Cognitive and mathematical foundations

Ferrer-i-Cancho, Ramon, Namboodiripad, Savithry

arXiv.org Artificial IntelligenceDec-7-2023

Distance minimization is a general principle of language. A special case of this principle in the domain of word order is swap distance minimization. This principle predicts that variations from a canonical order that are reached by fewer swaps of adjacent constituents are lest costly and thus more likely. Here we investigate the principle in the context of the triple formed by subject (S), object (O) and verb (V). We introduce the concept of word order rotation as a cognitive underpinning of that prediction. When the canonical order of a language is SOV, the principle predicts SOV < SVO, OSV < VSO, OVS < VOS, in order of increasing cognitive cost. We test the prediction in three flexible order SOV languages: Korean (Koreanic), Malayalam (Dravidian), and Sinhalese (Indo-European). Evidence of swap distance minimization is found in all three languages, but it is weaker in Sinhalese. Swap distance minimization is stronger than a preference for the canonical order in Korean and especially Malayalam.

distance minimization, minimization, swap distance minimization, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.53482/2023_55_412

2312.04219

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

The expected sum of edge lengths in planar linearizations of trees. Theory and applications

Alemany-Puig, Lluís, Ferrer-i-Cancho, Ramon

arXiv.org Artificial IntelligenceSep-18-2023

Dependency trees have proven to be a very successful model to represent the syntactic structure of sentences of human languages. In these structures, vertices are words and edges connect syntactically-dependent words. The tendency of these dependencies to be short has been demonstrated using random baselines for the sum of the lengths of the edges or its variants. A ubiquitous baseline is the expected sum in projective orderings (wherein edges do not cross and the root word of the sentence is not covered by any edge), that can be computed in time $O(n)$. Here we focus on a weaker formal constraint, namely planarity. In the theoretical domain, we present a characterization of planarity that, given a sentence, yields either the number of planar permutations or an efficient algorithm to generate uniformly random planar permutations of the words. We also show the relationship between the expected sum in planar arrangements and the expected sum in projective arrangements. In the domain of applications, we derive a $O(n)$-time algorithm to calculate the expected value of the sum of edge lengths. We also apply this research to a parallel corpus and find that the gap between actual dependency distance and the random baseline reduces as the strength of the formal constraint on dependency structures increases, suggesting that formal constraints absorb part of the dependency distance minimization effect. Our research paves the way for replicating past research on dependency distance minimization using random planar linearizations as random baseline.

alemany-puig and ferrer-i-cancho, arrangement, ferrer-i-cancho, (16 more...)

arXiv.org Artificial Intelligence

2207.05564

Country:

Europe > Czechia > Prague (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.86)

Add feedback

The distribution of syntactic dependency distances

Petrini, Sonia, Ferrer-i-Cancho, Ramon

arXiv.org Artificial IntelligenceNov-26-2022

The syntactic structure of a sentence can be represented as a graph where vertices are words and edges indicate syntactic dependencies between them. In this setting, the distance between two syntactically linked words can be defined as the difference between their positions. Here we want to contribute to the characterization of the actual distribution of syntactic dependency distances, and unveil its relationship with short-term memory limitations. We propose a new double-exponential model in which decay in probability is allowed to change after a break-point. This transition could mirror the transition from the processing of words chunks to higher-level structures. We find that a two-regime model -- where the first regime follows either an exponential or a power-law decay -- is the most likely one in all 20 languages we considered, independently of sentence length and annotation style. Moreover, the break-point is fairly stable across languages and averages values of 4-5 words, suggesting that the amount of words that can be simultaneously processed abstracts from the specific language to a high degree. Finally, we give an account of the relation between the best estimated model and the closeness of syntactic dependencies, as measured by a recently introduced optimality score.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.1462

Country:

North America > United States (0.27)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Memory limitations are hidden in grammar

Gómez-Rodríguez, Carlos, Christiansen, Morten H., Ferrer-i-Cancho, Ramon

arXiv.org Artificial IntelligenceApr-5-2022

For many centuries, the goal of linguistics has been to capture this capacity by a formal description--a grammar--consisting of a systematic set of rules and/or principles that determine which sentences are part of a given language and which are not (Bod, 2013). Over the years, these formal grammars have taken many forms but common to them all is the assumption that they capture the idealized linguistic competence of a native speaker/hearer, independent of any memory limitations or other non-linguistic cognitive constraints (Chomsky, 1965; Miller, 2000). These abstract formal descriptions have come to play a foundational role in the language sciences, from linguistics, psycholinguistics, and neurolinguistics (Hauser et al., 2002; Pinker, 2003) to computer science, engineering, and machine learning (Klein and Manning, 2003; Dyer et al., 2016; Gómez-Rodríguez et al., 2018). Despite evidence that processing difficulty underpins the unacceptability of certain sentences (Morrill, 2010; Hawkins, 2004), the cognitive independence assumption that is a defining feature of linguistic competence has not been examined in a systematic way using the tools of formal grammar. It is therefore unclear whether these supposedly idealized descriptions of language are free of non-linguistic cognitive constraints, such as memory limitations.

dependency structure, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.53482/2022_52_397

1908.06629

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Czechia > Prague (0.05)
(21 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Education > Curriculum > Subject-Specific Education (0.68)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Parallels of human language in the behavior of bottlenose dolphins

Ferrer-i-Cancho, R., Lusseau, D., McCowan, B.

arXiv.org Artificial IntelligenceMar-25-2022

Here we review them with the help of quantitative linguistics and information theory.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.2478/lf-2022-0002

1605.01661

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Yolo County > Davis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(9 more...)

Genre:

Research Report (0.40)
Overview (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.70)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

The advent and fall of a vocabulary learning bias from communicative efficiency

Carrera-Casado, David, Ferrer-i-Cancho, Ramon

arXiv.org Artificial IntelligenceJul-20-2021

Biosemiosis is a process of choice-making between simultaneously alternative options. It is well-known that, when sufficiently young children encounter a new word, they tend to interpret it as pointing to a meaning that does not have a word yet in their lexicon rather than to a meaning that already has a word attached. In previous research, the strategy was shown to be optimal from an information theoretic standpoint. In that framework, interpretation is hypothesized to be driven by the minimization of a cost function: the option of least communication cost is chosen. However, the information theoretic model employed in that research neither explains the weakening of that vocabulary learning bias in older children or polylinguals nor reproduces Zipf's meaning-frequency law, namely the non-linear relationship between the number of meanings of a word and its frequency. Here we consider a generalization of the model that is channeled to reproduce that law. The analysis of the new model reveals regions of the phase space where the bias disappears consistently with the weakening or loss of the bias in older children or polylinguals. The model is abstract enough to support future research on other levels of life that are relevant to biosemiotics. In the deep learning era, the model is a transparent low-dimensional tool for future experimental research and illustrates the predictive power of a theoretical framework originally designed to shed light on the origins of Zipf's rank-frequency law.

artificial intelligence, machine learning, vocabulary learning bias, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s12304-021-09452-w

2105.11519

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
(9 more...)

Genre:

Research Report > Experimental Study (0.47)
Research Report > New Finding (0.34)

Industry:

Health & Medicine (0.67)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback