AITopics | lexicography

Collaborating Authors

lexicography

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Vision-Enabled LLMs in Historical Lexicography: Digitising and Enriching Estonian-German Dictionaries from the 17th and 18th Centuries

Jürviste, Madis, Jakobson, Joonatan

arXiv.org Artificial IntelligenceOct-10-2025

This article presents research conducted at the Institute of the Estonian Language between 2022 and 2025 on the application of large language models (LLMs) to the study of 17th and 18th century Estonian dictionaries. The authors address three main areas: enriching historical dictionaries with modern word forms and meanings; using vision-enabled LLMs to perform text recognition on sources printed in Gothic script (Fraktur); and preparing for the creation of a unified, cross-source dataset. Initial experiments with J. Gutslaff's 1648 dictionary indicate that LLMs have significant potential for semi-automatic enrichment of dictionary information. When provided with sufficient context, Claude 3.7 Sonnet accurately provided meanings and modern equivalents for 81% of headword entries. In a text recognition experiment with A. T. Helle's 1732 dictionary, a zero-shot method successfully identified and structured 41% of headword entries into error-free JSON-formatted output. For digitising the Estonian-German dictionary section of A. W. Hupel's 1780 grammar, overlapping tiling of scanned image files is employed, with one LLM being used for text recognition and a second for merging the structured output. These findings demonstrate that even for minor languages LLMs have a significant potential for saving time and financial resources.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.07931

Country: Europe > Estonia (0.52)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Der Effizienz- und Intelligenzbegriff in der Lexikographie und kuenstlichen Intelligenz: kann ChatGPT die lexikographische Textsorte nachbilden?

Arias-Arias, Ivan, Vazquez, Maria Jose Dominguez, Riveiro, Carlos Valcarcel

arXiv.org Artificial IntelligenceDec-11-2024

By means of pilot experiments for the language pair German and Galician, this paper examines the concept of efficiency and intelligence in lexicography and artificial intelligence, AI. The aim of the experiments is to gain empirically and statistically based insights into the lexicographical text type,dictionary article, in the responses of ChatGPT 3.5, as well as into the lexicographical data on which this chatbot was trained. Both quantitative and qualitative methods are used for this purpose. The analysis is based on the evaluation of the outputs of several sessions with the same prompt in ChatGPT 3.5. On the one hand, the algorithmic performance of intelligent systems is evaluated in comparison with data from lexicographical works. On the other hand, the ChatGPT data supplied is analysed using specific text passages of the aforementioned lexicographical text type. The results of this study not only help to evaluate the efficiency of this chatbot regarding the creation of dictionary articles, but also to delve deeper into the concept of intelligence, the thought processes and the actions to be carried out in both disciplines.

effizienz-und intelligenzbegriff, lexicography, lexiko, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.5788/34-1-1879.

2412.08599

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Czechia > South Moravian Region > Brno (0.05)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
(11 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Building another Spanish dictionary, this time with GPT-4

Ortega-Martín, Miguel, García-Sierra, Óscar, Ardoiz, Alfonso, Armenteros, Juan Carlos, Garrido, Ignacio, Álvarez, Jorge, Torrón, Camilo, Galdeano, Iñigo, Arranz, Ignacio, Vorontsov, Oleg, Alonso, Adrián

arXiv.org Artificial IntelligenceJun-17-2024

We present the "Spanish Built Factual Freectianary 2.0" (Spanish-BFF-2) as the second iteration of an AI-generated Spanish dictionary. Previously, we developed the inaugural version of this unique free dictionary employing GPT-3. In this study, we aim to improve the dictionary by using GPT-4-turbo instead. Furthermore, we explore improvements made to the initial version and compare the performance of both models.

dle, lemma, spanish-bff-2, (13 more...)

arXiv.org Artificial Intelligence

2406.11218

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Contribuci\'on de la sem\'antica combinatoria al desarrollo de herramientas digitales multiling\"ues

Vázquez, María José Domínguez

arXiv.org Artificial IntelligenceDec-26-2023

This paper describes how the field of Combinatorial Semantics has contributed to the design of three prototypes for the automatic generation of argument patterns in nominal phrases in Spanish, French and German (Xera, Combinatoria and CombiContext). It also shows the importance of knowing about the argument syntactic-semantic interface in a production situation in the context of foreign languages. After a descriptive section on the design, typologie and information levels of the resources, there follows an explanation of the central role of the combinatorial meaning (roles and ontological features). The study deals with different semantic f ilters applied in the selection, organization and expansion of the lexicon, being these key pieces for the generation of grammatically correct and semantically acceptable mono- and biargumental nominal phrases.

ejemplo, ntica, zquez, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.5209/clac.73849

2312.16309

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.05)
Europe > France (0.05)
(14 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

"Definition Modeling: To model definitions." Generating Definitions With Little to No Semantics

Segonne, Vincent, Mickus, Timothee

arXiv.org Artificial IntelligenceJun-14-2023

Definition Modeling, the task of generating definitions, was first proposed as a means to evaluate the semantic quality of word embeddings-a coherent lexical semantic representations of a word in context should contain all the information necessary to generate its definition. The relative novelty of this task entails that we do not know which factors are actually relied upon by a Definition Modeling system. In this paper, we present evidence that the task may not involve as much semantics as one might expect: we show how an earlier model from the literature is both rather insensitive to semantic aspects such as explicit polysemy, as well as reliant on formal similarities between headwords and words occurring in its glosses, casting doubt on the validity of the task as a means to evaluate embeddings.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.08433

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
North America > United States > New York > New York County > New York City (0.04)
(23 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback

Spanish Built Factual Freectianary (Spanish-BFF): the first AI-generated free dictionary

Ortega-Martín, Miguel, García-Sierra, Óscar, Ardoiz, Alfonso, Armenteros, Juan Carlos, Álvarez, Jorge, Alonso, Adrián

arXiv.org Artificial IntelligenceFeb-28-2023

Dictionaries are one of the oldest and most used linguistic resources. Building them is a complex task that, to the best of our knowledge, has yet to be explored with generative Large Language Models (LLMs). We introduce the "Spanish Built Factual Freectianary" (Spanish-BFF) as the first Spanish AI-generated dictionary. This first-of-its-kind free dictionary uses GPT-3. We also define future steps we aim to follow to improve this initial commitment to the field, such as more additional languages.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2302.12746

Country:

North America > United States (0.04)
Europe > Spain > Galicia > Madrid (0.04)
Europe > Spain > Balearic Islands (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

American English Is Now Reliant on Scrabble's Dictionary

SlateNov-28-2022, 10:40:00 GMT

In the mid-1970s, top players in an emerging tournament Scrabble scene persuaded the game's corporate owner to adopt a universal lexicon for competition. Players manually scraped five standard college dictionaries, recording every unique two- through eight-letter word (plus inflections) that met the game's rules. When the Official Scrabble Players Dictionary was published, in 1978, players rejoiced. "You can retire the boxing gloves and put up your swords," the Scrabble Players Newspaper wrote. "You now have an arbiter to settle all arguments."

dictionary, merriam, new word, (17 more...)

Slate

Country:

North America > United States > Indiana (0.05)
North America > United States > District of Columbia > Washington (0.05)
North America > United States > Connecticut > Fairfield County > Westport (0.05)
Europe > United Kingdom > England > East Sussex > Brighton (0.05)

Industry: Leisure & Entertainment > Games > Scrabble (1.00)

Technology: Information Technology > Artificial Intelligence > Games > Scrabble (0.40)

Add feedback

Lexicography from Α to Ω

VideoLectures.NETSep-8-2022, 11:40:02 GMT

ELEXIS from Α to Ω: Outcomes, Sustainability & Afterlife of a new European Lexicographic Infrastructure ELEXIS Showcase Event 2022 invites representatives of institutions that have become observers, as well as people from the industry, operating in fields such as Language Technology, Machine Translation, language learning, Dictionary Publishing, etc.

lexicography

VideoLectures.NET

Industry: Education > Curriculum > Subject-Specific Education (0.45)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.95)

Add feedback