AITopics | linguistique

Collaborating Authors

linguistique

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multilingual corpora for the study of new concepts in the social sciences and humanities:

Kyriakoglou, Revekka, Pappa, Anna

arXiv.org Artificial IntelligenceDec-9-2025

This article presents a hybrid methodology for building a multilingual corpus designed to support the study of emerging concepts in the humanities and social sciences (HSS), illustrated here through the case of ``non-technological innovation''. The corpus relies on two complementary sources: (1) textual content automatically extracted from company websites, cleaned for French and English, and (2) annual reports collected and automatically filtered according to documentary criteria (year, format, duplication). The processing pipeline includes automatic language detection, filtering of non-relevant content, extraction of relevant segments, and enrichment with structural metadata. From this initial corpus, a derived dataset in English is created for machine learning purposes. For each occurrence of a term from the expert lexicon, a contextual block of five sentences is extracted (two preceding and two following the sentence containing the term). Each occurrence is annotated with the thematic category associated with the term, enabling the construction of data suitable for supervised classification tasks. This approach results in a reproducible and extensible resource, suitable both for analyzing lexical variability around emerging concepts and for generating datasets dedicated to natural language processing applications.

artificial intelligence, data & corpus, natural language, (13 more...)

arXiv.org Artificial Intelligence

2512.07367

Country: Europe (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Animer une base de connaissance: des ontologies aux mod{è}les d'I.A. g{é}n{é}rative

Stockinger, Peter

arXiv.org Artificial IntelligenceSep-3-2025

Animating a Knowledge Base: From Ontologies to Generative AI Models From Expert Systems and the Semantic W eb to Generative AI: Model - Driven and Data - Driven Approaches in Area Studies In a context where the social sciences and humanities are experimenting with non - anthropocentric analytical frames, this article proposes a semiotic (structural) reading of the hybridization between symbolic AI and neural (or sub - symbolic) AI based on a field of application: the design and use of a knowledge base for area studies. W e describe the LaCAS ecosystem - Open Archives in Linguistic and Cultural Studies (thesaurus; RDF/OWL ontology; LOD services; harvesting; expertise; publication), deployed at Inalco (National Institute for Oriental Languages and Civilizations) in Paris with the Okapi (Open Knowledge and Annotation Interface) software environment from Ina (National Audiovisual Institute), which now has around 160,000 documentary r esources and ten knowledge macro - domains grouping together several thousand knowledge objects. W e illustrate this approach using the knowledge domain "Languages of the world" (~540 languages) and the knowledge object "Quechua (language)". On this basis, we discuss the controlled integration of neural tools, more specifically generative tools, into the life cycle of a knowledge base: assistance with data localization/qualification, index extraction and aggregation, property suggestion and testing, dynamic file generation, and engineering of contextualized prompts (generic, contextual, explanatory, adjustment, procedural) aligned with a domain ontology. W e outline an ecosystem of specialized agents capable of animating the database while respe cting its symbolic constraints, by articulating model - driven and data - driven methods .

artificial intelligence, connaissance, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.01304

Country: Europe (1.00)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)

Add feedback

Un mod{\`e}le de base de connaissances terminologiques

Séguéla, Patrick, Aussenac-Gilles, Nathalie

arXiv.org Artificial IntelligenceFeb-16-2023

In the present paper, we argue that Terminological Knowledge Bases (TKB) are all the more useful for addressing various needs as they do not fulfill formal criteria. Moreover, they intend to clarify the terminology of a given domain by illustrating term uses in various contexts. Thus we designed a TKB structure including 3 linked features: terms, concepts and texts, that present the peculiar use of each term in the domain. Note that concepts are represented into frames whose non-formal description is standardized. Associated with this structure, we defined modeling criteria at the conceptual level. Finaly, we discuss the situation of TKB with regard to ontologies, and the use of TKB for the development of AI systems.

artificial intelligence, expert system, knowledge management, (18 more...)

arXiv.org Artificial Intelligence

2302.08198

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)

Add feedback

Un modèle générique d'organisation de corpus en ligne: application à la FReeBank

Salmon-Alt, Susanne, Romary, Laurent, Pierrel, Jean-Marie

arXiv.org Artificial IntelligenceDec-1-2009

The few available French resources for evaluating linguistic models or algorithms on other linguistic levels than morpho-syntax are either insufficient from quantitative as well as qualitative point of view or not freely accessible. Based on this fact, the FREEBANK project intends to create French corpora constructed using manually revised output from a hybrid Constraint Grammar parser and annotated on several linguistic levels (structure, morpho-syntax, syntax, coreference), with the objective to make them available on-line for research purposes. Therefore, we will focus on using standard annotation schemes, integration of existing resources and maintenance allowing for continuous enrichment of the annotations. Prior to the actual presentation of the prototype that has been implemented, this paper describes a generic model for the organization and deployment of a linguistic resource archive, in compliance with the various works currently conducted within international standardization initiatives (TEI and ISO/TC 37/SC 4).

artificial intelligence, linguistique, natural language, (18 more...)

arXiv.org Artificial Intelligence

cs/0611026

Country: Europe (1.00)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Raisonner avec des diagrammes : perspectives cognitives et computationnelles

Recanati, Catherine

arXiv.org Artificial IntelligenceDec-1-2009

Diagrammatic, analogical or iconic representations are often contrasted with linguistic or logical representations, in which the shape of the symbols is arbitrary. The aim of this paper is to make a case for the usefulness of diagrams in inferential knowledge representation systems. Although commonly used, diagrams have for a long time suffered from the reputation of being only a heuristic tool or a mere support for intuition. The first part of this paper is an historical background paying tribute to the logicians, psychologists and computer scientists who put an end to this formal prejudice against diagrams. The second part is a discussion of their characteristics as opposed to those of linguistic forms. The last part is aimed at reviving the interest for heterogeneous representation systems including both linguistic and diagrammatic representations.

artificial intelligence, diagrammatique, sentation, (17 more...)

arXiv.org Artificial Intelligence

cs/0607051

Country:

Europe > United Kingdom > England (0.28)
North America > United States > Massachusetts > Middlesex County (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Sur le statut référentiel des entités nommées

Poibeau, Thierry

arXiv.org Artificial IntelligenceDec-1-2009

We show in this paper that, on the one hand, named entities can be designated using different denominations and that, on the second hand, names denoting named entities are polysemous. The analysis cannot be limited to reference resolution but should take into account naming strategies, which are mainly based on two linguistic operations: synecdoche and metonymy. Lastly, we present a model that explicitly represents the different denominations in discourse, unifying the way to represent linguistic knowledge and world knowledge.

artificial intelligence, natural language, text processing, (18 more...)

arXiv.org Artificial Intelligence

cs/0510020

Country:

North America > United States (0.68)
Europe (0.68)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)

Add feedback

Le terme et le concept : fondements d'une ontoterminologie

Roche, Christophe

arXiv.org Artificial IntelligenceJan-8-2008

Most definitions of ontology, viewed as a "specification of a conceptualization", agree on the fact that if an ontology can take different forms, it necessarily includes a vocabulary of terms and some specification of their meaning in relation to the domain's conceptualization. And as domain knowledge is mainly conveyed through scientific and technical texts, we can hope to extract some useful information from them for building ontology. But is it as simple as this? In this article we shall see that the lexical structure, i.e. the network of words linked by linguistic relationships, does not necessarily match the domain conceptualization. We have to bear in mind that writing documents is the concern of textual linguistics, of which one of the principles is the incompleteness of text, whereas building ontology - viewed as task-independent knowledge - is concerned with conceptualization based on formal and not natural languages. Nevertheless, the famous Sapir and Whorf hypothesis, concerning the interdependence of thought and language, is also applicable to formal languages. This means that the way an ontology is built and a concept is defined depends directly on the formal language which is used; and the results will not be the same. The introduction of the notion of ontoterminology allows to take into account epistemological principles for formal ontology building.

artificial intelligence, terminologie, terminologie & ontologie, (17 more...)

arXiv.org Artificial Intelligence

0801.1275

Country: Europe (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback