AITopics

2507.09617

Country: Europe (0.46)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Makovska, Viktoriia, Fletcher, George, Stoyanovich, Julia

ONION: A Multi-Layered Framework for Participatory ER Design

arXiv.org Artificial IntelligenceJul-14-2025

We present ONION, a multi-layered framework for participatory Entity-Relationship (ER) modeling that integrates insights from design justice, participatory AI, and conceptual modeling. ONION introduces a five-stage methodology: Observe, Nurture, Integrate, Optimize, Normalize. It supports progressive abstraction from unstructured stakeholder input to structured ER diagrams. Our approach aims to reduce designer bias, promote inclusive participation, and increase transparency through the modeling process. We evaluate ONION through real-world workshops focused on sociotechnical systems in Ukraine, highlighting how diverse stakeholder engagement leads to richer data models and deeper mutual understanding. Early results demonstrate ONION's potential to host diversity in early-stage data modeling. We conclude with lessons learned, limitations and challenges involved in scaling and refining the framework for broader adoption.

artificial intelligence, information management, participant, (18 more...)

doi: 10.1145/3736733.3736736

2507.08702

Country:

Europe > Ukraine (0.49)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.95)

Gupta, Anant, Bhowmik, Rajarshi, Gunow, Geoffrey

Agentic Retrieval of Topics and Insights from Earnings Calls

arXiv.org Artificial IntelligenceJul-11-2025

Tracking the strategic focus of companies through topics in their earnings calls is a key task in financial analysis. However, as industries evolve, traditional topic modeling techniques struggle to dynamically capture emerging topics and their relationships. In this work, we propose an LLM-agent driven approach to discover and retrieve emerging topics from quarterly earnings calls. We propose an LLM-agent to extract topics from documents, structure them into a hierarchical ontology, and establish relationships between new and existing topics through a topic ontology. We demonstrate the use of extracted topics to infer company-level insights and emerging trends over time. We evaluate our approach by measuring ontology coherence, topic evolution accuracy, and its ability to surface emerging financial trends.

large language model, machine learning, natural language, (20 more...)

2507.07906

Country: Asia (0.28)

Genre:

Financial News (1.00)
Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Automobiles & Trucks (1.00)
Banking & Finance > Trading (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceJul-9-2025

Bridging Perception and Language: A Systematic Benchmark for LVLMs' Understanding of Amodal Completion Reports

Watahiki, Amane, Doi, Tomoki, Shinozaki, Taiga, Nishida, Satoshi, Niikawa, Takuya, Miyahara, Katsunori, Yanaka, Hitomi

One of the main objectives in developing large vision-language models (L VLMs) is to engineer systems that can assist humans with multimodal tasks, including interpreting descriptions of perceptual experiences. A central phenomenon in this context is amodal completion, in which people perceive objects even when parts of those objects are hidden. Although numerous studies have assessed whether computer-vision algorithms can detect or reconstruct occluded regions, the inferential abilities of L VLMs on texts related to amodal completion remain unexplored. To address this gap, we constructed a benchmark grounded in Basic Formal Ontology to achieve a systematic classification of amodal completion. Our results indicate that while many L VLMs achieve human-comparable performance overall, their accuracy diverges for certain types of objects being completed. Notably, in certain categories, some LLaV A-NeXT variants and Claude 3.5 Sonnet exhibit lower accuracy on original images compared to blank stimuli lacking visual content. Intriguingly, this disparity emerges only under Japanese prompting, suggesting a deficiency in Japanese-specific linguistic competence among these models.

category, large language model, natural language, (20 more...)

2507.05799

Country:

Asia > Japan (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.70)

Dasgupta, Subhasis, Stephens, Jon, Gupta, Amarnath

OLG++: A Semantic Extension of Obligation Logic Graph

arXiv.org Artificial IntelligenceJul-9-2025

We present OLG++, a semantic extension of the Obligation Logic Graph (OLG) for modeling regulatory and legal rules in municipal and interjurisdictional contexts. OLG++ introduces richer node and edge types, including spatial, temporal, party group, defeasibility, and logical grouping constructs, enabling nuanced representations of legal obligations, exceptions, and hierarchies. The model supports structured reasoning over rules with contextual conditions, precedence, and complex triggers. We demonstrate its expressiveness through examples from food business regulations, showing how OLG++ supports legal question answering using property graph queries. OLG++ also improves over LegalRuleML by providing native support for subClassOf, spatial constraints, and reified exception structures. Our examples show that OLG++ is more expressive than prior graph-based models for legal knowledge representation.

artificial intelligence, natural language, obligation, (15 more...)

2507.05488

Country: North America > United States > California > San Diego County (0.17)

Genre: Research Report (0.50)

Industry: Law > Statutes (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.47)

Ioannides, Georgios, Constantinou, Christos, Jain, Vinija, Chadha, Aman, Elkins, Aaron

MOD-X: A Modular Open Decentralized eXchange Framework proposal for Heterogeneous Interoperable Artificial Intelligence Agents

arXiv.org Artificial IntelligenceJul-9-2025

As Artificial Intelligence systems evolve from monolithic models to ecosystems of specialized agents, the need for standardized communication protocols becomes increasingly critical. This paper introduces MOD-X (Modular Open Decentralized eXchange), a novel architectural framework proposal for agent interoperability that addresses key limitations of existing protocols. Unlike current approaches, MOD-X proposes a layered architecture with a Universal Message Bus, thorough state management, translation capabilities, and blockchain-based security mechanisms. We present MOD-X's architecture, compare it with existing protocols, and demonstrate its application through a worked example how it enables integration between heterogeneous specialist agents (agents with different architectures, vendors, capabilities, and knowledge representations--including rule-based systems, neural networks, symbolic reasoning engines, and legacy software with agent wrappers). MOD-X's key innovations include a publish-subscribe communication model, semantic capability discovery, and dynamic workflow orchestration--providing a framework that bridges theoretical formalism with practical implementation. This architecture addresses the growing need for truly decentralized, interoperable agent ecosystems that can scale effectively without the need for central coordination.

artificial intelligence, natural language, text processing, (15 more...)

2507.04376

Country: North America > United States (0.47)

Genre: Research Report (0.64)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Consumer Products & Services > Travel (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

NDAI-NeuroMAP: A Neuroscience-Specific Embedding Model for Domain-Specific Retrieval

Patel, Devendra, Jain, Aaditya, Verma, Jayant, Rajput, Divyansh, Mahala, Sunil, Khapare, Ketki Suresh, Kalla, Jayateja

The exponential growth in neuroscience research output and clinical data necessitates the development of specialized natural language processing models tailored to this domain. Contemporary embedding models, while demonstrating superior performance on general-purpose benchmarks, exhibit suboptimal efficacy when applied to neuroscience-specific tasks due to their broad training objectives and limited exposure to domain-specific terminologies and conceptual relationships. This limitation significantly constrains the development of advanced applications including patient-centric retrieval-augmented generation (RAG) systems and comprehensive electronic health record (EHR) mining for neurological healthcare applications. To address this critical gap, we present NDAI-NeuroMAP, the first neuroscience-domain-specific dense vector embedding model engineered for high-precision information retrieval tasks. Our methodology encompasses the curation of an extensive domain-specific training corpus comprising 500,000 carefully constructed triplets (query-positive-negative configurations), augmented with 250,000 neuroscience-specific definitional entries and 250,000 structured knowledge-graph triplets derived from authoritative neurological ontologies. We employ a sophisticated fine-tuning approach utilizing the FremyCompany/BioLORD-2023 foundation model, implementing a multi-objective optimization framework combining contrastive learning with triplet-based metric learning paradigms. Comprehensive evaluation on a held-out test dataset comprising approximately 24,000 neuroscience-specific queries demonstrates substantial performance improvements over state-of-the-art general-purpose and biomedical embedding models. These empirical findings underscore the critical importance of domain-specific embedding architectures for neuroscience-oriented RAG systems and related clinical natural language processing applications. The landscape of natural language processing (NLP) has evolved profoundly over the past decade, driven by advances in neural embedding architectures. These models, which transform text into dense, high-dimensional vectors, now support diverse tasks spanning cross-lingual translation to large-scale information retrieval. Early methods, such as the seminal Word2V ec [1] and GloV e [2], introduced static word embeddings that successfully captured semantic relationships through distributional statistics, but failed to account for context, producing identical vectors for terms like "bank" regardless of meaning. Contextualized embedding architectures subsequently overcame these limitations.

information retrieval, large language model, machine learning, (22 more...)

2507.03329

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.69)
(5 more...)

Fox, Oliver Robert, Bergami, Giacomo, Morgan, Graham

Verified Language Processing with Hybrid Explainability: A Technical Report

The volume and diversity of digital information have led to a growing reliance on Machine Learning techniques, such as Natural Language Processing, for interpreting and accessing appropriate data. While vector and graph embeddings represent data for similarity tasks, current state-of-the-art pipelines lack guaranteed explainability, failing to determine similarity for given full texts accurately. These considerations can also be applied to classifiers exploiting generative language models with logical prompts, which fail to correctly distinguish between logical implication, indifference, and inconsistency, despite being explicitly trained to recognise the first two classes. We present a novel pipeline designed for hybrid explainability to address this. Our methodology combines graphs and logic to produce First-Order Logic representations, creating machine- and human-readable representations through Montague Grammar. Preliminary results indicate the effectiveness of this approach in accurately capturing full text similarity. To the best of our knowledge, this is the first approach to differentiate between implication, inconsistency, and indifference for text classification tasks. To address the limitations of existing approaches, we use three self-contained datasets annotated for the former classification task to determine the suitability of these approaches in capturing sentence structure equivalence, logical connectives, and spatiotemporal reasoning. We also use these data to compare the proposed method with language models pre-trained for detecting sentence entailment. The results show that the proposed method outperforms state-of-the-art models, indicating that natural language understanding cannot be easily generalised by training over extensive document corpora. This work offers a step toward more transparent and reliable Information Retrieval from extensive textual data.

logic & formal reasoning, machine learning, natural language, (21 more...)

2507.05017

Country:

Asia (1.00)
Europe > United Kingdom (0.92)
North America > United States > California (0.45)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

RELRaE: LLM-Based Relationship Extraction, Labelling, Refinement, and Evaluation

Hannah, George, de Berardinis, Jacopo, Payne, Terry R., Tamma, Valentina, Mitchell, Andrew, Piercy, Ellen, Johnson, Ewan, Ng, Andrew, Rostron, Harry, Konev, Boris

A large volume of XML data is produced in experiments carried out by robots in laboratories. In order to support the interoperability of data between labs, there is a motivation to translate the XML data into a knowledge graph. A key stage of this process is the enrichment of the XML schema to lay the foundation of an ontology schema. To achieve this, we present the RELRaE framework, a framework that employs large language models in different stages to extract and accurately label the relationships implicitly present in the XML schema. We investigate the capability of LLMs to accurately generate these labels and then evaluate them. Our work demonstrates that LLMs can be effectively used to support the generation of relationship labels in the context of lab automation, and that they can play a valuable role within semi-automatic ontology generation frameworks more generally.

large language model, machine learning, natural language, (18 more...)

2507.03829

Country: Europe (1.00)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Alharbi, Reham, Tamma, Valentina, Payne, Terry R., de Berardinis, Jacopo

A Comparative Study of Competency Question Elicitation Methods from Ontology Requirements

Competency Questions (CQs) are pivotal in knowledge engineering, guiding the design, validation, and testing of ontologies. A number of diverse formulation approaches have been proposed in the literature, ranging from completely manual to Large Language Model (LLM) driven ones. However, attempts to characterise the outputs of these approaches and their systematic comparison are scarce. This paper presents an empirical comparative evaluation of three distinct CQ formulation approaches: manual formulation by ontology engineers, instantiation of CQ patterns, and generation using state of the art LLMs. We generate CQs using each approach from a set of requirements for cultural heritage, and assess them across different dimensions: degree of acceptability, ambiguity, relevance, readability and complexity. Our contribution is twofold: (i) the first multi-annotator dataset of CQs generated from the same source using different methods; and (ii) a systematic comparison of the characteristics of the CQs resulting from each approach. Our study shows that different CQ generation approaches have different characteristics and that LLMs can be used as a way to initially elicit CQs, however these are sensitive to the model used to generate CQs and they generally require a further refinement step before they can be used to model requirements.

large language model, machine learning, natural language, (19 more...)

2507.02989

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)