AITopics

2409.18695

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Europe > Monaco (0.04)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (0.54)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.31)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Hidalgo, Rafael, Parron, Jesse, Varde, Aparna S., Wang, Weitian

Robo-CSK-Organizer: Commonsense Knowledge to Organize Detected Objects for Multipurpose Robots

In the rapidly evolving field of robotics, integration of commonsense knowledge (CSK) in AI systems is becoming highly crucial to enhance the decision-making capabilities of robots, especially in nextgeneration multipurpose environments. This paper presents Robo-CSK-Organizer, a pioneering system that employs CSK, via a classical knowledge base, to facilitate sophisticated task-based object organization helpful in multipurpose robots. Unlike systems relying solely on deep learning tools such as ChatGPT, our Robo-CSK-Organizer system stands out in various crucial aspects. This includes: (1) its ability to resolve ambiguities and maintain consistency in object placement; (2) its adaptability to diverse task-based classifications; and moreover, (3) its contributions to explainable AI (XAI), consequently helping to foster trust and human-robot collaboration. This system's efficacy is underlined by DETIC (DEtector with Image Classes), an advanced extension of Detectron2 for object identification; BLIP (Bootstrapping Language-Image Pre-training) for context discernment; and most vitally by the adaptation of ConceptNet, a well-grounded commonsense knowledge base for reasoning based on semantic as well as pragmatic knowledge. While we deploy ConceptNet to extract CSK, the process in Robo-CSK-Organizer is generic enough to be replicated with other state-of-the-art knowledge bases. Controlled experiments and real-world applications, synopsized in this paper, make Robo-CSK-Organizer demonstrate superior performance in placing objects in contextually relevant locations, highlighting its clear capacity for commonsense-guided decision-making closer to the thresholds of human cognition. Hence, Robo-CSK-Organizer makes valuable contributions to Robotics and AI.

large language model, machine learning, natural language, (20 more...)

2409.18385

Country:

North America > United States (0.14)
Europe > Germany > Saarland > Saarbrücken (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

LCMDC: Large-scale Chinese Medical Dialogue Corpora for Automatic Triage and Medical Consultation

Wang, Xinyuan, Li, Haozhou, Zheng, Dingfang, Peng, Qinke

The global COVID-19 pandemic underscored major deficiencies in traditional healthcare systems, hastening the advancement of online medical services, especially in medical triage and consultation. However, existing studies face two main challenges. First, the scarcity of large-scale, publicly available, domain-specific medical datasets due to privacy concerns, with current datasets being small and limited to a few diseases, limiting the effectiveness of triage methods based on Pre-trained Language Models (PLMs). Second, existing methods lack medical knowledge and struggle to accurately understand professional terms and expressions in patient-doctor consultations. To overcome these obstacles, we construct the Large-scale Chinese Medical Dialogue Corpora (LCMDC), comprising a Coarse-grained Triage dataset with 439,630 samples, a Fine-grained Diagnosis dataset with 199,600 samples, and a Medical Consultation dataset with 472,418 items, thereby addressing the data shortage in this field. Moreover, we further propose a novel triage system that combines BERT-based supervised learning with prompt learning, as well as a GPT-based medical consultation model using reinforcement learning. To enhance domain knowledge acquisition, we pre-trained PLMs using our self-constructed background corpus. Experimental results on the LCMDC demonstrate the efficacy of our proposed systems.

large language model, machine learning, natural language, (18 more...)

2410.03521

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology > Telehealth (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.89)

Contrastive Learning for Knowledge-Based Question Generation in Large Language Models

Zhang, Zhenhong, Chen, Jiajing, Shi, Weiyan, Yi, Lingjie, Wang, Chihang, Yu, Qian

With the rapid development of artificial intelligence technology, especially the increasingly widespread application of question-and-answer systems, high-quality question generation has become a key component in supporting the development of these systems. This article focuses on knowledge-based question generation technology, which aims to enable computers to simulate the human questioning process based on understanding specific texts or knowledge bases. In light of the issues of hallucination and knowledge gaps present in large-scale language models when applied to knowledge-intensive tasks, this paper proposes an enhanced question generation method that incorporates contrastive learning. This method utilizes multiple models to jointly mine domain knowledge and uses contrastive learning to guide the model in reducing noise and hallucinations in generation. Experimental results show that by designing prompts containing contrasting examples, the model's performance in question generation improves considerably, particularly when contrasting instructions and examples are used simultaneously, leading to the highest quality of generated questions and improved accuracy. These results demonstrate that the method proposed in this study, which combines contrasting context and chain-of-thought prompts, can effectively improve both the quality and the practicality of question generation.

large language model, machine learning, question answering, (16 more...)

2409.13994

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > Switzerland (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

RmGPT: Rotating Machinery Generative Pretrained Model

Wang, Yilin, Yu, Yifei, Sun, Kong, Lei, Peixuan, Zhang, Yuxuan, Zio, Enrico, Xia, Aiguo, Li, Yuanxiang

In industry, the reliability of rotating machinery is critical for production efficiency and safety. Current methods of Prognostics and Health Management (PHM) often rely on task-specific models, which face significant challenges in handling diverse datasets with varying signal characteristics, fault modes and operating conditions. Inspired by advancements in generative pretrained models, we propose RmGPT, a unified model for diagnosis and prognosis tasks. RmGPT introduces a novel token-based framework, incorporating Signal Tokens, Prompt Tokens, Time-Frequency Task Tokens and Fault Tokens to handle heterogeneous data within a unified model architecture. We leverage self-supervised learning for robust feature extraction and introduce a next signal token prediction pretraining strategy, alongside efficient prompt learning for task-specific adaptation. Extensive experiments demonstrate that RmGPT significantly outperforms state-of-the-art algorithms, achieving near-perfect accuracy in diagnosis tasks and exceptionally low errors in prognosis tasks. Notably, RmGPT excels in few-shot learning scenarios, achieving 92% accuracy in 16-class one-shot experiments, highlighting its adaptability and robustness. This work establishes RmGPT as a powerful PHM foundation model for rotating machinery, advancing the scalability and generalizability of PHM solutions.

dataset, diagnosis and prognosis task, rmgpt, (13 more...)

2409.17604

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > China > Beijing > Beijing (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Consumer Health (0.67)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceSep-23-2024

DSG-KD: Knowledge Distillation from Domain-Specific to General Language Models

Cho, Sangyeon, Jeon, Jangyeong, Lee, Dongjoon, Lee, Changhee, Kim, Junyeong

The use of pre-trained language models fine-tuned to address specific downstream tasks is a common approach in natural language processing (NLP). However, acquiring domain-specific knowledge via fine-tuning is challenging. Traditional methods involve pretraining language models using vast amounts of domain-specific data before fine-tuning for particular tasks. This study investigates emergency/non-emergency classification tasks based on electronic medical record (EMR) data obtained from pediatric emergency departments (PEDs) in Korea. Our findings reveal that existing domain-specific pre-trained language models underperform compared to general language models in handling N-lingual free-text data characteristics of non-English-speaking regions. To address these limitations, we propose a domain knowledge transfer methodology that leverages knowledge distillation to infuse general language models with domain-specific knowledge via fine-tuning. This study demonstrates the effective transfer of specialized knowledge between models by defining a general language model as the student model and a domain-specific pre-trained model as the teacher model. In particular, we address the complexities of EMR data obtained from PEDs in non-English-speaking regions, such as Korea, and demonstrate that the proposed method enhances classification performance in such contexts. The proposed methodology not only outperforms baseline models on Korean PED EMR data, but also promises broader applicability in various professional and technical domains. In future works, we intend to extend this methodology to include diverse non-English-speaking regions and address additional downstream tasks, with the aim of developing advanced model architectures using state-of-the-art KD techniques. The code is available in https://github.com/JoSangYeon/DSG-KD.

knowledge, language model, teacher model, (13 more...)

2409.14904

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Illinois (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceSep-18-2024

Geometric Relational Embeddings

Xiong, Bo

Relational representation learning transforms relational data into continuous and low-dimensional vector representations. However, vector-based representations fall short in capturing crucial properties of relational data that are complex and symbolic. We propose geometric relational embeddings, a paradigm of relational embeddings that respect the underlying symbolic structures. Specifically, this dissertation introduces various geometric relational embedding models capable of capturing: 1) complex structured patterns like hierarchies and cycles in networks and knowledge graphs; 2) logical structures in ontologies and logical constraints applicable for constraining machine learning model outputs; and 3) high-order structures between entities and relations. Our results obtained from benchmark and real-world datasets demonstrate the efficacy of geometric relational embeddings in adeptly capturing these discrete, symbolic, and structured properties inherent in relational data.

hyperbolic embedding inference, knowledge discovery and data mining, rotation reflection and hyperbolic rotation, (16 more...)

2409.15369

Country:

North America > United States > California > Los Angeles County > Long Beach (0.13)
Asia > Russia (0.13)
Europe > Switzerland > Zürich > Zürich (0.04)
(24 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(9 more...)

Silva, P. H. O., Cerqueira, A. S., Nepomuceno, E. G.

Insightful Railway Track Evaluation: Leveraging NARX Feature Interpretation

arXiv.org Artificial IntelligenceSep-17-2024

The classification of time series is essential for extracting meaningful insights and aiding decision-making in engineering domains. Parametric modeling techniques like NARX are invaluable for comprehending intricate processes, such as environmental time series, owing to their easily interpretable and transparent structures. This article introduces a classification algorithm, Logistic-NARX Multinomial, which merges the NARX methodology with logistic regression. This approach not only produces interpretable models but also effectively tackles challenges associated with multiclass classification. Furthermore, this study introduces an innovative methodology tailored for the railway sector, offering a tool by employing NARX models to interpret the multitude of features derived from onboard sensors. This solution provides profound insights through feature importance analysis, enabling informed decision-making regarding safety and maintenance.

artificial intelligence, expert system, machine learning, (16 more...)

2410.0277

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
South America > Brazil (0.04)
Europe > Ireland (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.91)

Industry: Transportation > Ground > Rail (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.46)

arXiv.org Artificial IntelligenceSep-17-2024

ProSLM : A Prolog Synergized Language Model for explainable Domain Specific Knowledge Based Question Answering

Vakharia, Priyesh, Kufeldt, Abigail, Meyers, Max, Lane, Ian, Gilpin, Leilani

Neurosymbolic approaches can add robustness to opaque neural systems by incorporating explainable symbolic representations. However, previous approaches have not used formal logic to contextualize queries to and validate outputs of large language models (LLMs). We propose \systemname{}, a novel neurosymbolic framework, to improve the robustness and reliability of LLMs in question-answering tasks. We provide \systemname{} with a domain-specific knowledge base, a logical reasoning system, and an integration to an existing LLM. This framework has two capabilities (1) context gathering: generating explainable and relevant context for a given query, and (2) validation: confirming and validating the factual accuracy of a statement in accordance with a knowledge base (KB). Our work opens a new area of neurosymbolic generative AI text validation and user personalization.

language model, llm, proslm, (14 more...)

doi: 10.1007/978-3-031-71170-1_23

2409.11589

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Villagomez, Enrique Luna, Mahalec, Vladimir

Fault Detection and Identification via Monitoring Modules Based on Clusters of Interacting Measurements

arXiv.org Artificial IntelligenceSep-16-2024

This work introduces a novel control-aware distributed process monitoring methodology based on modules comprised of clusters of interacting measurements. The methodology relies on the process flow diagram (PFD) and control system structure without requiring cross-correlation data to create monitoring modules. The methodology is validated on the Tennessee Eastman Process benchmark using full Principal Component Analysis (f-PCA) in the monitoring modules. The results are comparable to nonlinear techniques implemented in a centralized manner such as Kernel PCA (KPCA), Autoencoders (AE), and Recurrent Neural Networks (RNN), or distributed techniques like the Distributed Canonical Correlation Analysis (DCCA). Temporal plots of fault detection by different modules show clearly the magnitude and propagation of the fault through each module, pinpointing the module where the fault originates, and separating controllable faults from other faults. This information, combined with PCA contribution plots, helps detection and identification as effectively as more complex nonlinear centralized or distributed methods.

artificial intelligence, expert system, machine learning, (17 more...)

2409.11444

Country:

North America > Canada (0.28)
North America > United States > Tennessee (0.25)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)