AITopics | Dinh, Sang

Plotting

Dinh, Sang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GloCOM: A Short Text Neural Topic Model via Global Clustering Context

Nguyen, Quang Duc, Nguyen, Tung, Nguyen, Duc Anh, Van, Linh Ngo, Dinh, Sang, Nguyen, Thien Huu

arXiv.org Artificial IntelligenceNov-30-2024

Uncovering hidden topics from short texts is challenging for traditional and neural models due to data sparsity, which limits word co-occurrence patterns, and label sparsity, stemming from incomplete reconstruction targets. Although data aggregation offers a potential solution, existing neural topic models often overlook it due to time complexity, poor aggregation quality, and difficulty in inferring topic proportions for individual documents. In this paper, we propose a novel model, GloCOM (Global Clustering COntexts for Topic Models), which addresses these challenges by constructing aggregated global clustering contexts for short documents, leveraging text embeddings from pre-trained language models. GloCOM can infer both global topic distributions for clustering contexts and local distributions for individual short texts. Additionally, the model incorporates these global contexts to augment the reconstruction loss, effectively handling the label sparsity issue. Extensive experiments on short text datasets show that our approach outperforms other state-of-the-art models in both topic quality and document representations.

information retrieval, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2412.00525

Country:

North America > United States (0.93)
Asia (0.93)

Genre: Research Report > Promising Solution (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Health & Medicine (1.00)
Banking & Finance (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

SemiKong: Curating, Training, and Evaluating A Semiconductor Industry-Specific Large Language Model

Nguyen, Christopher, Nguyen, William, Suzuki, Atsushi, Oku, Daisuke, Phan, Hong An, Dinh, Sang, Nguyen, Zooey, Ha, Anh, Raghavan, Shruti, Vo, Huy, Nguyen, Thang, Nguyen, Lan, Hirayama, Yoshikuni

arXiv.org Artificial IntelligenceNov-21-2024

Large Language Models (LLMs) have demonstrated the potential to address some issues within the semiconductor industry. However, they are often general-purpose models that lack the specialized knowledge needed to tackle the unique challenges of this sector, such as the intricate physics and chemistry of semiconductor devices and processes. SemiKong, the first industry-specific LLM for the semiconductor domain, provides a foundation that can be used to develop tailored proprietary models. With SemiKong 1.0, we aim to develop a foundational model capable of understanding etching problems at an expert level. Our key contributions include (a) curating a comprehensive corpus of semiconductor-related texts, (b) creating a foundational model with in-depth semiconductor knowledge, and (c) introducing a framework for integrating expert knowledge, thereby advancing the evaluation process of domain-specific AI models. Through fine-tuning a pre-trained LLM using our curated dataset, we have shown that SemiKong outperforms larger, general-purpose LLMs in various semiconductor manufacturing and design tasks. Our extensive experiments underscore the importance of developing domain-specific LLMs as a foundation for company- or tool-specific proprietary models, paving the way for further research and applications in the semiconductor domain. Code and dataset will be available at https://github.com/aitomatic/semikong

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.13802

Country: Asia (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Hardware (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

DANA: Domain-Aware Neurosymbolic Agents for Consistency and Accuracy

Luong, Vinh, Dinh, Sang, Raghavan, Shruti, Nguyen, William, Nguyen, Zooey, Le, Quynh, Vo, Hung, Maegaito, Kentaro, Nguyen, Loc, Nguyen, Thao, Ha, Anh Hai, Nguyen, Christopher

arXiv.org Artificial IntelligenceSep-27-2024

Large Language Models (LLMs) have shown remarkable capabilities, but their inherent probabilistic nature often leads to inconsistency and inaccuracy in complex problem-solving tasks. This paper introduces DANA (Domain-Aware Neurosymbolic Agent), an architecture that addresses these issues by integrating domain-specific knowledge with neurosymbolic approaches. We begin by analyzing current AI architectures, including AutoGPT, LangChain ReAct and OpenAI's ChatGPT, through a neurosymbolic lens, highlighting how their reliance on probabilistic inference contributes to inconsistent outputs. In response, DANA captures and applies domain expertise in both natural-language and symbolic forms, enabling more deterministic and reliable problem-solving behaviors. We implement a variant of DANA using Hierarchical Task Plans (HTPs) in the open-source OpenSSA framework. This implementation achieves over 90\% accuracy on the FinanceBench financial-analysis benchmark, significantly outperforming current LLM-based systems in both consistency and accuracy. Application of DANA in physical industries such as semiconductor shows that its flexible architecture for incorporating knowledge is effective in mitigating the probabilistic limitations of LLMs and has potential in tackling complex, real-world problems that require reliability and precision.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.02823

Genre: Research Report (0.50)

Industry:

Information Technology (0.46)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study

Nguyen, Zooey, Annunziata, Anthony, Luong, Vinh, Dinh, Sang, Le, Quynh, Ha, Anh Hai, Le, Chanh, Phan, Hong An, Raghavan, Shruti, Nguyen, Christopher

arXiv.org Artificial IntelligenceApr-19-2024

AI-powered question-answering (Q&A) systems have emerged as important tools, alongside established search technologies, to enable quick access to relevant information and knowledge from large digital sources that are complex and time-consuming for humans to navigate. Advancements in large language models (LLMs) have revolutionized the field of Q&A, with models like GPT-3 (Brown et al. 2020), BERT (Devlin et al. 2018), and RoBERTa (Liu et al. 2019) demonstrating remarkable abilities in understanding and generating human-like text. However, the effectiveness of such models in handling domain-specific questions that require specialized knowledge is limited. Retrieval-augmented generation (RAG) techniques, which combine information retrieval and generative models (Lewis et al. 2021), have shown promise in boosting the quality of LLM output in Q&A tasks. RAG systems leverage the strengths of both retrieval and generation components to provide contextually relevant and informative responses. While there is a lack of established quantification of RAG accuracy, early findings suggest that generic RAG does not perform well in complex domains such as finance. In one instance, RAG based on generic LLMs such as GPT-4-Turbo fails to answer 81% of the questions derived from Securities and Exchange Commission (SEC) financial filings (Islam et al. 2023). Aitomatic, Inc. (except as noted, all authors are from Aitomatic)

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.11792

Genre:

Research Report > New Finding (0.88)
Personal > Interview (0.60)

Industry:

Banking & Finance (0.48)
Health & Medicine (0.46)
Law (0.34)
Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback