AITopics | Yoon, David Seunghyun

Collaborating Authors

Yoon, David Seunghyun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Domain-specific Question Answering with Hybrid Search

Sultania, Dewang, Lu, Zhaoyu, Naik, Twisha, Dernoncourt, Franck, Yoon, David Seunghyun, Sharma, Sanat, Bui, Trung, Gupta, Ashok, Vatsa, Tushar, Suresha, Suhas, Verma, Ishita, Belavadi, Vibha, Chen, Cheng, Friedrich, Michael

arXiv.org Artificial IntelligenceDec-21-2024

With the increasing adoption of Large Language Models A production-ready, generalizable framework for LLMbased (LLMs) in enterprise settings, ensuring accurate and reliable QA systems built on Elasticsearch question-answering systems remains a critical challenge. A flexible hybrid retrieval mechanism combining dense Building upon our previous work on domain-specific and sparse search methods question answering about Adobe products (Sharma et al. A comprehensive evaluation framework for assessing 2024), which established a retrieval-aware framework with QA system performance self-supervised training, we now present a production-ready, Empirical analysis demonstrating the effectiveness of our generalizable architecture alongside a comprehensive evaluation approach across various metrics methodology. Our core contribution is a flexible, scalable framework built on Elasticsearch that can be adapted Through this work, we provide not only theoretical insights for any LLM-based question-answering system. This framework but also a practical, deployable solution for building reliable seamlessly integrates hybrid retrieval mechanisms, domain-specific question-answering systems that can combining dense and sparse search with boost matching, be adapted to various enterprise needs.

large language model, machine learning, question answering, (22 more...)

arXiv.org Artificial Intelligence

2412.03736

Country: Asia > Thailand (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Retrieval Augmented Generation for Domain-specific Question Answering

Sharma, Sanat, Yoon, David Seunghyun, Dernoncourt, Franck, Sultania, Dewang, Bagga, Karishma, Zhang, Mengjiao, Bui, Trung, Kotte, Varun

arXiv.org Artificial IntelligenceMay-29-2024

Question answering (QA) has become an important application in the advanced development of large language models. General pre-trained large language models for question-answering are not trained to properly understand the knowledge or terminology for a specific domain, such as finance, healthcare, education, and customer service for a product. To better cater to domain-specific understanding, we build an in-house question-answering system for Adobe products. We propose a novel framework to compile a large question-answer database and develop the approach for retrieval-aware finetuning of a Large Language model. We showcase that fine-tuning the retriever leads to major improvements in the final generation. Our overall approach reduces hallucinations during generation while keeping in context the latest retrieval information for contextual grounding.

large language model, machine learning, question answering, (20 more...)

arXiv.org Artificial Intelligence

2404.1476

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.48)
Education (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PDFTriage: Question Answering over Long, Structured Documents

Saad-Falcon, Jon, Barrow, Joe, Siu, Alexa, Nenkova, Ani, Yoon, David Seunghyun, Rossi, Ryan A., Dernoncourt, Franck

arXiv.org Artificial IntelligenceNov-8-2023

Large Language Models (LLMs) have issues with document question answering (QA) in situations where the document is unable to fit in the small context length of an LLM. To overcome this issue, most existing works focus on retrieving the relevant context from the document, representing them as plain text. However, documents such as PDFs, web pages, and presentations are naturally structured with different pages, tables, sections, and so on. Representing such structured documents as plain text is incongruous with the user's mental model of these documents with rich structure. When a system has to query the document for context, this incongruity is brought to the fore, and seemingly trivial questions can trip up the QA system. To bridge this fundamental gap in handling structured documents, we propose an approach called PDFTriage that enables models to retrieve the context based on either structure or content. Our experiments demonstrate the effectiveness of the proposed PDFTriage-augmented models across several classes of questions where existing retrieval-augmented LLMs fail. To facilitate further research on this fundamental problem, we release our benchmark dataset consisting of 900+ human-generated questions over 80 structured documents from 10 different categories of question types for document QA. Our code and datasets will be released soon on Github.

artificial intelligence, large language model, structured document, (3 more...)

arXiv.org Artificial Intelligence

2309.08872

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.60)

Add feedback