AITopics | Frequently Asked Questions (FAQ)

Collaborating Authors

Frequently Asked Questions (FAQ)

Five questions and answers about artificial intelligence

arXiv.org Artificial IntelligenceSep-24-2024

Rapid advances in Artificial Intelligence (AI) are generating much controversy in society, often without scientific basis. As occurred the development of other emerging technologies, such as the introduction of electricity in the early 20th century, AI causes both fascination and fear. Following the advice of the philosopher R.W. Emerson's advice'the knowledge is the antidote to fear', this paper seeks to contribute to the dissemination of knowledge about AI. To this end, it reflects on the following questions: the origins of AI, its possible future evolution, its ability to show feelings, the associated threats and dangers, and the concept of AI singularity Keywords: Artificial Intelligence (AI), Fourth Industrial Revolution, Beginnings of AI, Development of AI, Automatic learning, Machine learning, Feelings in AI, Dangers of AI, Advantages of AI, Singularity of AI, Superintelligence, Frictionless Reproducibility (FR), Large Language Models, General AI (GAI), Intelligence, GPT Chat.

information, intelligence, question and answer, (15 more...)

arXiv.org Artificial Intelligence

2409.15903

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
Europe > Spain > Catalonia (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)

Genre:

Frequently Asked Questions (FAQ) (0.41)
Research Report (0.40)

Industry:

Leisure & Entertainment > Games > Chess (1.00)
Law (0.93)
Government (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

AsthmaBot: Multi-modal, Multi-Lingual Retrieval Augmented Generation For Asthma Patient Support

Bahaj, Adil, Ghogho, Mounir

arXiv.org Artificial IntelligenceSep-24-2024

Asthma rates have risen globally, driven by environmental and lifestyle factors. Access to immediate medical care is limited, particularly in developing countries, necessitating automated support systems. Large Language Models like ChatGPT (Chat Generative Pre-trained Transformer) and Gemini have advanced natural language processing in general and question answering in particular, however, they are prone to producing factually incorrect responses (i.e. hallucinations). Retrieval-augmented generation systems, integrating curated documents, can improve large language models' performance and reduce the incidence of hallucination. We introduce AsthmaBot, a multi-lingual, multi-modal retrieval-augmented generation system for asthma support. Evaluation of an asthma-related frequently asked questions dataset shows AsthmaBot's efficacy. AsthmaBot has an added interactive and intuitive interface that integrates different data modalities (text, images, videos) to make it accessible to the larger public. AsthmaBot is available online via \url{asthmabot.datanets.org}.

asthmabot, language model, query, (13 more...)

arXiv.org Artificial Intelligence

2409.15815

Country:

Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)
Europe > Lithuania > Kaunas County > Kaunas (0.04)
Africa > Middle East > Morocco > Rabat-Salé-Kénitra Region > Rabat (0.04)

Genre:

Research Report (0.50)
Frequently Asked Questions (FAQ) (0.37)

Industry: Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases > Asthma (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Auto FAQ Generation

Kalvakolanu, Anjaneya Teja, Chandra, NagaSai, Fekadu, Michael

arXiv.org Artificial IntelligenceMay-12-2024

FAQ documents are commonly used with text documents and websites to provide important information in the form of question answer pairs to either aid in reading comprehension or provide a shortcut to the key ideas. We suppose that salient sentences from a given document serve as a good proxy fro the answers to an aggregated set of FAQs from readers. We propose a system for generating FAQ documents that extract the salient questions and their corresponding answers from sizeable text documents scraped from the Stanford Encyclopedia of Philosophy. We use existing text summarization, sentence ranking via the Text rank algorithm, and question-generation tools to create an initial set of questions and answers. Finally, we apply some heuristics to filter out invalid questions. We use human evaluation to rate the generated questions on grammar, whether the question is meaningful, and whether the question's answerability is present within a summarized context. On average, participants thought 71 percent of the questions were meaningful.

arxiv preprint arxiv, question generation, summarization, (12 more...)

arXiv.org Artificial Intelligence

2405.13006

Country:

North America > United States > California > San Luis Obispo County > San Luis Obispo (0.29)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Frequently Asked Questions (FAQ) (1.00)
Research Report (0.82)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
(2 more...)

Add feedback

[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

Choshen, Leshem, Cotterell, Ryan, Hu, Michael Y., Linzen, Tal, Mueller, Aaron, Ross, Candace, Warstadt, Alex, Wilcox, Ethan, Williams, Adina, Zhuang, Chengxu

arXiv.org Artificial IntelligenceApr-9-2024

After last year's successful BabyLM Challenge, the competition will be hosted again in 2024/2025. The overarching goals of the challenge remain the same; however, some of the competition rules will be different. The big changes for this year's competition are as follows: First, we replace the loose track with a paper track, which allows (for example) non-model-based submissions, novel cognitively-inspired benchmarks, or analysis techniques. Second, we are relaxing the rules around pretraining data, and will now allow participants to construct their own datasets provided they stay within the 100M-word or 10M-word budget. Third, we introduce a multimodal vision-and-language track, and will release a corpus of 50% text-only and 50% image-text multimodal data as a starting point for LM model training. The purpose of this CfP is to provide rules for this year's challenge, explain these rule changes and their rationale in greater detail, give a timeline of this year's competition, and provide answers to frequently asked questions from last year's challenge.

babylm challenge, dataset, submission, (13 more...)

arXiv.org Artificial Intelligence

2404.06214

Country:

Europe > Switzerland > Zürich > Zürich (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Slovenia (0.04)

Genre:

Research Report (0.50)
Frequently Asked Questions (FAQ) (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.95)

Add feedback

FAQ-Gen: An automated system to generate domain-specific FAQs to aid content comprehension

Kale, Sahil, Khaire, Gautam, Patankar, Jay

arXiv.org Artificial IntelligenceFeb-8-2024

Frequently Asked Questions (FAQs) refer to the most common inquiries about specific content. They serve as content comprehension aids by simplifying topics and enhancing understanding through succinct presentation of information. In this paper, we address FAQ generation as a well-defined Natural Language Processing (NLP) task through the development of an end-to-end system leveraging text-to-text transformation models. We present a literature review covering traditional question-answering systems, highlighting their limitations when applied directly to the FAQ generation task. We propose our system capable of building FAQs from textual content tailored to specific domains, enhancing their accuracy and relevance. We utilise self-curated algorithms for obtaining optimal representation of information to be provided as input and also for ranking the question-answer pairs to maximise human comprehension. Qualitative human evaluation showcases the generated FAQs to be well-constructed and readable, while also utilising domain-specific constructs to highlight domain-based nuances and jargon in the original content.

dataset, faq, generate domain-specific faq, (13 more...)

arXiv.org Artificial Intelligence

2402.05812

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Poland > Masovia Province > Warsaw (0.05)
Asia > India > Maharashtra > Pune (0.04)
(6 more...)

Genre: Frequently Asked Questions (FAQ) (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Reinforcement Learning for Optimizing RAG for Domain Chatbots

Kulkarni, Mandar, Tangarajan, Praveen, Kim, Kyung, Trivedi, Anusua

arXiv.org Artificial IntelligenceJan-9-2024

With the advent of Large Language Models (LLM), conversational assistants have become prevalent for domain use cases. LLMs acquire the ability to contextual question answering through training, and Retrieval Augmented Generation (RAG) further enables the bot to answer domain-specific questions. This paper describes a RAG-based approach for building a chatbot that answers user's queries using Frequently Asked Questions (FAQ) data. We train an in-house retrieval embedding model using infoNCE loss, and experimental results demonstrate that the in-house model works significantly better than the well-known general-purpose public embedding model, both in terms of retrieval accuracy and Out-of-Domain (OOD) query detection. As an LLM, we use an open API-based paid ChatGPT model. We noticed that a previously retrieved-context could be used to generate an answer for specific patterns/sequences of queries (e.g., follow-up queries). Hence, there is a scope to optimize the number of LLM tokens and cost. Assuming a fixed retrieval model and an LLM, we optimize the number of LLM tokens using Reinforcement Learning (RL). Specifically, we propose a policy-based model external to the RAG, which interacts with the RAG pipeline through policy actions and updates the policy to optimize the cost. The policy model can perform two actions: to fetch FAQ context or skip retrieval. We use the open API-based GPT-4 as the reward model. We then train a policy model using policy gradient on multiple training chat sessions. As a policy model, we experimented with a public gpt-2 model and an in-house BERT model. With the proposed RL-based optimization combined with similarity threshold, we are able to achieve significant cost savings while getting a slightly improved accuracy. Though we demonstrate results for the FAQ chatbot, the proposed RL approach is generic and can be experimented with any existing RAG pipeline.

faq context, policy model, query, (13 more...)

arXiv.org Artificial Intelligence

2401.068

Country:

North America > United States > Washington > King County > Seattle (0.04)
Asia > India (0.04)

Genre:

Frequently Asked Questions (FAQ) (1.00)
Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AdapterDistillation: Non-Destructive Task Composition with Knowledge Distillation

Wang, Junjie, Chen, Yicheng, Zhang, Wangshu, Hu, Sen, Xu, Teng, Zheng, Jing

arXiv.org Artificial IntelligenceDec-26-2023

Leveraging knowledge from multiple tasks through introducing a small number of task specific parameters into each transformer layer, also known as adapters, receives much attention recently. However, adding an extra fusion layer to implement knowledge composition not only increases the inference time but also is non-scalable for some applications. To avoid these issues, we propose a two-stage knowledge distillation algorithm called AdapterDistillation. In the first stage, we extract task specific knowledge by using local data to train a student adapter. In the second stage, we distill the knowledge from the existing teacher adapters into the student adapter to help its inference. Extensive experiments on frequently asked question retrieval in task-oriented dialog systems validate the efficiency of AdapterDistillation. We show that AdapterDistillation outperforms existing algorithms in terms of accuracy, resource consumption and inference time.

adapter, adapterdistillation, tenant, (15 more...)

arXiv.org Artificial Intelligence

2312.16261

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(5 more...)

Genre:

Research Report (0.40)
Frequently Asked Questions (FAQ) (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Portuguese FAQ for Financial Services

Finardi, Paulo, Melo, Wanderley M., Neto, Edgard D. Medeiros, Mansano, Alex F., Costa, Pablo B., Caridá, Vinicius F.

arXiv.org Artificial IntelligenceNov-19-2023

Scarcity of domain-specific data in the Portuguese financial domain has disfavored the development of Natural Language Processing (NLP) applications. To address this limitation, the present study advocates for the utilization of synthetic data generated through data augmentation techniques. The investigation focuses on the augmentation of a dataset sourced from the Central Bank of Brazil FAQ, employing techniques that vary in semantic similarity. Supervised and unsupervised tasks are conducted to evaluate the impact of augmented data on both low and high semantic similarity scenarios. Additionally, the resultant dataset will be publicly disseminated on the Hugging Face Datasets platform, thereby enhancing accessibility and fostering broader engagement within the NLP research community.

dataset, portuguese faq, similarity, (14 more...)

arXiv.org Artificial Intelligence

2311.11331

Country:

South America > Brazil > São Paulo (0.04)
South America > Brazil > Ceará > Fortaleza (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)

Genre:

Research Report (1.00)
Frequently Asked Questions (FAQ) (0.77)

Industry: Banking & Finance > Financial Services (0.42)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback

What Are People Asking About COVID-19? A Question Classification Dataset

Wei, Jerry, Huang, Chengyu, Vosoughi, Soroush, Wei, Jason

arXiv.org Artificial IntelligenceSep-8-2023

We present COVID-Q, a set of 1,690 questions about COVID-19 from 13 sources, which we annotate into 15 question categories and 207 question clusters. The most common questions in our dataset asked about transmission, prevention, and societal effects of COVID, and we found that many questions that appeared in multiple sources were not answered by any FAQ websites of reputable organizations such as the CDC and FDA. We post our dataset publicly at https://github.com/JerryWeiAI/COVID-Q. For classifying questions into 15 categories, a BERT baseline scored 58.1% accuracy when trained on 20 examples per category, and for a question clustering task, a BERT + triplet loss baseline achieved 49.5% accuracy. We hope COVID-Q can help either for direct use in developing applied systems or as a domain-specific resource for model evaluation.

category, covid, question cluster, (16 more...)

arXiv.org Artificial Intelligence

2005.12522

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York (0.04)
North America > United States > Illinois (0.04)
(3 more...)

Genre:

Research Report (0.51)
Frequently Asked Questions (FAQ) (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Public Health (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.47)

Add feedback

ELQA: A Corpus of Metalinguistic Questions and Answers about English

Behzad, Shabnam, Sakaguchi, Keisuke, Schneider, Nathan, Zeldes, Amir

arXiv.org Artificial IntelligenceJul-3-2023

We present ELQA, a corpus of questions and answers in and about the English language. Collected from two online forums, the >70k questions (from English learners and others) cover wide-ranging topics including grammar, meaning, fluency, and etymology. The answers include descriptions of general properties of English vocabulary and grammar as well as explanations about specific (correct and incorrect) usage examples. Unlike most NLP datasets, this corpus is metalinguistic -- it consists of language about language. As such, it can facilitate investigations of the metalinguistic capabilities of NLU models, as well as educational applications in the language learning domain. To study this, we define a free-form question answering task on our dataset and conduct evaluations on multiple LLMs (Large Language Models) to analyze their capacity to generate metalinguistic answers.

computational linguistic, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2205.00395

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Japan > Honshū > Tōhoku (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(14 more...)

Genre:

Frequently Asked Questions (FAQ) (0.50)
Research Report (0.50)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback