AITopics | Setty, Spurthi

Plotting

Setty, Spurthi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Retrieval for RAG based Question Answering Models on Financial Documents

Setty, Spurthi, Jijo, Katherine, Chung, Eden, Vidra, Natan

arXiv.org Artificial IntelligenceMar-22-2024

In recent years, the emergence of Large Language Models (LLMs) represent a critical turning point in Generative AI and its ability to expedite productivity across a variety domains. However, the capabilities of these models, while impressive, are limited in a number of ways that have hindered certain industries from being able to take full advantage of the potential of this technology. A key disadvantage is the tendency for LLMs to hallucinate information and its lack of knowledge in domain specific areas. The knowledge of LLMs are limited by their training data, and without the use of additional techniques, these models have very poor performance of very domain specific tasks. In order to develop a large language model, the first step is the pre-training process where a transformer is trained on a very large corpus of text data. This data is very general and not specific to a certain domain or field, as well as unchanging with time. This is a reason why LLMs like ChatGPT might perform well for general queries but fail on questions on more specific and higher-level topics. Additionally, a model's performance about a certain topic is highly dependent on how often that information appears in the training data, meaning that LLMs struggle with information that does not appear frequently.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.07221

Country: Asia > Middle East > Israel > Mediterranean Sea (0.24)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Enhancing Large Language Model Performance To Answer Questions and Extract Information More Accurately

Zhang, Liang, Jijo, Katherine, Setty, Spurthi, Chung, Eden, Javid, Fatima, Vidra, Natan, Clifford, Tommy

arXiv.org Artificial IntelligenceJan-26-2024

Large Language Models (LLMs) generate responses to questions; however, their effectiveness is often hindered by sub-optimal quality of answers and occasional failures to provide accurate responses to questions. To address these challenges, a fine-tuning process is employed, involving feedback and examples to refine models. The objective is to enhance AI models through continuous feedback loops, utilizing metrics such as cosine similarity, LLM evaluation and Rouge-L scores to evaluate the models. Leveraging LLMs like GPT-3.5, GPT4ALL, and LLaMA2, and Claude, this approach is benchmarked on financial datasets, including the FinanceBench and RAG Instruct Benchmark Tester Dataset, illustrating the necessity of fine-tuning. The results showcase the capability of fine-tuned models to surpass the accuracy of zero-shot LLMs, providing superior question and answering capabilities. Notably, the combination of fine-tuning the LLM with a process known as Retrieval Augmented Generation (RAG) proves to generate responses with improved accuracy.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.01722

Genre: Research Report (0.40)

Industry:

Banking & Finance (0.68)
Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback