AITopics | Mishra, Lokesh

Collaborating Authors

Mishra, Lokesh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems

de Lima, Rafael Teixeira, Gupta, Shubham, Berrospi, Cesar, Mishra, Lokesh, Dolfi, Michele, Staar, Peter, Vagenas, Panagiotis

arXiv.org Artificial IntelligenceNov-29-2024

Retrieval Augmented Generation (RAG) systems are a widespread application of Large Language Models (LLMs) in the industry. While many tools exist empowering developers to build their own systems, measuring their performance locally, with datasets reflective of the system's use cases, is a technological challenge. Solutions to this problem range from non-specific and cheap (most public datasets) to specific and costly (generating data from local documents). In this paper, we show that using public question and answer (Q&A) datasets to assess retrieval performance can lead to non-optimal systems design, and that common tools for RAG dataset generation can lead to unbalanced data. We propose solutions to these issues based on the characterization of RAG datasets through labels and through label-targeted data generation. Finally, we show that fine-tuned small LLMs can efficiently generate Q&A datasets. We believe that these observations are invaluable to the know-your-data step of RAG systems development.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.1971

Country:

Europe > Switzerland (0.14)
Asia > Thailand (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs

Mishra, Lokesh, Dhibi, Sohayl, Kim, Yusik, Ramis, Cesar Berrospi, Gupta, Shubham, Dolfi, Michele, Staar, Peter

arXiv.org Artificial IntelligenceJun-27-2024

Environment, Social, and Governance (ESG) KPIs assess an organization's performance on issues such as climate change, greenhouse gas emissions, water consumption, waste management, human rights, diversity, and policies. ESG reports convey this valuable quantitative information through tables. Unfortunately, extracting this information is difficult due to high variability in the table structure as well as content. We propose Statements, a novel domain agnostic data structure for extracting quantitative facts and related information. We propose translating tables to statements as a new supervised deep-learning universal information extraction task. We introduce SemTabNet - a dataset of over 100K annotated tables. Investigating a family of T5-based Statement Extraction Models, our best model generates statements which are 82% similar to the ground-truth (compared to baseline of 21%). We demonstrate the advantages of statements by applying our model to over 2700 tables from ESG reports. The homogeneous nature of statements permits exploratory data analysis on expansive information found in large collections of ESG reports.

large language model, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

2406.19102

Country:

Europe > Switzerland (0.14)
Europe > Ireland (0.14)
Asia > Middle East (0.14)
Africa > Ethiopia (0.14)

Genre: Research Report (0.81)

Industry:

Water & Waste Management > Solid Waste Management (0.67)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ESG Accountability Made Easy: DocQA at Your Service

Mishra, Lokesh, Berrospi, Cesar, Dinkla, Kasper, Antognini, Diego, Fusco, Francesco, Bothur, Benedikt, Lysak, Maksym, Livathinos, Nikolaos, Nassar, Ahmed, Vagenas, Panagiotis, Morin, Lucas, Auer, Christoph, Dolfi, Michele, Staar, Peter

arXiv.org Artificial IntelligenceNov-30-2023

We present Deep Search DocQA. This application enables information extraction from documents via a question-answering conversational assistant. The system integrates several technologies from different AI disciplines consisting of document conversion to machine-readable format (via computer vision), finding relevant data (via natural language processing), and formulating an eloquent response (via large language models). Users can explore over 10,000 Environmental, Social, and Governance (ESG) disclosure reports from over 2000 corporations. The Deep Search platform can be accessed at: https://ds4sd.github.io.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.18481

Country: Europe > Switzerland > Zürich > Zürich (0.16)

Genre: Research Report (0.40)

Industry: Banking & Finance (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Add feedback