AITopics | Lastras, Luis A.

Collaborating Authors

Lastras, Luis A.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs

Basu, Kinjal, Abdelaziz, Ibrahim, Chaudhury, Subhajit, Dan, Soham, Crouse, Maxwell, Munawar, Asim, Kumaravel, Sadhana, Muthusamy, Vinod, Kapanipathi, Pavan, Lastras, Luis A.

arXiv.org Artificial IntelligenceMay-20-2024

There is a growing need for Large Language Models (LLMs) to effectively use tools and external Application Programming Interfaces (APIs) to plan and complete tasks. As such, there is tremendous interest in methods that can acquire sufficient quantities of train and test data that involve calls to tools / APIs. Two lines of research have emerged as the predominant strategies for addressing this challenge. The first has focused on synthetic data generation techniques, while the second has involved curating task-adjacent datasets which can be transformed into API / Tool-based tasks. In this paper, we focus on the task of identifying, curating, and transforming existing datasets and, in turn, introduce API-BLEND, a large corpora for training and systematic testing of tool-augmented LLMs. The datasets mimic real-world scenarios involving API-tasks such as API / tool detection, slot filling, and sequencing of the detected APIs. We demonstrate the utility of the API-BLEND dataset for both training and benchmarking purposes.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.15491

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Consumer Products & Services > Travel (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Pointwise Mutual Information Based Metric and Decoding Strategy for Faithful Generation in Document Grounded Dialogs

Nandwani, Yatin, Kumar, Vineet, Raghu, Dinesh, Joshi, Sachindra, Lastras, Luis A.

arXiv.org Artificial IntelligenceDec-1-2023

A major concern in using deep learning based generative models for document-grounded dialogs is the potential generation of responses that are not \textit{faithful} to the underlying document. Existing automated metrics used for evaluating the faithfulness of response with respect to the grounding document measure the degree of similarity between the generated response and the document's content. However, these automated metrics are far from being well aligned with human judgments. Therefore, to improve the measurement of faithfulness, we propose a new metric that utilizes (Conditional) Point-wise Mutual Information (PMI) between the generated response and the source document, conditioned on the dialogue. PMI quantifies the extent to which the document influences the generated response -- with a higher PMI indicating a more faithful response. We build upon this idea to create a new decoding technique that incorporates PMI into the response generation process to predict more faithful responses. Our experiments on the BEGIN benchmark demonstrate an improved correlation of our metric with human evaluation. We also show that our decoding technique is effective in generating more faithful responses when compared to standard decoding techniques on a set of publicly available document-grounded dialog datasets.

machine learning, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2305.12191

Country:

Europe (0.93)
North America > United States > Louisiana (0.14)
North America > United States > Pennsylvania (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.48)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.46)

Add feedback

Lattice Representation Learning

Lastras, Luis A.

arXiv.org Machine LearningJun-24-2020

In this article we introduce theory and algorithms for learning discrete representations that take on a lattice that is embedded in an Euclidean space. Lattice representations possess an interesting combination of properties: a) they can be computed explicitly using lattice quantization, yet they can be learned efficiently using the ideas we introduce in this paper, b) they are highly related to Gaussian Variational Autoencoders, allowing designers familiar with the latter to easily produce discrete representations from their models and c) since lattices satisfy the axioms of a group, their adoption can lead into a way of learning simple algebras for modeling binary operations between objects through symbolic formalisms, yet learn these structures also formally using differentiation techniques. This article will focus on laying the groundwork for exploring and exploiting the first two properties, including a new mathematical result linking expressions used during training and inference time and experimental validation on two popular datasets.

artificial intelligence, lattice, neural network, (19 more...)

arXiv.org Machine Learning

2006.13833

Country:

Europe (0.28)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Information Theoretic Lower Bounds on Negative Log Likelihood

Lastras, Luis A.

arXiv.org Machine LearningApr-12-2019

In this article we use rate-distortion theory, a branch of information theory devoted to the problem of lossy compression, to shed light on an important problem in latent variable modeling of data: is there room to improve the model? One way to address this question is to find an upper bound on the probability (equivalently a lower bound on the negative log likelihood) that the model can assign to some data as one varies the prior and/or the likelihood function in a latent variable model. The core of our contribution is to formally show that the problem of optimizing priors in latent variable models is exactly an instance of the variational optimization problem that information theorists solve when computing rate-distortion functions, and then to use this to derive a lower bound on negative log likelihood. Moreover, we will show that if changing the prior can improve the log likelihood, then there is a way to change the likelihood function instead and attain the same log likelihood, and thus rate-distortion theory is of relevance to both optimizing priors as well as optimizing likelihood functions. We will experimentally argue for the usefulness of quantities derived from rate-distortion theory in latent variable modeling by applying them to a problem in image modeling.

artificial intelligence, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

1904.06395

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback