AITopics | Calixto, Iacer

Collaborating Authors

Calixto, Iacer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating Linguistic Capabilities of Multimodal LLMs in the Lens of Few-Shot Learning

Dogan, Mustafa, Kesen, Ilker, Calixto, Iacer, Erdem, Aykut, Erdem, Erkut

arXiv.org Artificial IntelligenceJul-17-2024

The linguistic capabilities of Multimodal Large Language Models (MLLMs) are critical for their effective application across diverse tasks. This study aims to evaluate the performance of MLLMs on the VALSE benchmark, focusing on the efficacy of few-shot In-Context Learning (ICL), and Chain-of-Thought (CoT) prompting. We conducted a comprehensive assessment of state-of-the-art MLLMs, varying in model size and pretraining datasets. The experimental results reveal that ICL and CoT prompting significantly boost model performance, particularly in tasks requiring complex reasoning and contextual understanding. Models pretrained on captioning datasets show superior zero-shot performance, while those trained on interleaved image-text data benefit from few-shot learning. Our findings provide valuable insights into optimizing MLLMs for better grounding of language in visual contexts, highlighting the importance of the composition of pretraining data and the potential of few-shot learning strategies to improve the reasoning abilities of MLLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.12498

Country:

Europe (1.00)
North America > United States (0.27)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLM aided semi-supervision for Extractive Dialog Summarization

Mishra, Nishant, Sahu, Gaurav, Calixto, Iacer, Abu-Hanna, Ameen, Laradji, Issam H.

arXiv.org Artificial IntelligenceNov-23-2023

Generating high-quality summaries for chat dialogs often requires large labeled datasets. We propose a method to efficiently use unlabeled data for extractive summarization of customer-agent dialogs. In our method, we frame summarization as a question-answering problem and use state-of-the-art large language models (LLMs) to generate pseudo-labels for a dialog. We then use these pseudo-labels to fine-tune a chat summarization model, effectively transferring knowledge from the large LLM into a smaller specialized model. We demonstrate our method on the \tweetsumm dataset, and show that using 10% of the original labelled data set we can achieve 65.9/57.0/61.0 ROUGE-1/-2/-L, whereas the current state-of-the-art trained on the entire training data set obtains 65.16/55.81/64.37 ROUGE-1/-2/-L. In other words, in the worst case (i.e., ROUGE-L) we still effectively retain 94.7% of the performance while using only 10% of the data.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2311.11462

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models

Kesen, Ilker, Pedrotti, Andrea, Dogan, Mustafa, Cafagna, Michele, Acikgoz, Emre Can, Parcalabescu, Letitia, Calixto, Iacer, Frank, Anette, Gatt, Albert, Erdem, Aykut, Erdem, Erkut

arXiv.org Artificial IntelligenceNov-12-2023

With the ever-increasing popularity of pretrained Video-Language Models (VidLMs), there is a pressing need to develop robust evaluation methodologies that delve deeper into their visio-linguistic capabilities. To address this challenge, we present ViLMA (Video Language Model Assessment), a task-agnostic benchmark that places the assessment of fine-grained capabilities of these models on a firm footing. Task-based evaluations, while valuable, fail to capture the complexities and specific temporal aspects of moving images that VidLMs need to process. Through carefully curated counterfactuals, ViLMA offers a controlled evaluation suite that sheds light on the true potential of these models, as well as their performance gaps compared to human-level understanding. ViLMA also includes proficiency tests, which assess basic capabilities deemed essential to solving the main counterfactual tests. We show that current VidLMs' grounding abilities are no better than those of vision-language models which use static images. This is especially striking once the performance on proficiency tests is factored in. Our benchmark serves as a catalyst for future research on VidLMs, helping to highlight areas that still need to be explored.

artificial intelligence, linguistic and temporal grounding, video-language model, (2 more...)

arXiv.org Artificial Intelligence

2311.07022

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

Fixing confirmation bias in feature attribution methods via semantic match

Cinà, Giovanni, Fernandez-Llaneza, Daniel, Mishra, Nishant, Röber, Tabea E., Pezzelle, Sandro, Calixto, Iacer, Goedhart, Rob, Birbil, Ş. İlker

arXiv.org Artificial IntelligenceJul-3-2023

Feature attribution methods have become a staple method to disentangle the complex behavior of black box models. Despite their success, some scholars have argued that such methods suffer from a serious flaw: they do not allow a reliable interpretation in terms of human concepts. Simply put, visualizing an array of feature contributions is not enough for humans to conclude something about a model's internal representations, and confirmation bias can trick users into false beliefs about model behavior. We argue that a structured approach is required to test whether our hypotheses on the model are confirmed by the feature attributions. This is what we call the "semantic match" between human concepts and (sub-symbolic) explanations. Building on the conceptual framework put forward in Cin\`a et al. [2023], we propose a structured approach to evaluate semantic match in practice. We showcase the procedure in a suite of experiments spanning tabular and image data, and show how the assessment of semantic match can give insight into both desirable (e.g., focusing on an object relevant for prediction) and undesirable model behaviors (e.g., focusing on a spurious correlation). We couple our experimental results with an analysis on the metrics to measure semantic match, and argue that this approach constitutes the first step towards resolving the issue of confirmation bias in XAI.

data mining, explanation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.00897

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Soft-prompt tuning to predict lung cancer using primary care free-text Dutch medical notes

Elfrink, Auke, Vagliano, Iacopo, Abu-Hanna, Ameen, Calixto, Iacer

arXiv.org Artificial IntelligenceMar-28-2023

We investigate different natural language processing (NLP) approaches based on contextualised word representations for the problem of early prediction of lung cancer using free-text patient medical notes of Dutch primary care physicians. Because lung cancer has a low prevalence in primary care, we also address the problem of classification under highly imbalanced classes. Specifically, we use large Transformer-based pretrained language models (PLMs) and investigate: 1) how soft prompttuning--an NLP technique used to adapt PLMs using small amounts of training data--compares to standard model fine-tuning; 2) whether simpler static word embedding models (WEMs) can be more robust compared to PLMs in highly imbalanced settings; and 3) how models fare when trained on notes from a small number of patients. We find that 1) soft-prompt tuning is an efficient alternative to standard model fine-tuning; 2) PLMs show better discrimination but worse calibration compared to simpler static word embedding models as the classification problem becomes more imbalanced; and 3) results when training models on small number of patients are mixed and show no clear differences between PLMs and WEMs. All our code is available open source in https://bitbucket.org/aumc-kik/prompt

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2303.15846

Country: Europe > Netherlands (0.16)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

Erdem, Erkut (Hacettepe University, Ankara, Turkey) | Kuyu, Menekse (Hacettepe University, Ankara, Turkey) | Yagcioglu, Semih (Hacettepe University, Ankara, Turkey) | Frank, Anette (Heidelberg University, Heidelberg, Germany) | Parcalabescu, Letitia (Heidelberg University, Heidelberg, Germany) | Plank, Barbara (IT University of Copenhagen, Copenhagen, Denmark) | Babii, Andrii (Kharkiv National University of Radio Electronics, Ukraine) | Turuta, Oleksii (Kharkiv National University of Radio Electronics, Ukraine) | Erdem, Aykut | Calixto, Iacer (New York University, U.S.A. / University of Amsterdam, Netherlands) | Lloret, Elena (University of Alicante, Alicante, Spain) | Apostol, Elena-Simona (University Politehnica of Bucharest, Bucharest, Romania) | Truică, Ciprian-Octavian (University Politehnica of Bucharest, Bucharest, Romania) | Šandrih, Branislava (University of Belgrade, Belgrade, Serbia) | Martinčić-Ipšić, Sanda (University of Rijeka, Rijeka, Croatia) | Berend, Gábor (University of Szeged, Szeged, Hungary) | Gatt, Albert (University of Malta, Malta) | Korvel, Grăzina (Vilnius University, Vilnius, Lithuania)

Journal of Artificial Intelligence ResearchApr-6-2022

Developing artificial learning systems that can understand and generate natural language has been one of the long-standing goals of artificial intelligence. Recent decades have witnessed an impressive progress on both of these problems, giving rise to a new family of approaches. Especially, the advances in deep learning over the past couple of years have led to neural approaches to natural language generation (NLG). These methods combine generative language learning techniques with neural-networks based frameworks. With a wide range of applications in natural language processing, neural NLG (NNLG) is a new and fast growing field of research. In this state-of-the-art report, we investigate the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies. We summarize the fundamental building blocks of NNLG approaches from these aspects and provide detailed reviews of commonly used preprocessing steps and basic neural architectures. This report also focuses on the seminal applications of these NNLG models such as machine translation, description generation, automatic speech recognition, abstractive summarization, text simplification, question answering and generation, and dialogue generation. Finally, we conclude with a thorough discussion of the described frameworks by pointing out some open research directions.

machine learning, natural language, text simplification and paraphrasing, (25 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12918

AI Access Foundation

12918

Journal of Artificial Intelligence Research

Country:

Europe > Spain (0.67)
North America > United States > California (0.46)
North America > United States > Minnesota (0.28)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine (1.00)
Education > Curriculum > Subject-Specific Education (0.48)
Government > Regional Government > Europe Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

Parcalabescu, Letitia, Cafagna, Michele, Muradjan, Lilitta, Frank, Anette, Calixto, Iacer, Gatt, Albert

arXiv.org Artificial IntelligenceMar-14-2022

We propose VALSE (Vision And Language Structured Evaluation), a novel benchmark designed for testing general-purpose pretrained vision and language (V&L) models for their visio-linguistic grounding capabilities on specific linguistic phenomena. VALSE offers a suite of six tests covering various linguistic constructs. Solving these requires models to ground linguistic phenomena in the visual modality, allowing more fine-grained evaluations than hitherto possible. We build VALSE using methods that support the construction of valid foils, and report results from evaluating five widely-used V&L models. Our experiments suggest that current models have considerable difficulty addressing most phenomena. Hence, we expect VALSE to serve as an important benchmark to measure future progress of pretrained V&L models from a linguistic perspective, complementing the canonical task-centred V&L evaluations.

large language model, machine learning, natural language, (24 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.acl-long.567

2112.07566

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > Experimental Study (0.34)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(2 more...)

Add feedback

VisualSem: a high-quality knowledge graph for vision and language

Alberts, Houda, Huang, Teresa, Deshpande, Yash, Liu, Yibo, Cho, Kyunghyun, Vania, Clara, Calixto, Iacer

arXiv.org Artificial IntelligenceAug-20-2020

We argue that the next frontier in natural language understanding (NLU) and generation (NLG) will include models that can efficiently access external structured knowledge repositories. In order to support the development of such models, we release the VisualSem knowledge graph (KG) which includes nodes with multilingual glosses and multiple illustrative images and visually relevant relations. We also release a neural multi-modal retrieval model that can use images or sentences as inputs and retrieves entities in the KG. This multi-modal retrieval model can be integrated into any (neural network) model pipeline and we encourage the research community to use VisualSem for data augmentation and/or as a source of grounding, among other possible uses. VisualSem as well as the multi-modal retrieval model are publicly available and can be downloaded in: https://github.com/iacercalixto/visualsem.

neural network, node, soccer, (21 more...)

arXiv.org Artificial Intelligence

2008.0915

Country:

Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports > Soccer (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.71)

Add feedback