AITopics | Villegas, Marta

Collaborating Authors

Villegas, Marta

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization

Pikabea, Iñigo, Lacunza, Iñaki, Pareras, Oriol, Escolano, Carlos, Gonzalez-Agirre, Aitor, Hernando, Javier, Villegas, Marta

arXiv.org Artificial IntelligenceMar-28-2025

Rapid advancements in Visual Language Models (VLMs) have transformed multimodal understanding but are often constrained by generating English responses regardless of the input language. This phenomenon has been termed as Image-induced Fidelity Loss (IFL) and stems from limited multimodal multilingual training data. To address this, we propose a continuous multilingual integration strategy that injects text-only multilingual data during visual instruction tuning, preserving the language model's original multilingual capabilities. Extensive evaluations demonstrate that our approach significantly improves linguistic fidelity across languages without degradation in visual performance. We also explore model merging, which improves language fidelity but comes at the cost of visual performance. In contrast, our core method achieves robust multilingual alignment without trade-offs, offering a scalable and effective path to mitigating IFL for global VLM adoption.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.22577

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Education (1.00)
Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Salamandra Technical Report

Gonzalez-Agirre, Aitor, Pàmies, Marc, Llop, Joan, Baucells, Irene, Da Dalt, Severino, Tamayo, Daniel, Saiz, José Javier, Espuña, Ferran, Prats, Jaume, Aula-Blasco, Javier, Mina, Mario, Pikabea, Iñigo, Rubio, Adrián, Shvets, Alexander, Sallés, Anna, Lacunza, Iñaki, Palomar, Jorge, Falcão, Júlia, Tormo, Lucía, Vasquez-Reina, Luis, Marimon, Montserrat, Pareras, Oriol, Ruiz-Fernández, Valle, Villegas, Marta

arXiv.org Artificial IntelligenceFeb-13-2025

This work introduces Salamandra, a suite of open-source decoder-only large language models available in three different sizes: 2, 7, and 40 billion parameters. The models were trained from scratch on highly multilingual data that comprises text in 35 European languages and code. Our carefully curated corpus is made exclusively from open-access data compiled from a wide variety of sources. Along with the base models, supplementary checkpoints that were fine-tuned on public-domain instruction data are also released for chat applications. Additionally, we also share our preliminary experiments on multimodality, which serve as proof-of-concept to showcase potential applications for the Salamandra family. Our extensive evaluations on multilingual benchmarks reveal that Salamandra has strong capabilities, achieving competitive performance when compared to similarly sized open-source models. We provide comprehensive evaluation results both on standard downstream tasks as well as key aspects related to bias and safety.With this technical report, we intend to promote open science by sharing all the details behind our design choices, data curation strategy and evaluation methodology. In addition to that, we deviate from the usual practice by making our training and evaluation scripts publicly accessible. We release all models under a permissive Apache 2.0 license in order to foster future research and facilitate commercial use, thereby contributing to the open-source ecosystem of large language models.

kideak modu eraginkorragoan aurkitzen zituzten, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2502.08489

Country:

North America > United States (1.00)
Europe > Spain (1.00)
Asia > Middle East > UAE (0.27)

Genre: Research Report > New Finding (0.92)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(4 more...)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(3 more...)

Add feedback

Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge

Tamayo, Daniel, Gonzalez-Agirre, Aitor, Hernando, Javier, Villegas, Marta

arXiv.org Artificial IntelligenceFeb-4-2025

Recent research has explored methods for updating and modifying factual knowledge in large language models, often focusing on specific multi-layer perceptron blocks. This study expands on this work by examining the effectiveness of existing knowledge editing methods across languages and delving into the role of attention mechanisms in this process. Drawing from the insights gained, we propose Mass-Editing Memory with Attention in Transformers (MEMAT), a method that achieves significant improvements in all metrics while requiring minimal parameter modifications. MEMAT delivers a remarkable 10% increase in magnitude metrics, benefits languages not included in the training data and also demonstrates a high degree of portability. Our code and data are at https://github.com/dtamayo-nlp/MEMAT.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2024.findings-acl.347

2502.02173

Country: North America (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (0.94)
Leisure & Entertainment (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

The Catalan Language CLUB

Rodriguez-Penagos, Carlos, Armentano-Oller, Carme, Villegas, Marta, Melero, Maite, Gonzalez, Aitor, Bonet, Ona de Gibert, Pio, Casimiro Carrino

arXiv.org Artificial IntelligenceDec-3-2021

The Catalan Language Understanding Benchmark (CLUB) encompasses various datasets representative of different NLU tasks that enable accurate evaluations of language models, following the General Language Understanding Evaluation (GLUE) example. It is part of AINA and PlanTL, two public funding initiatives to empower the Catalan language in the Artificial Intelligence era.

artificial intelligence, natural language, text processing, (18 more...)

arXiv.org Artificial Intelligence

2112.01894

Country: North America > Canada (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Spanish Legalese Language Model and Corpora

Gutiérrez-Fandiño, Asier, Armengol-Estapé, Jordi, Gonzalez-Agirre, Aitor, Villegas, Marta

arXiv.org Artificial IntelligenceOct-23-2021

There are many Language Models for the English language according to its worldwide relevance. However, for the Spanish language, even if it is a widely spoken language, there are very few Spanish Language Models which result to be small and too general. Legal slang could be think of a Spanish variant on its own as it is very complicated in vocabulary, semantics and phrase understanding. For this work we gathered legal-domain corpora from different sources, generated a model and evaluated against Spanish general domain tasks. The model provides reasonable results in those tasks.

artificial intelligence, natural language, text processing, (14 more...)

arXiv.org Artificial Intelligence

2110.12201

Country: Europe > France (0.30)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Spanish Language Models

Gutiérrez-Fandiño, Asier, Armengol-Estapé, Jordi, Pàmies, Marc, Llop-Palao, Joan, Silveira-Ocampo, Joaquín, Carrino, Casimiro Pio, Gonzalez-Agirre, Aitor, Armentano-Oller, Carme, Rodriguez-Penagos, Carlos, Villegas, Marta

arXiv.org Artificial IntelligenceAug-13-2021

This paper presents the Spanish RoBERTa-base and RoBERTa-large models, as well as the corresponding performance evaluations. Both models were pre-trained using the largest Spanish corpus known to date, with a total of 570GB of clean and deduplicated text processed for this work, compiled from the web crawlings performed by the National Library of Spain from 2009 to 2019. We extended the current evaluation datasets with an extractive Question Answering dataset and our models outperform the existing Spanish models across tasks and settings.

artificial intelligence, roberta-large-bne, text processing, (19 more...)

arXiv.org Artificial Intelligence

2107.07253

Country: Europe > Spain (0.34)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

Add feedback

Overview of BioASQ 2020: The eighth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering

Nentidis, Anastasios, Krithara, Anastasia, Bougiatiotis, Konstantinos, Krallinger, Martin, Rodriguez-Penagos, Carlos, Villegas, Marta, Paliouras, Georgios

arXiv.org Artificial IntelligenceJun-28-2021

In this paper, we present an overview of the eighth edition of the BioASQ challenge, which ran as a lab in the Conference and Labs of the Evaluation Forum (CLEF) 2020. BioASQ is a series of challenges aiming at the promotion of systems and methodologies for large-scale biomedical semantic indexing and question answering. To this end, shared tasks are organized yearly since 2012, where different teams develop systems that compete on the same demanding benchmark datasets that represent the real information needs of experts in the biomedical domain. This year, the challenge has been extended with the introduction of a new task on medical semantic indexing in Spanish. In total, 34 teams with more than 100 systems participated in the three tasks of the challenge. As in previous years, the results of the evaluation reveal that the top-performing systems managed to outperform the strong baselines, which suggests that state-of-the-art systems keep pushing the frontier of research through continuous improvements.

bioasq 2020, bioasq challenge, large-scale biomedical semantic indexing, (1 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-58219-7_16

2106.14618

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.60)

Add feedback

Persistent Homology Captures the Generalization of Neural Networks Without A Validation Set

Gutiérrez-Fandiño, Asier, Pérez-Fernández, David, Armengol-Estapé, Jordi, Villegas, Marta

arXiv.org Artificial IntelligenceMay-31-2021

The training of neural networks is usually monitored with a validation (holdout) set to estimate the generalization of the model. This is done instead of measuring intrinsic properties of the model to determine whether it is learning appropriately. In this work, we suggest studying the training of neural networks with Algebraic Topology, specifically Persistent Homology (PH). Using simplicial complex representations of neural networks, we study the PH diagram distance evolution on the neural network learning process with different architectures and several datasets. Results show that the PH diagram distance between consecutive neural network states correlates with the validation accuracy, implying that the generalization error of a neural network could be intrinsically estimated without any holdout set.

deep learning, distance difference, neural network, (17 more...)

arXiv.org Artificial Intelligence

2106.00012

Country: Europe > Spain (0.28)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Spanish Biomedical and Clinical Language Embeddings

Gutiérrez-Fandiño, Asier, Armengol-Estapé, Jordi, Carrino, Casimiro Pio, De Gibert, Ona, Gonzalez-Agirre, Aitor, Villegas, Marta

arXiv.org Artificial IntelligenceFeb-25-2021

We have developed two types of embeddings using We evaluated the Biomedical Word Embeddings two different corpora: the Spanish Biomedical Corpora obtaining better results than previous versions showing and the Spanish Clinical Corpora. Since the the implication that with more data, we obtain Spanish Biomedical Corpora is of a much larger magnitude better representations.

embedding, health & medicine, text processing, (15 more...)

arXiv.org Artificial Intelligence

2102.12843

Country:

Asia (0.96)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.32)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Add feedback