AITopics | Faria, Gonçalo

Collaborating Authors

Faria, Gonçalo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sample, Don't Search: Rethinking Test-Time Alignment for Language Models

Faria, Gonçalo, Smith, Noah A.

arXiv.org Machine LearningApr-3-2025

Increasing test-time computation has emerged as a promising direction for improving language model performance, particularly in scenarios where model finetuning is impractical or impossible due to computational constraints or private model weights. However, existing test-time search methods using a reward model (RM) often degrade in quality as compute scales, due to the over-optimization of what are inherently imperfect reward proxies. We introduce QAlign, a new test-time alignment approach. As we scale test-time compute, QAlign converges to sampling from the optimal aligned distribution for each individual prompt. By adopting recent advances in Markov chain Monte Carlo for text generation, our method enables better-aligned outputs without modifying the underlying model or even requiring logit access. We demonstrate the effectiveness of QAlign on mathematical reasoning benchmarks (GSM8K and GSM-Symbolic) using a task-specific RM, showing consistent improvements over existing test-time compute methods like best-of-n and majority voting. Furthermore, when applied with more realistic RMs trained on the Tulu 3 preference dataset, QAlign outperforms direct preference optimization (DPO), best-of-n, majority voting, and weighted majority voting on a diverse range of datasets (GSM8K, MATH500, IFEval, MMLU-Redux, and TruthfulQA). A practical solution to aligning language models at test time using additional computation without degradation, our approach expands the limits of the capability that can be obtained from off-the-shelf language models without further training.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2504.0379

Country:

Europe (0.28)
North America (0.28)

Genre: Research Report (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
(3 more...)

Add feedback

Modeling User Preferences with Automatic Metrics: Creating a High-Quality Preference Dataset for Machine Translation

Agrawal, Sweta, de Souza, José G. C., Rei, Ricardo, Farinhas, António, Faria, Gonçalo, Fernandes, Patrick, Guerreiro, Nuno M, Martins, Andre

arXiv.org Artificial IntelligenceOct-10-2024

Alignment with human preferences is an important step in developing accurate and safe large language models. This is no exception in machine translation (MT), where better handling of language nuances and context-specific variations leads to improved quality. However, preference data based on human feedback can be very expensive to obtain and curate at a large scale. Automatic metrics, on the other hand, can induce preferences, but they might not match human expectations perfectly. In this paper, we propose an approach that leverages the best of both worlds. We first collect sentence-level quality assessments from professional linguists on translations generated by multiple high-quality MT systems and evaluate the ability of current automatic metrics to recover these preferences. We then use this analysis to curate a new dataset, MT-Pref (metric induced translation preference) dataset, which comprises 18k instances covering 18 language directions, using texts sourced from multiple domains post-2022. We show that aligning TOWER models on MT-Pref significantly improves translation quality on WMT23 and FLORES benchmarks.

artificial intelligence, natural language, translation, (16 more...)

arXiv.org Artificial Intelligence

2410.07779

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
North America > United States > New Mexico (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Learning to compute inner consensus -- A noble approach to modeling agreement between Capsules

Faria, Gonçalo

arXiv.org Machine LearningSep-27-2019

The now called field of Deep Learning has expanded these ideas by creating models that stack multiple layers of Perceptrons. These Multilayer Perceptrons, commonly known as Neural Networks [7], achieve greater representation capacity, due to the layered manner the computational complexity is added, especially when compared with its precursor. Attributable to this compositional approach they are especially hard-wired to learn a nested hierarchy of concepts [27]. As an approach to soft-computing, Neural Networks stand in opposition to the precisely stated view of analytical algorithms that, unlike the human mind, are not tolerant of imprecision, uncertainty, partial truth and approximation [5]. In conjunction with other Deep Learning models, they stand at the vanguard of Artificial Intelligence Research, employed in tasks that previously have been found computationally intractable.

capsule, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1909.12737

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback