AITopics | Ferianc, Martin

Collaborating Authors

Ferianc, Martin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cultural Alignment in Large Language Models Using Soft Prompt Tuning

Masoud, Reem I., Ferianc, Martin, Treleaven, Philip, Rodrigues, Miguel

arXiv.org Artificial IntelligenceMar-20-2025

Large Language Model (LLM) alignment conventionally relies on supervised fine-tuning or reinforcement learning based alignment frameworks. These methods typically require labeled or preference datasets and involve updating model weights to align the LLM with the training objective or reward model. Meanwhile, in social sciences such as cross-cultural studies, factor analysis is widely used to uncover underlying dimensions or latent variables that explain observed patterns in survey data. The non-differentiable nature of these measurements deriving from survey data renders the former alignment methods infeasible for alignment with cultural dimensions. To overcome this, we propose a parameter efficient strategy that combines soft prompt tuning, which freezes the model parameters while modifying the input prompt embeddings, with Differential Evolution (DE), a black-box optimization method for cases where a differentiable objective is unattainable. This strategy ensures alignment consistency without the need for preference data or model parameter updates, significantly enhancing efficiency and mitigating overfitting. Our method demonstrates significant improvements in LLama-3-8B-Instruct's cultural dimensions across multiple regions, outperforming both the Naive LLM and the In-context Learning (ICL) baseline, and effectively bridges computational models with human cultural nuances.

dimension, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.16094

Country:

Asia (0.98)
North America > United States (0.15)

Genre:

Research Report (0.82)
Questionnaire & Opinion Survey (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Large language models surpass human experts in predicting neuroscience results

Luo, Xiaoliang, Rechardt, Akilles, Sun, Guangzhi, Nejad, Kevin K., Yáñez, Felipe, Yilmaz, Bati, Lee, Kangjoo, Cohen, Alexandra O., Borghesani, Valentina, Pashkov, Anton, Marinazzo, Daniele, Nicholas, Jonathan, Salatiello, Alessandro, Sucholutsky, Ilia, Minervini, Pasquale, Razavi, Sepehr, Rocca, Roberta, Yusifov, Elkhan, Okalova, Tereza, Gu, Nianlong, Ferianc, Martin, Khona, Mikail, Patil, Kaustubh R., Lee, Pui-Shee, Mata, Rui, Myers, Nicholas E., Bizley, Jennifer K, Musslick, Sebastian, Bilgin, Isil Poyraz, Niso, Guiomar, Ales, Justin M., Gaebler, Michael, Murty, N Apurva Ratan, Loued-Khenissi, Leyla, Behler, Anna, Hall, Chloe M., Dafflon, Jessica, Bao, Sherry Dongqi, Love, Bradley C.

arXiv.org Artificial IntelligenceJun-21-2024

Scientific discoveries often hinge on synthesizing decades of research, a task that potentially outstrips human information processing capacities. Large language models (LLMs) offer a solution. LLMs trained on the vast scientific literature could potentially integrate noisy yet interrelated findings to forecast novel results better than human experts. To evaluate this possibility, we created BrainBench, a forward-looking benchmark for predicting neuroscience results. We find that LLMs surpass experts in predicting experimental outcomes. BrainGPT, an LLM we tuned on the neuroscience literature, performed better yet. Like human experts, when LLMs were confident in their predictions, they were more likely to be correct, which presages a future where humans and LLMs team together to make discoveries. Our approach is not neuroscience-specific and is transferable to other knowledge-intensive endeavors.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2403.0323

Country:

North America > United States (1.00)
Europe > Germany (1.00)
Asia (0.68)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

YAMLE: Yet Another Machine Learning Environment

Ferianc, Martin, Rodrigues, Miguel

arXiv.org Artificial IntelligenceFeb-9-2024

YAMLE: Yet Another Machine Learning Environment is an open-source framework that facilitates rapid prototyping and experimentation with machine learning (ML) models and methods. The key motivation is to reduce repetitive work when implementing new approaches and improve reproducibility in ML research. YAMLE includes a command-line interface and integrations with popular and well-maintained PyTorch-based libraries to streamline training, hyperparameter optimisation, and logging. The ambition for YAMLE is to grow into a shared ecosystem where researchers and practitioners can quickly build on and compare existing implementations.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2402.06268

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.50)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SAE: Single Architecture Ensemble Neural Networks

Ferianc, Martin, Fan, Hongxiang, Rodrigues, Miguel

arXiv.org Artificial IntelligenceFeb-9-2024

Ensembles of separate neural networks (NNs) have shown superior accuracy and confidence calibration over single NN across tasks. Recent methods compress ensembles within a single network via early exits or multi-input multi-output frameworks. However, the landscape of these methods is fragmented thus far, making it difficult to choose the right approach for a given task. Furthermore, the algorithmic performance of these methods is behind the ensemble of separate NNs and requires extensive architecture tuning. We propose a novel methodology unifying these approaches into a Single Architecture Ensemble (SAE). Our method learns the optimal number and depth of exits per ensemble input in a single NN. This enables the SAE framework to flexibly tailor its configuration for a given architecture or application. We evaluate SAEs on image classification and regression across various network architecture types and sizes. We demonstrate competitive accuracy or confidence calibration to baselines while reducing the compute operations or parameter count by up to $1.5{\sim}3.7\times$.

artificial intelligence, configuration, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2402.0658

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions

Masoud, Reem I., Liu, Ziquan, Ferianc, Martin, Treleaven, Philip, Rodrigues, Miguel

arXiv.org Artificial IntelligenceAug-25-2023

The deployment of large language models (LLMs) raises concerns regarding their cultural misalignment and potential ramifications on individuals from various cultural norms. Existing work investigated political and social biases and public opinions rather than their cultural values. To address this limitation, the proposed Cultural Alignment Test (CAT) quantifies cultural alignment using Hofstede's cultural dimension framework, which offers an explanatory cross-cultural comparison through the latent variable analysis. We apply our approach to assess the cultural values embedded in state-of-the-art LLMs, such as: ChatGPT and Bard, across diverse cultures of countries: United States (US), Saudi Arabia, China, and Slovakia, using different prompting styles and hyperparameter settings. Our results not only quantify cultural alignment of LLMs with certain countries, but also reveal the difference between LLMs in explanatory cultural dimensions. While all LLMs did not provide satisfactory results in understanding cultural values, GPT-4 exhibited the highest CAT score for the cultural values of the US.

large language model, machine learning, natural language, (6 more...)

arXiv.org Artificial Intelligence

2309.12342

Country:

North America > United States (0.53)
Asia > Middle East > Saudi Arabia (0.24)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Impact of Noise on Calibration and Generalisation of Neural Networks

Ferianc, Martin, Bohdal, Ondrej, Hospedales, Timothy, Rodrigues, Miguel

arXiv.org Artificial IntelligenceJun-30-2023

Noise injection and data augmentation strategies have been effective for enhancing the generalisation and robustness of neural networks (NNs). Certain types of noise such as label smoothing and MixUp have also been shown to improve calibration. Since noise can be added in various stages of the NN's training, it motivates the question of when and where the noise is the most effective. We study a variety of noise types to determine how much they improve calibration and generalisation, and under what conditions. More specifically we evaluate various noise-injection strategies in both in-distribution (ID) and out-of-distribution (OOD) scenarios. The findings highlight that activation noise was the most transferable and effective in improving generalisation, while input augmentation noise was prominent in improving calibration on OOD but not necessarily ID data.

artificial intelligence, calibration, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2306.1763

Country: North America > United States > Hawaii (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Renate: A Library for Real-World Continual Learning

Wistuba, Martin, Ferianc, Martin, Balles, Lukas, Archambeau, Cedric, Zappella, Giovanni

arXiv.org Artificial IntelligenceApr-24-2023

Continual learning enables the incremental training of machine learning models on non-stationary data streams.While academic interest in the topic is high, there is little indication of the use of state-of-the-art continual learning algorithms in practical machine learning deployment. This paper presents Renate, a continual learning library designed to build real-world updating pipelines for PyTorch models. We discuss requirements for the use of continual learning algorithms in practice, from which we derive design principles for Renate. We give a high-level description of the library components and interfaces. Finally, we showcase the strengths of the library by presenting experimental results. Renate may be found at https://github.com/awslabs/renate.

artificial intelligence, machine learning, renate, (16 more...)

arXiv.org Artificial Intelligence

2304.12067

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

On the Effects of Quantisation on Model Uncertainty in Bayesian Neural Networks

Ferianc, Martin, Maji, Partha, Mattina, Matthew, Rodrigues, Miguel

arXiv.org Machine LearningFeb-22-2021

Bayesian neural networks (BNNs) are making significant progress in many research areas where decision making needs to be accompanied by uncertainty estimation. Being able to quantify uncertainty while making decisions is essential for understanding when the model is over-/under-confident, and hence BNNs are attracting interest in safety-critical applications, such as autonomous driving, healthcare and robotics. Nevertheless, BNNs have not been as widely used in industrial practice, mainly because of their increased memory and compute costs. In this work, we investigate quantisation of BNNs by compressing 32-bit floating-point weights and activations to their integer counterparts, that has already been successful in reducing the compute demand in standard pointwise neural networks. We study three types of quantised BNNs, we evaluate them under a wide range of different settings, and we empirically demonstrate that an uniform quantisation scheme applied to BNNs does not substantially decrease their quality of uncertainty estimation.

bayesian inference, neural network, quantisation, (12 more...)

arXiv.org Machine Learning

2102.11062

Genre: Research Report (0.40)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

VINNAS: Variational Inference-based Neural Network Architecture Search

Ferianc, Martin, Fan, Hongxiang, Rodrigues, Miguel

arXiv.org Machine LearningOct-24-2020

In recent years, neural architecture search (NAS) has received intensive scientific and industrial interest due to its capability of finding a neural architecture with high accuracy for various artificial intelligence tasks such as image classification or object detection. In particular, gradient-based NAS approaches have become one of the more popular approaches thanks to their computational efficiency during the search. However, these methods often experience a mode collapse, where the quality of the found architectures is poor due to the algorithm resorting to choosing a single operation type for the entire network, or stagnating at a local minima for various datasets or search spaces. To address these defects, we present a differentiable variational inference-based NAS method for searching sparse convolutional neural networks. Our approach finds the optimal neural architecture by dropping out candidate operations in an over-parameterised supergraph using variational dropout with automatic relevance determination prior, which makes the algorithm gradually remove unnecessary operations and connections without risking mode collapse. The evaluation is conducted through searching two types of convolutional cells that shape the neural network for classifying different image datasets. Our method finds diverse network cells, while showing state-of-the-art accuracy with up to almost 2 times fewer non-zero parameters.

architecture, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

2007.06103

Country: Europe > United Kingdom (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback