Goto

Collaborating Authors

 freita


Chatbots Play With Your Emotions to Avoid Saying Goodbye

WIRED

A Harvard Business School study shows that several AI companions use various tricks to keep a conversation from ending. Before you close this browser tab, just know that you risk missing out on some very important information. If you want to understand the subtle hold that artificial intelligence has over you, then please, keep reading. That was, perhaps, a bit manipulative. But it is just the kind of trick that some AI companions, which are designed to act as a friend or a partner, use to discourage users from breaking off a conversation.


Bridging Compositional and Distributional Semantics: A Survey on Latent Semantic Geometry via AutoEncoder

Zhang, Yingji, Carvalho, Danilo S., Freitas, André

arXiv.org Artificial Intelligence

Integrating compositional and symbolic properties into current distributional semantic spaces can enhance the interpretability, controllability, compositionality, and generalisation capabilities of Transformer-based auto-regressive language models (LMs). In this survey, we offer a novel perspective on latent space geometry through the lens of compositional semantics, a direction we refer to as \textit{semantic representation learning}. This direction enables a bridge between symbolic and distributional semantics, helping to mitigate the gap between them. We review and compare three mainstream autoencoder architectures-Variational AutoEncoder (VAE), Vector Quantised VAE (VQVAE), and Sparse AutoEncoder (SAE)-and examine the distinctive latent geometries they induce in relation to semantic structure and interpretability.


Regret Analysis of Posterior Sampling-Based Expected Improvement for Bayesian Optimization

Takeno, Shion, Inatsu, Yu, Karasuyama, Masayuki, Takeuchi, Ichiro

arXiv.org Machine Learning

Bayesian optimization is a powerful tool for optimizing an expensive-to-evaluate black-box function. In particular, the effectiveness of expected improvement (EI) has been demonstrated in a wide range of applications. However, theoretical analyses of EI are limited compared with other theoretically established algorithms. This paper analyzes a randomized variant of EI, which evaluates the EI from the maximum of the posterior sample path. We show that this posterior sampling-based random EI achieves the sublinear Bayesian cumulative regret bounds under the assumption that the black-box function follows a Gaussian process. Finally, we demonstrate the effectiveness of the proposed method through numerical experiments.


LangVAE and LangSpace: Building and Probing for Language Model VAEs

Carvalho, Danilo S., Zhang, Yingji, Unsworth, Harriet, Freitas, André

arXiv.org Artificial Intelligence

We present LangVAE, a novel framework for modular construction of variational autoencoders (VAEs) on top of pre-trained large language models (LLMs). Such language model VAEs can encode the knowledge of their pre-trained components into more compact and semantically disentangled representations. The representations obtained in this way can be analysed with the LangVAE companion framework: LangSpace, which implements a collection of probing methods, such as vector traversal and interpolation, disentanglement measures, and cluster visualisations. LangVAE and LangSpace offer a flexible, efficient and scalable way of building and analysing textual representations, with simple integration for models available on the HuggingFace Hub. Additionally, we conducted a set of experiments with different encoder and decoder combinations, as well as annotated inputs, revealing a wide range of interactions across architectural families and sizes w.r.t. generalisation and disentanglement. Our findings demonstrate a promising framework for systematising the experimentation and understanding of textual representations.


Artificial Intelligence: A Deadly Love Affair with a Chatbot

Der Spiegel International

The only thing that Sewell was still interested in was his telephone. It was the only way to motivate him, to reach him at all. When his telephone was taken away, he would do his homework, but only to get it back. "It was a constant fight," says Megan Garcia. I had always taught my child: Don't talk to strangers, don't post any photos of yourself on the web, don't share any personal information.


Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions

Ranaldi, Leonardo, Valentino, Marco, Polonsky, Alexander, Freitas, Andrè

arXiv.org Artificial Intelligence

Chain-of-Though (CoT) represents a common strategy for reasoning in Large Language Models (LLMs) by decomposing complex tasks into intermediate inference steps. However, explanations generated via CoT are susceptible to content biases that negatively affect their robustness and faithfulness. To mitigate existing limitations, recent work has proposed using logical formalisms coupled with external symbolic solvers. However, fully symbolic approaches possess the bottleneck of requiring a complete translation from natural language to formal languages, a process that affects efficiency and flexibility. To achieve a trade-off, this paper investigates methods to disentangle content from logical reasoning without a complete formalisation. In particular, we present QuaSAR (for Quasi-Symbolic Abstract Reasoning), a variation of CoT that guides LLMs to operate at a higher level of abstraction via quasi-symbolic explanations. Our framework leverages the capability of LLMs to formalise only relevant variables and predicates, enabling the coexistence of symbolic elements with natural language. We show the impact of QuaSAR for in-context learning and for constructing demonstrations to improve the reasoning capabilities of smaller models. Our experiments show that quasi-symbolic abstractions can improve CoT-based methods by up to 8% accuracy, enhancing robustness and consistency on challenging adversarial variations on both natural language (i.e. MMLU-Redux) and symbolic reasoning tasks (i.e., GSM-Symbolic).


Reasoning with Natural Language Explanations

Valentino, Marco, Freitas, André

arXiv.org Artificial Intelligence

Explanation constitutes an archetypal feature of human rationality, underpinning learning and generalisation, and representing one of the media supporting scientific discovery and communication. Due to the importance of explanations in human reasoning, an increasing amount of research in Natural Language Inference (NLI) has started reconsidering the role that explanations play in learning and inference, attempting to build explanation-based NLI models that can effectively encode and use natural language explanations on downstream tasks. Research in explanation-based NLI, however, presents specific challenges and opportunities, as explanatory reasoning reflects aspects of both material and formal inference, making it a particularly rich setting to model and deliver complex reasoning. In this tutorial, we provide a comprehensive introduction to the field of explanation-based NLI, grounding this discussion on the epistemological-linguistic foundations of explanations, systematically describing the main architectural trends and evaluation methodologies that can be used to build systems capable of explanatory reasoning.


Exploring the Limits of Fine-grained LLM-based Physics Inference via Premise Removal Interventions

Meadows, Jordan, James, Tamsin, Freitas, Andre

arXiv.org Artificial Intelligence

Language models can hallucinate when performing complex and detailed mathematical reasoning. Physics provides a rich domain for assessing mathematical reasoning capabilities where physical context imbues the use of symbols which needs to satisfy complex semantics (\textit{e.g.,} units, tensorial order), leading to instances where inference may be algebraically coherent, yet unphysical. In this work, we assess the ability of Language Models (LMs) to perform fine-grained mathematical and physical reasoning using a curated dataset encompassing multiple notations and Physics subdomains. We improve zero-shot scores using synthetic in-context examples, and demonstrate non-linear degradation of derivation quality with perturbation strength via the progressive omission of supporting premises. We find that the models' mathematical reasoning is not physics-informed in this setting, where physical context is predominantly ignored in favour of reverse-engineering solutions.


7f53f8c6c730af6aeb52e66eb74d8507-Reviews.html

Neural Information Processing Systems

This paper considers learning to sample from the posterior distribution of a model, by directly predicting latent variables from data. The idea is tested in the block MCMC context, where a small block of latents are predicted from the current state of other latents (and the data). This is shown to perform better than single-site Gibbs when variables are highly correlated and there is sufficient data to train the predictors. The paper is well written and has a reasonable evaluation. The comparison between block MCMC and single-site Gibbs is unsurprising.


Improving Semantic Control in Discrete Latent Spaces with Transformer Quantized Variational Autoencoders

Zhang, Yingji, Carvalho, Danilo S., Valentino, Marco, Pratt-Hartmann, Ian, Freitas, Andre

arXiv.org Artificial Intelligence

Achieving precise semantic control over the latent spaces of Variational AutoEncoders (VAEs) holds significant value for downstream tasks in NLP as the underlying generative mechanisms could be better localised, explained and improved upon. Recent research, however, has struggled to achieve consistent results, primarily due to the inevitable loss of semantic information in the variational bottleneck and limited control over the decoding mechanism. To overcome these challenges, we investigate discrete latent spaces in Vector Quantized Variational AutoEncoders (VQVAEs) to improve semantic control and generation in Transformer-based VAEs. In particular, We propose T5VQVAE, a novel model that leverages the controllability of VQVAEs to guide the self-attention mechanism in T5 at the token-level, exploiting its full generalization capabilities. Experimental results indicate that T5VQVAE outperforms existing state-of-the-art VAE models, including Optimus, in terms of controllability and preservation of semantic information across different tasks such as auto-encoding of sentences and mathematical expressions, text transfer, and inference. Moreover, T5VQVAE exhibits improved inference capabilities, suggesting potential applications for downstream natural language and symbolic reasoning tasks.