AITopics | South America

Collaborating Authors

South America

Student-t processes as infinite-width limits of posterior Bayesian neural networks

Caporali, Francesco, Favaro, Stefano, Trevisan, Dario

arXiv.org Machine LearningFeb-6-2025

The asymptotic properties of Bayesian Neural Networks (BNNs) have been extensively studied, particularly regarding their approximations by Gaussian processes in the infinite-width limit. We extend these results by showing that posterior BNNs can be approximated by Student-t processes, which offer greater flexibility in modeling uncertainty. Specifically, we show that, if the parameters of a BNN follow a Gaussian prior distribution, and the variance of both the last hidden layer and the Gaussian likelihood function follows an Inverse-Gamma prior distribution, then the resulting posterior BNN converges to a Student-t process in the infinite-width limit. Our proof leverages the Wasserstein metric to establish control over the convergence rate of the Student-t process approximation.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

2502.04247

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Malleable Robots

Clark, Angus B., Wang, Xinran, Ranne, Alex, Rojas, Nicolas

arXiv.org Artificial IntelligenceFeb-6-2025

Reconfigurable robot systems provide several key potential advantages over traditional robots, including increased task versatility by adapting to better suit tasks, and reduced robot cost due to a smaller total number of modules, such as links and joints. As such, there has been significant research into the development of reconfigurable robots, with the most popular approach utilising modularity as the method of reconfiguration, as this allows for the interchangeability of parts, leading to self-repair [71, 60]. The reconfigurability feature has specifically been of interest in unstructured and unpredictable environments, characterised by changing operating contexts, which take the most advantage from robots that can adapt their shape and operating mode [66]. An alternative approach for the application of reconfigurable robot manipulators can be found in the industrial field of serial manipulators. In an ideal case, a manipulator would be designed with the exact number and configuration of joints necessary for its expected set of tasks [26].

artificial intelligence, machine learning, robot, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-68620-7

2502.04012

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Oregon > Benton County > Corvallis (0.04)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Surgery (0.67)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > Polymers & Plastics (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Sports and Women's Sports: Gender Bias in Text Generation with Olympic Data

Biester, Laura

arXiv.org Artificial IntelligenceFeb-6-2025

Large Language Models (LLMs) have been shown to be biased in prior work, as they generate text that is in line with stereotypical views of the world or that is not representative of the viewpoints and values of historically marginalized demographic groups. In this work, we propose using data from parallel men's and women's events at the Olympic Games to investigate different forms of gender bias in language models. We define three metrics to measure bias, and find that models are consistently biased against women when the gender is ambiguous in the prompt. In this case, the model frequently retrieves only the results of the men's event with or without acknowledging them as such, revealing pervasive gender bias in LLMs in the context of athletics.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.04218

Country:

Europe > Russia (0.14)
Asia > Russia (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(20 more...)

Genre:

Personal > Honors (0.49)
Research Report > Experimental Study (0.46)

Industry: Leisure & Entertainment > Sports > Olympic Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Modular Training of Neural Networks aids Interpretability

Golechha, Satvik, Chaudhary, Maheep, Velja, Joan, Abate, Alessandro, Schoots, Nandi

arXiv.org Artificial IntelligenceFeb-6-2025

An approach to improve neural network interpretability is via clusterability, i.e., splitting a model into disjoint clusters that can be studied independently. We define a measure for clusterability and show that pre-trained models form highly enmeshed clusters via spectral graph clustering. We thus train models to be more modular using a "clusterability loss" function that encourages the formation of non-interacting clusters. Using automated interpretability techniques, we show that our method can help train models that are more modular and learn different, disjoint, and smaller circuits. We investigate CNNs trained on MNIST and CIFAR, small transformers trained on modular addition, and language models. Our approach provides a promising direction for training neural networks that learn simpler functions and are easier to interpret.

artificial intelligence, machine learning, modular training, (16 more...)

arXiv.org Artificial Intelligence

2502.0247

Country:

Europe > Austria > Vienna (0.14)
South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(4 more...)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.40)
Health & Medicine > Therapeutic Area > Immunology (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Add feedback

Learning Street View Representations with Spatiotemporal Contrast

Li, Yong, Huang, Yingjing, Mai, Gengchen, Zhang, Fan

arXiv.org Artificial IntelligenceFeb-6-2025

Street view imagery is extensively utilized in representation learning for urban visual environments, supporting various sustainable development tasks such as environmental perception and socio-economic assessment. However, it is challenging for existing image representations to specifically encode the dynamic urban environment (such as pedestrians, vehicles, and vegetation), the built environment (including buildings, roads, and urban infrastructure), and the environmental ambiance (such as the cultural and socioeconomic atmosphere) depicted in street view imagery to address downstream tasks related to the city. In this work, we propose an innovative self-supervised learning framework that leverages temporal and spatial attributes of street view imagery to learn image representations of the dynamic urban environment for diverse downstream tasks. By employing street view images captured at the same location over time and spatially nearby views at the same time, we construct contrastive learning tasks designed to learn the temporal-invariant characteristics of the built environment and the spatial-invariant neighborhood ambiance. Our approach significantly outperforms traditional supervised and unsupervised methods in tasks such as visual place recognition, socioeconomic estimation, and human-environment perception. Moreover, we demonstrate the varying behaviors of image representations learned through different contrastive learning objectives across various downstream tasks. This study systematically discusses representation learning strategies for urban studies based on street view images, providing a benchmark that enhances the applicability of visual data in urban science. The code is available at https://github.com/yonglleee/UrbanSTCL.

artificial intelligence, machine learning, street view image, (16 more...)

arXiv.org Artificial Intelligence

2502.04638

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > California > Los Angeles County > Los Angeles (0.05)
North America > Canada > Quebec > Montreal (0.04)
(16 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Evaluating Inter-Column Logical Relationships in Synthetic Tabular Data Generation

Long, Yunbo, Xu, Liming, Brintrup, Alexandra

arXiv.org Artificial IntelligenceFeb-6-2025

To evaluate the fidelity of synthetic tabular data, numerous metrics have been proposed to assess accuracy and diversity, including both low-order statistics (e.g., Density Estimation and Correlation Score (Zhang et al., 2023), Average Coverage Scores (Zein & Urvoy, 2022)) and high-order statistics (e.g., α-Precision and β-Recall (Alaa et al., 2022)). However, these metrics operate at a high level and fail to evaluate whether synthetic data preserves logical relationships, such as hierarchical or semantic dependencies between features. This highlights the need for a more fine-grained, context-aware evaluation of multivariate dependencies. To address this, we propose three evaluation metrics: Hierarchical Consistency Score (HCS), Multivariate Dependency Index (MDI), and Distributional Similarity Index (DSI). To assess the effectiveness of these metrics in quantifying inter-column relationships, we select five representative tabular data generation methods from different categories for evaluation. Their performance is measured using both existing and our proposed metrics on a real-world dataset rich in logical consistency and dependency constraints. Experimental results validate the effectiveness of our proposed metrics and reveal the limitations of existing approaches in preserving logical relationships in synthetic tabular data. Additionally, we discuss potential pathways to better capture logical constraints within joint distributions, paying the way for future advancements in synthetic tabular data generation.

machine learning, natural language, tabular data, (17 more...)

arXiv.org Artificial Intelligence

2502.04055

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Southeast Asia (0.06)
(49 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)

Add feedback

Extracting and Understanding the Superficial Knowledge in Alignment

Chen, Runjin, Perin, Gabriel Jacob, Chen, Xuxi, Chen, Xilun, Han, Yan, Hirata, Nina S. T., Hong, Junyuan, Kailkhura, Bhavya

arXiv.org Artificial IntelligenceFeb-6-2025

Alignment of large language models (LLMs) with human values and preferences, often achieved through fine-tuning based on human feedback, is essential for ensuring safe and responsible AI behaviors. However, the process typically requires substantial data and computation resources. Recent studies have revealed that alignment might be attainable at lower costs through simpler methods, such as in-context learning. This leads to the question: Is alignment predominantly superficial? In this paper, we delve into this question and provide a quantitative analysis. We formalize the concept of superficial knowledge, defining it as knowledge that can be acquired through easily token restyling, without affecting the model's ability to capture underlying causal relationships between tokens. We propose a method to extract and isolate superficial knowledge from aligned models, focusing on the shallow modifications to the final token selection process. By comparing models augmented only with superficial knowledge to fully aligned models, we quantify the superficial portion of alignment. Our findings reveal that while superficial knowledge constitutes a significant portion of alignment, particularly in safety and detoxification tasks, it is not the whole story. Tasks requiring reasoning and contextual understanding still rely on deeper knowledge. Additionally, we demonstrate two practical advantages of isolated superficial knowledge: (1) it can be transferred between models, enabling efficient offsite alignment of larger models using extracted superficial knowledge from smaller models, and (2) it is recoverable, allowing for the restoration of alignment in compromised models without sacrificing performance.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.04602

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis

Ye, Zhen, Zhu, Xinfa, Chan, Chi-Min, Wang, Xinsheng, Tan, Xu, Lei, Jiahe, Peng, Yi, Liu, Haohe, Jin, Yizhu, DAI, Zheqi, Lin, Hongzhan, Chen, Jianyi, Du, Xingjian, Xue, Liumeng, Chen, Yunlin, Li, Zhifei, Xie, Lei, Kong, Qiuqiang, Guo, Yike, Xue, Wei

arXiv.org Artificial IntelligenceFeb-6-2025

Recent advances in text-based large language models (LLMs), particularly in the GPT series and the o1 model, have demonstrated the effectiveness of scaling both training-time and inference-time compute. However, current state-of-the-art TTS systems leveraging LLMs are often multi-stage, requiring separate models (e.g., diffusion models after LLM), complicating the decision of whether to scale a particular model during training or testing. This work makes the following contributions: First, we explore the scaling of train-time and inference-time compute for speech synthesis. Second, we propose a simple framework Llasa for speech synthesis that employs a single-layer vector quantizer (VQ) codec and a single Transformer architecture to fully align with standard LLMs such as Llama. Our experiments reveal that scaling train-time compute for Llasa consistently improves the naturalness of synthesized speech and enables the generation of more complex and accurate prosody patterns. Furthermore, from the perspective of scaling inference-time compute, we employ speech understanding models as verifiers during the search, finding that scaling inference-time compute shifts the sampling modes toward the preferences of specific verifiers, thereby improving emotional expressiveness, timbre consistency, and content accuracy. In addition, we released the checkpoint and training code for our TTS model (1B, 3B, 8B) and codec model publicly available.

large language model, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2502.04128

Country:

Asia > Maldives (0.04)
Asia > China > Hong Kong (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
(15 more...)

Genre:

Personal (1.00)
Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Media (0.93)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers

Stooke, Adam, Prabhavalkar, Rohit, Sim, Khe Chai, Mengibar, Pedro Moreno

arXiv.org Artificial IntelligenceFeb-6-2025

Modern systems for automatic speech recognition, including the RNN-Transducer and Attention-based Encoder-Decoder (AED), are designed so that the encoder is not required to alter the time-position of information from the audio sequence into the embedding; alignment to the final text output is processed during decoding. We discover that the transformer-based encoder adopted in recent years is actually capable of performing the alignment internally during the forward pass, prior to decoding. This new phenomenon enables a simpler and more efficient model, the "Aligner-Encoder". To train it, we discard the dynamic programming of RNN-T in favor of the frame-wise cross-entropy loss of AED, while the decoder employs the lighter text-only recurrence of RNN-T without learned cross-attention -- it simply scans embedding frames in order from the beginning, producing one token each until predicting the end-of-message. We conduct experiments demonstrating performance remarkably close to the state of the art, including a special inference configuration enabling long-form recognition. In a representative comparison, we measure the total inference time for our model to be 2x faster than RNN-T and 16x faster than AED. Lastly, we find that the audio-text alignment is clearly visible in the self-attention weights of a certain layer, which could be said to perform "self-transduction".

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.05232

Country:

North America > United States (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > South Korea > Incheon > Incheon (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Ontology-Guided, Hybrid Prompt Learning for Generalization in Knowledge Graph Question Answering

Jiang, Longquan, Huang, Junbo, Möller, Cedric, Usbeck, Ricardo

arXiv.org Artificial IntelligenceFeb-6-2025

Most existing Knowledge Graph Question Answering (KGQA) approaches are designed for a specific KG, such as Wikidata, DBpedia or Freebase. Due to the heterogeneity of the underlying graph schema, topology and assertions, most KGQA systems cannot be transferred to unseen Knowledge Graphs (KGs) without resource-intensive training data. We present OntoSCPrompt, a novel Large Language Model (LLM)-based KGQA approach with a two-stage architecture that separates semantic parsing from KG-dependent interactions. OntoSCPrompt first generates a SPARQL query structure (including SPARQL keywords such as SELECT, ASK, WHERE and placeholders for missing tokens) and then fills them with KG-specific information. To enhance the understanding of the underlying KG, we present an ontology-guided, hybrid prompt learning strategy that integrates KG ontology into the learning process of hybrid prompts (e.g., discrete and continuous vectors). We also present several task-specific decoding strategies to ensure the correctness and executability of generated SPARQL queries in both stages. Experimental results demonstrate that OntoSCPrompt performs as well as SOTA approaches without retraining on a number of KGQA datasets such as CWQ, WebQSP and LC-QuAD 1.0 in a resource-efficient manner and can generalize well to unseen domain-specific KGs like DBLP-QuAD and CoyPu KG Code: \href{https://github.com/LongquanJiang/OntoSCPrompt}{https://github.com/LongquanJiang/OntoSCPrompt}

artificial intelligence, computational linguistic, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.03992

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(11 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback