AITopics | South America

Collaborating Authors

South America

Adaptive posterior distributions for uncertainty analysis of covariance matrices in Bayesian inversion problems for multioutput signals

Curbelo, E., Martino, L., Llorente, F., Delgado-Gomez, D.

arXiv.org Machine LearningJan-2-2025

In this paper we address the problem of performing Bayesian inference for the parameters of a nonlinear multi-output model and the covariance matrix of the different output signals. We propose an adaptive importance sampling (AIS) scheme for multivariate Bayesian inversion problems, which is based in two main ideas: the variables of interest are split in two blocks and the inference takes advantage of known analytical optimization formulas. We estimate both the unknown parameters of the multivariate non-linear model and the covariance matrix of the noise. In the first part of the proposed inference scheme, a novel AIS technique called adaptive target adaptive importance sampling (ATAIS) is designed, which alternates iteratively between an IS technique over the parameters of the non-linear model and a frequentist approach for the covariance matrix of the noise. In the second part of the proposed inference scheme, a prior density over the covariance matrix is considered and the cloud of samples obtained by ATAIS are recycled and re-weighted to obtain a complete Bayesian study over the model parameters and covariance matrix. ATAIS is the main contribution of the work. Additionally, the inverted layered importance sampling (ILIS) is presented as a possible compelling algorithm (but based on a conceptually simpler idea). Different numerical examples show the benefits of the proposed approaches

atais, covariance matrix, matrix, (17 more...)

arXiv.org Machine Learning

doi: 10.1016/j.jfranklin.2024.107441

2501.01148

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Galicia > Madrid (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation

Guo, Shoutao, Zhang, Shaolei, Ma, Zhengrui, Feng, Yang

arXiv.org Artificial IntelligenceJan-1-2025

Simultaneous generation models write generation results while reading streaming inputs, necessitating a policy-maker to determine the appropriate output timing. Existing simultaneous generation methods generally adopt the traditional encoder-decoder architecture and learn the generation and policy-making capabilities through complex dynamic programming techniques. Although LLMs excel at text generation, they face challenges in taking on the role of policy-makers through traditional training methods, limiting their exploration in simultaneous generation. To overcome these limitations, we propose a novel LLM-driven Simultaneous Generation (LSG) framework, which allows the off-the-shelf LLM to decide the generation timing and produce output concurrently. Specifically, LSG selects the generation policy that minimizes latency as the baseline policy. Referring to the baseline policy, LSG enables the LLM to devise an improved generation policy that better balances latency and generation quality, and writes generation results accordingly. Experiments on simultaneous translation and streaming automatic speech recognition tasks show that our method can achieve state-of-the-art performance utilizing the open-source LLMs and demonstrate practicality in real-world scenarios.

computational linguistic, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.00868

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
North America > Canada > Ontario > Toronto (0.04)
(12 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.90)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ValuesRAG: Enhancing Cultural Alignment Through Retrieval-Augmented Contextual Learning

Seo, Wonduk, Yuan, Zonghao, Bu, Yi

arXiv.org Artificial IntelligenceJan-1-2025

Cultural values alignment in Large Language Models (LLMs) is a critical challenge due to their tendency to embed Western-centric biases from training data, leading to misrepresentations and fairness issues in cross-cultural contexts. Recent approaches, such as role-assignment and few-shot learning, often struggle with reliable cultural alignment as they heavily rely on pre-trained knowledge, lack scalability, and fail to capture nuanced cultural values effectively. To address these issues, we propose ValuesRAG, a novel and effective framework that applies Retrieval-Augmented Generation (RAG) with in-context learning to integrate cultural and demographic knowledge dynamically during text generation. Leveraging the World Values Survey (WVS) dataset, ValuesRAG first generates summaries of values for each individual. Subsequently, we curated several representative regional datasets to serve as test datasets and retrieve relevant summaries of values based on demographic features, followed by a reranking step to select the top-k relevant summaries. ValuesRAG consistently outperforms baseline methods, both in the main experiment and in the ablation study where only the values summary was provided, highlighting ValuesRAG's potential to foster culturally aligned AI systems and enhance the inclusivity of AI-driven applications.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.01031

Country:

Asia > India (0.05)
Africa (0.05)
Europe (0.04)
(7 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Reasoning based on symbolic and parametric knowledge bases: a survey

Xu, Mayi, Ning, Yunfeng, Li, Yongqi, Chen, Jianhao, Wen, Jintao, Xiao, Yao, Zhou, Shen, Pan, Birong, Bao, Zepeng, Miao, Xin, Kang, Hankun, Sun, Ke, Qian, Tieyun

arXiv.org Artificial IntelligenceJan-1-2025

Reasoning is fundamental to human intelligence, and critical for problem-solving, decision-making, and critical thinking. Reasoning refers to drawing new conclusions based on existing knowledge, which can support various applications like clinical diagnosis, basic education, and financial analysis. Though a good number of surveys have been proposed for reviewing reasoning-related methods, none of them has systematically investigated these methods from the viewpoint of their dependent knowledge base. Both the scenarios to which the knowledge bases are applied and their storage formats are significantly different. Hence, investigating reasoning methods from the knowledge base perspective helps us better understand the challenges and future directions. To fill this gap, this paper first classifies the knowledge base into symbolic and parametric ones. The former explicitly stores information in human-readable symbols, and the latter implicitly encodes knowledge within parameters. Then, we provide a comprehensive overview of reasoning methods using symbolic knowledge bases, parametric knowledge bases, and both of them. Finally, we identify the future direction toward enhancing reasoning capabilities to bridge the gap between human and machine intelligence.

11th international joint conference, international acm sigir conference, research and development, (14 more...)

arXiv.org Artificial Intelligence

2501.0103

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(42 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Education > Assessment & Standards (0.46)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Optimizing Noise Schedules of Generative Models in High Dimensionss

Aranguri, Santiago, Biroli, Giulio, Mezard, Marc, Vanden-Eijnden, Eric

arXiv.org Artificial IntelligenceJan-1-2025

Recent works have shown that diffusion models can undergo phase transitions, the resolution of which is needed for accurately generating samples. This has motivated the use of different noise schedules, the two most common choices being referred to as variance preserving (VP) and variance exploding (VE). Here we revisit these schedules within the framework of stochastic interpolants. Using the Gaussian Mixture (GM) and Curie-Weiss (CW) data distributions as test case models, we first investigate the effect of the variance of the initial noise distribution and show that VP recovers the low-level feature (the distribution of each mode) but misses the high-level feature (the asymmetry between modes), whereas VE performs oppositely. We also show that this dichotomy, which happens when denoising by a constant amount in each step, can be avoided by using noise schedules specific to VP and VE that allow for the recovery of both high- and low-level features. Finally we show that these schedules yield generative models for the GM and CW model whose probability flow ODE can be discretized using $\Theta_d(1)$ steps in dimension $d$ instead of the $\Theta_d(\sqrt{d})$ steps required by constant denoising.

equation, interpolant, probability flow ode, (14 more...)

arXiv.org Artificial Intelligence

2501.00988

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.62)

Add feedback

Automatic Text Pronunciation Correlation Generation and Application for Contextual Biasing

Cheng, Gaofeng, Lu, Haitian, Yang, Chengxu, Wang, Xuyang, Li, Ta, Yan, Yonghong

arXiv.org Artificial IntelligenceJan-1-2025

Effectively distinguishing the pronunciation correlations between different written texts is a significant issue in linguistic acoustics. Traditionally, such pronunciation correlations are obtained through manually designed pronunciation lexicons. In this paper, we propose a data-driven method to automatically acquire these pronunciation correlations, called automatic text pronunciation correlation (ATPC). The supervision required for this method is consistent with the supervision needed for training end-to-end automatic speech recognition (E2E-ASR) systems, i.e., speech and corresponding text annotations. First, the iteratively-trained timestamp estimator (ITSE) algorithm is employed to align the speech with their corresponding annotated text symbols. Then, a speech encoder is used to convert the speech into speech embeddings. Finally, we compare the speech embeddings distances of different text symbols to obtain ATPC. Experimental results on Mandarin show that ATPC enhances E2E-ASR performance in contextual biasing and holds promise for dialects or languages lacking artificial pronunciation lexicons.

correlation, pronunciation correlation, pronunciation lexicon, (12 more...)

arXiv.org Artificial Intelligence

2501.00804

Country:

Asia > China (0.05)
South America > Paraguay (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Enhancing Transformers for Generalizable First-Order Logical Entailment

Zheng, Tianshi, Wang, Jiazheng, Wang, Zihao, Bai, Jiaxin, Yin, Hang, Deng, Zheye, Song, Yangqiu, Li, Jianxin

arXiv.org Artificial IntelligenceJan-1-2025

Transformers, as a fundamental deep learning architecture, have demonstrated remarkable capabilities in reasoning. This paper investigates the generalizable first-order logical reasoning ability of transformers with their parameterized knowledge and explores ways to improve it. The first-order reasoning capability of transformers is assessed through their ability to perform first-order logical entailment, which is quantitatively measured by their performance in answering knowledge graph queries. We establish connections between (1) two types of distribution shifts studied in out-of-distribution generalization and (2) the unseen knowledge and query settings discussed in the task of knowledge graph query answering, enabling a characterization of fine-grained generalizability. Results on our comprehensive dataset show that transformers outperform previous methods specifically designed for this task and provide detailed empirical evidence on the impact of input query syntax, token embedding, and transformer architectures on the reasoning capability of transformers. Interestingly, our findings reveal a mismatch between positional encoding and other design choices in transformer architectures employed in prior practices. This discovery motivates us to propose a more sophisticated, logic-aware architecture, TEGA, to enhance the capability for generalizable first-order logical entailment in transformers.

preprint, query, transformer, (13 more...)

arXiv.org Artificial Intelligence

2501.00759

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > New York County > New York City (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Survey on Large Language Model Acceleration based on KV Cache Management

Li, Haoyang, Li, Yiming, Tian, Anxin, Tang, Tianhao, Xu, Zhanchao, Chen, Xuejia, Hu, Nicole, Dong, Wei, Li, Qing, Chen, Lei

arXiv.org Artificial IntelligenceJan-1-2025

Large Language Models (LLMs) have revolutionized a wide range of domains such as natural language processing, computer vision, and multi-modal tasks due to their ability to comprehend context and perform logical reasoning. However, the computational and memory demands of LLMs, particularly during inference, pose significant challenges when scaling them to real-world, long-context, and real-time applications. Key-Value (KV) cache management has emerged as a critical optimization technique for accelerating LLM inference by reducing redundant computations and improving memory utilization. This survey provides a comprehensive overview of KV cache management strategies for LLM acceleration, categorizing them into token-level, model-level, and system-level optimizations. Token-level strategies include KV cache selection, budget allocation, merging, quantization, and low-rank decomposition, while model-level optimizations focus on architectural innovations and attention mechanisms to enhance KV reuse. System-level approaches address memory management, scheduling, and hardware-aware designs to improve efficiency across diverse computing environments. Additionally, the survey provides an overview of both text and multimodal datasets and benchmarks used to evaluate these strategies. By presenting detailed taxonomies and comparative analyses, this work aims to offer useful insights for researchers and practitioners to support the development of efficient and scalable KV cache management techniques, contributing to the practical deployment of LLMs in real-world applications. The curated paper list for KV cache management is in: \href{https://github.com/TreeAI-Lab/Awesome-KV-Cache-Management}{https://github.com/TreeAI-Lab/Awesome-KV-Cache-Management}.

arxiv, inference, zhang, (13 more...)

arXiv.org Artificial Intelligence

2412.19442

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
(22 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Health & Medicine (1.00)
Education (0.67)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Satori: Towards Proactive AR Assistant with Belief-Desire-Intention User Modeling

Li, Chenyi, Wu, Guande, Chan, Gromit Yeuk-Yin, Turakhia, Dishita G, Quispe, Sonia Castelo, Li, Dong, Welch, Leslie, Silva, Claudio, Qian, Jing

arXiv.org Artificial IntelligenceJan-1-2025

Augmented Reality assistance are increasingly popular for supporting users with tasks like assembly and cooking. However, current practice typically provide reactive responses initialized from user requests, lacking consideration of rich contextual and user-specific information. To address this limitation, we propose a novel AR assistance system, Satori, that models both user states and environmental contexts to deliver proactive guidance. Our system combines the Belief-Desire-Intention (BDI) model with a state-of-the-art multi-modal large language model (LLM) to infer contextually appropriate guidance. The design is informed by two formative studies involving twelve experts. A sixteen within-subject study find that Satori achieves performance comparable to an designer-created Wizard-of-Oz (WoZ) system without relying on manual configurations or heuristics, thereby enhancing generalizability, reusability and opening up new possibilities for AR assistance.

ar assistance, assistance, satori, (16 more...)

arXiv.org Artificial Intelligence

2410.16668

Country:

North America > United States > New York > New York County > New York City (0.28)
North America > United States > New York > Kings County > New York City (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(28 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Education (0.92)
Information Technology (0.92)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(5 more...)

Add feedback

Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models

Aranguri, Santiago, Insulla, Francesco

arXiv.org Machine LearningJan-1-2025

We analyze the training of a two-layer autoencoder used to parameterize a flow-based generative model for sampling from a high-dimensional Gaussian mixture. Previous work shows that the phase where the relative probability between the modes is learned disappears as the dimension goes to infinity without an appropriate time schedule. We introduce a time dilation that solves this problem. This enables us to characterize the learned velocity field, finding a first phase where the probability of each mode is learned and a second phase where the variance of each mode is learned. We find that the autoencoder representing the velocity field learns to simplify by estimating only the parameters relevant to each phase. Turning to real data, we propose a method that, for a given feature, finds intervals of time where training improves accuracy the most on that feature. Since practitioners take a uniform distribution over training times, our method enables more efficient training. We provide preliminary experiments validating this approach.

diffusion model, probability, velocity field, (15 more...)

arXiv.org Machine Learning

2412.07972

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(2 more...)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback