AITopics | Tarragona

Collaborating Authors

Tarragona

Intervening to learn and compose disentangled representations

Markham, Alex, Chang, Jeri A., Hirsch, Isaac, Solus, Liam, Aragam, Bryon

arXiv.org Machine LearningJul-8-2025

In designing generative models, it is commonly believed that in order to learn useful latent structure, we face a fundamental tension between expressivity and structure. In this paper we challenge this view by proposing a new approach to training arbitrarily expressive generative models that simultaneously learn disentangled latent structure. This is accomplished by adding a simple decoder-only module to the head of an existing decoder block that can be arbitrarily complex. The module learns to process concept information by implicitly inverting linear representations from an encoder. Inspired by the notion of intervention in causal graphical models, our module selectively modifies its architecture during training, allowing it to learn a compact joint model over different contexts. We show how adding this module leads to disentangled representations that can be composed for out-of-distribution generation. To further validate our proposed approach, we prove a new identifiability result that extends existing work on identifying structured representations in nonlinear models.

intervention, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2507.04754

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

SMOTE-DP: Improving Privacy-Utility Tradeoff with Synthetic Data

Zhou, Yan, Malin, Bradley, Kantarcioglu, Murat

arXiv.org Machine LearningJun-3-2025

Privacy-preserving data publication, including synthetic data sharing, often experiences trade-offs between privacy and utility. Synthetic data is generally more effective than data anonymization in balancing this trade-off, however, not without its own challenges. Synthetic data produced by generative models trained on source data may inadvertently reveal information about outliers. Techniques specifically designed for preserving privacy, such as introducing noise to satisfy differential privacy, often incur unpredictable and significant losses in utility. In this work we show that, with the right mechanism of synthetic data generation, we can achieve strong privacy protection without significant utility loss. Synthetic data generators producing contracting data patterns, such as Synthetic Minority Over-sampling Technique (SMOTE), can enhance a differentially private data generator, leveraging the strengths of both. We prove in theory and through empirical demonstration that this SMOTE-DP technique can produce synthetic data that not only ensures robust privacy protection but maintains utility in downstream learning tasks.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2506.01907

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England (0.04)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.88)

Add feedback

A Consensus Privacy Metrics Framework for Synthetic Data

Pilgram, Lisa, Dankar, Fida K., Drechsler, Jorg, Elliot, Mark, Domingo-Ferrer, Josep, Francis, Paul, Kantarcioglu, Murat, Kong, Linglong, Malin, Bradley, Muralidhar, Krishnamurty, Myles, Puja, Prasser, Fabian, Raisaro, Jean Louis, Yan, Chao, Emam, Khaled El

arXiv.org Artificial IntelligenceMar-6-2025

Synthetic data generation is one approach for sharing individual-level data. However, to meet legislative requirements, it is necessary to demonstrate that the individuals' privacy is adequately protected. There is no consolidated standard for measuring privacy in synthetic data. Through an expert panel and consensus process, we developed a framework for evaluating privacy in synthetic data. Our findings indicate that current similarity metrics fail to measure identity disclosure, and their use is discouraged. For differentially private synthetic data, a privacy budget other than close to zero was not considered interpretable. There was consensus on the importance of membership and attribute disclosure, both of which involve inferring personal information about an individual without necessarily revealing their identity. The resultant framework provides precise recommendations for metrics that address these types of disclosures effectively. Our findings further present specific opportunities for future research that can help with widespread adoption of synthetic data.

data anonymity vulnerability measure, differentially private synthetic data, relative attribute disclosure metric, (16 more...)

arXiv.org Artificial Intelligence

2503.0498

Country:

North America > Canada > Alberta (0.14)
Europe > Netherlands (0.14)
Europe > Germany > Berlin (0.14)
(30 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
(10 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Modeling & Simulation (1.00)
Information Technology > Information Management (1.00)
(7 more...)

Add feedback

Advanced ingestion process powered by LLM parsing for RAG system

Perez, Arnau, Vizcaino, Xavier

arXiv.org Artificial IntelligenceDec-16-2024

This paper introduces a novel multi-strategy parsing approach using LLM-powered OCR to extract content from diverse document types, including presentations and high text density files both scanned or not. The methodology employs a node-based extraction technique that creates relationships between different information types and generates context-aware metadata. By implementing a Multimodal Assembler Agent and a flexible embedding strategy, the system enhances document comprehension and retrieval capabilities. Experimental evaluations across multiple knowledge bases demonstrate the approach's effectiveness, showing improvements in answer relevancy and information faithfulness.

advanced ingestion process, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2412.15262

Country: Europe > Spain > Catalonia > Tarragona Province > Tarragona (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Smart ETL and LLM-based contents classification: the European Smart Tourism Tools Observatory experience

Cosme, Diogo, Galvão, António, Abreu, Fernando Brito e

arXiv.org Artificial IntelligenceOct-24-2024

Purpose: Our research project focuses on improving the content update of the online European Smart Tourism Tools (STTs) Observatory by incorporating and categorizing STTs. The categorization is based on their taxonomy, and it facilitates the end user's search process. The use of a Smart ETL (Extract, Transform, and Load) process, where \emph{Smart} indicates the use of Artificial Intelligence (AI), is central to this endeavor. Methods: The contents describing STTs are derived from PDF catalogs, where PDF-scraping techniques extract QR codes, images, links, and text information. Duplicate STTs between the catalogs are removed, and the remaining ones are classified based on their text information using Large Language Models (LLMs). Finally, the data is transformed to comply with the Dublin Core metadata structure (the observatory's metadata structure), chosen for its wide acceptance and flexibility. Results: The Smart ETL process to import STTs to the observatory combines PDF-scraping techniques with LLMs for text content-based classification. Our preliminary results have demonstrated the potential of LLMs for text content-based classification. Conclusion: The proposed approach's feasibility is a step towards efficient content-based classification, not only in Smart Tourism but also adaptable to other fields. Future work will mainly focus on refining this classification process.

catalog, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2410.18641

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(5 more...)

Genre: Research Report > New Finding (0.34)

Industry: Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FGR-Net:Interpretable fundus imagegradeability classification based on deepreconstruction learning

Khalid, Saif, Rashwan, Hatem A., Abdulwahab, Saddam, Abdel-Nasser, Mohamed, Quiroga, Facundo Manuel, Puig, Domenec

arXiv.org Artificial IntelligenceSep-16-2024

The performance of diagnostic Computer-Aided Design (CAD) systems for retinal diseases depends on the quality of the retinal images being screened. Thus, many studies have been developed to evaluate and assess the quality of such retinal images. However, most of them did not investigate the relationship between the accuracy of the developed models and the quality of the visualization of interpretability methods for distinguishing between gradable and non-gradable retinal images. Consequently, this paper presents a novel framework called FGR-Net to automatically assess and interpret underlying fundus image quality by merging an autoencoder network with a classifier network. The FGR-Net model also provides an interpretable quality assessment through visualizations. In particular, FGR-Net uses a deep autoencoder to reconstruct the input image in order to extract the visual characteristics of the input fundus images based on self-supervised learning. The extracted features by the autoencoder are then fed into a deep classifier network to distinguish between gradable and ungradable fundus images. FGR-Net is evaluated with different interpretability methods, which indicates that the autoencoder is a key factor in forcing the classifier to focus on the relevant structures of the fundus images, such as the fovea, optic disk, and prominent blood vessels. Additionally, the interpretability methods can provide visual feedback for ophthalmologists to understand how our model evaluates the quality of fundus images. The experimental results showed the superiority of FGR-Net over the state-of-the-art quality assessment methods, with an accuracy of 89% and an F1-score of 87%.

dataset, fundus image, loss function, (14 more...)

arXiv.org Artificial Intelligence

2409.10246

Country:

Europe > Spain > Catalonia > Tarragona Province > Tarragona (0.04)
Africa > Middle East > Egypt > Aswan Governorate > Aswan (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ASR Error Correction using Large Language Models

Ma, Rao, Qian, Mengjie, Gales, Mark, Knill, Kate

arXiv.org Artificial IntelligenceSep-14-2024

Error correction (EC) models play a crucial role in refining Automatic Speech Recognition (ASR) transcriptions, enhancing the readability and quality of transcriptions. Without requiring access to the underlying code or model weights, EC can improve performance and provide domain adaptation for black-box ASR systems. This work investigates the use of large language models (LLMs) for error correction across diverse scenarios. 1-best ASR hypotheses are commonly used as the input to EC models. We propose building high-performance EC models using ASR N-best lists which should provide more contextual information for the correction process. Additionally, the generation process of a standard EC model is unrestricted in the sense that any output sequence can be generated. For some scenarios, such as unseen domains, this flexibility may impact performance. To address this, we introduce a constrained decoding approach based on the N-best list or an ASR lattice. Finally, most EC models are trained for a specific ASR system requiring retraining whenever the underlying ASR system is changed. This paper explores the ability of EC models to operate on the output of different ASR systems. This concept is further extended to zero-shot error correction using LLMs, such as ChatGPT. Experiments on three standard datasets demonstrate the efficacy of our proposed methods for both Transducer and attention-based encoder-decoder ASR systems. In addition, the proposed method can serve as an effective method for model ensembling.

correction, hypothesis, n-best list, (15 more...)

arXiv.org Artificial Intelligence

2409.09554

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Tarragona Province > Tarragona (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving the quality of Persian clinical text with a novel spelling correction system

Dashti, Seyed Mohammad Sadegh, Dashti, Seyedeh Fatemeh

arXiv.org Artificial IntelligenceAug-7-2024

Background: The accuracy of spelling in Electronic Health Records (EHRs) is a critical factor for efficient clinical care, research, and ensuring patient safety. The Persian language, with its abundant vocabulary and complex characteristics, poses unique challenges for real-word error correction. This research aimed to develop an innovative approach for detecting and correcting spelling errors in Persian clinical text. Methods: Our strategy employs a state-of-the-art pre-trained model that has been meticulously fine-tuned specifically for the task of spelling correction in the Persian clinical domain. This model is complemented by an innovative orthographic similarity matching algorithm, PERTO, which uses visual similarity of characters for ranking correction candidates. Results: The evaluation of our approach demonstrated its robustness and precision in detecting and rectifying word errors in Persian clinical text. In terms of non-word error correction, our model achieved an F1-Score of 90.0% when the PERTO algorithm was employed. For real-word error detection, our model demonstrated its highest performance, achieving an F1-Score of 90.6%. Furthermore, the model reached its highest F1-Score of 91.5% for real-word error correction when the PERTO algorithm was employed. Conclusions: Despite certain limitations, our method represents a substantial advancement in the field of spelling error detection and correction for Persian clinical text. By effectively addressing the unique challenges posed by the Persian language, our approach paves the way for more accurate and efficient clinical documentation, contributing to improved patient care and safety. Future research could explore its use in other areas of the Persian medical domain, enhancing its impact and utility.

correction, persian clinical text, real-word error, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1186/s12911-024-02613-0

2408.03622

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Asia > Middle East > Iran > Bushehr Province > Bushehr (0.04)
North America > United States > New Jersey (0.04)
(11 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Sensitivity Analysis of Cellular Automata and Heterogeneous Topology Networks: Partially-Local Cellular Automata and Homogeneous Homogeneous Random Boolean Networks

Glover, Tom Eivind, Jahren, Ruben, Martinuzzi, Francesco, Lind, Pedro Gonçalves, Nichele, Stefano

arXiv.org Artificial IntelligenceJul-25-2024

Elementary Cellular Automata (ECA) are a well-studied computational universe that is, despite its simple configurations, capable of impressive computational variety. Harvesting this computation in a useful way has historically shown itself to be difficult, but if combined with reservoir computing (RC), this becomes much more feasible. Furthermore, RC and ECA enable energy-efficient AI, making the combination a promising concept for Edge AI. In this work, we contrast ECA to substrates of Partially-Local CA (PLCA) and Homogeneous Homogeneous Random Boolean Networks (HHRBN). They are, in comparison, the topological heterogeneous counterparts of ECA. This represents a step from ECA towards more biological-plausible substrates. We analyse these substrates by testing on an RC benchmark (5-bit memory), using Temporal Derrida plots to estimate the sensitivity and assess the defect collapse rate. We find that, counterintuitively, disordered topology does not necessarily mean disordered computation. There are countering computational "forces" of topology imperfections leading to a higher collapse rate (order) and yet, if accounted for, an increased sensitivity to the initial condition. These observations together suggest a shrinking critical range.

benchmark, cellular automata, substrate, (13 more...)

arXiv.org Artificial Intelligence

2407.18017

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Illinois > Champaign County > Champaign (0.14)
Europe > Germany > Saxony > Leipzig (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry:

Energy (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition

Shu, Yuchun, Hu, Bo, He, Yifeng, Shi, Hao, Wang, Longbiao, Dang, Jianwu

arXiv.org Artificial IntelligenceJun-29-2024

Accurately finding the wrong words in the automatic speech recognition (ASR) hypothesis and recovering them well-founded is the goal of speech error correction. In this paper, we propose a non-autoregressive speech error correction method. A Confidence Module measures the uncertainty of each word of the N-best ASR hypotheses as the reference to find the wrong word position. Besides, the acoustic feature from the ASR encoder is also used to provide the correct pronunciation references. N-best candidates from ASR are aligned using the edit path, to confirm each other and recover some missing character errors. Furthermore, the cross-attention mechanism fuses the information between error correction references and the ASR hypothesis. The experimental results show that both the acoustic and confidence references help with error correction. The proposed system reduces the error rate by 21% compared with the ASR model.

correction, error correction, information, (13 more...)

arXiv.org Artificial Intelligence

2407.12817

Country:

Asia > China > Tianjin Province > Tianjin (0.05)
Europe > Spain > Catalonia > Tarragona Province > Tarragona (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science > Data Quality > Data Cleaning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback