AITopics | Marina

Collaborating Authors

Marina

Exact Rate-Distortion in Autoencoders via Echo Noise

Rob Brekelmans, Daniel Moyer, Aram Galstyan, Greg Ver Steeg

Neural Information Processing SystemsOct-3-2025, 07:26:55 GMT

Neural Information Processing Systems http://nips.cc/

information, mutual information, noise, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Monterey County > Marina (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Fast structure learning with modular regularization

Greg Ver Steeg, Hrayr Harutyunyan, Daniel Moyer, Aram Galstyan

Neural Information Processing SystemsAug-20-2025, 06:41:18 GMT

We also use our approach for estimating covariance structure for a number of real-world datasets and show that it consistently outperforms state-of-the-art estimators at a fraction of the computational cost.

factor model, latent factor, latent factor model, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Monterey County > Marina (0.05)
North America > Canada (0.04)

Industry:

Banking & Finance > Trading (0.94)
Energy (0.69)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.64)

Add feedback

FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization

Wang, Haonan, Liu, Zeli, Hoshino, Kajimusugura, Zhang, Tuo, Walters, John Paul, Crago, Stephen

arXiv.org Artificial IntelligenceMar-31-2025

--Federated Learning (FL) enables distributed training on edge devices but faces significant challenges due to resource constraints in edge environments, impacting both communication and computational efficiency. Existing iterative pruning techniques improve communication efficiency but are limited by their centralized design, which struggles with FL's decentralized and data-imbalanced nature, resulting in suboptimal sparsity levels. T o address these issues, we propose FedPaI, a novel efficient FL framework that leverages Pruning at Initialization (PaI) to achieve extreme sparsity. FedPaI identifies optimal sparse connections at an early stage, maximizing model capacity and significantly reducing communication and computation overhead by fixing sparsity patterns at the start of training. T o adapt to diverse hardware and software environments, FedPaI supports both structured and unstructured pruning. Additionally, we introduce personalized client-side pruning mechanisms for improved learning capacity and sparsity-aware server-side aggregation for enhanced efficiency. Experimental results demonstrate that FedPaI consistently outperforms existing efficient FL that applies conventional iterative pruning with significant leading in efficiency and model accuracy. For the first time, our proposed FedPaI achieves an extreme sparsity level of up to 98% without compromising the model accuracy compared to unpruned baselines, even under challenging non-IID settings. By employing our FedPaI with joint optimization of model learning capacity and sparsity, FL applications can benefit from faster convergence and accelerate the training by 6.4 to 7.9 . Federated Learning (FL) [1], [2] has emerged as a promising approach for decentralized machine learning on edge devices, which are rapidly growing in number and capability.

artificial intelligence, machine learning, pruning, (16 more...)

arXiv.org Artificial Intelligence

2504.00308

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > California > Monterey County > Marina (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.68)
Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Discovering Structure in High-Dimensional Data Through Correlation Explanation

Greg Ver Steeg, Aram Galstyan

Neural Information Processing SystemsFeb-9-2025, 03:00:04 GMT

We introduce a method to learn a hierarchy of successively more abstract representations of complex data based on optimizing an information-theoretic objective. Intuitively, the optimization searches for a set of latent factors that best explain the correlations in the data as measured by multivariate mutual information. The method is unsupervised, requires no model assumptions, and scales linearly with the number of variables which makes it an attractive approach for very high dimensional systems. We demonstrate that Correlation Explanation (CorEx) automatically discovers meaningful structure for data from diverse sources including personality tests, DNA, and human language.

artificial intelligence, machine learning, representation, (18 more...)

Neural Information Processing Systems

Country:

Africa (0.06)
Oceania (0.04)
North America > United States > California > Monterey County > Marina (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Fear and Loathing on the Frontline: Decoding the Language of Othering by Russia-Ukraine War Bloggers

Gerard, Patrick, Theisen, William, Weninger, Tim, Lerman, Kristina

arXiv.org Artificial IntelligenceSep-19-2024

Othering, the act of portraying outgroups as fundamentally different from the ingroup, often escalates into framing them as existential threats--fueling intergroup conflict and justifying exclusion and violence. These dynamics are alarmingly pervasive, spanning from the extreme historical examples of genocides against minorities in Germany and Rwanda to the ongoing violence and rhetoric targeting migrants in the US and Europe. While concepts like hate speech and fear speech have been explored in existing literature, they capture only part of this broader and more nuanced dynamic which can often be harder to detect, particularly in online speech and propaganda. To address this challenge, we introduce a novel computational framework that leverages large language models (LLMs) to quantify othering across diverse contexts, extending beyond traditional linguistic indicators of hostility. Applying the model to real-world data from Telegram war bloggers and political discussions on Gab reveals how othering escalates during conflicts, interacts with moral language, and garners significant attention, particularly during periods of crisis. Our framework, designed to offer deeper insights into othering dynamics, combines with a rapid adaptation process to provide essential tools for mitigating othering's adverse impacts on social cohesion.

blogger, threat, war blogger, (16 more...)

arXiv.org Artificial Intelligence

2409.13064

Country:

Asia > Russia (0.66)
Europe > Russia (0.42)
Europe > Germany (0.24)
(15 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media (1.00)
Law Enforcement & Public Safety (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SelECT-SQL: Self-correcting ensemble Chain-of-Thought for Text-to-SQL

Shen, Ke, Kejriwal, Mayank

arXiv.org Artificial IntelligenceSep-16-2024

Natural language interfaces to databases allow non-SQL experts to query relational databases more conveniently. Text-to-SQL, which automatically maps natural language questions to SQL queries [1, 2] has therefore emerged as an important problem, especially due to generative AI. Early Text-to-SQL systems were domain-specific with limited user interaction, often relying on rule-based approaches to parse input questions [3, 4, 5, 6]. Recent advancements have shifted towards greater domain independence by introducing supervised models trained on various cross-domain datasets [7, 8], and transformer-based models fine-tuned with built-in modules and constraints [9, 10, 11, 12]. Unlike retrieval-augmented generation (RAG) [13], which uses transformer-based language models fine-tuned on external knowledge, Text-to-SQL reduces potential hallucinations in domain-specific or knowledge-intensive tasks because the answer is from querying the database rather than being generated directly by a model. Recent developments in Text-to-SQL use large language models (LLMs) with zero-shot [14, 15] and few-shot prompting [16, 17], demonstrating that LLMs can serve as strong baselines with minimal demonstration of questions and schemas and no fine-tuning.

query, select-sql, self-correcting ensemble chain-of-thought, (14 more...)

arXiv.org Artificial Intelligence

2409.10007

Country:

North America > United States > California > Monterey County > Marina (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Modeling Information Narrative Detection and Evolution on Telegram during the Russia-Ukraine War

Gerard, Patrick, Volkova, Svitlana, Penafiel, Louis, Lerman, Kristina, Weninger, Tim

arXiv.org Artificial IntelligenceSep-11-2024

Following the Russian Federation's full-scale invasion of Ukraine in February 2022, a multitude of information narratives emerged within both pro-Russian and pro-Ukrainian communities online. As the conflict progresses, so too do the information narratives, constantly adapting and influencing local and global community perceptions and attitudes. This dynamic nature of the evolving information environment (IE) underscores a critical need to fully discern how narratives evolve and affect online communities. Existing research, however, often fails to capture information narrative evolution, overlooking both the fluid nature of narratives and the internal mechanisms that drive their evolution. Recognizing this, we introduce a novel approach designed to both model narrative evolution and uncover the underlying mechanisms driving them. In this work we perform a comparative discourse analysis across communities on Telegram covering the initial three months following the invasion. First, we uncover substantial disparities in narratives and perceptions between pro-Russian and pro-Ukrainian communities. Then, we probe deeper into prevalent narratives of each group, identifying key themes and examining the underlying mechanisms fueling their evolution. Finally, we explore influences and factors that may shape the development and spread of narratives.

evolution, narrative, story cluster, (16 more...)

arXiv.org Artificial Intelligence

2409.07684

Country:

Asia > Russia (1.00)
Europe > Russia (0.73)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.14)
(7 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry:

Media > News (1.00)
Government > Military (1.00)
Government > Regional Government > Europe Government > Russia Government (0.46)
Government > Regional Government > Asia Government > Russia Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Add feedback

Integrating Pre-Trained Language Model with Physical Layer Communications

Lee, Ju-Hyung, Lee, Dong-Ho, Lee, Joohan, Pujara, Jay

arXiv.org Artificial IntelligenceJun-28-2024

The burgeoning field of on-device AI communication, where devices exchange information directly through embedded foundation models, such as language models (LMs), requires robust, efficient, and generalizable communication frameworks. However, integrating these frameworks with existing wireless systems and effectively managing noise and bit errors pose significant challenges. In this work, we introduce a practical ondevice AI communication framework, integrated with physical layer (PHY) communication functions, demonstrated through its performance on a link-level simulator. Our framework incorporates end-to-end training with channel noise to enhance resilience, incorporates vector quantized variational autoencoders (VQ-VAE) for efficient and robust communication, and utilizes pre-trained encoder-decoder transformers for improved generalization capabilities. Simulations, across various communication scenarios, reveal that our framework achieves a 50% reduction in transmission size while demonstrating substantial generalization ability and noise robustness under standardized 3GPP channel models.

communication, communication system, transmission, (16 more...)

arXiv.org Artificial Intelligence

2402.11656

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Pennsylvania (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Information Technology (0.47)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Deriva-ML: A Continuous FAIRness Approach to Reproducible Machine Learning Models

Li, Zhiwei, Kesselman, Carl, D'Arch, Mike, Pazzani, Michael, Xu, Benjamin Yizing

arXiv.org Artificial IntelligenceJun-27-2024

Increasingly, artificial intelligence (AI) and machine learning (ML) are used in eScience applications [9]. While these approaches have great potential, the literature has shown that ML-based approaches frequently suffer from results that are either incorrect or unreproducible due to mismanagement or misuse of data used for training and validating the models [12, 15]. Recognition of the necessity of high-quality data for correct ML results has led to data-centric ML approaches that shift the central focus from model development to creation of high-quality data sets to train and validate the models [14, 20]. However, there are limited tools and methods available for data-centric approaches to explore and evaluate ML solutions for eScience problems which often require collaborative multidisciplinary teams working with models and data that will rapidly evolve as an investigation unfolds [1]. In this paper, we show how data management tools based on the principle that all of the data for ML should be findable, accessible, interoperable and reusable (i.e. FAIR [26]) can significantly improve the quality of data that is used for ML applications. When combined with best practices that apply these tools to the entire life cycle of an ML-based eScience investigation, we can significantly improve the ability of an eScience team to create correct and reproducible ML solutions. We propose an architecture and implementation of such tools and demonstrate through two use cases how they can be used to improve ML-based eScience investigations.

catalog, dataset, workflow, (17 more...)

arXiv.org Artificial Intelligence

2407.01608

Country:

North America > United States > California > Monterey County > Marina (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)

Genre:

Workflow (0.97)
Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Harmful Speech Detection by Language Models Exhibits Gender-Queer Dialect Bias

Dorn, Rebecca, Kezar, Lee, Morstatter, Fred, Lerman, Kristina

arXiv.org Artificial IntelligenceJun-21-2024

Content moderation on social media platforms shapes the dynamics of online discourse, influencing whose voices are amplified and whose are suppressed. Recent studies have raised concerns about the fairness of content moderation practices, particularly for aggressively flagging posts from transgender and non-binary individuals as toxic. In this study, we investigate the presence of bias in harmful speech classification of gender-queer dialect online, focusing specifically on the treatment of reclaimed slurs. We introduce a novel dataset, QueerReclaimLex, based on 109 curated templates exemplifying non-derogatory uses of LGBTQ+ slurs. Dataset instances are scored by gender-queer annotators for potential harm depending on additional context about speaker identity. We systematically evaluate the performance of five off-the-shelf language models in assessing the harm of these texts and explore the effectiveness of chain-of-thought prompting to teach large language models (LLMs) to leverage author identity context. We reveal a tendency for these models to inaccurately flag texts authored by gender-queer individuals as harmful. Strikingly, across all LLMs the performance is poorest for texts that show signs of being written by individuals targeted by the featured slur (F1 <= 0.24). We highlight an urgent need for fairness and inclusivity in content moderation systems. By uncovering these biases, this work aims to inform the development of more equitable content moderation practices and contribute to the creation of inclusive online spaces for all users.

annotator, language model, slur, (16 more...)

arXiv.org Artificial Intelligence

2406.0002

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback