AITopics | Kompella, Ramana Rao

Collaborating Authors

Kompella, Ramana Rao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction

Zhou, Tong, Duan, Shijin, Liu, Gaowen, Fleming, Charles, Kompella, Ramana Rao, Ren, Shaolei, Xu, Xiaolin

arXiv.org Artificial IntelligenceMar-17-2025

Pre-trained models are valuable intellectual property, capturing both domainspecific and domain-invariant features within their weight spaces. However, model extraction attacks threaten these assets by enabling unauthorized source-domain inference and facilitating cross-domain transfer via the exploitation of domaininvariant features. In this work, we introduce ProDiF, a novel framework that leverages targeted weight space manipulation to secure pre-trained models against extraction attacks. ProDiF quantifies the transferability of filters and perturbs the weights of critical filters in unsecured memory, while preserving actual critical weights in a Trusted Execution Environment (TEE) for authorized users. A bilevel optimization further ensures resilience against adaptive fine-tuning attacks. Experimental results show that ProDiF reduces source-domain accuracy to nearrandom levels and decreases cross-domain transferability by 74.65%, providing robust protection for pre-trained models. This work offers comprehensive protection for pre-trained DNN models and highlights the potential of weight space manipulation as a novel approach to model security. Pre-trained deep neural networks (DNNs) excel across diverse tasks and encapsulate rich information in their weight spaces, making them valuable intellectual property (IP) Xue et al. (2021). However, this richness also makes them vulnerable to model extraction attacks Sun et al. (2021); Dubey et al. (2022), enabling attackers to perform source-domain inference using the extracted models. To counter this, prior works use Trusted Execution Environments (TEEs) to protect critical model weights, ensuring attackers obtain incomplete parameters, degrading source-domain performance Chakraborty et al. (2020); Zhou et al. (2023).

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.13224

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Vector Optimization on Low-Dimensional Vector Symbolic Architecture

Duan, Shijin, Liu, Yejia, Liu, Gaowen, Kompella, Ramana Rao, Ren, Shaolei, Xu, Xiaolin

arXiv.org Artificial IntelligenceMar-15-2025

Vector Symbolic Architecture (VSA) is emerging in machine learning due to its efficiency, but they are hindered by issues of hyperdimensionality and accuracy. As a promising mitigation, the Low-Dimensional Computing (LDC) method significantly reduces the vector dimension by ~100 times while maintaining accuracy, by employing a gradient-based optimization. Despite its potential, LDC optimization for VSA is still underexplored. Our investigation into vector updates underscores the importance of stable, adaptive dynamics in LDC training. We also reveal the overlooked yet critical roles of batch normalization (BN) and knowledge distillation (KD) in standard approaches. Besides the accuracy boost, BN does not add computational overhead during inference, and KD significantly enhances inference confidence. Through extensive experiments and ablation studies across multiple benchmarks, we provide a thorough evaluation of our approach and extend the interpretability of binary neural network optimization similar to LDC, previously unaddressed in BNN literature.

artificial intelligence, machine learning, vsa model, (18 more...)

arXiv.org Artificial Intelligence

2502.14075

Country:

Europe (0.28)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback

Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference

Ji, Jiabao, Liu, Yujian, Zhang, Yang, Liu, Gaowen, Kompella, Ramana Rao, Liu, Sijia, Chang, Shiyu

arXiv.org Artificial IntelligenceJun-12-2024

A conventional LLM unlearning task typically involves two goals: (1) The target LLM should forget the knowledge in the specified forget documents, and (2) it should retain the other knowledge that the LLM possesses, for which we assume access to a small number of retain documents. To achieve both goals, a mainstream class of LLM unlearning methods introduces an optimization framework with a combination of two objectives - maximizing the prediction loss on the forget documents while minimizing that on the retain documents, which suffers from two challenges, degenerated output and catastrophic forgetting. In this paper, we propose a novel unlearning framework called Unlearning from Logit Difference (ULD), which introduces an assistant LLM that aims to achieve the opposite of the unlearning goals: remembering the forget documents and forgetting the retain knowledge. ULD then derives the unlearned LLM by computing the logit difference between the target and the assistant LLMs. We show that such reversed objectives would naturally resolve both aforementioned challenges while significantly improving the training efficiency. Extensive experiments demonstrate that our method efficiently achieves the intended forgetting while preserving the LLM's overall capabilities, reducing training time by more than threefold. Notably, our method loses 0% of model utility on the ToFU benchmark, whereas baseline methods may sacrifice 17% of utility on average to achieve comparable forget quality.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.08607

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.29)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Not All Federated Learning Algorithms Are Created Equal: A Performance Evaluation Study

Baumgart, Gustav A., Shin, Jaemin, Payani, Ali, Lee, Myungjin, Kompella, Ramana Rao

arXiv.org Artificial IntelligenceMar-25-2024

Federated Learning (FL) emerged as a practical approach to training a model from decentralized data. The proliferation of FL led to the development of numerous FL algorithms and mechanisms. Many prior efforts have given their primary focus on accuracy of those approaches, but there exists little understanding of other aspects such as computational overheads, performance and training stability, etc. To bridge this gap, we conduct extensive performance evaluation on several canonical FL algorithms (FedAvg, FedProx, FedYogi, FedAdam, SCAFFOLD, and FedDyn) by leveraging an open-source federated learning framework called Flame. Our comprehensive measurement study reveals that no single algorithm works best across different performance metrics. A few key observations are: (1) While some state-of-the-art algorithms achieve higher accuracy than others, they incur either higher computation overheads (FedDyn) or communication overheads (SCAFFOLD). (2) Recent algorithms present smaller standard deviation in accuracy across clients than FedAvg, indicating that the advanced algorithms' performances are stable. (3) However, algorithms such as FedDyn and SCAFFOLD are more prone to catastrophic failures without the support of additional techniques such as gradient clipping. We hope that our empirical study can help the community to build best practices in evaluating FL algorithms.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2403.17287

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Prometheus: Infrastructure Security Posture Analysis with AI-generated Attack Graphs

Jin, Xin, Katsis, Charalampos, Sang, Fan, Sun, Jiahao, Bertino, Elisa, Kompella, Ramana Rao, Kundu, Ashish

arXiv.org Artificial IntelligenceDec-20-2023

The rampant occurrence of cybersecurity breaches imposes substantial limitations on the progress of network infrastructures, leading to compromised data, financial losses, potential harm to individuals, and disruptions in essential services. The current security landscape demands the urgent development of a holistic security assessment solution that encompasses vulnerability analysis and investigates the potential exploitation of these vulnerabilities as attack paths. In this paper, we propose Prometheus, an advanced system designed to provide a detailed analysis of the security posture of computing infrastructures. Using user-provided information, such as device details and software versions, Prometheus performs a comprehensive security assessment. This assessment includes identifying associated vulnerabilities and constructing potential attack graphs that adversaries can exploit. Furthermore, Prometheus evaluates the exploitability of these attack paths and quantifies the overall security posture through a scoring mechanism. The system takes a holistic approach by analyzing security layers encompassing hardware, system, network, and cryptography. Furthermore, Prometheus delves into the interconnections between these layers, exploring how vulnerabilities in one layer can be leveraged to exploit vulnerabilities in others. In this paper, we present the end-to-end pipeline implemented in Prometheus, showcasing the systematic approach adopted for conducting this thorough security analysis.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2312.13119

Country: North America > United States > New Mexico (0.14)

Genre:

Overview (1.00)
Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.86)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

From Trojan Horses to Castle Walls: Unveiling Bilateral Backdoor Effects in Diffusion Models

Pan, Zhuoshi, Yao, Yuguang, Liu, Gaowen, Shen, Bingquan, Zhao, H. Vicky, Kompella, Ramana Rao, Liu, Sijia

arXiv.org Artificial IntelligenceNov-4-2023

While state-of-the-art diffusion models (DMs) excel in image generation, concerns regarding their security persist. Earlier research highlighted DMs' vulnerability to backdoor attacks, but these studies placed stricter requirements than conventional methods like 'BadNets' in image classification. This is because the former necessitates modifications to the diffusion sampling and training procedures. Unlike the prior work, we investigate whether generating backdoor attacks in DMs can be as simple as BadNets, i.e., by only contaminating the training dataset without tampering the original diffusion process. In this more realistic backdoor setting, we uncover bilateral backdoor effects that not only serve an adversarial purpose (compromising the functionality of DMs) but also offer a defensive advantage (which can be leveraged for backdoor defense). Specifically, we find that a BadNets-like backdoor attack remains effective in DMs for producing incorrect images (misaligned with the intended text conditions), and thereby yielding incorrect predictions when DMs are used as classifiers. Meanwhile, backdoored DMs exhibit an increased ratio of backdoor triggers, a phenomenon we refer to as `trigger amplification', among the generated images. We show that this latter insight can be used to enhance the detection of backdoor-poisoned training data. Even under a low backdoor poisoning ratio, studying the backdoor effects of DMs is also valuable for designing anti-backdoor image classifiers. Last but not least, we establish a meaningful linkage between backdoor attacks and the phenomenon of data replications by exploring DMs' inherent data memorization tendencies. The codes of our work are available at https://github.com/OPTML-Group/BiBadDiff.

artificial intelligence, backdoor attack, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2311.02373

Country: Asia (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Graph Mixture of Experts: Learning on Large-Scale Graphs with Explicit Diversity Modeling

Wang, Haotao, Jiang, Ziyu, You, Yuning, Han, Yan, Liu, Gaowen, Srinivasa, Jayanth, Kompella, Ramana Rao, Wang, Zhangyang

arXiv.org Artificial IntelligenceOct-17-2023

Graph neural networks (GNNs) have found extensive applications in learning from graph data. However, real-world graphs often possess diverse structures and comprise nodes and edges of varying types. To bolster the generalization capacity of GNNs, it has become customary to augment training graph structures through techniques like graph augmentations and large-scale pre-training on a wider array of graphs. Balancing this diversity while avoiding increased computational costs and the notorious trainability issues of GNNs is crucial. This study introduces the concept of Mixture-of-Experts (MoE) to GNNs, with the aim of augmenting their capacity to adapt to a diverse range of training graph structures, without incurring explosive computational overhead. The proposed Graph Mixture of Experts (GMoE) model empowers individual nodes in the graph to dynamically and adaptively select more general information aggregation experts. These experts are trained to capture distinct subgroups of graph structures and to incorporate information with varying hop sizes, where those with larger hop sizes specialize in gathering information over longer distances. The effectiveness of GMoE is validated through a series of experiments on a diverse set of tasks, including graph, node, and link prediction, using the OGB benchmark. Notably, it enhances ROC-AUC by $1.81\%$ in ogbg-molhiv and by $1.40\%$ in ogbg-molbbbp, when compared to the non-MoE baselines. Our code is publicly available at https://github.com/VITA-Group/Graph-Mixture-of-Experts.

artificial intelligence, arxiv preprint arxiv, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2304.02806

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

Riemannian Multiclass Logistics Regression for SPD Neural Networks

Chen, Ziheng, Song, Yue, Liu, Gaowen, Kompella, Ramana Rao, Wu, Xiaojun, Sebe, Nicu

arXiv.org Artificial IntelligenceMay-18-2023

Deep neural networks for learning symmetric positive definite (SPD) matrices are gaining increasing attention in machine learning. Despite the significant progress, most existing SPD networks use traditional Euclidean classifiers on approximated spaces rather than intrinsic classifiers that accurately capture the geometry of SPD manifolds. Inspired by the success of hyperbolic neural networks (HNNs), we propose Riemannian multiclass logistics regression (RMLR) for SPD networks. We introduce a general unified framework for a family of Riemannian metrics on SPD manifolds and showcase the specific $\orth{n}$-invariant Log-Euclidean Metrics for SPD networks. Moreover, we encompass the most popular classifier in existing SPD networks as a special case of our framework. Extensive experiments on popular SPD learning benchmarks demonstrate the superiority of our classifiers.

artificial intelligence, machine learning, manifold, (18 more...)

arXiv.org Artificial Intelligence

2305.11288

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Federated Learning Operations Made Simple with Flame

Daga, Harshit, Shin, Jaemin, Garg, Dhruv, Gavrilovska, Ada, Lee, Myungjin, Kompella, Ramana Rao

arXiv.org Artificial IntelligenceMay-8-2023

Distributed machine learning approaches, including a broad class of federated learning techniques, present a number of benefits when deploying machine learning applications over widely distributed infrastructures. To realize the expected benefits, however, introduces substantial operational challenges due to required application and configuration-level changes related to deployment-specific details. Such complexities can be greatly reduced by introducing higher-level abstractions -- role and channel -- using which federated learning applications are described as Topology Abstraction Graphs (TAGs). TAGs decouple the ML application logic from the underlying deployment details, making it possible to specialize the application deployment, thus reducing development effort and paving the way for improved automation and tuning. We present Flame, the first system that supports these abstractions, and demonstrate its benefits for several use cases.

artificial intelligence, machine learning, topology, (16 more...)

arXiv.org Artificial Intelligence

2305.05118

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback