AITopics

2501.13904

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Artificial IntelligenceJan-27-2025

PBM-VFL: Vertical Federated Learning with Feature and Sample Privacy

Tran, Linh, Castiglia, Timothy, Patterson, Stacy, Milanova, Ana

We present Poisson Binomial Mechanism Vertical Federated Learning (PBM-VFL), a communication-efficient Vertical Federated Learning algorithm with Differential Privacy guarantees. PBM-VFL combines Secure Multi-Party Computation with the recently introduced Poisson Binomial Mechanism to protect parties' private datasets during model training. We define the novel concept of feature privacy and analyze end-to-end feature and sample privacy of our algorithm. We compare sample privacy loss in VFL with privacy loss in HFL. We also provide the first theoretical characterization of the relationship between privacy budget, convergence error, and communication cost in differentially-private VFL. Finally, we empirically show that our model performs well with high levels of privacy.

artificial intelligence, machine learning, privacy, (14 more...)

2501.13916

Country: North America (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceJul-9-2024

A Differentially Private Blockchain-Based Approach for Vertical Federated Learning

Tran, Linh, Chari, Sanjay, Khan, Md. Saikat Islam, Zachariah, Aaron, Patterson, Stacy, Seneviratne, Oshani

We present the Differentially Private Blockchain-Based Vertical Federal Learning (DP-BBVFL) algorithm that provides verifiability and privacy guarantees for decentralized applications. DP-BBVFL uses a smart contract to aggregate the feature representations, i.e., the embeddings, from clients transparently. We apply local differential privacy to provide privacy for embeddings stored on a blockchain, hence protecting the original data. We provide the first prototype application of differential privacy with blockchain for vertical federated learning. Our experiments with medical data show that DP-BBVFL achieves high accuracy with a tradeoff in training time due to on-chain aggregation. This innovative fusion of differential privacy and blockchain technology in DP-BBVFL could herald a new era of collaborative and trustworthy machine learning applications across several decentralized application domains.

artificial intelligence, machine learning, privacy, (17 more...)

2407.07054

Country: North America > United States > New York (0.15)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

arXiv.org Artificial IntelligenceSep-5-2023

Representation Learning for Sequential Volumetric Design Tasks

Alam, Md Ferdous, Wang, Yi, Tran, Linh, Cheng, Chin-Yi, Luo, Jieliang

Volumetric design, also called massing design, is the first and critical step in professional building design which is sequential in nature. As the volumetric design process is complex, the underlying sequential design process encodes valuable information for designers. Many efforts have been made to automatically generate reasonable volumetric designs, but the quality of the generated design solutions varies, and evaluating a design solution requires either a prohibitively comprehensive set of metrics or expensive human expertise. While previous approaches focused on learning only the final design instead of sequential design tasks, we propose to encode the design knowledge from a collection of expert or high-performing design sequences and extract useful representations using transformer-based models. Later we propose to utilize the learned representations for crucial downstream applications such as design preference evaluation and procedural design generation. We develop the preference model by estimating the density of the learned representations whereas we train an autoregressive transformer model for sequential design generation. We demonstrate our ideas by leveraging a novel dataset of thousands of sequential volumetric designs. Our preference model can compare two arbitrarily given design sequences and is almost 90% accurate in evaluation against random design sequences. Our autoregressive model is also capable of autocompleting a volumetric design sequence from a partial design sequence.

artificial intelligence, representation learning, sequential volumetric design task

2309.02583

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (1.00)

arXiv.org Artificial IntelligenceOct-8-2022

MaskTune: Mitigating Spurious Correlations by Forcing to Explore

Taghanaki, Saeid Asgari, Khani, Aliasghar, Khani, Fereshte, Gholami, Ali, Tran, Linh, Mahdavi-Amiri, Ali, Hamarneh, Ghassan

A fundamental challenge of over-parameterized deep learning models is learning meaningful data representations that yield good performance on a downstream task without over-fitting spurious input features. This work proposes MaskTune, a masking strategy that prevents over-reliance on spurious (or a limited number of) features. MaskTune forces the trained model to explore new features during a single epoch finetuning by masking previously discovered features. MaskTune, unlike earlier approaches for mitigating shortcut learning, does not require any supervision, such as annotating spurious features or labels for subgroup samples in a dataset. Our empirical results on biased MNIST, CelebA, Waterbirds, and ImagenNet-9L datasets show that MaskTune is effective on tasks that often suffer from the existence of spurious correlations. Finally, we show that MaskTune outperforms or achieves similar performance to the competing methods when applied to the selective classification (classification with rejection option) task. Code for MaskTune is available at https://github.com/aliasgharkhani/

artificial intelligence, machine learning, masktune, (16 more...)

2210.00055

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningFeb-3-2021

Noise-robust classification with hypergraph neural network

Dang, Nguyen Trinh Vu, Tran, Loc, Tran, Linh

This paper presents a novel version of the hypergraph neural network method. This method is utilized to solve the noisy label learning problem. First, we apply the PCA dimensional reduction technique to the feature matrices of the image datasets in order to reduce the "noise" and the redundant features in the feature matrices of the image datasets and to reduce the runtime constructing the hypergraph of the hypergraph neural network method. Then, the classic graph-based semi-supervised learning method, the classic hypergraph based semi-supervised learning method, the graph neural network, the hypergraph neural network, and our proposed hypergraph neural network are employed to solve the noisy label learning problem. The accuracies of these five methods are evaluated and compared. Experimental results show that the hypergraph neural network methods achieve the best performance when the noise level increases. Moreover, the hypergraph neural network methods are at least as good as the graph neural network.

artificial intelligence, deep learning, neural network, (18 more...)

2102.01934

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education > Focused Education > Special Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

arXiv.org Machine LearningSep-3-2019

To Detect Irregular Trade Behaviors In Stock Market By Using Graph Based Ranking Methods

Tran, Loc, Tran, Linh

To detect the irregular trade behaviors in the stock market is the important problem in machine learning field. These irregular trade behaviors are obviously illegal. To detect these irregular trade behaviors in the stock market, data scientists normally employ the supervised learning techniques. In this paper, we employ the three graph Laplacian based semi-supervised ranking methods to solve the irregular trade behavior detection problem. Experimental results show that that the un-normalized and symmetric normalized graph Laplacian based semi-supervised ranking methods outperform the random walk Laplacian based semi-supervised ranking method.

health & medicine, inductive learning, laplacian, (17 more...)

1909.08964

Country: Asia > Vietnam > Bình Dương Province (0.14)

Genre: Research Report (0.84)

Industry:

Banking & Finance > Trading (1.00)
Health & Medicine (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

arXiv.org Machine LearningAug-29-2019

Solve fraud detection problem by using graph based learning methods

Tran, Loc, Tran, Tuan, Tran, Linh, Mai, An

The credit cards' fraud transactions detection is the important problem in machine learning field. To detect the credit cards's fraud transactions help reduce the significant loss of the credit cards' holders and the banks. To detect the credit cards' fraud transactions, data scientists normally employ the unsupervised learning techniques and supervised learning techniques. In this paper, we employ the graph p-Laplacian based semi-supervised learning methods combined with the undersampling techniques such as Cluster Centroids to solve the credit cards' fraud transactions detection problem. Experimental results show that the graph p-Laplacian semi-supervised learning methods outperform the current state of the art graph Laplacian based semi-supervised learning method (p=2).

law enforcement, operator, public safety, (24 more...)

1908.11708

Country: Asia > Vietnam > Bình Dương Province (0.14)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Security & Privacy (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.30)

arXiv.org Machine LearningAug-26-2013

The Generalized Mean Information Coefficient

Luedtke, Alexander, Tran, Linh

Reshef & Reshef recently published a paper in which they present a method called the Maximal Information Coefficient (MIC) that can detect all forms of statistical dependence between pairs of variables as sample size goes to infinity. While this method has been praised by some, it has also been criticized for its lack of power in finite samples. We seek to modify MIC so that it has higher power in detecting associations for limited sample sizes. Here we present the Generalized Mean Information Coefficient (GMIC), a generalization of MIC which incorporates a tuning parameter that can be used to modify the complexity of the association favored by the measure. We define GMIC and prove it maintains several key asymptotic properties of MIC. Its increased power over MIC is demonstrated using a simulation of eight different functional relationships at sixty different noise levels. The results are compared to the Pearson correlation, distance correlation, and MIC. Simulation results suggest that while generally GMIC has slightly lower power than the distance correlation measure, it achieves higher power than MIC for many forms of underlying association. For some functional relationships, GMIC surpasses all other statistics calculated. Preliminary results suggest choosing a moderate value of the tuning parameter for GMIC will yield a test that is robust across underlying relationships. GMIC is a promising new method that mitigates the power issues suffered by MIC, at the possible expense of equitability. Nonetheless, distance correlation was in our simulations more powerful for many forms of underlying relationships. At a minimum, this work motivates further consideration of maximal information-based nonparametric exploration (MINE) methods as statistical tests of independence.

artificial intelligence, cor dcor mic minic gmic, machine learning, (10 more...)

1308.5712

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)