AITopics | Huang, Ziyue

Collaborating Authors

Huang, Ziyue

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Training on Fake Labels: Mitigating Label Leakage in Split Learning via Secure Dimension Transformation

Jiang, Yukun, Wang, Peiran, Lin, Chengguo, Huang, Ziyue, Cheng, Yong

arXiv.org Artificial IntelligenceOct-11-2024

Two-party split learning has emerged as a popular paradigm for vertical federated learning. To preserve the privacy of the label owner, split learning utilizes a split model, which only requires the exchange of intermediate representations (IRs) based on the inputs and gradients for each IR between two parties during the learning process. However, split learning has recently been proven to survive label inference attacks. Though several defense methods could be adopted, they either have limited defensive performance or significantly negatively impact the original mission. In this paper, we propose a novel two-party split learning method to defend against existing label inference attacks while maintaining the high utility of the learned models. Specifically, we first craft a dimension transformation module, SecDT, which could achieve bidirectional mapping between original labels and increased K-class labels to mitigate label leakage from the directional perspective. Then, a gradient normalization algorithm is designed to remove the magnitude divergence of gradients from different classes. We propose a softmax-normalized Gaussian noise to mitigate privacy leakage and make our K unknowable to adversaries. We conducted experiments on real-world datasets, including two binary-classification datasets (Avazu and Criteo) and three multi-classification datasets (MNIST, FashionMNIST, CIFAR-10); we also considered current attack schemes, including direction, norm, spectral, and model completion attacks. The detailed experiments demonstrate our proposed method's effectiveness and superiority over existing approaches. For instance, on the Avazu dataset, the attack AUC of evaluated four prominent attacks could be reduced by 0.4532+-0.0127.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.09125

Country: North America > United States (0.48)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics

Wu, Hao, Shi, Xingjian, Huang, Ziyue, Zhao, Penghao, Xiong, Wei, Xue, Jinbao, Tao, Yangyu, Huang, Xiaomeng, Wang, Weiyan

arXiv.org Artificial IntelligenceMay-27-2024

Data-driven deep learning has emerged as the new paradigm to model complex physical space-time systems. These data-driven methods learn patterns by optimizing statistical metrics and tend to overlook the adherence to physical laws, unlike traditional model-driven numerical methods. Thus, they often generate predictions that are not physically realistic. On the other hand, by sampling a large amount of high quality predictions from a data-driven model, some predictions will be more physically plausible than the others and closer to what will happen in the future. Based on this observation, we propose \emph{Beam search by Vector Quantization} (BeamVQ) to enhance the physical alignment of data-driven space-time forecasting models. The key of BeamVQ is to train model on self-generated samples filtered with physics-aware metrics. To be flexibly support different backbone architectures, BeamVQ leverages a code bank to transform any encoder-decoder model to the continuous state space into discrete codes. Afterwards, it iteratively employs beam search to sample high-quality sequences, retains those with the highest physics-aware scores, and trains model on the new dataset. Comprehensive experiments show that BeamVQ not only gave an average statistical skill score boost for more than 32% for ten backbones on five datasets, but also significantly enhances physics-aware metrics.

artificial intelligence, machine learning, prediction, (15 more...)

arXiv.org Artificial Intelligence

2405.17051

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gate Recurrent Unit Network based on Hilbert-Schmidt Independence Criterion for State-of-Health Estimation

Huang, Ziyue, Dang, Lujuan, Xie, Yuqing, Ma, Wentao, Chen, Badong

arXiv.org Artificial IntelligenceMar-16-2023

State-of-health (SOH) estimation is a key step in ensuring the safe and reliable operation of batteries. Due to issues such as varying data distribution and sequence length in different cycles, most existing methods require health feature extraction technique, which can be time-consuming and labor-intensive. GRU can well solve this problem due to the simple structure and superior performance, receiving widespread attentions. However, redundant information still exists within the network and impacts the accuracy of SOH estimation. To address this issue, a new GRU network based on Hilbert-Schmidt Independence Criterion (GRU-HSIC) is proposed. First, a zero masking network is used to transform all battery data measured with varying lengths every cycle into sequences of the same length, while still retaining information about the original data size in each cycle. Second, the Hilbert-Schmidt Independence Criterion (HSIC) bottleneck, which evolved from Information Bottleneck (IB) theory, is extended to GRU to compress the information from hidden layers. To evaluate the proposed method, we conducted experiments on datasets from the Center for Advanced Life Cycle Engineering (CALCE) of the University of Maryland and NASA Ames Prognostics Center of Excellence. Experimental results demonstrate that our model achieves higher accuracy than other recurrent models.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2303.09497

Country: North America > United States > Maryland (0.24)

Genre: Research Report > New Finding (0.34)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)
Government > Regional Government > North America Government > United States Government (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs

Wang, Minjie, Yu, Lingfan, Zheng, Da, Gan, Quan, Gai, Yu, Ye, Zihao, Li, Mufei, Zhou, Jinjing, Huang, Qi, Ma, Chao, Huang, Ziyue, Guo, Qipeng, Zhang, Hao, Lin, Haibin, Zhao, Junbo, Li, Jinyang, Smola, Alexander, Zhang, Zheng

arXiv.org Machine LearningSep-3-2019

DGL is platform-agnostic so that it can easily be integrated with tensor-oriented frameworks like PyTorch and MXNet. It is an open-source project under active development. Appendix A summarizes the models released in DGL repository. In this paper, we compare DGL against the state-of- the-art library on multiple standard GNN setups and show the improvement of training speed and memory efficiency. 2 F RAMEWORK REQUIREMENTS OF D EEP L EARNING ON G RAPHS Message passing paradigm. Formally, we define a graph G(V,E). V is the set of nodes with v i being the feature vector associated with each node. E is the set of the edge tuples (e k,r k,s k), where s k r k represents the edge from node s k to r k, and e k is feature vector associated with the edge. DGNs are defined by the following edgewise and node-wise computation: Edgewise: m (t) k φ e (e ( t 1) k, v ( t 1) r k, v ( t 1) s k), Node-wise: v ( t) i φ v (v (t 1) i, null k s.t.

deep learning, graph, neural network, (13 more...)

arXiv.org Machine Learning

1909.01315

Country:

Asia > China (0.29)
North America > United States (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback

Zheng, Shuai, Huang, Ziyue, Kwok, James T.

arXiv.org Machine LearningMay-26-2019

Communication overhead is a major bottleneck hampering the scalability of distributed machine learning systems. Recently, there has been a surge of interest in using gradient compression to improve the communication efficiency of distributed neural network training. Using 1-bit quantization, signSGD with majority vote achieves a 32x reduction on communication cost. However, its convergence is based on unrealistic assumptions and can diverge in practice. In this paper, we propose a general distributed compressed SGD with Nesterov's momentum. We consider two-way compression, which compresses the gradients both to and from workers. Convergence analysis on nonconvex problems for general gradient compressors is provided. By partitioning the gradient into blocks, a blockwise compressor is introduced such that each gradient block is compressed and transmitted in 1-bit format with a scaling factor, leading to a nearly 32x reduction on communication. Experimental results show that the proposed method converges as fast as full-precision distributed momentum SGD and achieves the same testing accuracy. In particular, on distributed ResNet training with 7 workers on the ImageNet, the proposed algorithm achieves the same testing accuracy as momentum SGD using full-precision gradients, but with $46\%$ less wall clock time.

deep learning, gradient, neural network, (20 more...)

arXiv.org Machine Learning

1905.10936

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback