AITopics | Zhang, Haixia

Collaborating Authors

Zhang, Haixia

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Progressive Supervision via Label Decomposition: An Long-Term and Large-Scale Wireless Traffic Forecasting Method

Liang, Daojun, Zhang, Haixia, Yuan, Dongfeng

arXiv.org Artificial IntelligenceJan-8-2025

Long-term and Large-scale Wireless Traffic Forecasting (LL-WTF) is pivotal for strategic network management and comprehensive planning on a macro scale. However, LL-WTF poses greater challenges than short-term ones due to the pronounced non-stationarity of extended wireless traffic and the vast number of nodes distributed at the city scale. To cope with this, we propose a Progressive Supervision method based on Label Decomposition (PSLD). Specifically, we first introduce a Random Subgraph Sampling (RSS) algorithm designed to sample a tractable subset from large-scale traffic data, thereby enabling efficient network training. Then, PSLD employs label decomposition to obtain multiple easy-to-learn components, which are learned progressively at shallow layers and combined at deep layers to effectively cope with the non-stationary problem raised by LL-WTF tasks. Finally, we compare the proposed method with various state-of-the-art (SOTA) methods on three large-scale WT datasets. Extensive experimental results demonstrate that the proposed PSLD significantly outperforms existing methods, with an average 2%, 4%, and 11% performance improvement on three WT datasets, respectively. In addition, we built an open source library for WT forecasting (WTFlib) to facilitate related research, which contains numerous SOTA methods and provides a strong benchmark.Experiments can be reproduced through https://github.com/Anoise/WTFlib.

data mining, forecasting, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.06255

Country:

North America > United States > Maryland (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Multi-Head Encoding for Extreme Label Classification

Liang, Daojun, Zhang, Haixia, Yuan, Dongfeng, Zhang, Minggao

arXiv.org Artificial IntelligenceDec-13-2024

The number of categories of instances in the real world is normally huge, and each instance may contain multiple labels. To distinguish these massive labels utilizing machine learning, eXtreme Label Classification (XLC) has been established. However, as the number of categories increases, the number of parameters and nonlinear operations in the classifier also rises. This results in a Classifier Computational Overload Problem (CCOP). To address this, we propose a Multi-Head Encoding (MHE) mechanism, which replaces the vanilla classifier with a multi-head classifier. During the training process, MHE decomposes extreme labels into the product of multiple short local labels, with each head trained on these local labels. During testing, the predicted labels can be directly calculated from the local predictions of each head. This reduces the computational load geometrically. Then, according to the characteristics of different XLC tasks, e.g., single-label, multi-label, and model pretraining tasks, three MHE-based implementations, i.e., Multi-Head Product, Multi-Head Cascade, and Multi-Head Sampling, are proposed to more effectively cope with CCOP. Moreover, we theoretically demonstrate that MHE can achieve performance approximately equivalent to that of the vanilla classifier by generalizing the low-rank approximation problem from Frobenius-norm to Cross-Entropy. Experimental results show that the proposed methods achieve state-of-the-art performance while significantly streamlining the training and inference processes of XLC tasks. The source code has been made public at https://github.com/Anoise/MHE.

classifier, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TPAMI.2024.3522298

2412.10182

Country:

Europe (1.00)
North America > United States > California (0.67)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Act Now: A Novel Online Forecasting Framework for Large-Scale Streaming Data

Liang, Daojun, Zhang, Haixia, Wang, Jing, Yuan, Dongfeng, Zhang, Minggao

arXiv.org Artificial IntelligenceNov-27-2024

In this paper, we find that existing online forecasting methods have the following issues: 1) They do not consider the update frequency of streaming data and directly use labels (future signals) to update the model, leading to information leakage. 2) Eliminating information leakage can exacerbate concept drift and online parameter updates can damage prediction accuracy. 3) Leaving out a validation set cuts off the model's continued learning. 4) Existing GPU devices cannot support online learning of large-scale streaming data. To address the above issues, we propose a novel online learning framework, Act-Now, to improve the online prediction on large-scale streaming data. Firstly, we introduce a Random Subgraph Sampling (RSS) algorithm designed to enable efficient model training. Then, we design a Fast Stream Buffer (FSB) and a Slow Stream Buffer (SSB) to update the model online. FSB updates the model immediately with the consistent pseudo- and partial labels to avoid information leakage. SSB updates the model in parallel using complete labels from earlier times. Further, to address concept drift, we propose a Label Decomposition model (Lade) with statistical and normalization flows. Lade forecasts both the statistical variations and the normalized future values of the data, integrating them through a combiner to produce the final predictions. Finally, we propose to perform online updates on the validation set to ensure the consistency of model learning on streaming data. Extensive experiments demonstrate that the proposed Act-Now framework performs well on large-scale streaming data, with an average 28.4% and 19.5% performance improvement, respectively. Experiments can be reproduced via https://github.com/Anoise/Act-Now.

artificial intelligence, forecasting, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.00108

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Continuing Education (0.34)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

DistPred: A Distribution-Free Probabilistic Inference Method for Regression and Forecasting

Liang, Daojun, Zhang, Haixia, Yuan, Dongfeng

arXiv.org Machine LearningJun-17-2024

Traditional regression and prediction tasks often only provide deterministic point estimates. To estimate the uncertainty or distribution information of the response variable, methods such as Bayesian inference, model ensembling, or MC Dropout are typically used. These methods either assume that the posterior distribution of samples follows a Gaussian process or require thousands of forward passes for sample generation. We propose a novel approach called DistPred for regression and forecasting tasks, which overcomes the limitations of existing methods while remaining simple and powerful. Specifically, we transform proper scoring rules that measure the discrepancy between the predicted distribution and the target distribution into a differentiable discrete form and use it as a loss function to train the model end-to-end. This allows the model to sample numerous samples in a single forward pass to estimate the potential distribution of the response variable. We have compared our method with several existing approaches on multiple datasets and achieved state-of-the-art performance. Additionally, our method significantly improves computational efficiency. For example, compared to state-of-the-art models, DistPred has a 90x faster inference speed. Experimental results can be reproduced through https://github.com/Anoise/DistPred.

artificial intelligence, distpred, machine learning, (16 more...)

arXiv.org Machine Learning

2406.11397

Country:

North America > United States > New York (0.14)
North America > United States > California (0.14)
North America > United States > Maryland (0.14)

Genre: Research Report > Promising Solution (0.86)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback

ESFL: Efficient Split Federated Learning over Resource-Constrained Heterogeneous Wireless Devices

Zhu, Guangyu, Deng, Yiqin, Chen, Xianhao, Zhang, Haixia, Fang, Yuguang, Wong, Tan F.

arXiv.org Artificial IntelligenceApr-16-2024

Federated learning (FL) allows multiple parties (distributed devices) to train a machine learning model without sharing raw data. How to effectively and efficiently utilize the resources on devices and the central server is a highly interesting yet challenging problem. In this paper, we propose an efficient split federated learning algorithm (ESFL) to take full advantage of the powerful computing capabilities at a central server under a split federated learning framework with heterogeneous end devices (EDs). By splitting the model into different submodels between the server and EDs, our approach jointly optimizes user-side workload and server-side computing resource allocation by considering users' heterogeneity. We formulate the whole optimization problem as a mixed-integer non-linear program, which is an NP-hard problem, and develop an iterative approach to obtain an approximate solution efficiently. Extensive simulations have been conducted to validate the significantly increased efficiency of our ESFL approach compared with standard federated learning, split learning, and splitfed learning.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2402.15903

Country:

Asia > China > Hong Kong (0.14)
North America > United States > Florida > Alachua County > Gainesville (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Minusformer: Improving Time Series Forecasting by Progressively Learning Residuals

Liang, Daojun, Zhang, Haixia, Yuan, Dongfeng, Zhang, Bingzheng, Zhang, Minggao

arXiv.org Artificial IntelligenceFeb-3-2024

In this paper, we find that ubiquitous time series (TS) forecasting models are prone to severe overfitting. To cope with this problem, we embrace a de-redundancy approach to progressively reinstate the intrinsic values of TS for future intervals. Specifically, we renovate the vanilla Transformer by reorienting the information aggregation mechanism from addition to subtraction. Then, we incorporate an auxiliary output branch into each block of the original model to construct a highway leading to the ultimate prediction. The output of subsequent modules in this branch will subtract the previously learned results, enabling the model to learn the residuals of the supervision signal, layer by layer. This designing facilitates the learning-driven implicit progressive decomposition of the input and output streams, empowering the model with heightened versatility, interpretability, and resilience against overfitting. Since all aggregations in the model are minus signs, which is called Minusformer. Extensive experiments demonstrate the proposed method outperform existing state-of-the-art methods, yielding an average performance improvement of 11.9% across various datasets.

artificial intelligence, machine learning, minusformer, (18 more...)

arXiv.org Artificial Intelligence

2402.02332

Country:

North America > United States > California (0.14)
Asia > China > Shandong Province (0.14)

Genre: Research Report > Promising Solution (0.68)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Joint Service Caching, Communication and Computing Resource Allocation in Collaborative MEC Systems: A DRL-based Two-timescale Approach

Liu, Qianqian, Zhang, Haixia, Zhang, Xin, Yuan, Dongfeng

arXiv.org Artificial IntelligenceJul-18-2023

Meeting the strict Quality of Service (QoS) requirements of terminals has imposed a signiffcant challenge on Multiaccess Edge Computing (MEC) systems, due to the limited multidimensional resources. To address this challenge, we propose a collaborative MEC framework that facilitates resource sharing between the edge servers, and with the aim to maximize the long-term QoS and reduce the cache switching cost through joint optimization of service caching, collaborative offfoading, and computation and communication resource allocation. The dual timescale feature and temporal recurrence relationship between service caching and other resource allocation make solving the problem even more challenging. To solve it, we propose a deep reinforcement learning (DRL)-based dual timescale scheme, called DGL-DDPG, which is composed of a short-term genetic algorithm (GA) and a long short-term memory network-based deep deterministic policy gradient (LSTM-DDPG). In doing so, we reformulate the optimization problem as a Markov decision process (MDP) where the small-timescale resource allocation decisions generated by an improved GA are taken as the states and input into a centralized LSTM-DDPG agent to generate the service caching decision for the large-timescale. Simulation results demonstrate that our proposed algorithm outperforms the baseline algorithms in terms of the average QoS and cache switching cost.

allocation, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2307.09691

Country: Asia > South Korea (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Does Long-Term Series Forecasting Need Complex Attention and Extra Long Inputs?

Liang, Daojun, Zhang, Haixia, Yuan, Dongfeng, Ma, Xiaoyan, Li, Dongyang, Zhang, Minggao

arXiv.org Artificial IntelligenceJun-12-2023

Does Long-Term Series Forecasting Need Complex Attention and Extra Long Inputs? Abstract--As Transformer-based models have achieved impressive performance on various time series tasks, Long-Term Series Forecasting (LTSF) tasks have also received extensive attention in recent years. However, due to the inherent computational complexity and long sequences demanding of Transformer-based methods, its application on LTSF tasks still has two major issues that need to be further investigated: 1) Whether the sparse attention mechanism designed by these methods actually reduce the running time on real devices; 2) Whether these models need extra long input sequences to guarantee their performance? The answers given in this paper are negative. Meanwhile, a gating mechanism is embedded into Periodformer to regulate the influence of the attention module on the prediction results. This enables Periodformer to have much more powerful and flexible sequence modeling capability with linear computational complexity, which guarantees higher prediction performance and shorter runtime on real devices. Furthermore, to take full advantage of GPUs for fast hyperparameter optimization (e.g., finding the suitable input length), a Multi-GPU Asynchronous parallel algorithm based on Bayesian Optimization (MABO) is presented. MABO allocates a process to each GPU via a queue mechanism, and then creates multiple trials at a time for asynchronous parallel search, which greatly reduces the search time. Experimental results show that Periodformer consistently achieves the best performance on six widely used benchmark datasets.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2306.05035

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Federated Radio Frequency Fingerprinting with Model Transfer and Adaptation

Zhang, Chuanting, Dang, Shuping, Zhang, Junqing, Zhang, Haixia, Beach, Mark A.

arXiv.org Artificial IntelligenceFeb-22-2023

The Radio frequency (RF) fingerprinting technique makes highly secure device authentication possible for future networks by exploiting hardware imperfections introduced during manufacturing. Although this technique has received considerable attention over the past few years, RF fingerprinting still faces great challenges of channel-variation-induced data distribution drifts between the training phase and the test phase. To address this fundamental challenge and support model training and testing at the edge, we propose a federated RF fingerprinting algorithm with a novel strategy called model transfer and adaptation (MTA). The proposed algorithm introduces dense connectivity among convolutional layers into RF fingerprinting to enhance learning accuracy and reduce model complexity. Besides, we implement the proposed algorithm in the context of federated learning, making our algorithm communication efficient and privacy-preserved. To further conquer the data mismatch challenge, we transfer the learned model from one channel condition and adapt it to other channel conditions with only a limited amount of information, leading to highly accurate predictions under environmental drifts. Experimental results on real-world datasets demonstrate that the proposed algorithm is model-agnostic and also signal-irrelevant. Compared with state-of-the-art RF fingerprinting algorithms, our algorithm can improve prediction performance considerably with a performance gain of up to 15\%.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.11418

Country: Europe > United Kingdom (0.46)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback