AITopics | Zhu, Hongbo

Collaborating Authors

Zhu, Hongbo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework

Wu, Yuping, Li, Hao, Zhu, Hongbo, Nenadic, Goran, Zeng, Xiao-Jun

arXiv.org Artificial IntelligenceSep-18-2024

Extract-then-Abstract is a naturally coherent paradigm to conduct abstractive summarization with the help of salient information identified by the extractive model. Previous works that adopt this paradigm train the extractor and abstractor separately and introduce extra parameters to highlight the extracted salients to the abstractor, which results in error accumulation and additional training costs. In this paper, we first introduce a parameter-free highlight method into the encoder-decoder framework: replacing the encoder attention mask with a saliency mask in the cross-attention module to force the decoder to focus only on salient parts of the input. A preliminary analysis compares different highlight methods, demonstrating the effectiveness of our saliency mask. We further propose the novel extract-and-abstract paradigm, ExtAbs, which jointly and seamlessly performs Extractive and Abstractive summarization tasks within single encoder-decoder model to reduce error accumulation. In ExtAbs, the vanilla encoder is augmented to extract salients, and the vanilla decoder is modified with the proposed saliency mask to generate summaries. Built upon BART and PEGASUS, experiments on three datasets show that ExtAbs can achieve superior performance than baselines on the extractive task and performs comparable, or even better than the vanilla models on the abstractive task.

computational linguistic, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2409.11827

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Law (0.46)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Noise-Free Explanation for Driving Action Prediction

Zhu, Hongbo, Wulff, Theodor, Maharjan, Rahul Singh, Han, Jinpei, Cangelosi, Angelo

arXiv.org Artificial IntelligenceJul-8-2024

Although attention mechanisms have achieved considerable progress in Transformer-based architectures across various Artificial Intelligence (AI) domains, their inner workings remain to be explored. Existing explainable methods have different emphases but are rather one-sided. They primarily analyse the attention mechanisms or gradient-based attribution while neglecting the magnitudes of input feature values or the skip-connection module. Moreover, they inevitably bring spurious noisy pixel attributions unrelated to the model's decision, hindering humans' trust in the spotted visualization result. Hence, we propose an easy-to-implement but effective way to remedy this flaw: Smooth Noise Norm Attention (SNNA). We weigh the attention by the norm of the transformed value vector and guide the label-specific signal with the attention gradient, then randomly sample the input perturbations and average the corresponding gradients to produce noise-free attribution. Instead of evaluating the explanation method on the binary or multi-class classification tasks like in previous works, we explore the more complex multi-label classification scenario in this work, i.e., the driving action prediction task, and trained a model for it specifically. Both qualitative and quantitative evaluation results show the superiority of SNNA compared to other SOTA attention-based explainable methods in generating a clearer visual explanation map and ranking the input pixel importance.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2407.06339

Genre: Research Report (0.84)

Industry:

Information Technology (0.47)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Gradient Sparsification for Efficient Wireless Federated Learning with Differential Privacy

Wei, Kang, Li, Jun, Ma, Chuan, Ding, Ming, Shu, Feng, Zhao, Haitao, Chen, Wen, Zhu, Hongbo

arXiv.org Artificial IntelligenceDec-20-2023

Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other. However, it suffers the leakage of private information from uploading models. In addition, as the model size grows, the training latency increases due to limited transmission bandwidth and the model performance degrades while using differential privacy (DP) protection. In this paper, we propose a gradient sparsification empowered FL framework over wireless channels, in order to improve training efficiency without sacrificing convergence performance. Specifically, we first design a random sparsification algorithm to retain a fraction of the gradient elements in each client's local training, thereby mitigating the performance degradation induced by DP and and reducing the number of transmission parameters over wireless channels. Then, we analyze the convergence bound of the proposed algorithm, by modeling a non-convex FL problem. Next, we formulate a time-sequential stochastic optimization problem for minimizing the developed convergence bound, under the constraints of transmit power, the average transmitting delay, as well as the client's DP requirement. Utilizing the Lyapunov drift-plus-penalty framework, we develop an analytical solution to the optimization problem. Extensive experiments have been implemented on three real life datasets to demonstrate the effectiveness of our proposed algorithm. We show that our proposed algorithms can fully exploit the interworking between communication and computation to outperform the baselines, i.e., random scheduling, round robin and delay-minimization algorithms.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11432-023-3918-9

2304.04164

Country:

North America > United States (0.67)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback

LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond The True Class

Zhu, Hongbo, Cangelosi, Angelo, Sen, Procheta, Mukherjee, Anirbit

arXiv.org Artificial IntelligenceDec-7-2023

In this work, we instantiate a novel perturbation-based multi-class explanation framework, LIPEx (Locally Interpretable Probabilistic Explanation). We demonstrate that LIPEx not only locally replicates the probability distributions output by the widely used complex classification models but also provides insight into how every feature deemed to be important affects the prediction probability for each of the possible classes. We achieve this by defining the explanation as a matrix obtained via regression with respect to the Hellinger distance in the space of probability distributions. Ablation tests on text and image data, show that LIPEx-guided removal of important features from the data causes more change in predictions for the underlying model than similar tests based on other saliency-based or feature importance-based Explainable AI (XAI) methods. It is also shown that compared to LIME, LIPEx is more data efficient in terms of using a lesser number of perturbations of the data to obtain a reliable explanation. This data-efficiency is seen to manifest as LIPEx being able to compute its explanation matrix around 53% faster than all-class LIME, for classification experiments with text data.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.04856

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Maryland (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry:

Law (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.87)

Add feedback

A Legged Soft Robot Platform for Dynamic Locomotion

Xia, Boxi, Fu, Jiaming, Zhu, Hongbo, Song, Zhicheng, Jiang, Yibo, Lipson, Hod

arXiv.org Artificial IntelligenceMar-25-2021

We present an open-source untethered quadrupedal soft robot platform for dynamic locomotion (e.g., high-speed running and backflipping). The robot is mostly soft (80 vol.%) while driven by four geared servo motors. The robot's soft body and soft legs were 3D printed with gyroid infill using a flexible material, enabling it to conform to the environment and passively stabilize during locomotion on multi-terrain environments. In addition, we simulated the robot in a real-time soft body simulation. With tuned gaits in simulation, the real robot can locomote at a speed of 0.9 m/s (2.5 body length/second), substantially faster than most untethered legged soft robots published to date. We hope this platform, along with its verified simulator, can catalyze the development of soft robotics.

artificial intelligence, dynamic locomotion, legged soft robot platform

arXiv.org Artificial Intelligence

doi: 10.1109/ICRA48506.2021.9561018

2011.06749

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback