AITopics | Hu, Guoqiang

Plotting

Hu, Guoqiang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rethinking Remaining Useful Life Prediction with Scarce Time Series Data: Regression under Indirect Supervision

Cheng, Jiaxiang, Pang, Yipeng, Hu, Guoqiang

arXiv.org Machine LearningApr-12-2025

Supervised time series prediction relies on directly measured target variables, but real-world use cases such as predicting remaining useful life (RUL) involve indirect supervision, where the target variable is labeled as a function of another dependent variable. Trending temporal regression techniques rely on sequential time series inputs to capture temporal patterns, requiring interpolation when dealing with sparsely and irregularly sampled covariates along the timeline. However, interpolation can introduce significant biases, particularly with highly scarce data. In this paper, we address the RUL prediction problem with data scarcity as time series regression under indirect supervision. We introduce a unified framework called parameterized static regression, which takes single data points as inputs for regression of target values, inherently handling data scarcity without requiring interpolation. The time dependency under indirect supervision is captured via a parametrical rectification (PR) process, approximating a parametric function during inference with historical posteriori estimates, following the same underlying distribution used for labeling during training. Additionally, we propose a novel batch training technique for tasks in indirect supervision to prevent overfitting and enhance efficiency. We evaluate our model on public benchmarks for RUL prediction with simulated data scarcity. Our method demonstrates competitive performance in prediction accuracy when dealing with highly scarce time series data.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Machine Learning

2504.09206

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

Proactive Depot Discovery: A Generative Framework for Flexible Location-Routing

Qu, Site, Hu, Guoqiang

arXiv.org Artificial IntelligenceFeb-17-2025

The Location-Routing Problem (LRP) is a critical optimization challenge in the urban logistics industry, combining two interdependent decisions: selecting depot locations where vehicles commence and conclude their tasks, and planning vehicle routes for serving customers. This integration is crucial as the depot locations can directly affect the vehicle route planning, thereby impacting overall costs [1]. The LRP can be formally defined as [2]: Given a set of customers with specific location and quantity of demands, and a set of potential depot candidates each with a fleet of vehicles featuring fixed capacity, aiming to properly select a subset of depots and plan routes for vehicles departing from these chosen depots to meet customers' demands, while minimizing both depot-related and route-related costs, without violating specific constraints. In this traditional problem configuration, solving LRP have relied on a predefined set of depot candidates [3, 4, 5, 6] instead of directly generating desired optimal depot locations, thereby limiting the solution space and potentially leading to suboptimal outcomes. This constraint is particularly pronounced in scenarios where the optimal depot locations are not included in the candidates set, or when the problem configuration demands a high degree of flexibility in depot placement, requiring quickly establish and adjust depot locations. The real-world application that underscores the necessity of generating depots without predefined candidates is medical rescue and disaster relief logistics: In the aftermath of a natural disaster, such as an earthquake or flood, the existing infrastructure may be severely damaged, rendering previously established depots unusable. In such scenarios, the ability to dynamically generate new depot locations based on current needs and constraints is crucial for efficient and effective relief operations.

artificial intelligence, depot, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.11715

Genre: Research Report (0.82)

Industry: Transportation > Freight & Logistics Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

A Mel Spectrogram Enhancement Paradigm Based on CWT in Speech Synthesis

Hu, Guoqiang, Tan, Huaning, Li, Ruilai

arXiv.org Artificial IntelligenceJul-9-2024

Acoustic features play an important role in improving the quality of the synthesised speech. Currently, the Mel spectrogram is a widely employed acoustic feature in most acoustic models. However, due to the fine-grained loss caused by its Fourier transform process, the clarity of speech synthesised by Mel spectrogram is compromised in mutant signals. In order to obtain a more detailed Mel spectrogram, we propose a Mel spectrogram enhancement paradigm based on the continuous wavelet transform (CWT). This paradigm introduces an additional task: a more detailed wavelet spectrogram, which like the post-processing network takes as input the Mel spectrogram output by the decoder. We choose Tacotron2 and Fastspeech2 for experimental validation in order to test autoregressive (AR) and non-autoregressive (NAR) speech systems, respectively. The experimental results demonstrate that the speech synthesised using the model with the Mel spectrogram enhancement paradigm exhibits higher MOS, with an improvement of 0.14 and 0.09 compared to the baseline model, respectively. These findings provide some validation for the universality of the enhancement paradigm, as they demonstrate the success of the paradigm in different architectures.

data quality, machine learning, spectrogram, (14 more...)

arXiv.org Artificial Intelligence

2406.12164

Country: Asia > China (0.29)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Quality (0.94)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.71)

Add feedback

Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

Wang, Haoyu, Hu, Guoqiang, Lin, Guodong, Zhang, Wei-Qiang, Li, Jian

arXiv.org Artificial IntelligenceJun-14-2024

As a robust and large-scale multilingual speech recognition model, Whisper has demonstrated impressive results in many low-resource and out-of-distribution scenarios. However, its encoder-decoder structure hinders its application to streaming speech recognition. In this paper, we introduce Simul-Whisper, which uses the time alignment embedded in Whisper's cross-attention to guide auto-regressive decoding and achieve chunk-based streaming ASR without any fine-tuning of the pre-trained model. Furthermore, we observe the negative effect of the truncated words at the chunk boundaries on the decoding results and propose an integrate-and-fire-based truncation detection model to address this issue. Experiments on multiple languages and Whisper architectures show that Simul-Whisper achieves an average absolute word error rate degradation of only 1.46% at a chunk size of 1 second, which significantly outperforms the current state-of-the-art baseline.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.10052

Country: Asia > China (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.95)

Add feedback

Recent Advances in End-to-End Simultaneous Speech Translation

Liu, Xiaoqian, Hu, Guoqiang, Du, Yangfan, He, Erfeng, Luo, YingFeng, Xu, Chen, Xiao, Tong, Zhu, Jingbo

arXiv.org Artificial IntelligenceJun-1-2024

Simultaneous speech translation (SimulST) is a demanding task that involves generating translations in real-time while continuously processing speech input. This paper offers a comprehensive overview of the recent developments in SimulST research, focusing on four major challenges. Firstly, the complexities associated with processing lengthy and continuous speech streams pose significant hurdles. Secondly, satisfying real-time requirements presents inherent difficulties due to the need for immediate translation output. Thirdly, striking a balance between translation quality and latency constraints remains a critical challenge. Finally, the scarcity of annotated data adds another layer of complexity to the task. Through our exploration of these challenges and the proposed solutions, we aim to provide valuable insights into the current landscape of SimulST research and suggest promising directions for future exploration.

artificial intelligence, natural language, translation, (20 more...)

arXiv.org Artificial Intelligence

2406.00497

Country: Asia > China > Liaoning Province (0.14)

Genre:

Overview (0.48)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Enhancing Unsupervised Anomaly Detection with Score-Guided Network

Huang, Zongyuan, Zhang, Baohua, Hu, Guoqiang, Li, Longyuan, Xu, Yanyan, Jin, Yaohui

arXiv.org Artificial IntelligenceSep-10-2021

Anomaly detection plays a crucial role in various real-world applications, including healthcare and finance systems. Owing to the limited number of anomaly labels in these complex systems, unsupervised anomaly detection methods have attracted great attention in recent years. Two major challenges faced by the existing unsupervised methods are: (i) distinguishing between normal and abnormal data in the transition field, where normal and abnormal data are highly mixed together; (ii) defining an effective metric to maximize the gap between normal and abnormal data in a hypothesis space, which is built by a representation learner. To that end, this work proposes a novel scoring network with a score-guided regularization to learn and enlarge the anomaly score disparities between normal and abnormal data. With such score-guided strategy, the representation learner can gradually learn more informative representation during the model training stage, especially for the samples in the transition field. We next propose a score-guided autoencoder (SG-AE), incorporating the scoring network into an autoencoder framework for anomaly detection, as well as other three state-of-the-art models, to further demonstrate the effectiveness and transferability of the design. Extensive experiments on both synthetic and real-world datasets demonstrate the state-of-the-art performance of these score-guided models (SGMs).

dataset, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

2109.04684

Country:

Asia > China (0.15)
Europe > Germany (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.66)

Industry:

Information Technology > Security & Privacy (0.46)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Cognitive Visual Inspection Service for LCD Manufacturing Industry

Ding, Yuanyuan, Yan, Junchi, Hu, Guoqiang, Zhu, Jun

arXiv.org Artificial IntelligenceJan-11-2021

With the rapid growth of display devices, quality inspection via machine vision technology has become increasingly important for flat-panel displays (FPD) industry. This paper discloses a novel visual inspection system for liquid crystal display (LCD), which is currently a dominant type in the FPD industry. The system is based on two cornerstones: robust/high-performance defect recognition model and cognitive visual inspection service architecture. A hybrid application of conventional computer vision technique and the latest deep convolutional neural network (DCNN) leads to an integrated defect detection, classfication and impact evaluation model that can be economically trained with only image-level class annotations to achieve a high inspection accuracy. In addition, the properly trained model is robust to the variation of the image qulity, significantly alleviating the dependency between the model prediction performance and the image aquisition environment. This in turn justifies the decoupling of the defect recognition functions from the front-end device to the back-end serivce, motivating the design and realization of the cognitive visual inspection service architecture. Empirical case study is performed on a large-scale real-world LCD dataset from a manufacturing line with different layers and products, which shows the promising utility of our system, which has been deployed in a real-world LCD manufacturing line from a major player in the world.

inspection image, law enforcement, neural network, (21 more...)

arXiv.org Artificial Intelligence

2101.03747

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback