AITopics | Hu, Min

Collaborating Authors

Hu, Min

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dual-Splitting Conformal Prediction for Multi-Step Time Series Forecasting

Yu, Qingdi, Cao, Zhiwei, Wang, Ruihang, Yang, Zhen, Deng, Lijun, Hu, Min, Luo, Yong, Zhou, Xin

arXiv.org Artificial IntelligenceMar-27-2025

Time series forecasting is crucial for applications like resource scheduling and risk management, where multi-step predictions provide a comprehensive view of future trends. Uncertainty Quantification (UQ) is a mainstream approach for addressing forecasting uncertainties, with Conformal Prediction (CP) gaining attention due to its model-agnostic nature and statistical guarantees. However, most variants of CP are designed for single-step predictions and face challenges in multi-step scenarios, such as reliance on real-time data and limited scalability. This highlights the need for CP methods specifically tailored to multi-step forecasting. We propose the Dual-Splitting Conformal Prediction (DSCP) method, a novel CP approach designed to capture inherent dependencies within time-series data for multi-step forecasting. Experimental results on real-world datasets from four different domains demonstrate that the proposed DSCP significantly outperforms existing CP variants in terms of the Winkler Score, achieving a performance improvement of up to 23.59% compared to state-of-the-art methods. Furthermore, we deployed the DSCP approach for renewable energy generation and IT load forecasting in power management of a real-world trajectory-based application, achieving an 11.25% reduction in carbon emissions through predictive optimization of data center operations and controls.

data mining, forecasting, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2503.21251

Country: Asia > China (0.93)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Renewable (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
(4 more...)

Add feedback

ReverseNER: A Self-Generated Example-Driven Framework for Zero-Shot Named Entity Recognition with Large Language Models

Wang, Anbang, Mei, Difei, Zhang, Zhichao, Bai, Xiuxiu, Yao, Ran, Fang, Zewen, Hu, Min, Cao, Zhirui, Sun, Haitao, Guo, Yifeng, Zhou, Hongyao, Guo, Yu

arXiv.org Artificial IntelligenceDec-25-2024

This paper presents ReverseNER, a method aimed at overcoming the limitation of large language models (LLMs) in zero-shot named entity recognition (NER) tasks, arising from their reliance on pre-provided demonstrations. ReverseNER tackles this challenge by constructing a reliable example library composed of dozens of entity-labeled sentences, generated through the reverse process of NER. Specifically, while conventional NER methods label entities in a sentence, ReverseNER features reversing the process by using an LLM to generate entities from their definitions and subsequently expand them into full sentences. During the entity expansion process, the LLM is guided to generate sentences by replicating the structures of a set of specific \textsl{feature sentences}, extracted from the task sentences by clustering. This expansion process produces dozens of entity-labeled task-relevant sentences. After constructing the example library, the method selects several semantically similar entity-labeled examples for each task sentence as references to facilitate the LLM's entity recognition. We also propose an entity-level self-consistency scoring mechanism to improve NER performance with LLMs. Experiments show that ReverseNER significantly outperforms other zero-shot NER methods with LLMs, marking a notable improvement in NER for domains without labeled data, while declining computational resource consumption.

example library, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.00533

Country:

Asia (1.00)
North America > United States > Minnesota (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.83)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Futures Quantitative Investment with Heterogeneous Continual Graph Neural Network

Hu, Min, Tan, Zhizhong, Liu, Bin, Yin, Guosheng

arXiv.org Artificial IntelligenceDec-19-2023

This study aims to address the challenges of futures price prediction in high-frequency trading (HFT) by proposing a continuous learning factor predictor based on graph neural networks. The model integrates multi-factor pricing theories with real-time market dynamics, effectively bypassing the limitations of existing methods that lack financial theory guidance and ignore various trend signals and their interactions. We propose three heterogeneous tasks, including price moving average regression, price gap regression and change-point detection to trace the short-, intermediate-, and long-term trend factors present in the data. In addition, this study also considers the cross-sectional correlation characteristics of future contracts, where prices of different futures often show strong dynamic correlations. Each variable (future contract) depends not only on its historical values (temporal) but also on the observation of other variables (cross-sectional). To capture these dynamic relationships more accurately, we resort to the spatio-temporal graph neural network (STGNN) to enhance the predictive power of the model. The model employs a continuous learning strategy to simultaneously consider these tasks (factors). Additionally, due to the heterogeneity of the tasks, we propose to calculate parameter importance with mutual information between original observations and the extracted features to mitigate the catastrophic forgetting (CF) problem. Empirical tests on 49 commodity futures in China's futures market demonstrate that the proposed model outperforms other state-of-the-art models in terms of prediction accuracy. Not only does this research promote the integration of financial theory and deep learning, but it also provides a scientific basis for actual trading decisions.

artificial intelligence, forecasting, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2303.16532

Country: Asia > China (0.48)

Genre: Research Report > Promising Solution (0.48)

Industry:

Banking & Finance > Trading (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > Polymers & Plastics (0.46)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive loose optimization for robust question answering

Ma, Jie, Wang, Pinghui, Wang, Zewei, Kong, Dechen, Hu, Min, Han, Ting, Liu, Jun

arXiv.org Artificial IntelligenceOct-30-2023

Question answering methods are well-known for leveraging data bias, such as the language prior in visual question answering and the position bias in machine reading comprehension (extractive question answering). Current debiasing methods often come at the cost of significant in-distribution performance to achieve favorable out-of-distribution generalizability, while non-debiasing methods sacrifice a considerable amount of out-of-distribution performance in order to obtain high in-distribution performance. Therefore, it is challenging for them to deal with the complicated changing real-world situations. In this paper, we propose a simple yet effective novel loss function with adaptive loose optimization, which seeks to make the best of both worlds for question answering. Our main technical contribution is to reduce the loss adaptively according to the ratio between the previous and current optimization state on mini-batch training data. This loose optimization can be used to prevent non-debiasing methods from overlearning data bias while enabling debiasing methods to maintain slight bias learning. Experiments on the visual question answering datasets, including VQA v2, VQA-CP v1, VQA-CP v2, GQA-OOD, and the extractive question answering dataset SQuAD demonstrate that our approach enables QA methods to obtain state-of-the-art in- and out-of-distribution performance in most cases. The source code has been released publicly in \url{https://github.com/reml-group/ALO}.

machine learning, natural language, question answering, (21 more...)

arXiv.org Artificial Intelligence

2305.03971

Country:

Asia > China (0.15)
Asia > Middle East > Israel (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports > Football (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Neural CRF transducers for sequence labeling

Hu, Kai, Ou, Zhijian, Hu, Min, Feng, Junlan

arXiv.org Machine LearningNov-4-2018

Conditional random fields (CRFs) have been shown to be one of the most successful approaches to sequence labeling. Various linear-chain neural CRFs (NCRFs) are developed to implement the non-linear node potentials in CRFs, but still keeping the linear-chain hidden structure. In this paper, we propose NCRF transducers, which consists of two RNNs, one extracting features from observations and the other capturing (theoretically infinite) long-range dependencies between labels. Different sequence labeling methods are evaluated over POS tagging, chunking and NER (English, Dutch). Experiment results show that NCRF transducers achieve consistent improvements over linear-chain NCRFs and RNN transducers across all the four tasks, and can improve state-of-the-art results.

deep learning, neural network, transducer, (19 more...)

arXiv.org Machine Learning

1811.01382

Genre: Research Report > New Finding (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)

Add feedback