AITopics | Wang, Hao

Collaborating Authors

Wang, Hao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Convexity in Anomaly Detection: A New Formulation of SSLM with Unique Optimal Solutions

Liu, Hongying, Wang, Hao, Chu, Haoran, Wu, Yibo

arXiv.org Artificial IntelligenceOct-31-2024

An unsolved issue in widely used methods such as Support Vector Data Description (SVDD) and Small Sphere and Large Margin SVM (SSLM) for anomaly detection is their nonconvexity, which hampers the analysis of optimal solutions in a manner similar to SVMs and limits their applicability in large-scale scenarios. In this paper, we introduce a novel convex SSLM formulation which has been demonstrated to revert to a convex quadratic programming problem for hyperparameter values of interest. Leveraging the convexity of our method, we derive numerous results that are unattainable with traditional nonconvex approaches. We conduct a thorough analysis of how hyperparameters influence the optimal solution, pointing out scenarios where optimal solutions can be trivially found and identifying instances of ill-posedness. Most notably, we establish connections between our method and traditional approaches, providing a clear determination of when the optimal solution is unique -- a task unachievable with traditional nonconvex methods. We also derive the {\nu}-property to elucidate the interactions between hyperparameters and the fractions of support vectors and margin errors in both positive and negative classes.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2410.23774

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.70)

Add feedback

Variational Language Concepts for Interpreting Foundation Language Models

Wang, Hengyi, Tan, Shiwei, Hong, Zhiqing, Zhang, Desheng, Wang, Hao

arXiv.org Machine LearningOct-28-2024

Foundation Language Models (FLMs) such as BERT and its variants have achieved remarkable success in natural language processing. To date, the interpretability of FLMs has primarily relied on the attention weights in their self-attention layers. However, these attention weights only provide word-level interpretations, failing to capture higher-level structures, and are therefore lacking in readability and intuitiveness. To address this challenge, we first provide a formal definition of conceptual interpretation and then propose a variational Bayesian framework, dubbed VAriational Language Concept (VALC), to go beyond word-level interpretations and provide concept-level interpretations. Our theoretical analysis shows that our VALC finds the optimal language concepts to interpret FLM predictions. Empirical results on several real-world datasets show that our method can successfully provide conceptual interpretation for FLMs.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2410.03964

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Government > Military (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Privacy without Noisy Gradients: Slicing Mechanism for Generative Model Training

Greenewald, Kristjan, Yu, Yuancheng, Wang, Hao, Xu, Kai

arXiv.org Artificial IntelligenceOct-25-2024

Training generative models with differential privacy (DP) typically involves injecting noise into gradient updates or adapting the discriminator's training procedure. As a result, such approaches often struggle with hyper-parameter tuning and convergence. We consider the slicing privacy mechanism that injects noise into random low-dimensional projections of the private data, and provide strong privacy guarantees for it. These noisy projections are used for training generative models. To enable optimizing generative models using this DP approach, we introduce the smoothed-sliced $f$-divergence and show it enjoys statistical consistency. Moreover, we present a kernel-based estimator for this divergence, circumventing the need for adversarial training. Extensive numerical experiments demonstrate that our approach can generate synthetic data of higher quality compared with baselines. Beyond performance improvement, our method, by sidestepping the need for noisy gradients, offers data scientists the flexibility to adjust generator architecture and hyper-parameters, run the optimization over any number of epochs, and even restart the optimization process -- all without incurring additional privacy costs.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.19941

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Synthetic Data Generation for Residential Load Patterns via Recurrent GAN and Ensemble Method

Liang, Xinyu, Wang, Ziheng, Wang, Hao

arXiv.org Artificial IntelligenceOct-20-2024

Generating synthetic residential load data that can accurately represent actual electricity consumption patterns is crucial for effective power system planning and operation. The necessity for synthetic data is underscored by the inherent challenges associated with using real-world load data, such as privacy considerations and logistical complexities in large-scale data collection. In this work, we tackle the above-mentioned challenges by developing the Ensemble Recurrent Generative Adversarial Network (ERGAN) framework to generate high-fidelity synthetic residential load data. ERGAN leverages an ensemble of recurrent Generative Adversarial Networks, augmented by a loss function that concurrently takes into account adversarial loss and differences between statistical properties. Our developed ERGAN can capture diverse load patterns across various households, thereby enhancing the realism and diversity of the synthetic data generated. Comprehensive evaluations demonstrate that our method consistently outperforms established benchmarks in the synthetic generation of residential load data across various performance metrics including diversity, similarity, and statistical measures. The findings confirm the potential of ERGAN as an effective tool for energy applications requiring synthetic yet realistic load data. We also make the generated synthetic residential load patterns publicly available.

artificial intelligence, load pattern, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TIM.2024.3480225

2410.15379

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Prompt Refinement-based Large Language Model for Metro Passenger Flow Forecasting under Delay Conditions

Huang, Ping, He, Yuxin, Wang, Hao, Chen, Jingjing, Luo, Qin

arXiv.org Artificial IntelligenceOct-19-2024

Accurate short-term forecasts of passenger flow in metro systems under delay conditions are crucial for emergency response and service recovery, which pose significant challenges and are currently under-researched. Due to the rare occurrence of delay events, the limited sample size under delay condictions make it difficult for conventional models to effectively capture the complex impacts of delays on passenger flow, resulting in low forecasting accuracy. Recognizing the strengths of large language models (LLMs) in few-shot learning due to their powerful pre-training, contextual understanding, ability to perform zero-shot and few-shot reasoning, to address the issues that effectively generalize and adapt with minimal data, we propose a passenger flow forecasting framework under delay conditions that synthesizes an LLM with carefully designed prompt engineering. By Refining prompt design, we enable the LLM to understand delay event information and the pattern from historical passenger flow data, thus overcoming the challenges of passenger flow forecasting under delay conditions. The propmpt engineering in the framework consists of two main stages: systematic prompt generation and prompt refinement. In the prompt generation stage, multi-source data is transformed into descriptive texts understandable by the LLM and stored. In the prompt refinement stage, we employ the multidimensional Chain of Thought (CoT) method to refine the prompts. We verify the proposed framework by conducting experiments using real-world datasets specifically targeting passenger flow forecasting under delay conditions of Shenzhen metro in China. The experimental results demonstrate that the proposed model performs particularly well in forecasting passenger flow under delay conditions.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.15111

Country: Asia > China > Guangdong Province > Shenzhen (0.27)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Rail (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Anderson Acceleration in Nonsmooth Problems: Local Convergence via Active Manifold Identification

Li, Kexin, Bai, Luwei, Wang, Xiao, Wang, Hao

arXiv.org Artificial IntelligenceOct-15-2024

Anderson acceleration is an effective technique for enhancing the efficiency of fixed-point iterations; however, analyzing its convergence in nonsmooth settings presents significant challenges. In this paper, we investigate a class of nonsmooth optimization algorithms characterized by the active manifold identification property. This class includes a diverse array of methods such as the proximal point method, proximal gradient method, proximal linear method, proximal coordinate descent method, Douglas-Rachford splitting (or the alternating direction method of multipliers), and the iteratively reweighted $\ell_1$ method, among others. Under the assumption that the optimization problem possesses an active manifold at a stationary point, we establish a local R-linear convergence rate for the Anderson-accelerated algorithm. Our extensive numerical experiments further highlight the robust performance of the proposed Anderson-accelerated methods.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.0942

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

On Calibration of LLM-based Guard Models for Reliable Content Moderation

Liu, Hongfu, Huang, Hengguan, Wang, Hao, Gu, Xiangming, Wang, Ye

arXiv.org Artificial IntelligenceOct-14-2024

Large language models (LLMs) pose significant risks due to the potential for generating harmful content or users attempting to evade guardrails. Existing studies have developed LLM-based guard models designed to moderate the input and output of threat LLMs, ensuring adherence to safety policies by blocking content that violates these protocols upon deployment. However, limited attention has been given to the reliability and calibration of such guard models. In this work, we empirically conduct comprehensive investigations of confidence calibration for 9 existing LLM-based guard models on 12 benchmarks in both user input and model output classification. Our findings reveal that current LLM-based guard models tend to 1) produce overconfident predictions, 2) exhibit significant miscalibration when subjected to jailbreak attacks, and 3) demonstrate limited robustness to the outputs generated by different types of response models. Additionally, we assess the effectiveness of post-hoc calibration methods to mitigate miscalibration. We demonstrate the efficacy of temperature scaling and, for the first time, highlight the benefits of contextual calibration for confidence calibration of guard models, particularly in the absence of validation sets. Our analysis and experiments underscore the limitations of current LLM-based guard models and provide valuable insights for the future development of well-calibrated guard models toward more reliable content moderation. We also advocate for incorporating reliability evaluation of confidence calibration when releasing future LLM-based guard models.

classification, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.10414

Genre: Research Report > New Finding (1.00)

Industry:

Law (0.93)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Meta-Transfer Learning Empowered Temporal Graph Networks for Cross-City Real Estate Appraisal

Zhang, Weijia, Han, Jindong, Liu, Hao, Fan, Wei, Wang, Hao, Xiong, Hui

arXiv.org Artificial IntelligenceOct-11-2024

Real estate appraisal is important for a variety of endeavors such as real estate deals, investment analysis, and real property taxation. Recently, deep learning has shown great promise for real estate appraisal by harnessing substantial online transaction data from web platforms. Nonetheless, deep learning is data-hungry, and thus it may not be trivially applicable to enormous small cities with limited data. To this end, we propose Meta-Transfer Learning Empowered Temporal Graph Networks (MetaTransfer) to transfer valuable knowledge from multiple data-rich metropolises to the data-scarce city to improve valuation performance. Specifically, by modeling the ever-growing real estate transactions with associated residential communities as a temporal event heterogeneous graph, we first design an Event-Triggered Temporal Graph Network to model the irregular spatiotemporal correlations between evolving real estate transactions. Besides, we formulate the city-wide real estate appraisal as a multi-task dynamic graph link label prediction problem, where the valuation of each community in a city is regarded as an individual task. A Hypernetwork-Based Multi-Task Learning module is proposed to simultaneously facilitate intra-city knowledge sharing between multiple communities and task-specific parameters generation to accommodate the community-wise real estate price distribution. Furthermore, we propose a Tri-Level Optimization Based Meta- Learning framework to adaptively re-weight training transaction instances from multiple source cities to mitigate negative transfer, and thus improve the cross-city knowledge transfer effectiveness. Finally, extensive experiments based on five real-world datasets demonstrate the significant superiority of MetaTransfer compared with eleven baseline algorithms.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.08947

Country:

Asia > China (0.95)
North America > United States (0.69)

Genre:

Research Report (0.50)
Instructional Material > Course Syllabus & Notes (0.34)

Industry: Banking & Finance > Real Estate (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MDAP: A Multi-view Disentangled and Adaptive Preference Learning Framework for Cross-Domain Recommendation

Tong, Junxiong, Yin, Mingjia, Wang, Hao, Pan, Qiushi, Lian, Defu, Chen, Enhong

arXiv.org Artificial IntelligenceOct-8-2024

Cross-domain Recommendation (CDR) systems leverage multi-domain user interactions to improve performance, especially in sparse data or new user scenarios. However, CDR faces challenges such as effectively capturing user preferences and avoiding negative transfer. To address these issues, we propose the Multi-view Disentangled and Adaptive Preference Learning (MDAP) framework. Our MDAP framework uses a multiview encoder to capture diverse user preferences. The framework includes a gated decoder that adaptively combines embeddings from different views to generate a comprehensive user representation. By disentangling representations and allowing adaptive feature selection, our model enhances recommendations' adaptability and effectiveness. Extensive experiments on benchmark datasets demonstrate that our method significantly outperforms state-of-the-art CDR and single-domain models, providing more accurate recommendations and deeper insights into user behavior across different domains.

artificial intelligence, machine learning, recommendation, (17 more...)

arXiv.org Artificial Intelligence

2410.05877

Country: Asia > China > Anhui Province (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Loki: An Open-Source Tool for Fact Verification

Li, Haonan, Han, Xudong, Wang, Hao, Wang, Yuxia, Wang, Minghan, Xing, Rui, Geng, Yilin, Zhai, Zenan, Nakov, Preslav, Baldwin, Timothy

arXiv.org Artificial IntelligenceOct-2-2024

We introduce Loki, an open-source tool designed to address the growing problem of misinformation. Loki adopts a human-centered approach, striking a balance between the quality of fact-checking and the cost of human involvement. It decomposes the fact-checking task into a five-step pipeline: breaking down long texts into individual claims, assessing their check-worthiness, generating queries, retrieving evidence, and verifying the claims. Instead of fully automating the claim verification process, Loki provides essential information at each step to assist human judgment, especially for general users such as journalists and content moderators. Moreover, it has been optimized for latency, robustness, and cost efficiency at a commercially usable level. Loki is released under an MIT license and is available on GitHub. We also provide a video presenting the system and its capabilities.

large language model, machine learning, oki, (22 more...)

arXiv.org Artificial Intelligence

2410.01794

Country: Asia > Thailand (0.14)

Genre: Research Report (0.50)

Industry: Media > News (0.55)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Human Computer Interaction (0.95)
(2 more...)

Add feedback