AITopics | Wu, Caesar

Collaborating Authors

Wu, Caesar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Unified Hyperparameter Optimization Pipeline for Transformer-Based Time Series Forecasting Models

Xu, Jingjing, Wu, Caesar, Li, Yuan-Fang, Danoy, Grégoire, Bouvry, Pascal

arXiv.org Artificial IntelligenceJan-2-2025

Transformer-based models for time series forecasting (TSF) have attracted significant attention in recent years due to their effectiveness and versatility. However, these models often require extensive hyperparameter optimization (HPO) to achieve the best possible performance, and a unified pipeline for HPO in transformer-based TSF remains lacking. In this paper, we present one such pipeline and conduct extensive experiments on several state-of-the-art (SOTA) transformer-based TSF models. These experiments are conducted on standard benchmark datasets to evaluate and compare the performance of different models, generating practical insights and examples. Our pipeline is generalizable beyond transformer-based architectures and can be applied to other SOTA models, such as Mamba and TimeMixer, as demonstrated in our experiments. The goal of this work is to provide valuable guidance to both industry practitioners and academic researchers in efficiently identifying optimal hyperparameters suited to their specific domain applications. The code and complete experimental results are available on GitHub.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.01394

Country: Oceania > Australia (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Renewable (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Trustworthiness of Stochastic Gradient Descent in Distributed Learning

Li, Hongyang, Wu, Caesar, Chadli, Mohammed, Mammar, Said, Bouvry, Pascal

arXiv.org Artificial IntelligenceOct-28-2024

DL is the method used to accelerate the training of deep learning models by distributing training tasks to multiple computing nodes [1]. However, as data scales continue to grow, the complexity of model gradients increases accordingly, for example, consider the training of deep learning on ImageNet [2], which contains over 14 million labeled images and topics with approximately 22,000 categories, leading to constraints on communication efficiency [3]. Gradient compression aimed at reducing communication overhead during gradient transmission between multiple nodes which enhances system computational efficiency [4, 5, 6], thus this has emerged as an effective optimization technique in distributed learning, especially when training complex models to process large-scale data. Among various gradient compression techniques, PowerSGD [6] and Top-K SGD [7] have emerged as prominent solutions for their ability to substantially reduce communication costs while preserving scalability and model accuracy in large-scale distributed learning. These two algorithms are particularly suitable for our study as they represent fundamental approaches to gradient compression: PowerSGD uses low-rank approximation, while TopKSGD leverages sparsification through threshold quantization. Both techniques are widely recognized for their practical effectiveness, especially when combined, to varying extents, with advanced features such as error feedback, warm start, all-reduce, making them ideal candidates of compressed SGD for assessing privacy risks in distributed deep learning systems.

artificial intelligence, machine learning, top-k sgd, (14 more...)

arXiv.org Artificial Intelligence

2410.21491

Country: Europe > Spain (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Transformer Multivariate Forecasting: Less is More?

Xu, Jingjing, Wu, Caesar, Li, Yuan-Fang, Bouvry, Pascal

arXiv.org Artificial IntelligenceDec-30-2023

In the domain of multivariate forecasting, transformer models stand out as powerful apparatus, displaying exceptional capabilities in handling messy datasets from real-world contexts. However, the inherent complexity of these datasets, characterized by numerous variables and lengthy temporal sequences, poses challenges, including increased noise and extended model runtime. This paper focuses on reducing redundant information to elevate forecasting accuracy while optimizing runtime efficiency. We propose a novel transformer forecasting framework enhanced by Principal Component Analysis (PCA) to tackle this challenge. The framework is evaluated by five state-of-the-art (SOTA) models and four diverse real-world datasets. Our experimental results demonstrate the framework's ability to minimize prediction errors across all models and datasets while significantly reducing runtime. From the model perspective, one of the PCA-enhanced models: PCA+Crossformer, reduces mean square errors (MSE) by 33.3% and decreases runtime by 49.2% on average. From the dataset perspective, the framework delivers 14.3% MSE and 76.6% runtime reduction on Electricity datasets, as well as 4.8% MSE and 86.9% runtime reduction on Traffic datasets. This study aims to advance various SOTA models and enhance transformer-based time series forecasting for intricate data.

data mining, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2401.0023

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Trustworthy AI: Deciding What to Decide

Wu, Caesar, Li, Yuan-Fang, Li, Jian, Xu, Jingjing, Pascal, Bouvry

arXiv.org Artificial IntelligenceNov-21-2023

When engaging in strategic decision-making, we are frequently confronted with overwhelming information and data. The situation can be further complicated when certain pieces of evidence contradict each other or become paradoxical. The primary challenge is how to determine which information can be trusted when we adopt Artificial Intelligence (AI) systems for decision-making. This issue is known as deciding what to decide or Trustworthy AI. However, the AI system itself is often considered an opaque black box. We propose a new approach to address this issue by introducing a novel framework of Trustworthy AI (TAI) encompassing three crucial components of AI: representation space, loss function, and optimizer. Each component is loosely coupled with four TAI properties. Altogether, the framework consists of twelve TAI properties. We aim to use this framework to conduct the TAI experiments by quantitive and qualitative research methods to satisfy TAI properties for the decision-making context. The framework allows us to formulate an optimal prediction model trained by the given dataset for applying the strategic investment decision of credit default swaps (CDS) in the technology sector. Finally, we provide our view of the future direction of TAI research

data mining, machine learning, tai property, (21 more...)

arXiv.org Artificial Intelligence

2311.12604

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback

Survey of Trustworthy AI: A Meta Decision of AI

Wu, Caesar, Lib, Yuan-Fang, Bouvry, Pascal

arXiv.org Artificial IntelligenceJun-12-2023

When making strategic decisions, we are often confronted with overwhelming information to process. The situation can be further complicated when some pieces of evidence are contradicted each other or paradoxical. The challenge then becomes how to determine which information is useful and which ones should be eliminated. This process is known as meta-decision. Likewise, when it comes to using Artificial Intelligence (AI) systems for strategic decision-making, placing trust in the AI itself becomes a meta-decision, given that many AI systems are viewed as opaque "black boxes" that process large amounts of data. Trusting an opaque system involves deciding on the level of Trustworthy AI (TAI). We propose a new approach to address this issue by introducing a novel taxonomy or framework of TAI, which encompasses three crucial domains: articulate, authentic, and basic for different levels of trust. To underpin these domains, we create ten dimensions to measure trust: explainability/transparency, fairness/diversity, generalizability, privacy, data governance, safety/robustness, accountability, reproducibility, reliability, and sustainability. We aim to use this taxonomy to conduct a comprehensive survey and explore different TAI approaches from a strategic decision-making perspective.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2306.0038

Country:

North America > United States (0.93)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)
Research Report > Experimental Study (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(4 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

Add feedback

The Emerging Artificial Intelligence Protocol for Hierarchical Information Network

Wu, Caesar, Bouvry, Pascal

arXiv.org Artificial IntelligenceFeb-22-2023

The recent development of artificial intelligence enables a machine to achieve a human level of intelligence. Problem-solving and decision-making are two mental abilities to measure human intelligence. Many scholars have proposed different models. However, there is a gap in establishing an AI-oriented hierarchical model with a multilevel abstraction. This study proposes a novel model known as the emerged AI protocol that consists of seven distinct layers capable of providing an optimal and explainable solution for a given problem.

artificial intelligence, creativity & intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.09463

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.54)

Add feedback