AITopics | Wang, Guanghui

Collaborating Authors

Wang, Guanghui

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Atten-Transformer: A Deep Learning Framework for User App Usage Prediction

Li, Longlong, Qu, Cunquan, Wang, Guanghui

arXiv.org Artificial IntelligenceFeb-24-2025

Accurately predicting smartphone app usage patterns is crucial for user experience optimization and targeted marketing. However, existing methods struggle to capture intricate dependencies in user behavior, particularly in sparse or complex usage scenarios. To address these challenges, we introduce Atten-Transformer, a novel model that integrates temporal attention with a Transformer network to dynamically identify and leverage key app usage patterns. Unlike conventional methods that primarily consider app order and duration, our approach employs a multi-dimensional feature representation, incorporating both feature encoding and temporal encoding to enhance predictive accuracy. The proposed attention mechanism effectively assigns importance to critical app usage moments, improving both model interpretability and generalization. Extensive experiments on multiple smartphone usage datasets, including LSapp and Tsinghua App Usage datasets, demonstrate that Atten-Transformer consistently outperforms state-of-the-art models across different data splits. Specifically, our model achieves a 45.24\% improvement in HR@1 on the Tsinghua dataset (Time-based Split) and a 18.25\% improvement in HR@1 on the LSapp dataset (Cold Start Split), showcasing its robustness across diverse app usage scenarios. These findings highlight the potential of integrating adaptive attention mechanisms in mobile usage forecasting, paving the way for enhanced user engagement and resource allocation.

data mining, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2502.16957

Country: North America > United States (0.14)

Genre:

Research Report > Promising Solution (0.68)
Research Report > New Finding (0.67)

Industry:

Telecommunications (0.88)
Information Technology (0.66)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-Agent Performative Prediction Beyond the Insensitivity Assumption: A Case Study for Mortgage Competition

Wang, Guanghui, Acharya, Krishna, Lakshmikanthan, Lokranjan, Muthukumar, Vidya, Ziani, Juba

arXiv.org Artificial IntelligenceFeb-11-2025

Performative prediction models account for feedback loops in decision-making processes where predictions influence future data distributions. While existing work largely assumes insensitivity of data distributions to small strategy changes, this assumption usually fails in real-world competitive (i.e. multi-agent) settings. For example, in Bertrand-type competitions, a small reduction in one firm's price can lead that firm to capture the entire demand, while all others sharply lose all of their customers. We study a representative setting of multi-agent performative prediction in which insensitivity assumptions do not hold, and investigate the convergence of natural dynamics. To do so, we focus on a specific game that we call the ''Bank Game'', where two lenders compete over interest rates and credit score thresholds. Consumers act similarly as to in a Bertrand Competition, with each consumer selecting the firm with the lowest interest rate that they are eligible for based on the firms' credit thresholds. Our analysis characterizes the equilibria of this game and demonstrates that when both firms use a common and natural no-regret learning dynamic -- exponential weights -- with proper initialization, the dynamics always converge to stable outcomes despite the general-sum structure. Notably, our setting admits multiple stable equilibria, with convergence dependent on initial conditions. We also provide theoretical convergence results in the stochastic case when the utility matrix is not fully known, but each learner can observe sufficiently many samples of consumers at each time step to estimate it, showing robustness to slight mis-specifications. Finally, we provide experimental results that validate our theoretical findings.

artificial intelligence, converge, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.08063

Genre: Research Report > New Finding (0.45)

Industry:

Leisure & Entertainment > Games (0.47)
Banking & Finance > Credit (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Survey on Mamba Architecture for Vision Applications

Ibrahim, Fady, Liu, Guangjun, Wang, Guanghui

arXiv.org Artificial IntelligenceFeb-10-2025

Transformers have become foundational for visual tasks such as object detection, semantic segmentation, and video understanding, but their quadratic complexity in attention mechanisms presents scalability challenges. To address these limitations, the Mamba architecture utilizes state-space models (SSMs) for linear scalability, efficient processing, and improved contextual awareness. This paper investigates Mamba architecture for visual domain applications and its recent advancements, including Vision Mamba (ViM) and VideoMamba, which introduce bidirectional scanning, selective scanning mechanisms, and spatiotemporal processing to enhance image and video understanding. Architectural innovations like position embeddings, cross-scan modules, and hierarchical designs further optimize the Mamba framework for global and local feature extraction. These advancements position Mamba as a promising architecture in computer vision research and applications.

artificial intelligence, arxiv preprint arxiv, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.07161

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Specific Architectures (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.68)

Add feedback

Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting

Nori, Milad Khademi, Kim, Il-Min, Wang, Guanghui

arXiv.org Artificial IntelligenceJan-30-2025

Federated Class-Incremental Learning (FCIL) refers to a scenario where a dynamically changing number of clients collaboratively learn an ever-increasing number of incoming tasks. FCIL is known to suffer from local forgetting due to class imbalance at each client and global forgetting due to class imbalance across clients. We develop a mathematical framework for FCIL that formulates local and global forgetting. Then, we propose an approach called Hybrid Rehearsal (HR), which utilizes latent exemplars and data-free techniques to address local and global forgetting, respectively. HR employs a customized autoencoder designed for both data classification and the generation of synthetic data. To determine the embeddings of new tasks for all clients in the latent space of the encoder, the server uses the Lennard-Jones Potential formulations. Meanwhile, at the clients, the decoder decodes the stored low-dimensional latent space exemplars back to the high-dimensional input space, used to address local forgetting. To overcome global forgetting, the decoder generates synthetic data. Furthermore, our mathematical framework proves that our proposed approach HR can, in principle, tackle the two local and global forgetting challenges. In practice, extensive experiments demonstrate that while preserving privacy, our proposed approach outperforms the state-of-the-art baselines on multiple FCIL benchmarks with low compute and memory footprints.

artificial intelligence, learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.15356

Country: North America > Canada > Ontario > Kingston (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

FgC2F-UDiff: Frequency-guided and Coarse-to-fine Unified Diffusion Model for Multi-modality Missing MRI Synthesis

Xiao, Xiaojiao, Hu, Qinmin Vivian, Wang, Guanghui

arXiv.org Artificial IntelligenceJan-6-2025

Multi-modality magnetic resonance imaging (MRI) is essential for the diagnosis and treatment of brain tumors. However, missing modalities are commonly observed due to limitations in scan time, scan corruption, artifacts, motion, and contrast agent intolerance. Synthesis of missing MRI has been a means to address the limitations of modality insufficiency in clinical practice and research. However, there are still some challenges, such as poor generalization, inaccurate non-linear mapping, and slow processing speeds. To address the aforementioned issues, we propose a novel unified synthesis model, the Frequency-guided and Coarse-to-fine Unified Diffusion Model (FgC2F-UDiff), designed for multiple inputs and outputs. Specifically, the Coarse-to-fine Unified Network (CUN) fully exploits the iterative denoising properties of diffusion models, from global to detail, by dividing the denoising process into two stages, coarse and fine, to enhance the fidelity of synthesized images. Secondly, the Frequency-guided Collaborative Strategy (FCS) harnesses appropriate frequency information as prior knowledge to guide the learning of a unified, highly non-linear mapping. Thirdly, the Specific-acceleration Hybrid Mechanism (SHM) integrates specific mechanisms to accelerate the diffusion model and enhance the feasibility of many-to-many synthesis. Extensive experimental evaluations have demonstrated that our proposed FgC2F-UDiff model achieves superior performance on two datasets, validated through a comprehensive assessment that includes both qualitative observations and quantitative metrics, such as PSNR SSIM, LPIPS, and FID.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.03526

Country:

Europe (1.00)
Asia (0.67)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

KA-GNN: Kolmogorov-Arnold Graph Neural Networks for Molecular Property Prediction

Li, Longlong, Zhang, Yipeng, Wang, Guanghui, Xia, Kelin

arXiv.org Artificial IntelligenceDec-18-2024

As key models in geometric deep learning, graph neural networks have demonstrated enormous power in molecular data analysis. Recently, a specially-designed learning scheme, known as Kolmogorov-Arnold Network (KAN), shows unique potential for the improvement of model accuracy, efficiency, and explainability. Here we propose the first non-trivial Kolmogorov-Arnold Network-based Graph Neural Networks (KA-GNNs), including KAN-based graph convolutional networks(KA-GCN) and KAN-based graph attention network (KA-GAT). The essential idea is to utilizes KAN's unique power to optimize GNN architectures at three major levels, including node embedding, message passing, and readout. Further, with the strong approximation capability of Fourier series, we develop Fourier series-based KAN model and provide a rigorous mathematical prove of the robust approximation capability of this Fourier KAN architecture. To validate our KA-GNNs, we consider seven most-widely-used benchmark datasets for molecular property prediction and extensively compare with existing state-of-the-art models. It has been found that our KA-GNNs can outperform traditional GNN models. More importantly, our Fourier KAN module can not only increase the model accuracy but also reduce the computational time. This work not only highlights the great power of KA-GNNs in molecular property prediction but also provides a novel geometric deep learning framework for the general non-Euclidean data analysis.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.11323

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.70)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Universal Online Convex Optimization Meets Second-order Bounds

Zhang, Lijun, Wang, Yibo, Wang, Guanghui, Yi, Jinfeng, Yang, Tianbao

arXiv.org Artificial IntelligenceNov-20-2024

Recently, several universal methods have been proposed for online convex optimization, and attain minimax rates for multiple types of convex functions simultaneously. However, they need to design and optimize one surrogate loss for each type of functions, making it difficult to exploit the structure of the problem and utilize existing algorithms. In this paper, we propose a simple strategy for universal online convex optimization, which avoids these limitations. The key idea is to construct a set of experts to process the original online functions, and deploy a meta-algorithm over the linearized losses to aggregate predictions from experts. Specifically, the meta-algorithm is required to yield a second-order bound with excess losses, so that it can leverage strong convexity and exponential concavity to control the meta-regret. In this way, our strategy inherits the theoretical guarantee of any expert designed for strongly convex functions and exponentially concave functions, up to a double logarithmic factor. As a result, we can plug in off-the-shelf online solvers as black-box experts to deliver problem-dependent regret bounds. For general convex functions, it maintains the minimax optimality and also achieves a small-loss bound. Furthermore, we extend our universal strategy to online composite optimization, where the loss function comprises a time-varying function and a fixed regularizer. To deal with the composite loss functions, we employ a meta-algorithm based on the optimistic online learning framework, which not only possesses a second-order bound, but also can utilize estimations for upcoming loss functions. With appropriate configurations, we demonstrate that the additional regularizer does not contribute to the meta-regret, thus maintaining the universality in the composite setting.

artificial intelligence, convex function, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2105.03681

Country:

Asia (0.28)
North America > United States > Texas (0.14)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.54)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.49)

Add feedback

Robust 3D Point Clouds Classification based on Declarative Defenders

Li, Kaidong, Zhang, Tianxiao, Zhong, Cuncong, Zhang, Ziming, Wang, Guanghui

arXiv.org Artificial IntelligenceOct-18-2024

3D point cloud classification requires distinct models from 2D image classification due to the divergent characteristics of the respective input data. While 3D point clouds are unstructured and sparse, 2D images are structured and dense. Bridging the domain gap between these two data types is a non-trivial challenge to enable model interchangeability. Recent research using Lattice Point Classifier (LPC) highlights the feasibility of cross-domain applicability. However, the lattice projection operation in LPC generates 2D images with disconnected projected pixels. In this paper, we explore three distinct algorithms for mapping 3D point clouds into 2D images. Through extensive experiments, we thoroughly examine and analyze their performance and defense mechanisms. Leveraging current large foundation models, we scrutinize the feature disparities between regular 2D images and projected 2D images. The proposed approaches demonstrate superior accuracy and robustness against adversarial attacks. The generative model-based mapping algorithms yield regular 2D images, further minimizing the domain gap from regular 2D classification tasks. The source code is available at https://github.com/KaidongLi/pytorch-LatticePointClassifier.git.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.09691

Country: North America (0.46)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Keypoint Detection Technique for Image-Based Visual Servoing of Manipulators

Amiri, Niloufar, Wang, Guanghui, Janabi-Sharifi, Farrokh

arXiv.org Artificial IntelligenceSep-20-2024

This paper introduces an innovative keypoint detection technique based on Convolutional Neural Networks (CNNs) to enhance the performance of existing Deep Visual Servoing (DVS) models. To validate the convergence of the Image-Based Visual Servoing (IBVS) algorithm, real-world experiments utilizing fiducial markers for feature detection are conducted before designing the CNN-based feature detector. To address the limitations of fiducial markers, the novel feature detector focuses on extracting keypoints that represent the corners of a more realistic object compared to fiducial markers. A dataset is generated from sample data captured by the camera mounted on the robot end-effector while the robot operates randomly in the task space. The samples are automatically labeled, and the dataset size is increased by flipping and rotation. The CNN model is developed by modifying the VGG-19 pre-trained on the ImageNet dataset. While the weights in the base model remain fixed, the fully connected layer's weights are updated to minimize the mean absolute error, defined based on the deviation of predictions from the real pixel coordinates of the corners. The model undergoes two modifications: replacing max-pooling with average-pooling in the base model and implementing an adaptive learning rate that decreases during epochs. These changes lead to a 50 percent reduction in validation loss. Finally, the trained model's reliability is assessed through k-fold cross-validation.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2409.13668

Country:

Europe (0.46)
North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Image-to-Joint Inverse Kinematic of a Supportive Continuum Arm Using Deep Learning

Sepahvand, Shayan, Wang, Guanghui, Janabi-Sharifi, Farrokh

arXiv.org Artificial IntelligenceJun-26-2024

In this work, a deep learning-based technique is used to study the image-to-joint inverse kinematics of a tendon-driven supportive continuum arm. An eye-off-hand configuration is considered by mounting a camera at a fixed pose with respect to the inertial frame attached at the arm base. This camera captures an image for each distinct joint variable at each sampling time to construct the training dataset. This dataset is then employed to adapt a feed-forward deep convolutional neural network, namely the modified VGG-16 model, to estimate the joint variable. One thousand images are recorded to train the deep network, and transfer learning and fine-tuning techniques are applied to the modified VGG-16 to further improve the training. Finally, training is also completed with a larger dataset of images that are affected by various types of noises, changes in illumination, and partial occlusion. The main contribution of this research is the development of an image-to-joint network that can estimate the joint variable given an image of the arm, even if the image is not captured in an ideal condition. The key benefits of this research are twofold: 1) image-to-joint mapping can offer a real-time alternative to computationally complex inverse kinematic mapping through analytical models; and 2) the proposed technique can provide robustness against noise, occlusion, and changes in illumination. The dataset is publicly available on Kaggle.

artificial intelligence, continuum robot, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.21428/d82e957c.d8706a7c

2405.20248

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback