AITopics | Li, Yi

Collaborating Authors

Li, Yi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Revisiting Spurious Correlation in Domain Generalization

Qin, Bin, Li, Jiangmeng, Li, Yi, Wu, Xuesong, Wang, Yupeng, Qiang, Wenwen, Cao, Jianwen

arXiv.org Artificial IntelligenceJun-17-2024

Without loss of generality, existing machine learning techniques may learn spurious correlation dependent on the domain, which exacerbates the generalization of models in out-of-distribution (OOD) scenarios. To address this issue, recent works build a structural causal model (SCM) to describe the causality within data generation process, thereby motivating methods to avoid the learning of spurious correlation by models. However, from the machine learning viewpoint, such a theoretical analysis omits the nuanced difference between the data generation process and representation learning process, resulting in that the causal analysis based on the former cannot well adapt to the latter. To this end, we explore to build a SCM for representation learning process and further conduct a thorough analysis of the mechanisms underlying spurious correlation. We underscore that adjusting erroneous covariates introduces bias, thus necessitating the correct selection of spurious correlation mechanisms based on practical application scenarios. In this regard, we substantiate the correctness of the proposed SCM and further propose to control confounding bias in OOD generalization by introducing a propensity score weighted estimator, which can be integrated into any existing OOD method as a plug-and-play module. The empirical results comprehensively demonstrate the effectiveness of our method on synthetic and large-scale real OOD datasets.

artificial intelligence, correlation, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2406.11517

Country:

Europe > France (0.14)
Asia > China (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Continuous-Time Digital Twin with Analogue Memristive Neural Ordinary Differential Equation Solver

Chen, Hegan, Yang, Jichang, Chen, Jia, Wang, Songqi, Wang, Shaocong, Wang, Dingchen, Tian, Xinyu, Yu, Yifei, Chen, Xi, Lin, Yinan, He, Yangu, Wu, Xiaoshan, Li, Yi, Zhang, Xinyuan, Lin, Ning, Xu, Meng, Li, Yi, Zhang, Xumeng, Wang, Zhongrui, Wang, Han, Shang, Dashan, Liu, Qi, Cheng, Kwang-Ting, Liu, Ming

arXiv.org Artificial IntelligenceJun-12-2024

Digital twins, the cornerstone of Industry 4.0, replicate real-world entities through computer models, revolutionising fields such as manufacturing management and industrial automation. Recent advances in machine learning provide data-driven methods for developing digital twins using discrete-time data and finite-depth models on digital computers. However, this approach fails to capture the underlying continuous dynamics and struggles with modelling complex system behaviour. Additionally, the architecture of digital computers, with separate storage and processing units, necessitates frequent data transfers and Analogue-Digital (A/D) conversion, thereby significantly increasing both time and energy costs. Here, we introduce a memristive neural ordinary differential equation (ODE) solver for digital twins, which is capable of capturing continuous-time dynamics and facilitates the modelling of complex systems using an infinite-depth model. By integrating storage and computation within analogue memristor arrays, we circumvent the von Neumann bottleneck, thus enhancing both speed and energy efficiency. We experimentally validate our approach by developing a digital twin of the HP memristor, which accurately extrapolates its nonlinear dynamics, achieving a 4.2-fold projected speedup and a 41.4-fold projected decrease in energy consumption compared to state-of-the-art digital hardware, while maintaining an acceptable error margin. Additionally, we demonstrate scalability through experimentally grounded simulations of Lorenz96 dynamics, exhibiting projected performance improvements of 12.6-fold in speed and 189.7-fold in energy efficiency relative to traditional digital approaches. By harnessing the capabilities of fully analogue computing, our breakthrough accelerates the development of digital twins, offering an efficient and rapid solution to meet the demands of Industry 4.0.

artificial intelligence, digital twin, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2406.08343

Country:

Asia > China (0.69)
North America > United States > Hawaii (0.14)

Genre: Research Report (1.00)

Industry:

Energy (0.89)
Semiconductors & Electronics (0.68)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

One-shot Active Learning Based on Lewis Weight Sampling for Multiple Deep Models

Huang, Sheng-Jun, Li, Yi, Sun, Yiming, Tang, Ying-Peng

arXiv.org Artificial IntelligenceMay-22-2024

Active learning (AL) for multiple target models aims to reduce labeled data querying while effectively training multiple models concurrently. Existing AL algorithms often rely on iterative model training, which can be computationally expensive, particularly for deep models. In this paper, we propose a one-shot AL method to address this challenge, which performs all label queries without repeated model training. Specifically, we extract different representations of the same dataset using distinct network backbones, and actively learn the linear prediction layer on each representation via an $\ell_p$-regression formulation. The regression problems are solved approximately by sampling and reweighting the unlabeled instances based on their maximum Lewis weights across the representations. An upper bound on the number of samples needed is provided with a rigorous analysis for $p\in [1, +\infty)$. Experimental results on 11 benchmarks show that our one-shot approach achieves competitive performances with the state-of-the-art AL methods for multiple target models.

artificial intelligence, lewis weight, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2405.14121

Country:

Asia (0.46)
North America > United States > Wisconsin (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation

Liu, Ye, Xue, Yue, Wu, Daoyuan, Sun, Yuqiang, Li, Yi, Shi, Miaolei, Liu, Yang

arXiv.org Artificial IntelligenceMay-4-2024

With recent advances in large language models (LLMs), this paper explores the potential of leveraging state-of-the-art LLMs, such as GPT-4, to transfer existing human-written properties (e.g., those from Certora auditing reports) and automatically generate customized properties for unknown code. To this end, we embed existing properties into a vector database and retrieve a reference property for LLM-based in-context learning to generate a new prop- erty for a given code. While this basic process is relatively straight- forward, ensuring that the generated properties are (i) compilable, (ii) appropriate, and (iii) runtime-verifiable presents challenges. To address (i), we use the compilation and static analysis feedback as an external oracle to guide LLMs in iteratively revising the generated properties. For (ii), we consider multiple dimensions of similarity to rank the properties and employ a weighted algorithm to identify the top-K properties as the final result. For (iii), we design a dedicated prover to formally verify the correctness of the generated prop- erties. We have implemented these strategies into a novel system called PropertyGPT, with 623 human-written properties collected from 23 Certora projects. Our experiments show that PropertyGPT can generate comprehensive and high-quality properties, achieving an 80% recall compared to the ground truth. It successfully detected 26 CVEs/attack incidents out of 37 tested and also uncovered 12 zero-day vulnerabilities, resulting in $8,256 bug bounty rewards.

large language model, machine learning, propertygpt, (18 more...)

arXiv.org Artificial Intelligence

2405.0258

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Federated Graph Learning for EV Charging Demand Forecasting with Personalization Against Cyberattacks

Li, Yi, Xie, Renyou, Li, Chaojie, Wang, Yi, Dong, Zhaoyang

arXiv.org Machine LearningApr-30-2024

Mitigating cybersecurity risk in electric vehicle (EV) charging demand forecasting plays a crucial role in the safe operation of collective EV chargings, the stability of the power grid, and the cost-effective infrastructure expansion. However, existing methods either suffer from the data privacy issue and the susceptibility to cyberattacks or fail to consider the spatial correlation among different stations. To address these challenges, a federated graph learning approach involving multiple charging stations is proposed to collaboratively train a more generalized deep learning model for demand forecasting while capturing spatial correlations among various stations and enhancing robustness against potential attacks. Firstly, for better model performance, a Graph Neural Network (GNN) model is leveraged to characterize the geographic correlation among different charging stations in a federated manner. Secondly, to ensure robustness and deal with the data heterogeneity in a federated setting, a message passing that utilizes a global attention mechanism to aggregate personalized models for each client is proposed. Thirdly, by concerning cyberattacks, a special credit-based function is designed to mitigate potential threats from malicious clients or unwanted attacks. Extensive experiments on a public EV charging dataset are conducted using various deep learning techniques and federated learning methods to demonstrate the prediction accuracy and robustness of the proposed approach.

artificial intelligence, forecasting, machine learning, (17 more...)

arXiv.org Machine Learning

2405.00742

Country: Oceania > Australia > New South Wales (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Harmonic Machine Learning Models are Robust

Kersting, Nicholas S., Li, Yi, Mohanty, Aman, Obisesan, Oyindamola, Okochu, Raphael

arXiv.org Artificial IntelligenceApr-29-2024

We introduce Harmonic Robustness, a powerful and intuitive method to test the robustness of any machine-learning model either during training or in black-box real-time inference monitoring without ground-truth labels. It is based on functional deviation from the harmonic mean value property, indicating instability and lack of explainability. We show implementation examples in low-dimensional trees and feedforward NNs, where the method reliably identifies overfitting, as well as in more complex high-dimensional models such as ResNet-50 and Vision Transformer where it efficiently measures adversarial vulnerability across image classes.

artificial intelligence, boundary, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2404.18825

Country: Europe > Italy (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches

Li, Yi, Wu, Yunan, Katsaggelos, Aggelos K.

arXiv.org Artificial IntelligenceApr-23-2024

The advancement of The Laser Interferometer Gravitational-Wave Observatory (LIGO) has significantly enhanced the feasibility and reliability of gravitational wave detection. However, LIGO's high sensitivity makes it susceptible to transient noises known as glitches, which necessitate effective differentiation from real gravitational wave signals. Traditional approaches predominantly employ fully supervised or semi-supervised algorithms for the task of glitch classification and clustering. In the future task of identifying and classifying glitches across main and auxiliary channels, it is impractical to build a dataset with manually labeled ground-truth. In addition, the patterns of glitches can vary with time, generating new glitches without manual labels. In response to this challenge, we introduce the Cross-Temporal Spectrogram Autoencoder (CTSAE), a pioneering unsupervised method for the dimensionality reduction and clustering of gravitational wave glitches. CTSAE integrates a novel four-branch autoencoder with a hybrid of Convolutional Neural Networks (CNN) and Vision Transformers (ViT). To further extract features across multi-branches, we introduce a novel multi-branch fusion method using the CLS (Class) token. Our model, trained and evaluated on the GravitySpy O3 dataset on the main channel, demonstrates superior performance in clustering tasks when compared to state-of-the-art semi-supervised learning methods. To the best of our knowledge, CTSAE represents the first unsupervised approach tailored specifically for clustering LIGO data, marking a significant step forward in the field of gravitational wave research. The code of this paper is available at https://github.com/Zod-L/CTSAE

artificial intelligence, glitch, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.15552

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

Yang, Jichang, Chen, Hegan, Chen, Jia, Wang, Songqi, Wang, Shaocong, Yu, Yifei, Chen, Xi, Wang, Bo, Zhang, Xinyuan, Cui, Binbin, Li, Yi, Lin, Ning, Xu, Meng, Li, Yi, Xu, Xiaoxin, Qi, Xiaojuan, Wang, Zhongrui, Zhang, Xumeng, Shang, Dashan, Wang, Han, Liu, Qi, Cheng, Kwang-Ting, Liu, Ming

arXiv.org Artificial IntelligenceApr-8-2024

Human brains image complicated scenes when reading a novel. Replicating this imagination is one of the ultimate goals of AI-Generated Content (AIGC). However, current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency. This deficiency is rooted in the difference between the brain and digital computers. Digital computers have physically separated storage and processing units, resulting in frequent data transfers during iterative calculations, incurring large time and energy overheads. This issue is further intensified by the conversion of inherently continuous and analog generation dynamics, which can be formulated by neural differential equations, into discrete and digital operations. Inspired by the brain, we propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion, employing emerging resistive memory. The integration of storage and computation within resistive memory synapses surmount the von Neumann bottleneck, benefiting the generative speed and energy efficiency. The closed-loop feedback integrator is time-continuous, analog, and compact, physically implementing an infinite-depth neural network. Moreover, the software-hardware co-design is intrinsically robust to analog noise. We experimentally validate our solution with 180 nm resistive memory in-memory computing macros. Demonstrating equivalent generative quality to the software baseline, our system achieved remarkable enhancements in generative speed for both unconditional and conditional generation tasks, by factors of 64.8 and 156.5, respectively. Moreover, it accomplished reductions in energy consumption by factors of 5.2 and 4.1. Our approach heralds a new horizon for hardware solutions in edge computing for generative AI applications.

artificial intelligence, machine learning, neural network, (12 more...)

arXiv.org Artificial Intelligence

2404.05648

Country: Asia > China (0.69)

Genre: Research Report (0.82)

Industry:

Semiconductors & Electronics (1.00)
Energy (0.69)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

Shirakawa, Toru, Li, Yi, Wu, Yulun, Qiu, Sky, Li, Yuxuan, Zhao, Mingduo, Iso, Hiroyasu, van der Laan, Mark

arXiv.org Machine LearningApr-5-2024

We propose Deep Longitudinal Targeted Minimum Loss-based Estimation (Deep LTMLE), a novel approach to estimate the counterfactual mean of outcome under dynamic treatment policies in longitudinal problem settings. Our approach utilizes a transformer architecture with heterogeneous type embedding trained using temporal-difference learning. After obtaining an initial estimate using the transformer, following the targeted minimum loss-based likelihood estimation (TMLE) framework, we statistically corrected for the bias commonly associated with machine learning algorithms. Furthermore, our method also facilitates statistical inference by enabling the provision of 95% confidence intervals grounded in asymptotic statistical theory. Simulation results demonstrate our method's superior performance over existing approaches, particularly in complex, long time-horizon scenarios. It remains effective in small-sample, short-duration contexts, matching the performance of asymptotically efficient estimators. To demonstrate our method in practice, we applied our method to estimate counterfactual mean outcomes for standard versus intensive blood pressure management strategies in a real-world cardiovascular epidemiology cohort study.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2404.04399

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Japan > Honshū > Kantō (0.14)

Genre:

Research Report > New Finding (0.86)
Research Report > Strength Medium (0.66)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Epidemiology (0.48)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

GT-Rain Single Image Deraining Challenge Report

Zhang, Howard, Ba, Yunhao, Yang, Ethan, Upadhyay, Rishi, Wong, Alex, Kadambi, Achuta, Guo, Yun, Xiao, Xueyao, Wang, Xiaoxiong, Li, Yi, Chang, Yi, Yan, Luxin, Zheng, Chaochao, Wang, Luping, Liu, Bin, Khowaja, Sunder Ali, Yoon, Jiseok, Lee, Ik-Hyun, Zhang, Zhao, Wei, Yanyan, Ren, Jiahuan, Zhao, Suiyi, Zheng, Huan

arXiv.org Artificial IntelligenceMar-18-2024

This report reviews the results of the GT-Rain challenge on single image deraining at the UG2+ workshop at CVPR 2023. The aim of this competition is to study the rainy weather phenomenon in real world scenarios, provide a novel real world rainy image dataset, and to spark innovative ideas that will further the development of single image deraining methods on real images. Submissions were trained on the GT-Rain dataset and evaluated on an extension of the dataset consisting of 15 additional scenes. Scenes in GT-Rain are comprised of real rainy image and ground truth image captured moments after the rain had stopped. 275 participants were registered in the challenge and 55 competed in the final testing phase.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.12327

Country:

North America > United States > California (0.14)
Asia > Pakistan (0.14)

Genre:

Research Report (1.00)
Overview (0.75)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Vision (0.71)

Add feedback