AITopics | Lu, Jianfeng

Collaborating Authors

Lu, Jianfeng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bi-Lipschitz Ansatz for Anti-Symmetric Functions

Dym, Nadav, Lu, Jianfeng, Mizrachi, Matan

arXiv.org Artificial IntelligenceMar-6-2025

The main advantage of this ansatz over previous alternatives is that it is bi-Lipschitz with respect to a naturally defined metric. As a result, we are able to obtain quantitative approximation results for approximation of Lipschitz continuous antisymmetric functions. Moreover, we provide preliminary experimental evidence to the improved performance of this ansatz for learning antisymmetric functions. The search for an ansatz for quantum many-body wave functions dates back to the early days of quantum mechanics [Sla29], and has been a central task in quantum chemistry [SO96]. In recent years, it has received renewed excitement primarily due to the advances of neural network-based ansatz.

ansatz, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.04263

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Convergence of two-timescale gradient descent ascent dynamics: finite-dimensional and mean-field perspectives

An, Jing, Lu, Jianfeng

arXiv.org Artificial IntelligenceJan-28-2025

The two-timescale gradient descent-ascent (GDA) is a canonical gradient algorithm designed to find Nash equilibria in min-max games. We analyze the two-timescale GDA by investigating the effects of learning rate ratios on convergence behavior in both finite-dimensional and mean-field settings. In particular, for finite-dimensional quadratic min-max games, we obtain long-time convergence in near quasi-static regimes through the hypocoercivity method. For mean-field GDA dynamics, we investigate convergence under a finite-scale ratio using a mixed synchronous-reflection coupling technique.

artificial intelligence, conference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.17122

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)

Add feedback

A Unified Blockwise Measurement Design for Learning Quantum Channels and Lindbladians via Low-Rank Matrix Sensing

Lang, Quanjun, Lu, Jianfeng

arXiv.org Machine LearningJan-23-2025

Quantum superoperator learning is a pivotal task in quantum information science, enabling accurate reconstruction of unknown quantum operations from measurement data. We propose a robust approach based on the matrix sensing techniques for quantum superoperator learning that extends beyond the positive semidefinite case, encompassing both quantum channels and Lindbladians. We first introduce a randomized measurement design using a near-optimal number of measurements. By leveraging the restricted isometry property (RIP), we provide theoretical guarantees for the identifiability and recovery of low-rank superoperators in the presence of noise. Additionally, we propose a blockwise measurement design that restricts the tomography to the sub-blocks, significantly enhancing performance while maintaining a comparable scale of measurements. We also provide a performance guarantee for this setup. Our approach employs alternating least squares (ALS) with acceleration for optimization in matrix sensing. Numerical experiments validate the efficiency and scalability of the proposed methods.

artificial intelligence, machine learning, matrix, (19 more...)

arXiv.org Machine Learning

2501.1408

Country:

North America > Canada (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

FedCross: Intertemporal Federated Learning Under Evolutionary Games

Lu, Jianfeng, Zhang, Ying, Jia, Riheng, Cao, Shuqin, Liu, Jing, Fu, Hao

arXiv.org Artificial IntelligenceDec-22-2024

Federated Learning (FL) mitigates privacy leakage in decentralized machine learning by allowing multiple clients to train collaboratively locally. However, dynamic mobile networks with high mobility, intermittent connectivity, and bandwidth limitation severely hinder model updates to the cloud server. Although previous studies have typically addressed user mobility issue through task reassignment or predictive modeling, frequent migrations may result in high communication overhead. Overcoming this obstacle involves not only dealing with resource constraints, but also finding ways to mitigate the challenges posed by user migrations. We therefore propose an intertemporal incentive framework, FedCross, which ensures the continuity of FL tasks by migrating interrupted training tasks to feasible mobile devices. Specifically, FedCross comprises two distinct stages. In Stage 1, we address the task allocation problem across regions under resource constraints by employing a multi-objective migration algorithm to quantify the optimal task receivers. Moreover, we adopt evolutionary game theory to capture the dynamic decision-making of users, forecasting the evolution of user proportions across different regions to mitigate frequent migrations. In Stage 2, we utilize a procurement auction mechanism to allocate rewards among base stations, ensuring that those providing high-quality models receive optimal compensation. This approach incentivizes sustained user participation, thereby ensuring the overall feasibility of FedCross. Finally, experimental results validate the theoretical soundness of FedCross and demonstrate its significant reduction in communication overhead.

artificial intelligence, base station, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2412.16968

Country: Asia > China > Hubei Province (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)
Telecommunications (0.73)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

TRAIL: Trust-Aware Client Scheduling for Semi-Decentralized Federated Learning

Hu, Gangqiang, Lu, Jianfeng, Han, Jianmin, Cao, Shuqin, Liu, Jing, Fu, Hao

arXiv.org Artificial IntelligenceDec-19-2024

Due to the sensitivity of data, Federated Learning (FL) is employed to enable distributed machine learning while safeguarding data privacy and accommodating the requirements of various devices. However, in the context of semi-decentralized FL, clients' communication and training states are dynamic. This variability arises from local training fluctuations, heterogeneous data distributions, and intermittent client participation. Most existing studies primarily focus on stable client states, neglecting the dynamic challenges inherent in real-world scenarios. To tackle this issue, we propose a TRust-Aware clIent scheduLing mechanism called TRAIL, which assesses client states and contributions, enhancing model training efficiency through selective client participation. We focus on a semi-decentralized FL framework where edge servers and clients train a shared global model using unreliable intra-cluster model aggregation and inter-cluster model consensus. First, we propose an adaptive hidden semi-Markov model to estimate clients' communication states and contributions. Next, we address a client-server association optimization problem to minimize global training loss. Using convergence analysis, we propose a greedy client scheduling algorithm. Finally, our experiments conducted on real-world datasets demonstrate that TRAIL outperforms state-of-the-art baselines, achieving an improvement of 8.7% in test accuracy and a reduction of 15.3% in training loss.

artificial intelligence, machine learning, mechanism, (16 more...)

arXiv.org Artificial Intelligence

2412.11448

Country: Asia > China (0.47)

Genre: Research Report (0.64)

Industry:

Energy (0.93)
Information Technology > Security & Privacy (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Towards characterizing the value of edge embeddings in Graph Neural Networks

Rohatgi, Dhruv, Marwah, Tanya, Lipton, Zachary Chase, Lu, Jianfeng, Moitra, Ankur, Risteski, Andrej

arXiv.org Artificial IntelligenceOct-13-2024

Graph neural networks (GNNs) have emerged as the dominant approach for solving machine learning tasks on graphs. Over the span of the last decade, many different architectures have been proposed, both in order to improve different notions of efficiency, and to improve performance on a variety of benchmarks. Nevertheless, theoretical and empirical understanding of the impact of different architectural design choices remains elusive. One previous line of work (Xu et al., 2018) has focused on characterizing the representational limitations stemming from the symmetry-preserving properties of GNNs when the node features are not informative (also called "anonymous GNNs") -- in particular, relating GNNs to the Weisfeiler-Lehman graph isomorphism test (Leman & Weisfeiler, 1968). Another line of work (Oono & Suzuki, 2019) focuses on the potential pitfalls of the (over)smoothing effect of deep GNN architectures, with particular choices of weights and non-linearities, in an effort to explain the difficulties of training deep GNN models. Yet another (Black et al., 2023) focuses on training difficulties akin to vanishing introduced by "bottlenecks" in the graph topology. In this paper, we focus on the benefits of maintaining and updating edge embeddings over the course of the computation of the GNN. More concretely, a typical way to parametrize a layer l of a GNN (Xu et al., 2018) is to maintain, for each node v in the graph, a node embedding h

artificial intelligence, machine learning, message-passing protocol, (18 more...)

arXiv.org Artificial Intelligence

2410.09867

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Industry:

Government > Regional Government (0.46)
Banking & Finance (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.85)

Add feedback

Posterior sampling via Langevin dynamics based on generative priors

Purohit, Vishal, Repasky, Matthew, Lu, Jianfeng, Qiu, Qiang, Xie, Yao, Cheng, Xiuyuan

arXiv.org Machine LearningOct-2-2024

Posterior sampling in high-dimensional spaces using generative models holds significant promise for various applications, including but not limited to inverse problems and guided generation tasks. Despite many recent developments, generating diverse posterior samples remains a challenge, as existing methods require restarting the entire generative process for each new sample, making the procedure computationally expensive. In this work, we propose efficient posterior sampling by simulating Langevin dynamics in the noise space of a pre-trained generative model. By exploiting the mapping between the noise and data spaces which can be provided by distilled flows or consistency models, our method enables seamless exploration of the posterior without the need to re-run the full sampling chain, drastically reducing computational overhead. Theoretically, we prove a guarantee for the proposed noise-space Langevin dynamics to approximate the posterior, assuming that the generative model sufficiently approximates the prior distribution. Our framework is experimentally validated on image restoration tasks involving noisy linear and nonlinear forward operators applied to LSUN-Bedroom (256 x 256) and ImageNet (64 x 64) datasets. The results demonstrate that our approach generates high-fidelity samples with enhanced semantic diversity even under a limited number of function evaluations, offering superior efficiency and performance compared to existing diffusion-based posterior sampling techniques.

machine learning, natural language, posterior, (20 more...)

arXiv.org Machine Learning

2410.02078

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.90)
Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Sensing and Signal Processing > Image Processing (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

What does guidance do? A fine-grained analysis in a simple setting

Chidambaram, Muthu, Gatmiry, Khashayar, Chen, Sitan, Lee, Holden, Lu, Jianfeng

arXiv.org Machine LearningSep-19-2024

The use of guidance in diffusion models was originally motivated by the premise that the guidance-modified score is that of the data distribution tilted by a conditional likelihood raised to some power. In this work we clarify this misconception by rigorously proving that guidance fails to sample from the intended tilted distribution. Our main result is to give a fine-grained characterization of the dynamics of guidance in two cases, (1) mixtures of compactly supported distributions and (2) mixtures of Gaussians, which reflect salient properties of guidance that manifest on real-world data. In both cases, we prove that as the guidance parameter increases, the guided model samples more heavily from the boundary of the support of the conditional distribution. We also prove that for any nonzero level of score estimation error, sufficiently large guidance will result in sampling away from the support, theoretically justifying the empirical finding that large guidance results in distorted generations. In addition to verifying these results empirically in synthetic settings, we also show how our theoretical insights can offer useful prescriptions for practical deployment.

artificial intelligence, guidance, machine learning, (14 more...)

arXiv.org Machine Learning

2409.13074

Country: Europe > France (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

The Solution for Temporal Sound Localisation Task of ICCV 1st Perception Test Challenge 2023

Huang, Yurui, Yang, Yang, Chen, Shou, Wu, Xiangyu, Chen, Qingguo, Lu, Jianfeng

arXiv.org Artificial IntelligenceJul-1-2024

In this paper, we propose a solution for improving the quality of temporal sound localization. We employ a multimodal fusion approach to combine visual and audio features. High-quality visual features are extracted using a state-of-the-art self-supervised pre-training network, resulting in efficient video feature representations. At the same time, audio features serve as complementary information to help the model better localize the start and end of sounds. The fused features are trained in a multi-scale Transformer for training. In the final test dataset, we achieved a mean average precision (mAP) of 0.33, obtaining the second-best performance in this track.

artificial intelligence, localization, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2407.02318

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image Denoising

Fu, Guanyiman, Xiong, Fengchao, Lu, Jianfeng, Zhou, Jun

arXiv.org Artificial IntelligenceJun-20-2024

Denoising is a crucial preprocessing step for hyperspectral images (HSIs) due to noise arising from intraimaging mechanisms and environmental factors. Long-range spatial-spectral correlation modeling is beneficial for HSI denoising but often comes with high computational complexity. Based on the state space model (SSM), Mamba is known for its remarkable long-range dependency modeling capabilities and computational efficiency. Building on this, we introduce a memory-efficient spatial-spectral UMamba (SSUMamba) for HSI denoising, with the spatial-spectral continuous scan (SSCS) Mamba being the core component. SSCS Mamba alternates the row, column, and band in six different orders to generate the sequence and uses the bidirectional SSM to exploit long-range spatial-spectral dependencies. In each order, the images are rearranged between adjacent scans to ensure spatial-spectral continuity. Additionally, 3D convolutions are embedded into the SSCS Mamba to enhance local spatial-spectral modeling. Experiments demonstrate that SSUMamba achieves superior denoising results with lower memory consumption per batch compared to transformer-based methods. The source code is available at https://github.com/lronkitty/SSUMamba.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2405.01726

Country: Asia > China (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback