AITopics | Zhao, Zhizhen

Collaborating Authors

Zhao, Zhizhen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Hierarchical Rectified Flow

Zhang, Yichi, Yan, Yici, Schwing, Alex, Zhao, Zhizhen

arXiv.org Artificial IntelligenceMar-1-2025

Published as a conference paper at ICLR 2025T OWARDSH IERARCHICAL R ECTIFIED F LOW Yichi Zhang 1, Yici Y an 1, Alex Schwing 1, Zhizhen Zhao 1 1 University of Illinois Urbana-Champaign A BSTRACT We formulate a hierarchical rectified flow to model data distributions. It hierarchically couples multiple ordinary differential equations (ODEs) and defines a time-differentiable stochastic process that generates a data distribution from a known source distribution. Each ODE resembles the ODE that is solved in a classic rectified flow, but differs in its domain, i.e., location, velocity, acceleration, etc. Unlike the classic rectified flow formulation, which formulates a single ODE in the location domain and only captures the expected velocity field (sufficient to capture a multi-modal data distribution), the hierarchical rectified flow formulation models the multi-modal random velocity field, acceleration field, etc., in their entirety. This more faithful modeling of the random velocity field enables integration paths to intersect when the underlying ODE is solved during data generation. Intersecting paths in turn lead to integration trajectories that are more straight than those obtained in the classic rectified flow formulation, where integration paths cannot intersect. This leads to modeling of data distributions with fewer neural function evaluations. We empirically verify this on synthetic 1D and 2D data as well as MNIST, CIFAR-10, and ImageNet-32 data. Our code is available at: https://riccizz.github.io/HRF/ . 1 I NTRODUCTION Diffusion models (Ho et al., 2020; Song et al., 2021a;b) and particularly also flow matching (Liu et al., 2023; Lipman et al., 2023; Albergo & V anden-Eijnden, 2023; Albergo et al., 2023) have gained significant attention recently. This is partly due to impressive results that have been reported across domains from computer vision (Ho et al., 2020) and medical imaging (Song et al., 2022) to robotics (Kapelyukh et al., 2023) and computational biology (Guo et al., 2024). Beyond impressive results, flow matching was also reported to faithfully model multimodal data distributions. In addition, sampling is reasonably straightforward: it requires to solve an ordinary differential equation (ODE) via forward integration of a set of source distribution points along an estimated velocity field from time zero to time one. The source distribution points are sampled from a simple and known source distribution, e.g., a standard Gaussian. The velocity field is obtained by matching velocities from a constructed "ground-truth" integration path with a parametric deep net using a mean squared error (MSE) objective. See Figure 1(a) for the "ground-truth" integration paths of classic rectified flow. Studying the "ground-truth" velocity distribution at a distinct location and time for rectified flow reveals a multimodal distribution.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2502.17436

Country:

North America > United States > Illinois > Champaign County > Urbana (0.24)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.74)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Boosting Test Performance with Importance Sampling--a Subpopulation Perspective

Shen, Hongyu, Zhao, Zhizhen

arXiv.org Machine LearningDec-17-2024

Despite empirical risk minimization (ERM) is widely applied in the machine learning community, its performance is limited on data with spurious correlation or subpopulation that is introduced by hidden attributes. Existing literature proposed techniques to maximize group-balanced or worst-group accuracy when such correlation presents, yet, at the cost of lower average accuracy. In addition, many existing works conduct surveys on different subpopulation methods without revealing the inherent connection between these methods, which could hinder the technology advancement in this area. In this paper, we identify important sampling as a simple yet powerful tool for solving the subpopulation problem. On the theory side, we provide a new systematic formulation of the subpopulation problem and explicitly identify the assumptions that are not clearly stated in the existing works. This helps to uncover the cause of the dropped average accuracy. We provide the first theoretical discussion on the connections of existing methods, revealing the core components that make them different. On the application side, we demonstrate a single estimator is enough to solve the subpopulation problem. In particular, we introduce the estimator in both attribute-known and -unknown scenarios in the subpopulation setup, offering flexibility in practical use cases. And empirically, we achieve state-of-the-art performance on commonly used benchmark datasets.

artificial intelligence, assumption, machine learning, (17 more...)

arXiv.org Machine Learning

2412.13003

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DeepDRK: Deep Dependency Regularized Knockoff for Feature Selection

Shen, Hongyu, Yan, Yici, Zhao, Zhizhen

arXiv.org Artificial IntelligenceFeb-26-2024

Model-X knockoff, among various feature selection methods, received much attention recently due to its guarantee on false discovery rate (FDR) control. Subsequent to its introduction in parametric design, knockoff is advanced to handle arbitrary data distributions using deep learning-based generative modeling. However, we observed that current implementations of the deep Model-X knockoff framework exhibit limitations. Notably, the "swap property" that knockoffs necessitate frequently encounter challenges on sample level, leading to a diminished selection power. To overcome, we develop "Deep Dependency Regularized Knockoff (DeepDRK)", a distribution-free deep learning method that strikes a balance between FDR and power. In DeepDRK, a generative model grounded in a transformer architecture is introduced to better achieve the "swap property". Novel efficient regularization techniques are also proposed to reach higher power. Our model outperforms other benchmarks in synthetic, semi-synthetic, and real-world data, especially when sample size is small and data distribution is complex.

artificial intelligence, deep learning, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2402.17176

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FAIR AI Models in High Energy Physics

Duarte, Javier, Li, Haoyang, Roy, Avik, Zhu, Ruike, Huerta, E. A., Diaz, Daniel, Harris, Philip, Kansal, Raghav, Katz, Daniel S., Kavoori, Ishaan H., Kindratenko, Volodymyr V., Mokhtar, Farouk, Neubauer, Mark S., Park, Sang Eon, Quinnan, Melissa, Rusack, Roger, Zhao, Zhizhen

arXiv.org Artificial IntelligenceDec-29-2023

The findable, accessible, interoperable, and reusable (FAIR) data principles provide a framework for examining, evaluating, and improving how data is shared to facilitate scientific discovery. Generalizing these principles to research software and other digital products is an active area of research. Machine learning (ML) models -- algorithms that have been trained on data without being explicitly programmed -- and more generally, artificial intelligence (AI) models, are an important target for this because of the ever-increasing pace with which AI is transforming scientific domains, such as experimental high energy physics (HEP). In this paper, we propose a practical definition of FAIR principles for AI models in HEP and describe a template for the application of these principles. We demonstrate the template's use with an example AI model applied to HEP, in which a graph neural network is used to identify Higgs bosons decaying to two bottom quarks. We report on the robustness of this FAIR AI model, its portability across hardware architectures and software frameworks, and its interpretability.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/2632-2153/ad12e3

2212.05081

Country:

North America > United States > Illinois (0.46)
North America > United States > California (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Multi-Frequency Joint Community Detection and Phase Synchronization

Wang, Lingda, Zhao, Zhizhen

arXiv.org Machine LearningDec-8-2023

This paper studies the joint community detection and phase synchronization problem on the \textit{stochastic block model with relative phase}, where each node is associated with an unknown phase angle. This problem, with a variety of real-world applications, aims to recover the cluster structure and associated phase angles simultaneously. We show this problem exhibits a \textit{``multi-frequency''} structure by closely examining its maximum likelihood estimation (MLE) formulation, whereas existing methods are not originated from this perspective. To this end, two simple yet efficient algorithms that leverage the MLE formulation and benefit from the information across multiple frequencies are proposed. The former is a spectral method based on the novel multi-frequency column-pivoted QR factorization. The factorization applied to the top eigenvectors of the observation matrix provides key information about the cluster structure and associated phase angles. The second approach is an iterative multi-frequency generalized power method, where each iteration updates the estimation in a matrix-multiplication-then-projection manner. Numerical experiments show that our proposed algorithms significantly improve the ability of exactly recovering the cluster structure and the accuracy of the estimated phase angles, compared to state-of-the-art algorithms.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TSIPN.2023.3258062

2206.12276

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

FAIR for AI: An interdisciplinary and international community building perspective

Huerta, E. A., Blaiszik, Ben, Brinson, L. Catherine, Bouchard, Kristofer E., Diaz, Daniel, Doglioni, Caterina, Duarte, Javier M., Emani, Murali, Foster, Ian, Fox, Geoffrey, Harris, Philip, Heinrich, Lukas, Jha, Shantenu, Katz, Daniel S., Kindratenko, Volodymyr, Kirkpatrick, Christine R., Lassila-Perini, Kati, Madduri, Ravi K., Neubauer, Mark S., Psomopoulos, Fotis E., Roy, Avik, Rübel, Oliver, Zhao, Zhizhen, Zhu, Ruike

arXiv.org Artificial IntelligenceAug-1-2023

A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets. Here, we present the perspectives, vision, and experiences of researchers from different countries, disciplines, and backgrounds who are leading the definition and adoption of FAIR principles in their communities of practice, and discuss outcomes that may result from pursuing and incentivizing FAIR AI research. The material for this report builds on the FAIR for AI Workshop held at Argonne National Laboratory on June 7, 2022.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s41597-023-02298-6

2210.08973

Country:

Europe (1.00)
North America > United States > Illinois (0.70)
North America > United States > California > Alameda County > Berkeley (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Applied AI (0.68)

Add feedback

Convolutional GRU Network for Seasonal Prediction of the El Ni\~no-Southern Oscillation

Wang, Lingda, Ammons, Savana, Hur, Vera Mikyoung, Sriver, Ryan L., Zhao, Zhizhen

arXiv.org Artificial IntelligenceJun-17-2023

Predicting sea surface temperature (SST) within the El Ni\~no-Southern Oscillation (ENSO) region has been extensively studied due to its significant influence on global temperature and precipitation patterns. Statistical models such as linear inverse model (LIM), analog forecasting (AF), and recurrent neural network (RNN) have been widely used for ENSO prediction, offering flexibility and relatively low computational expense compared to large dynamic models. However, these models have limitations in capturing spatial patterns in SST variability or relying on linear dynamics. Here we present a modified Convolutional Gated Recurrent Unit (ConvGRU) network for the ENSO region spatio-temporal sequence prediction problem, along with the Ni\~no 3.4 index prediction as a down stream task. The proposed ConvGRU network, with an encoder-decoder sequence-to-sequence structure, takes historical SST maps of the Pacific region as input and generates future SST maps for subsequent months within the ENSO region. To evaluate the performance of the ConvGRU network, we trained and tested it using data from multiple large climate models. The results demonstrate that the ConvGRU network significantly improves the predictability of the Ni\~no 3.4 index compared to LIM, AF, and RNN. This improvement is evidenced by extended useful prediction range, higher Pearson correlation, and lower root-mean-square error. The proposed model holds promise for improving our understanding and predicting capabilities of the ENSO phenomenon and can be broadly applicable to other weather and climate prediction scenarios with spatial patterns and teleconnections.

artificial intelligence, convgru network, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.10443

Country: North America > United States > Illinois > Champaign County > Urbana (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gauge Invariant and Anyonic Symmetric Autoregressive Neural Networks for Quantum Lattice Models

Luo, Di, Chen, Zhuo, Hu, Kaiwen, Zhao, Zhizhen, Hur, Vera Mikyoung, Clark, Bryan K.

arXiv.org Artificial IntelligenceApr-7-2023

Symmetries such as gauge invariance and anyonic symmetry play a crucial role in quantum many-body physics. We develop a general approach to constructing gauge invariant or anyonic symmetric autoregressive neural networks, including a wide range of architectures such as Transformer and recurrent neural network, for quantum lattice models. These networks can be efficiently sampled and explicitly obey gauge symmetries or anyonic constraint. We prove that our methods can provide exact representation for the ground and excited states of the 2D and 3D toric codes, and the X-cube fracton model. We variationally optimize our symmetry incorporated autoregressive neural networks for ground states as well as real-time dynamics for a variety of models. We simulate the dynamics and the ground states of the quantum link model of $\text{U(1)}$ lattice gauge theory, obtain the phase diagram for the 2D $\mathbb{Z}_2$ gauge theory, determine the phase transition and the central charge of the $\text{SU(2)}_3$ anyonic chain, and also compute the ground state energy of the $\text{SU(2)}$ invariant Heisenberg spin chain. Our approach provides powerful tools for exploring condensed matter physics, high energy physics and quantum information science.

artificial intelligence, machine learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

2101.07243

Country:

North America > United States > Illinois (0.28)
North America > United States > Michigan (0.28)

Genre: Research Report (0.64)

Industry:

Energy (0.92)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Spectral Method for Joint Community Detection and Orthogonal Group Synchronization

Fan, Yifeng, Khoo, Yuehaw, Zhao, Zhizhen

arXiv.org Machine LearningDec-25-2021

Community detection and synchronization are both fundamental problems in signal processing, machine learning, and computer vision. Recently, there is an increasing interest in their joint problem [27, 8, 44]. That is, in the presence of heterogeneous data where data points associated with random group elements (e.g. the orthogonal group O(d) of dimension d) fall into multiple underlying clusters, the joint problem is to simultaneously recover the cluster structures as well as the group elements. A motivating example is the 2D class averaging process in cryo-electron microscopy single particle reconstruction [30, 58, 68], whose goal is to align (with SO(2) group synchronization) and average projection images of a single particle with similar viewing angles to improve their signal-to-noise ratio (SNR). Another application in computer vision is simultaneous permutation group synchronization and clustering on heterogeneous object collections consisting of 2D images or 3D shapes [8]. In this work, we study the joint problem based on the probabilistic model introduced in [27] which extends the celebrated stochastic block model (SBM) [19, 21, 22, 29, 38, 41, 49, 50, 51, 52] (see Figure 1) for community detection. In particular, we focus on the orthogonal group O(d) that covers a wide range of applications mentioned above.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

2112.13199

Country: North America > United States > Illinois (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Add feedback

A FAIR and AI-ready Higgs Boson Decay Dataset

Chen, Yifan, Huerta, E. A., Duarte, Javier, Harris, Philip, Katz, Daniel S., Neubauer, Mark S., Diaz, Daniel, Mokhtar, Farouk, Kansal, Raghav, Park, Sang Eon, Kindratenko, Volodymyr V., Zhao, Zhizhen, Rusack, Roger

arXiv.org Artificial IntelligenceAug-4-2021

To enable the reusability of massive scientific datasets by humans and machines, researchers aim to create scientific datasets that adhere to the principles of findability, accessibility, interoperability, and reusability (FAIR) for data and artificial intelligence (AI) models. This article provides a domain-agnostic, step-by-step assessment guide to evaluate whether or not a given dataset meets each FAIR principle. We then demonstrate how to use this guide to evaluate the FAIRness of an open simulated dataset produced by the CMS Collaboration at the CERN Large Hadron Collider. This dataset consists of Higgs boson decays and quark and gluon background, and is available through the CERN Open Data Portal. We also use other available tools to assess the FAIRness of this dataset, and incorporate feedback from members of the FAIR community to validate our results. This article is accompanied by a Jupyter notebook to facilitate an understanding and exploration of the dataset, including visualization of its elements. This study marks the first in a planned series of articles that will guide scientists in the creation and quantification of FAIRness in high energy particle physics datasets and AI models.

artificial intelligence, dataset, neural network, (17 more...)

arXiv.org Artificial Intelligence

2108.02214

Country:

North America > United States > Illinois (0.46)
North America > United States > California (0.28)
North America > United States > New York (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback