Not enough data to create a plot.
Try a different view from the menu above.
Zhang, Yijie
If there is no underfitting, there is no Cold Posterior Effect
Zhang, Yijie, Wu, Yi-Shan, Ortega, Luis A., Masegosa, Andrés R.
The cold posterior effect (CPE) (Wenzel et al., 2020) in Bayesian deep learning shows that, for posteriors with a temperature T < 1, the resulting posterior predictive could have better performances than the Bayesian posterior (T = 1). As the Bayesian posterior is known to be optimal under perfect model specification, many recent works have studied the presence of CPE as a model misspecification problem, arising from the prior and/or from the likelihood function. In this work, we provide a more nuanced understanding of the CPE as we show that misspecification leads to CPE only when the resulting Bayesian posterior underfits. In fact, we theoretically show that if there is no underfitting, there is no CPE. In Bayesian deep learning, the cold posterior effect (CPE) (Wenzel et al., 2020) refers to the phenomenon in which if we artificially "temper" the posterior by either p(θ|D) (p(D|θ)p(θ)) The discovery of the CPE has sparked debates in the community about its potential contributing factors. If the prior and likelihood are properly specified, the Bayesian solution (i.e., T = 1) should be optimal (Gelman et al., 2013), assuming approximate inference is properly working. Hence, the presence of the CPE implies either the prior (Wenzel et al., 2020; Fortuin et al., 2022), the likelihood (Aitchison, 2021; Kapoor et al., 2022), or both are misspecified. This has been, so far, the main argument of many works trying to explain the CPE. One line of research examines the impact of the prior misspecification on the CPE (Wenzel et al., 2020; Fortuin et al., 2022). The priors of modern Bayesian neural networks are often selected for tractability. Consequently, the quality of the selected priors in relation to the CPE is a natural concern.
Cycle Consistency-based Uncertainty Quantification of Neural Networks in Inverse Imaging Problems
Huang, Luzhe, Li, Jianing, Ding, Xiaofu, Zhang, Yijie, Chen, Hanlong, Ozcan, Aydogan
Uncertainty estimation is critical for numerous applications of deep neural networks and draws growing attention from researchers. Here, we demonstrate an uncertainty quantification approach for deep neural networks used in inverse problems based on cycle consistency. We build forward-backward cycles using the physical forward model available and a trained deep neural network solving the inverse problem at hand, and accordingly derive uncertainty estimators through regression analysis on the consistency of these forward-backward cycles. We theoretically analyze cycle consistency metrics and derive their relationship with respect to uncertainty, bias, and robustness of the neural network inference. To demonstrate the effectiveness of these cycle consistency-based uncertainty estimators, we classified corrupted and out-of-distribution input image data using some of the widely used image deblurring and super-resolution neural networks as testbeds. The blind testing of our method outperformed other models in identifying unseen input data corruption and distribution shifts. This work provides a simple-to-implement and rapid uncertainty quantification method that can be universally applied to various neural networks used for solving inverse problems.
Hierarchical Data-efficient Representation Learning for Tertiary Structure-based RNA Design
Tan, Cheng, Zhang, Yijie, Gao, Zhangyang, Cao, Hanqun, Li, Stan Z.
While artificial intelligence has made remarkable strides in revealing the relationship between biological macromolecules' primary sequence and tertiary structure, designing RNA sequences based on specified tertiary structures remains challenging. Though existing approaches in protein design have thoroughly explored structure-to-sequence dependencies in proteins, RNA design still confronts difficulties due to structural complexity and data scarcity. Adding to the problem, direct transplantation of protein design methodologies into RNA design fails to achieve satisfactory outcomes although sharing similar structural components. In this study, we aim to systematically construct a data-driven RNA design pipeline. We crafted a large, well-curated benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure. More importantly, we proposed a hierarchical data-efficient representation learning framework that learns structural representations through contrastive learning at both cluster-level and sample-level to fully leverage the limited data. By constraining data representations within a limited hyperspherical space, the intrinsic relationships between data points could be explicitly imposed. Moreover, we incorporated extracted secondary structures with base pairs as prior knowledge to facilitate the RNA design process. Extensive experiments demonstrate the effectiveness of our proposed method, providing a reliable baseline for future RNA design tasks. The source code and benchmark dataset will be released publicly.
Deep Learning-enabled Virtual Histological Staining of Biological Samples
Bai, Bijie, Yang, Xilin, Li, Yuzhu, Zhang, Yijie, Pillar, Nir, Ozcan, Aydogan
Histological staining is the gold standard for tissue examination in clinical pathology and life-science research, which visualizes the tissue and cellular structures using chromatic dyes or fluorescence labels to aid the microscopic assessment of tissue. However, the current histological staining workflow requires tedious sample preparation steps, specialized laboratory infrastructure, and trained histotechnologists, making it expensive, time-consuming, and not accessible in resource-limited settings. Deep learning techniques created new opportunities to revolutionize staining methods by digitally generating histological stains using trained neural networks, providing rapid, cost-effective, and accurate alternatives to standard chemical staining methods. These techniques, broadly referred to as virtual staining, were extensively explored by multiple research groups and demonstrated to be successful in generating various types of histological stains from label-free microscopic images of unstained samples; similar approaches were also used for transforming images of an already stained tissue sample into another type of stain, performing virtual stain-to-stain transformations. In this Review, we provide a comprehensive overview of the recent research advances in deep learning-enabled virtual histological staining techniques. The basic concepts and the typical workflow of virtual staining are introduced, followed by a discussion of representative works and their technical innovations. We also share our perspectives on the future of this emerging field, aiming to inspire readers from diverse scientific fields to further expand the scope of deep learning-enabled virtual histological staining techniques and their applications.
Virtual impactor-based label-free bio-aerosol detection using holography and deep learning
Luo, Yi, Zhang, Yijie, Liu, Tairan, Yu, Alan, Wu, Yichen, Ozcan, Aydogan
Exposure to bio-aerosols such as mold spores and pollen can lead to adverse health effects. There is a need for a portable and cost-effective device for long-term monitoring and quantification of various bio-aerosols. To address this need, we present a mobile and cost-effective label-free bio-aerosol sensor that takes holographic images of flowing particulate matter concentrated by a virtual impactor, which selectively slows down and guides particles larger than ~6 microns to fly through an imaging window. The flowing particles are illuminated by a pulsed laser diode, casting their inline holograms on a CMOS image sensor in a lens-free mobile imaging device. The illumination contains three short pulses with a negligible shift of the flowing particle within one pulse, and triplicate holograms of the same particle are recorded at a single frame before it exits the imaging field-of-view, revealing different perspectives of each particle. The particles within the virtual impactor are localized through a differential detection scheme, and a deep neural network classifies the aerosol type in a label-free manner, based on the acquired holographic images. We demonstrated the success of this mobile bio-aerosol detector with a virtual impactor using different types of pollen (i.e., bermuda, elm, oak, pine, sycamore, and wheat) and achieved a blind classification accuracy of 92.91%. This mobile and cost-effective device weighs ~700 g and can be used for label-free sensing and quantification of various bio-aerosols over extended periods since it is based on a cartridge-free virtual impactor that does not capture or immobilize particulate matter.
Opponent Learning Awareness and Modelling in Multi-Objective Normal Form Games
Rădulescu, Roxana, Verstraeten, Timothy, Zhang, Yijie, Mannion, Patrick, Roijers, Diederik M., Nowé, Ann
Many real-world multi-agent interactions consider multiple distinct criteria, i.e. the payoffs are multi-objective in nature. However, the same multi-objective payoff vector may lead to different utilities for each participant. Therefore, it is essential for an agent to learn about the behaviour of other agents in the system. In this work, we present the first study of the effects of such opponent modelling on multi-objective multi-agent interactions with non-linear utilities. Specifically, we consider two-player multi-objective normal form games with non-linear utility functions under the scalarised expected returns optimisation criterion. We contribute novel actor-critic and policy gradient formulations to allow reinforcement learning of mixed strategies in this setting, along with extensions that incorporate opponent policy reconstruction and learning with opponent learning awareness (i.e., learning while considering the impact of one's policy when anticipating the opponent's learning step). Empirical results in five different MONFGs demonstrate that opponent learning awareness and modelling can drastically alter the learning dynamics in this setting. When equilibria are present, opponent modelling can confer significant benefits on agents that implement it. When there are no Nash equilibria, opponent learning awareness and modelling allows agents to still converge to meaningful solutions that approximate equilibria.
Relation Mention Extraction from Noisy Data with Hierarchical Reinforcement Learning
Feng, Jun, Huang, Minlie, Zhang, Yijie, Yang, Yang, Zhu, Xiaoyan
In this paper we address a task of relation mention extraction from noisy data: extracting representative phrases for a particular relation from noisy sentences that are collected via distant supervision. Despite its significance and value in many downstream applications, this task is less studied on noisy data. The major challenges exists in 1) the lack of annotation on mention phrases, and more severely, 2) handling noisy sentences which do not express a relation at all. To address the two challenges, we formulate the task as a semi-Markov decision process and propose a novel hierarchical reinforcement learning model. Our model consists of a top-level sentence selector to remove noisy sentences, a low-level mention extractor to extract relation mentions, and a reward estimator to provide signals to guide data denoising and mention extraction without explicit annotations. Experimental results show that our model is effective to extract relation mentions from noisy data.