AITopics | Chen, Junyu

Collaborating Authors

Chen, Junyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models

Chen, Junyu, Cai, Han, Chen, Junsong, Xie, Enze, Yang, Shang, Tang, Haotian, Li, Muyang, Lu, Yao, Han, Song

arXiv.org Artificial IntelligenceJan-14-2025

Existing autoencoders have demonstrated impressive results at a moderate spatial compression ratio (e.g., 8), but fail to maintain satisfactory reconstruction accuracy for high spatial compression ratios (e.g., 64). We address this challenge by introducing two key techniques: (1) Residual Autoencoding, where we design our models to learn residuals based on the space-to-channel transformed features to alleviate the optimization difficulty of high spatial-compression autoencoders; (2) Decoupled High-Resolution Adaptation, an efficient decoupled three-phase training strategy for mitigating the generalization penalty of high spatial-compression autoencoders. With these designs, we improve the autoencoder's spatial compression ratio up to 128 while maintaining the reconstruction quality. Applying our DC-AE to latent diffusion models, we achieve significant speedup without accuracy drop. For example, on ImageNet 512 512, our DC-AE provides 19.1 inference speedup and 17.9 training speedup on H100 GPU for UViT-H while achieving a better FID, compared with the widely used SD-VAE-f8 autoencoder. Latent diffusion models (Rombach et al., 2022) have emerged as a leading framework and demonstrated great success in image synthesis (Labs, 2024; Esser et al., 2024). They employ an autoencoder to project the images to the latent space to reduce the cost of diffusion models.

artificial intelligence, autoencoder, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.10733

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data

Wang, Zifan, Chen, Ziqing, Chen, Junyu, Wang, Jilong, Yang, Yuxin, Liu, Yunze, Liu, Xueyi, Wang, He, Yi, Li

arXiv.org Artificial IntelligenceJan-8-2025

This paper introduces MobileH2R, a framework for learning generalizable vision-based human-to-mobile-robot (H2MR) handover skills. Unlike traditional fixed-base handovers, this task requires a mobile robot to reliably receive objects in a large workspace enabled by its mobility. Our key insight is that generalizable handover skills can be developed in simulators using high-quality synthetic data, without the need for real-world demonstrations. To achieve this, we propose a scalable pipeline for generating diverse synthetic full-body human motion data, an automated method for creating safe and imitation-friendly demonstrations, and an efficient 4D imitation learning method for distilling large-scale demonstrations into closed-loop policies with base-arm coordination. Experimental evaluations in both simulators and the real world show significant improvements (at least +15% success rate) over baseline methods in all cases. Experiments also validate that large-scale and diverse synthetic data greatly enhances robot learning, highlighting our scalable framework.

artificial intelligence, demonstration, robot, (17 more...)

arXiv.org Artificial Intelligence

2501.04595

Genre: Research Report > Promising Solution (0.46)

Industry: Energy (0.48)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.82)

Add feedback

INSIGHT: Explainable Weakly-Supervised Medical Image Analysis

Zhang, Wenbo, Chen, Junyu, Kanan, Christopher

arXiv.org Artificial IntelligenceDec-8-2024

Processing such pathology images (WSIs) are often processed by extracting data end-to-end with deep neural networks is computationally embeddings from local regions and then an aggregator infeasible. Instead, pipelines rely on aggregators, which makes predictions from this set. However, current methods synthesize local embeddings extracted from tiles (WSIs) or require post-hoc visualization techniques (e.g., Grad-CAM) slices (volumes) into global predictions [5, 6, 23]. While and often fail to localize small yet clinically crucial details. this divide-and-conquer strategy is efficient, current methods To address these limitations, we introduce INSIGHT, a often discard spatial information during feature aggregation novel weakly-supervised aggregator that integrates heatmap and depend on post-hoc visualization tools, such as Grad-generation as an inductive bias. Starting from pre-trained CAM [33], to generate interpretable heatmaps. These visualizations feature maps, INSIGHT employs a detection module with are prone to missing clinically significant features small convolutional kernels to capture fine details and a and introduce additional complexity.

artificial intelligence, heatmap, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.02012

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Gene-Metabolite Association Prediction with Interactive Knowledge Transfer Enhanced Graph for Metabolite Production

Xin, Kexuan, Wang, Qingyun, Chen, Junyu, Yu, Pengfei, Zhao, Huimin, Ji, Heng

arXiv.org Artificial IntelligenceOct-31-2024

In the rapidly evolving field of metabolic engineering, the quest for efficient and precise gene target identification for metabolite production enhancement presents significant challenges. Traditional approaches, whether knowledge-based or model-based, are notably time-consuming and labor-intensive, due to the vast scale of research literature and the approximation nature of genome-scale metabolic model (GEM) simulations. Therefore, we propose a new task, Gene-Metabolite Association Prediction based on metabolic graphs, to automate the process of candidate gene discovery for a given pair of metabolite and candidate-associated genes, as well as presenting the first benchmark containing 2474 metabolites and 1947 genes of two commonly used microorganisms Saccharomyces cerevisiae (SC) and Issatchenkia orientalis (IO). This task is challenging due to the incompleteness of the metabolic graphs and the heterogeneity among distinct metabolisms. To overcome these limitations, we propose an Interactive Knowledge Transfer mechanism based on Metabolism Graph (IKT4Meta), which improves the association prediction accuracy by integrating the knowledge from different metabolism graphs. First, to build a bridge between two graphs for knowledge transfer, we utilize Pretrained Language Models (PLMs) with external knowledge of genes and metabolites to help generate inter-graph links, significantly alleviating the impact of heterogeneity. Second, we propagate intra-graph links from different metabolic graphs using inter-graph links as anchors. Finally, we conduct the gene-metabolite association prediction based on the enriched metabolism graphs, which integrate the knowledge from multiple microorganisms. Experiments on both types of organisms demonstrate that our proposed methodology outperforms baselines by up to 12.3% across various link prediction frameworks.

data mining, knowledge management, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2410.18475

Country: North America > United States (0.68)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Tang, Haotian, Wu, Yecheng, Yang, Shang, Xie, Enze, Chen, Junsong, Chen, Junyu, Zhang, Zhuoyang, Cai, Han, Lu, Yao, Han, Song

arXiv.org Artificial IntelligenceOct-14-2024

Figure 1: HART is an early autoregressive model that can directly generate 1024 1024 images with quality comparable to diffusion models, while offering significantly improved efficiency. It achieves 4.5-7.7 higher throughput, 3.1-5.9 Check out our online demo and video. We introduce Hybrid Autoregressive Transformer (HART), an autoregressive (AR) visual generation model capable of directly generating 1024 1024 images, rivaling diffusion models in image generation quality. Existing AR models face limitations due to the poor image reconstruction quality of their discrete tokenizers and the prohibitive training costs associated with generating 1024px images. To address these challenges, we present the hybrid tokenizer, which decomposes the continuous latents from the autoencoder into two components: discrete tokens representing the big picture and continuous tokens representing the residual components that cannot be represented by the discrete tokens. The discrete component is modeled by a scalable-resolution discrete AR model, while the continuous component is learned with a lightweight residual diffusion module with only 37M parameters. Compared with the discrete-only VAR tokenizer, our hybrid approach improves reconstruction FID from 2.11 to 0.30 on MJHQ-30K, leading to a 31% generation FID improvement from 7.85 to 5.38. HART also outperforms state-of-the-art diffusion models in both FID and CLIP score, with 4.5-7.7 higher throughput and 6.9-13.4 Part of the work was done when Haotian Tang and Shang Yang were summer interns at NVIDIA. Prompt: A panda that has been cybernetically enhanced.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.10812

Genre: Research Report (1.00)

Industry: Information Technology > Hardware (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation

Wang, Zifan, Chen, Junyu, Chen, Ziqing, Xie, Pengwei, Chen, Rui, Yi, Li

arXiv.org Artificial IntelligenceJan-1-2024

This paper presents GenH2R, a framework for learning generalizable vision-based human-to-robot (H2R) handover skills. The goal is to equip robots with the ability to reliably receive objects with unseen geometry handed over by humans in various complex trajectories. We acquire such generalizability by learning H2R handover at scale with a comprehensive solution including procedural simulation assets creation, automated demonstration generation, and effective imitation learning. We leverage large-scale 3D model repositories, dexterous grasp generation methods, and curve-based 3D animation to create an H2R handover simulation environment named \simabbns, surpassing the number of scenes in existing simulators by three orders of magnitude. We further introduce a distillation-friendly demonstration generation method that automatically generates a million high-quality demonstrations suitable for learning. Finally, we present a 4D imitation learning method augmented by a future forecasting objective to distill demonstrations into a visuo-motor handover policy. Experimental evaluations in both simulators and the real world demonstrate significant improvements (at least +10\% success rate) over baselines in all cases. The project page is https://GenH2R.github.io/.

demonstration, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2401.00929

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.46)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Learning to Evaluate the Artness of AI-generated Images

Chen, Junyu, An, Jie, Lyu, Hanjia, Luo, Jiebo

arXiv.org Artificial IntelligenceMay-8-2023

Assessing the artness of AI-generated images continues to be a challenge within the realm of image generation. Most existing metrics cannot be used to perform instance-level and reference-free artness evaluation. This paper presents ArtScore, a metric designed to evaluate the degree to which an image resembles authentic artworks by artists (or conversely photographs), thereby offering a novel approach to artness assessment. We first blend pre-trained models for photo and artwork generation, resulting in a series of mixed models. Subsequently, we utilize these mixed models to generate images exhibiting varying degrees of artness with pseudo-annotations. Each photorealistic image has a corresponding artistic counterpart and a series of interpolated images that range from realistic to artistic. This dataset is then employed to train a neural network that learns to estimate quantized artness levels of arbitrary images. Extensive experiments reveal that the artness levels predicted by ArtScore align more closely with human artistic evaluation than existing evaluation metrics, such as Gram loss and ArtFID.

artificial intelligence, artness, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.04923

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Predicting Adverse Neonatal Outcomes for Preterm Neonates with Multi-Task Learning

Lin, Jingyang, Chen, Junyu, Lyu, Hanjia, Khodak, Igor, Chhabra, Divya, Richardson, Colby L Day, Prelipcean, Irina, Dylag, Andrew M, Luo, Jiebo

arXiv.org Artificial IntelligenceMar-27-2023

Diagnosis of adverse neonatal outcomes is crucial for preterm survival since it enables doctors to provide timely treatment. Machine learning (ML) algorithms have been demonstrated to be effective in predicting adverse neonatal outcomes. However, most previous ML-based methods have only focused on predicting a single outcome, ignoring the potential correlations between different outcomes, and potentially leading to suboptimal results and overfitting issues. In this work, we first analyze the correlations between three adverse neonatal outcomes and then formulate the diagnosis of multiple neonatal outcomes as a multi-task learning (MTL) problem. We then propose an MTL framework to jointly predict multiple adverse neonatal outcomes. In particular, the MTL framework contains shared hidden layers and multiple task-specific branches. Extensive experiments have been conducted using Electronic Health Records (EHRs) from 121 preterm neonates. Empirical results demonstrate the effectiveness of the MTL framework. Furthermore, the feature importance is analyzed for each neonatal outcome, providing insights into model interpretability.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2303.15656

Country:

North America > United States (0.68)
Asia (0.68)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

An investigation of licensing of datasets for machine learning based on the GQM model

Chen, Junyu, Yoshida, Norihiro, Takada, Hiroaki

arXiv.org Artificial IntelligenceMar-23-2023

Dataset licensing is currently an issue in the development of machine learning systems. And in the development of machine learning systems, the most widely used are publicly available datasets. However, since the images in the publicly available dataset are mainly obtained from the Internet, some images are not commercially available. Furthermore, developers of machine learning systems do not often care about the license of the dataset when training machine learning models with it. In summary, the licensing of datasets for machine learning systems is in a state of incompleteness in all aspects at this stage. Our investigation of two collection datasets revealed that most of the current datasets lacked licenses, and the lack of licenses made it impossible to determine the commercial availability of the datasets. Therefore, we decided to take a more scientific and systematic approach to investigate the licensing of datasets and the licensing of machine learning systems that use the dataset to make it easier and more compliant for future developers of machine learning systems.

artificial intelligence, machine learning, survey article, (15 more...)

arXiv.org Artificial Intelligence

2303.13735

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Noise-level-aware Framework for PET Image Denoising

Li, Ye, Cui, Jianan, Chen, Junyu, Zeng, Guodong, Wollenweber, Scott, Jansen, Floris, Jang, Se-In, Kim, Kyungsang, Gong, Kuang, Li, Quanzheng

arXiv.org Artificial IntelligenceMar-15-2022

In PET, the amount of relative (signal-dependent) noise present in different body regions can be significantly different and is inherently related to the number of counts present in that region. The number of counts in a region depends, in principle and among other factors, on the total administered activity, scanner sensitivity, image acquisition duration, radiopharmaceutical tracer uptake in the region, and patient local body morphometry surrounding the region. In theory, less amount of denoising operations is needed to denoise a high-count (low relative noise) image than images a low-count (high relative noise) image, and vice versa. The current deep-learning-based methods for PET image denoising are predominantly trained on image appearance only and have no special treatment for images of different noise levels. Our hypothesis is that by explicitly providing the local relative noise level of the input image to a deep convolutional neural network (DCNN), the DCNN can outperform itself trained on image appearance only. To this end, we propose a noise-level-aware framework denoising framework that allows embedding of local noise level into a DCNN. The proposed is trained and tested on 30 and 15 patient PET images acquired on a GE Discovery MI PET/CT system. Our experiments showed that the increases in both PSNR and SSIM from our backbone network with relative noise level embedding (NLE) versus the same network without NLE were statistically significant with p<0.001, and the proposed method significantly outperformed a strong baseline method by a large margin.

artificial intelligence, machine learning, noise level, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-17247-2_8

2203.08034

Country: North America > United States (0.47)

Genre:

Research Report > Experimental Study (0.89)
Research Report > New Finding (0.69)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback