AITopics | Li, Tianhong

Collaborating Authors

Li, Tianhong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unified Autoregressive Visual Generation and Understanding with Continuous Tokens

Fan, Lijie, Tang, Luming, Qin, Siyang, Li, Tianhong, Yang, Xuan, Qiao, Siyuan, Steiner, Andreas, Sun, Chen, Li, Yuanzhen, Zhu, Tao, Rubinstein, Michael, Raptis, Michalis, Sun, Deqing, Soricut, Radu

arXiv.org Artificial IntelligenceMar-17-2025

Our unified autoregressive architecture processes multimodal image and text inputs, generating discrete tokens for text and continuous tokens for image. We find though there is an inherent trade-off between the image generation and understanding task, a carefully tuned training recipe enables them to improve each other. By selecting an appropriate loss balance weight, the unified model achieves results comparable to or exceeding those of single-task baselines on both tasks. Furthermore, we demonstrate that employing stronger pre-trained LLMs and random-order generation during training is important to achieve high-fidelity image generation within this unified framework. Built upon the Gemma model series, UniFluid exhibits competitive performance across both image generation and understanding, demonstrating strong transferability to various downstream tasks, including image editing for generation, as well as visual captioning and question answering for understanding.

arxiv preprint arxiv, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.13436

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Fractal Generative Models

Li, Tianhong, Sun, Qinyi, Fan, Lijie, He, Kaiming

arXiv.org Artificial IntelligenceFeb-25-2025

Modularization is a cornerstone of computer science, abstracting complex functions into atomic building blocks. In this paper, we introduce a new level of modularization by abstracting generative models into atomic generative modules. Analogous to fractals in mathematics, our method constructs a new type of generative model by recursively invoking atomic generative modules, resulting in self-similar fractal architectures that we call fractal generative models. As a running example, we instantiate our fractal framework using autoregressive models as the atomic generative modules and examine it on the challenging task of pixel-by-pixel image generation, demonstrating strong performance in both likelihood estimation and generation quality. We hope this work could open a new paradigm in generative modeling and provide a fertile ground for future research. Code is available at https://github.com/LTH14/fractalgen.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.17437

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.46)
Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

CellFlow: Simulating Cellular Morphology Changes via Flow Matching

Zhang, Yuhui, Su, Yuchang, Wang, Chenyu, Li, Tianhong, Wefers, Zoe, Nirschl, Jeffrey, Burgess, James, Ding, Daisy, Lozano, Alejandro, Lundberg, Emma, Yeung-Levy, Serena

arXiv.org Artificial IntelligenceFeb-13-2025

Building a virtual cell capable of accurately simulating cellular behaviors in silico has long been a dream in computational biology. We introduce CellFlow, an image-generative model that simulates cellular morphology changes induced by chemical and genetic perturbations using flow matching. Unlike prior methods, CellFlow models distribution-wise transformations from unperturbed to perturbed cell states, effectively distinguishing actual perturbation effects from experimental artifacts such as batch effects -- a major challenge in biological data. Evaluated on chemical (BBBC021), genetic (RxRx1), and combined perturbation (JUMP) datasets, CellFlow generates biologically meaningful cell images that faithfully capture perturbation-specific morphological changes, achieving a 35% improvement in FID scores and a 12% increase in mode-of-action prediction accuracy over existing methods. Additionally, CellFlow enables continuous interpolation between cellular states, providing a potential tool for studying perturbation dynamics. These capabilities mark a significant step toward realizing virtual cell modeling for biomedical research.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.09775

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens

Fan, Lijie, Li, Tianhong, Qin, Siyang, Li, Yuanzhen, Sun, Chen, Rubinstein, Michael, Sun, Deqing, He, Kaiming, Tian, Yonglong

arXiv.org Artificial IntelligenceOct-17-2024

Scaling up autoregressive models in vision has not proven as beneficial as in large language models. In this work, we investigate this scaling problem in the context of text-to-image generation, focusing on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed raster order using BERT- or GPT-like transformer architectures. Our empirical results show that, while all models scale effectively in terms of validation loss, their evaluation performance -- measured by FID, GenEval score, and visual quality -- follows different trends. Models based on continuous tokens achieve significantly better visual quality than those using discrete tokens. Furthermore, the generation order and attention mechanisms significantly affect the GenEval score: random-order models achieve notably better GenEval scores compared to raster-order models. Inspired by these findings, we train Fluid, a random-order autoregressive model on continuous tokens. Fluid 10.5B model achieves a new state-of-the-art zero-shot FID of 6.16 on MS-COCO 30K, and 0.69 overall score on the GenEval benchmark. We hope our findings and results will encourage future efforts to further bridge the scaling gap between vision and language models.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.13863

Genre: Research Report > New Finding (0.86)

Industry: Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Knowledge Distillation from Few Samples

Li, Tianhong, Li, Jianguo, Liu, Zhuang, Zhang, Changshui

arXiv.org Machine LearningDec-5-2018

Current knowledge distillation methods require full training data to distill knowledge from a large "teacher" network to a compact "student" network by matching certain statistics between "teacher" and "student" such as softmax outputs and feature responses. This is not only time-consuming but also inconsistent with human cognition in which children can learn knowledge from adults with few examples. This paper proposes a novel and simple method for knowledge distillation from few samples. Taking the assumption that both "teacher" and "student" have the same feature map sizes at each corresponding block, we add a 1x1 conv-layer at the end of each block in the student-net, and align the block-level outputs between "teacher" and "student" by estimating the parameters of the added layer with limited samples. We prove that the added layer can be absorbed/merged into the previous conv-layer to formulate a new conv-layer with the same size of parameters and computation cost as the previous one. Experiments verify that the proposed method is very efficient and effective to distill knowledge from teacher-net to student-net constructing in different ways on various datasets.

deep learning, fskd, neural network, (19 more...)

arXiv.org Machine Learning

1812.01839

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Education (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Quadratic Upper Bound for Recursive Teaching Dimension of Finite VC Classes

Hu, Lunjia, Wu, Ruihan, Li, Tianhong, Wang, Liwei

arXiv.org Machine LearningFeb-18-2017

In this work we study the quantitative relation between the recursive teaching dimension (RTD) and the VC dimension (VCD) of concept classes of finite sizes. The RTD of a concept class $\mathcal C \subseteq \{0, 1\}^n$, introduced by Zilles et al. (2011), is a combinatorial complexity measure characterized by the worst-case number of examples necessary to identify a concept in $\mathcal C$ according to the recursive teaching model. For any finite concept class $\mathcal C \subseteq \{0,1\}^n$ with $\mathrm{VCD}(\mathcal C)=d$, Simon & Zilles (2015) posed an open problem $\mathrm{RTD}(\mathcal C) = O(d)$, i.e., is RTD linearly upper bounded by VCD? Previously, the best known result is an exponential upper bound $\mathrm{RTD}(\mathcal C) = O(d \cdot 2^d)$, due to Chen et al. (2016). In this paper, we show a quadratic upper bound: $\mathrm{RTD}(\mathcal C) = O(d^2)$, much closer to an answer to the open problem. We also discuss the challenges in fully solving the problem.

artificial intelligence, machine learning, rtd, (16 more...)

arXiv.org Machine Learning

1702.05677

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Industry: Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.92)

Add feedback

Exploring Efficient Strategies for Minesweeper

Tu, Jinzheng (Tsinghua University) | Li, Tianhong (Tsinghua University) | Chen, Shiteng (Institute of Software, Chinese Academy of Sciences) | Zu, Chong (University of California, Berkeley) | Gu, Zhaoquan (The University of Hong Kong)

AAAI ConferencesFeb-4-2017

Minesweeper is a famous single-player computer game, in which the grid of blocks contains some mines and the player is to uncover (probe) all blocks that do not contain any mines. Many heuristic strategies have been prompted to play the game, but the rate of success is not high. In this paper, we explore efficient strategies for the Minesweeper game. First, we show a counterintuitive result that probing the corner blocks could increase the rate of success. Then, we present a series of heuristic strategies, and the combination of them could lead to better results. We also transplant the optimal procedure on the basis of our proposed methods, and it achieves the highest rate of success. Through extensive simulations, a combination of heuristic strategies, "PSEQ", yields a success rate of 81.627(8)%, 78.122(8)%, and 39.616(5)% for beginner, intermediate, and expert levels respectively, outperforming the state-of-the-art strategies. Moreover, the developed quasi-optimal methods, combining the optimal procedure and our heuristic methods, raise the success rate to at least 81.79(2)%, 78.22(3)%, and 40.06(2)% respectively.

algorithm, artificial intelligence, heuristic strategy, (14 more...)

AAAI Conferences

Workshops at the Thirty-First AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.29)
Europe (0.28)

Industry: Government > Military > Navy (0.88)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback