AITopics | Zhang, Kaichen

Collaborating Authors

Zhang, Kaichen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

Zhang, Kaichen, Shen, Yifei, Li, Bo, Liu, Ziwei

arXiv.org Artificial IntelligenceNov-22-2024

Recent advances in Large Multimodal Models (LMMs) lead to significant breakthroughs in both academia and industry. One question that arises is how we, as humans, can understand their internal neural representations. This paper takes an initial step towards addressing this question by presenting a versatile framework to identify and interpret the semantics within LMMs. Specifically, 1) we first apply a Sparse Autoencoder(SAE) to disentangle the representations into human understandable features. 2) We then present an automatic interpretation framework to interpreted the open-semantic features learned in SAE by the LMMs themselves. We employ this framework to analyze the LLaVA-NeXT-8B model using the LLaVA-OV-72B model, demonstrating that these features can effectively steer the model's behavior. Our results contribute to a deeper understanding of why LMMs excel in specific tasks, including EQ tests, and illuminate the nature of their mistakes along with potential strategies for their rectification. These findings offer new insights into the internal mechanisms of LMMs and suggest parallels with the cognitive processes of the human brain.

artificial intelligence, interpret feature, multi-modal model

arXiv.org Artificial Intelligence

2411.14982

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Ni, Jinjie, Song, Yifan, Ghosal, Deepanway, Li, Bo, Zhang, David Junhao, Yue, Xiang, Xue, Fuzhao, Zheng, Zian, Zhang, Kaichen, Shah, Mahir, Jain, Kabir, You, Yang, Shieh, Michael

arXiv.org Artificial IntelligenceOct-18-2024

Perceiving and generating diverse modalities are crucial for AI models to effectively learn from and engage with real-world signals, necessitating reliable evaluations for their development. We identify two major issues in current evaluations: (1) inconsistent standards, shaped by different communities with varying protocols and maturity levels; and (2) significant query, grading, and generalization biases. To address these, we introduce MixEval-X, the first any-to-any, real-world benchmark designed to optimize and standardize evaluations across diverse input and output modalities. We propose multi-modal benchmark mixture and adaptation-rectification pipelines to reconstruct real-world task distributions, ensuring evaluations generalize effectively to real-world use cases. Extensive meta-evaluations show our approach effectively aligns benchmark samples with real-world task distributions. Meanwhile, MixEval-X's model rankings correlate strongly with that of crowd-sourced real-world evaluations (up to 0.98) while being much more efficient. We provide comprehensive leaderboards to rerank existing models and organizations and offer insights to enhance understanding of multi-modal evaluations and inform future research.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.13754

Genre: Research Report (0.81)

Industry:

Media (0.92)
Leisure & Entertainment > Sports (0.67)
Education (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models

Zhang, Kaichen, Li, Bo, Zhang, Peiyuan, Pu, Fanyi, Cahyono, Joshua Adrian, Hu, Kairui, Liu, Shuai, Zhang, Yuanhan, Yang, Jingkang, Li, Chunyuan, Liu, Ziwei

arXiv.org Artificial IntelligenceJul-17-2024

The advances of large foundation models necessitate wide-coverage, low-cost, and zero-contamination benchmarks. Despite continuous exploration of language model evaluations, comprehensive studies on the evaluation of Large Multi-modal Models (LMMs) remain limited. In this work, we introduce LMMS-EVAL, a unified and standardized multimodal benchmark framework with over 50 tasks and more than 10 models to promote transparent and reproducible evaluations. Although LMMS-EVAL offers comprehensive coverage, we find it still falls short in achieving low cost and zero contamination. To approach this evaluation trilemma, we further introduce LMMS-EVAL LITE, a pruned evaluation toolkit that emphasizes both coverage and efficiency. Additionally, we present Multimodal LIVEBENCH that utilizes continuously updating news and online forums to assess models' generalization abilities in the wild, featuring a low-cost and zero-contamination evaluation approach. In summary, our work highlights the importance of considering the evaluation trilemma and provides practical solutions to navigate the trade-offs in evaluating large multi-modal models, paving the way for more effective and reliable benchmarking of LMMs. We opensource our codebase and maintain leaderboard of LIVEBENCH at https://github.com/EvolvingLMMs-Lab/lmms-eval and https://huggingface.co/spaces/lmms-lab/LiveBench.

benchmark, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2407.12772

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Qatar (0.14)

Genre:

Research Report (0.64)
Overview (0.46)

Industry:

Media > News (1.00)
Government (0.93)
Information Technology > Security & Privacy (0.68)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Robust Reward Placement under Uncertainty

Petsinis, Petros, Zhang, Kaichen, Pavlogiannis, Andreas, Zhou, Jingbo, Karras, Panagiotis

arXiv.org Artificial IntelligenceJun-3-2024

We consider a problem of placing generators of rewards to be collected by randomly moving agents in a network. In many settings, the precise mobility pattern may be one of several possible, based on parameters outside our control, such as weather conditions. The placement should be robust to this uncertainty, to gain a competent total reward across possible networks. To study such scenarios, we introduce the Robust Reward Placement problem (RRP). Agents move randomly by a Markovian Mobility Model with a predetermined set of locations whose connectivity is chosen adversarially from a known set $\Pi$ of candidates. We aim to select a set of reward states within a budget that maximizes the minimum ratio, among all candidates in $\Pi$, of the collected total reward over the optimal collectable reward under the same candidate. We prove that RRP is NP-hard and inapproximable, and develop $\Psi$-Saturate, a pseudo-polynomial time algorithm that achieves an $\epsilon$-additive approximation by exceeding the budget constraint by a factor that scales as $O(\ln |\Pi|/\epsilon)$. In addition, we present several heuristics, most prominently one inspired by a dynamic programming algorithm for the max-min 0-1 KNAPSACK problem. We corroborate our theoretical analysis with an experimental evaluation on synthetic and real data.

algorithm, artificial intelligence, optimization problem, (14 more...)

arXiv.org Artificial Intelligence

2405.05433

Country:

Europe (0.14)
Asia > China (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

The Impact of Generative Artificial Intelligence

Zhang, Kaichen, Kwon, Ohchan, Xiong, Hui

arXiv.org Artificial IntelligenceNov-12-2023

The rise of generative artificial intelligence (AI) has sparked concerns about its potential influence on unemployment and market depression. This study addresses this concern by examining the impact of generative AI on product markets. To overcome the challenge of causal inference, given the inherent limitations of conducting controlled experiments, this paper identifies an unanticipated and sudden leak of a highly proficient image-generative AI as a novel instance of a "natural experiment". This AI leak spread rapidly, significantly reducing the cost of generating anime-style images compared to other styles, creating an opportunity for comparative assessment. We collect real-world data from an artwork outsourcing platform. Surprisingly, our results show that while generative AI lowers average prices, it substantially boosts order volume and overall revenue. This counterintuitive finding suggests that generative AI confers benefits upon artists rather than detriments. The study further offers theoretical economic explanations to elucidate this unexpected phenomenon. By furnishing empirical evidence, this paper dispels the notion that generative AI might engender depression, instead underscoring its potential to foster market prosperity. These findings carry significant implications for practitioners, policymakers, and the broader AI community.

deep learning, generative artificial intelligence, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2311.07071

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback