AITopics | Chen, Jiahao

Collaborating Authors

Chen, Jiahao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Think on your feet: Seamless Transition between Human-like Locomotion in Response to Changing Commands

Huang, Huaxing, Cui, Wenhao, Zhang, Tonghe, Li, Shengtao, Han, Jinchao, Qin, Bangyu, Zhang, Tianchu, Zheng, Liang, Tang, Ziyang, Hu, Chenxu, Yan, Ning, Chen, Jiahao, Zhang, Shipu, Jiang, Zheyuan

arXiv.org Artificial IntelligenceFeb-26-2025

While it is relatively easier to train humanoid robots to mimic specific locomotion skills, it is more challenging to learn from various motions and adhere to continuously changing commands. These robots must accurately track motion instructions, seamlessly transition between a variety of movements, and master intermediate motions not present in their reference data. In this work, we propose a novel approach that integrates human-like motion transfer with precise velocity tracking by a series of improvements to classical imitation learning. To enhance generalization, we employ the Wasserstein divergence criterion (WGAN-div). Furthermore, a Hybrid Internal Model provides structured estimates of hidden states and velocity to enhance mobile stability and environment adaptability, while a curiosity bonus fosters exploration. Our comprehensive method promises highly human-like locomotion that adapts to varying velocity requirements, direct generalization to unseen motions and multitasking, as well as zero-shot transfer to the simulator and the real world across different terrains. These advancements are validated through simulations across various robot models and extensive real-world experiments.

artificial intelligence, locomotion, robot, (16 more...)

arXiv.org Artificial Intelligence

2502.18901

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Rethinking the Bias of Foundation Model under Long-tailed Distribution

Chen, Jiahao, Qin, Bin, Li, Jiangmeng, Chen, Hao, Su, Bing

arXiv.org Machine LearningJan-27-2025

Long-tailed learning has garnered increasing attention due to its practical significance. Among the various approaches, the fine-tuning paradigm has gained considerable interest with the advent of foundation models. However, most existing methods primarily focus on leveraging knowledge from these models, overlooking the inherent biases introduced by the imbalanced training data they rely on. In this paper, we examine how such imbalances from pre-training affect long-tailed downstream tasks. Specifically, we find the imbalance biases inherited in foundation models on downstream task as parameter imbalance and data imbalance. During fine-tuning, we observe that parameter imbalance plays a more critical role, while data imbalance can be mitigated using existing re-balancing strategies. Moreover, we find that parameter imbalance cannot be effectively addressed by current re-balancing techniques, such as adjusting the logits, during training, unlike data imbalance. To tackle both imbalances simultaneously, we build our method on causal learning and view the incomplete semantic factor as the confounder, which brings spurious correlations between input samples and labels. To resolve the negative effects of this, we propose a novel backdoor adjustment method that learns the true causal effect between input samples and labels, rather than merely fitting the correlations in the data. Notably, we achieve an average performance increase of about $1.67\%$ on each dataset.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2501.15955

Country:

Europe (0.14)
Asia > China (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.46)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Interpretable Enzyme Function Prediction via Residue-Level Detection

Yang, Zhao, Su, Bing, Chen, Jiahao, Wen, Ji-Rong

arXiv.org Artificial IntelligenceJan-9-2025

Predicting multiple functions labeled with Enzyme Commission (EC) numbers from the enzyme sequence is of great significance but remains a challenge due to its sparse multi-label classification nature, i.e., each enzyme is typically associated with only a few labels out of more than 6000 possible EC numbers. However, existing machine learning algorithms generally learn a fixed global representation for each enzyme to classify all functions, thereby they lack interpretability and the fine-grained information of some function-specific local residue fragments may be overwhelmed. Here we present an attention-based framework, namely ProtDETR (Protein Detection Transformer), by casting enzyme function prediction as a detection problem. It uses a set of learnable functional queries to adaptatively extract different local representations from the sequence of residue-level features for predicting different EC numbers. ProtDETR not only significantly outperforms existing deep learning-based enzyme function prediction methods, but also provides a new interpretable perspective on automatically detecting different local regions for identifying different functions through cross-attentions between queries and residue-level features. The development of genome sequencing technologies has unveiled a vast collection of protein sequences, but detailed functional annotations are only available for a very small number of them [2]. Evaluating the functions of protein sequences via wet experiments is time-consuming, labor-intensive, and expensive, underscoring the critical need for computational methods to predict protein functions. This is particularly acute in the study of enzymes, which catalyze various biological reactions and are central to understanding metabolic processes. For the most widely-used EC number classification scheme, each class of enzyme function is assigned an EC number, which is a four-level hierarchy reflecting the intricate organization of enzyme functions.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.05644

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AEIOU: A Unified Defense Framework against NSFW Prompts in Text-to-Image Models

Wang, Yiming, Chen, Jiahao, Li, Qingming, Yang, Xing, Ji, Shouling

arXiv.org Artificial IntelligenceDec-23-2024

As text-to-image (T2I) models continue to advance and gain widespread adoption, their associated safety issues are becoming increasingly prominent. Malicious users often exploit these models to generate Not-Safe-for-Work (NSFW) images using harmful or adversarial prompts, highlighting the critical need for robust safeguards to ensure the integrity and compliance of model outputs. Current internal safeguards frequently degrade image quality, while external detection methods often suffer from low accuracy and inefficiency. In this paper, we introduce AEIOU, a defense framework that is Adaptable, Efficient, Interpretable, Optimizable, and Unified against NSFW prompts in T2I models. AEIOU extracts NSFW features from the hidden states of the model's text encoder, utilizing the separable nature of these features to detect NSFW prompts. The detection process is efficient, requiring minimal inference time. AEIOU also offers real-time interpretation of results and supports optimization through data augmentation techniques. The framework is versatile, accommodating various T2I architectures. Our extensive experiments show that AEIOU significantly outperforms both commercial and open-source moderation tools, achieving over 95% accuracy across all datasets and improving efficiency by at least tenfold. It effectively counters adaptive attacks and excels in few-shot and multi-label scenarios.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.18123

Country: Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.68)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Analysis of frequent trading effects of various machine learning models

Chen, Jiahao, Li, Xiaofei

arXiv.org Artificial IntelligenceSep-14-2023

In recent years, high-frequency trading has emerged as a crucial strategy in stock trading. This study aims to develop an advanced high-frequency trading algorithm and compare the performance of three different mathematical models: the combination of the cross-entropy loss function and the quasi-Newton algorithm, the FCNN model, and the vector machine. The proposed algorithm employs neural network predictions to generate trading signals and execute buy and sell operations based on specific conditions. By harnessing the power of neural networks, the algorithm enhances the accuracy and reliability of the trading strategy. To assess the effectiveness of the algorithm, the study evaluates the performance of the three mathematical models. The combination of the cross-entropy loss function and the quasi-Newton algorithm is a widely utilized logistic regression approach. The FCNN model, on the other hand, is a deep learning algorithm that can extract and classify features from stock data. Meanwhile, the vector machine is a supervised learning algorithm recognized for achieving improved classification results by mapping data into high-dimensional spaces. By comparing the performance of these three models, the study aims to determine the most effective approach for high-frequency trading. This research makes a valuable contribution by introducing a novel methodology for high-frequency trading, thereby providing investors with a more accurate and reliable stock trading strategy.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2311.10719

Country: Asia > China (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Universal Defensive Underpainting Patch: Making Your Text Invisible to Optical Character Recognition

Deng, JiaCheng, Dong, Li, Chen, Jiahao, Yan, Diqun, Wang, Rangding, Ye, Dengpan, Zhao, Lingchen, Tian, Jinyu

arXiv.org Artificial IntelligenceAug-4-2023

Optical Character Recognition (OCR) enables automatic text extraction from scanned or digitized text images, but it also makes it easy to pirate valuable or sensitive text from these images. Previous methods to prevent OCR piracy by distorting characters in text images are impractical in real-world scenarios, as pirates can capture arbitrary portions of the text images, rendering the defenses ineffective. In this work, we propose a novel and effective defense mechanism termed the Universal Defensive Underpainting Patch (UDUP) that modifies the underpainting of text images instead of the characters. UDUP is created through an iterative optimization process to craft a small, fixed-size defensive patch that can generate non-overlapping underpainting for text images of any size. Experimental results show that UDUP effectively defends against unauthorized OCR under the setting of any screenshot range or complex image background. It is agnostic to the content, size, colors, and languages of characters, and is robust to typical image operations such as scaling and compressing. In addition, the transferability of UDUP is demonstrated by evading several off-the-shelf OCRs. The code is available at https://github.com/QRICKDD/UDUP.

machine learning, natural language, text image, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3581783.3613768

2308.02369

Country:

Asia > China > Zhejiang Province (0.15)
Asia > China > Hubei Province (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)

Add feedback

Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity

Hamman, Faisal, Chen, Jiahao, Dutta, Sanghamitra

arXiv.org Artificial IntelligenceJun-5-2023

Existing regulations prohibit model developers from accessing protected attributes (gender, race, etc.), often resulting in fairness assessments on populations without knowing their protected groups. In such scenarios, institutions often adopt a separation between the model developers (who train models with no access to the protected attributes) and a compliance team (who may have access to the entire dataset for auditing purposes). However, the model developers might be allowed to test their models for bias by querying the compliance team for group fairness metrics. In this paper, we first demonstrate that simply querying for fairness metrics, such as statistical parity and equalized odds can leak the protected attributes of individuals to the model developers. We demonstrate that there always exist strategies by which the model developers can identify the protected attribute of a targeted individual in the test dataset from just a single query. In particular, we show that one can reconstruct the protected attributes of all the individuals from O(Nk \log( n /Nk)) queries when Nk<

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3593013.3594086

2211.02139

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Atomic and Subgraph-aware Bilateral Aggregation for Molecular Representation Learning

Chen, Jiahao, Liu, Yurou, Li, Jiangmeng, Su, Bing, Wen, Jirong

arXiv.org Artificial IntelligenceMay-21-2023

Molecular representation learning is a crucial task in predicting molecular properties. Molecules are often modeled as graphs where atoms and chemical bonds are represented as nodes and edges, respectively, and Graph Neural Networks (GNNs) have been commonly utilized to predict atom-related properties, such as reactivity and solubility. However, functional groups (subgraphs) are closely related to some chemical properties of molecules, such as efficacy, and metabolic properties, which cannot be solely determined by individual atoms. In this paper, we introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA), which addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information. ASBA consists of two branches, one for atom-wise information and the other for subgraph-wise information. Considering existing atom-wise GNNs cannot properly extract invariant subgraph features, we propose a decomposition-polymerization GNN architecture for the subgraph-wise branch. Furthermore, we propose cooperative node-level and graph-level self-supervised learning strategies for ASBA to improve its generalization. Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications. Extensive experiments have demonstrated the effectiveness of our method.

artificial intelligence, machine learning, subgraph, (14 more...)

arXiv.org Artificial Intelligence

2305.12618

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Energy > Oil & Gas (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

SC-Ques: A Sentence Completion Question Dataset for English as a Second Language Learners

Liu, Qiongqiong, Huang, Yaying, Liu, Zitao, Huang, Shuyan, Chen, Jiahao, Zhao, Xiangyu, Lin, Guimin, Zhou, Yuyu, Luo, Weiqi

arXiv.org Artificial IntelligenceApr-7-2023

Sentence completion (SC) questions present a sentence with one or more blanks that need to be filled in, three to five possible words or phrases as options. SC questions are widely used for students learning English as a Second Language (ESL). In this paper, we present a large-scale SC dataset, \textsc{SC-Ques}, which is made up of 289,148 ESL SC questions from real-world standardized English examinations. Furthermore, we build a comprehensive benchmark of automatically solving the SC questions by training the large-scale pre-trained language models on the proposed \textsc{SC-Ques} dataset. We conduct detailed analysis of the baseline models performance, limitations and trade-offs. The data and our code are available for research purposes from: \url{https://github.com/ai4ed/SC-Ques}.

machine learning, natural language, sc question, (17 more...)

arXiv.org Artificial Intelligence

2206.12036

Country: Asia > China (0.70)

Genre: Research Report (0.64)

Industry:

Education > Curriculum > Subject-Specific Education (0.93)
Education > Educational Setting (0.68)
Education > Focused Education > Reading & Literacy > English As A Second Language (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations

Chen, Jiahao, Liu, Zitao, Huang, Shuyan, Liu, Qiongqiong, Luo, Weiqi

arXiv.org Artificial IntelligenceMar-16-2023

Knowledge tracing (KT) is a crucial technique to predict students' future performance by observing their historical learning processes. Due to the powerful representation ability of deep neural networks, remarkable progress has been made by using deep learning techniques to solve the KT problem. The majority of existing approaches rely on the \emph{homogeneous question} assumption that questions have equivalent contributions if they share the same set of knowledge components. Unfortunately, this assumption is inaccurate in real-world educational scenarios. Furthermore, it is very challenging to interpret the prediction results from the existing deep learning based KT models. Therefore, in this paper, we present QIKT, a question-centric interpretable KT model to address the above challenges. The proposed QIKT approach explicitly models students' knowledge state variations at a fine-grained level with question-sensitive cognitive representations that are jointly learned from a question-centric knowledge acquisition module and a question-centric problem solving module. Meanwhile, the QIKT utilizes an item response theory based prediction layer to generate interpretable prediction results. The proposed QIKT model is evaluated on three public real-world educational datasets. The results demonstrate that our approach is superior on the KT prediction task, and it outperforms a wide range of deep learning based KT models in terms of prediction accuracy with better model interpretability. To encourage reproducible results, we have provided all the datasets and code at \url{https://pykt.org/}.

artificial intelligence, machine learning, question-centric cognitive representation, (3 more...)

arXiv.org Artificial Intelligence

2302.06885

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback