AITopics | Chen, Yuqiang

Collaborating Authors

Chen, Yuqiang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation

Chen, Han, Jiang, Zicong, Zhang, Zining, He, Bingsheng, Luo, Pingyi, Lu, Mian, Chen, Yuqiang

arXiv.org Artificial IntelligenceMar-25-2025

We introduce LogQuant, a groundbreaking 2-bit quantization technique for KV Cache in large language model (LLM) inference, delivering substantial memory savings while preserving superior performance. Previous methods either assume that later tokens are more important or attempt to predict important tokens based on earlier attention patterns. Both approaches, however, can result in performance bottlenecks or frequent mispredictions. LogQuant takes a different approach. By applying a log-based filtering mechanism, it selectively compresses the KV Cache across the entire context, achieving better performance with the same or even reduced memory footprint compared to existing methods. In benchmark tests, it enhances throughput by 25% and boosts batch size by 60% without increasing memory consumption. For challenging tasks such as Math and Code Completion, LogQuant improves accuracy by 40% to 200% at the same compression ratio, outperforming comparable techniques.LogQuant integrates effortlessly with popular inference frameworks like Python's transformers library. Implementation can be available in https://github.com/Concyclics/LogQuantKV.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.1995

Country: Asia (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

OpenMLDB: A Real-Time Relational Data Feature Computation System for Online ML

Zhou, Xuanhe, Zhou, Wei, Qi, Liguo, Zhang, Hao, Chen, Dihao, He, Bingsheng, Lu, Mian, Li, Guoliang, Wu, Fan, Chen, Yuqiang

arXiv.org Artificial IntelligenceJan-15-2025

Efficient and consistent feature computation is crucial for a wide range of online ML applications. Typically, feature computation is divided into two distinct phases, i.e., offline stage for model training and online stage for model serving. These phases often rely on execution engines with different interface languages and function implementations, causing significant inconsistencies. Moreover, many online ML features involve complex time-series computations (e.g., functions over varied-length table windows) that differ from standard streaming and analytical queries. Existing data processing systems (e.g., Spark, Flink, DuckDB) often incur multi-second latencies for these computations, making them unsuitable for real-time online ML applications that demand timely feature updates. This paper presents OpenMLDB, a feature computation system deployed in 4Paradigm's SageOne platform and over 100 real scenarios. Technically, OpenMLDB first employs a unified query plan generator for consistent computation results across the offline and online stages, significantly reducing feature deployment overhead. Second, OpenMLDB provides an online execution engine that resolves performance bottlenecks caused by long window computations (via pre-aggregation) and multi-table window unions (via data self-adjusting). It also provides a high-performance offline execution engine with window parallel optimization and time-aware data skew resolving. Third, OpenMLDB features a compact data format and stream-focused indexing to maximize memory usage and accelerate data access. Evaluations in testing and real workloads reveal significant performance improvements and resource savings compared to the baseline systems. The open community of OpenMLDB now has over 150 contributors and gained 1.6k stars on GitHub.

artificial intelligence, data mining, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.08591

Country:

North America > United States (0.68)
Europe (0.46)

Genre: Research Report (0.82)

Industry:

Banking & Finance (0.67)
Information Technology (0.67)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.67)

Add feedback

Self-supervised Graph Neural Network for Mechanical CAD Retrieval

Quan, Yuhan, Zhao, Huan, Yi, Jinfeng, Chen, Yuqiang

arXiv.org Artificial IntelligenceJun-17-2024

CAD (Computer-Aided Design) plays a crucial role in mechanical industry, where large numbers of similar-shaped CAD parts are often created. Efficiently reusing these parts is key to reducing design and production costs for enterprises. Retrieval systems are vital for achieving CAD reuse, but the complex shapes of CAD models are difficult to accurately describe using text or keywords, making traditional retrieval methods ineffective. While existing representation learning approaches have been developed for CAD, manually labeling similar samples in these methods is expensive. Additionally, CAD models' unique parameterized data structure presents challenges for applying existing 3D shape representation learning techniques directly. In this work, we propose GC-CAD, a self-supervised contrastive graph neural network-based method for mechanical CAD retrieval that directly models parameterized CAD raw files. GC-CAD consists of two key modules: structure-aware representation learning and contrastive graph learning framework. The method leverages graph neural networks to extract both geometric and topological information from CAD models, generating feature representations. We then introduce a simple yet effective contrastive graph learning framework approach, enabling the model to train without manual labels and generate retrieval-ready representations. Experimental results on four datasets including human evaluation demonstrate that the proposed method achieves significant accuracy improvements and up to 100 times efficiency improvement over the baseline methods.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2406.08863

Country:

Asia > China (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Industry: Information Technology (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

IIP-Mixer:Intra-Inter Patch Mixing Architecture for Battery Remaining Useful Life Prediction

Ye, Guangzai, Feng, Li, Guo, Jianlan, Chen, Yuqiang

arXiv.org Artificial IntelligenceMar-27-2024

Accurately estimating the Remaining Useful Life (RUL) of lithium-ion batteries is crucial for maintaining the safe and stable operation of rechargeable battery management systems. However, this task is often challenging due to the complex temporal dynamics involved. Recently, attention-based networks, such as Transformers and Informer, have been the popular architecture in time series forecasting. Despite their effectiveness, these models with abundant parameters necessitate substantial training time to unravel temporal patterns. To tackle these challenges, we propose a simple MLP-Mixer-based architecture named 'Intra-Inter Patch Mixer' (IIP-Mixer), which is an architecture based exclusively on multi-layer perceptrons (MLPs), extracting information by mixing operations along both intra-patch and inter-patch dimensions for battery RUL prediction. The proposed IIP-Mixer comprises parallel dual-head mixer layers: the intra-patch mixing MLP, capturing local temporal patterns in the short-term period, and the inter-patch mixing MLP, capturing global temporal patterns in the long-term period. Notably, to address the varying importance of features in RUL prediction, we introduce a weighted loss function in the MLP-Mixer-based architecture, marking the first time such an approach has been employed. Our experiments demonstrate that IIP-Mixer achieves competitive performance in battery RUL prediction, outperforming other popular time-series frameworks

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2403.18379

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)

Add feedback

Network On Network for Tabular Data Classification in Real-world Applications

Luo, Yuanfei, Zhou, Hao, Tu, Weiwei, Chen, Yuqiang, Dai, Wenyuan, Yang, Qiang

arXiv.org Machine LearningMay-29-2020

Tabular data is the most common data format adopted by our customers ranging from retail, finance to E-commerce, and tabular data classification plays an essential role to their businesses. In this paper, we present Network On Network (NON), a practical tabular data classification model based on deep neural network to provide accurate predictions. Various deep methods have been proposed and promising progress has been made. However, most of them use operations like neural network and factorization machines to fuse the embeddings of different features directly, and linearly combine the outputs of those operations to get the final prediction. As a result, the intra-field information and the non-linear interactions between those operations (e.g. neural network and factorization machines) are ignored. Intra-field information is the information that features inside each field belong to the same field. NON is proposed to take full advantage of intra-field information and non-linear interactions. It consists of three components: field-wise network at the bottom to capture the intra-field information, across field network in the middle to choose suitable operations data-drivenly, and operation fusion network on the top to fuse outputs of the chosen operations deeply. Extensive experiments on six real-world datasets demonstrate NON can outperform the state-of-the-art models significantly. Furthermore, both qualitative and quantitative study of the features in the embedding space show NON can capture intra-field information effectively.

deep learning, neural network, opération, (20 more...)

arXiv.org Machine Learning

doi: 10.1145/3397271.3401437

2005.10114

Country:

North America > United States > Michigan > Isabella County (0.27)
North America > United States > Alaska > Anchorage Municipality > Anchorage (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AutoML @ NeurIPS 2018 challenge: Design and Results

Escalante, Hugo Jair, Tu, Wei-Wei, Guyon, Isabelle, Silver, Daniel L., Viegas, Evelyne, Chen, Yuqiang, Dai, Wenyuan, Yang, Qiang

arXiv.org Machine LearningMar-13-2019

Machine learning has achieved great successes in online advertising, recommender systems, financial market analysis, computer vision, computational linguistics, bioinformatics and many other fields. However, its success crucially relies on human machine learning experts, as human experts are involved to some extent, in all systems design stages. In fact, it is still common for humans to take critical decisions in aspects like: converting a real world problem into a machine learning one, data gathering, formatting and preprocessing, feature engineering, selecting or designing model architectures, hyper-parameter tuning, assessment of model performance, deploying online ML systems, among others.

artificial intelligence, machine learning, participant, (17 more...)

arXiv.org Machine Learning

1903.05263

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Overview (0.46)
Research Report (0.40)

Industry:

Marketing (0.54)
Education (0.47)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.34)

Add feedback

Privacy-preserving Transfer Learning for Knowledge Sharing

Guo, Xiawei, Yao, Quanming, Tu, WeiWei, Chen, Yuqiang, Dai, Wenyuan, Yang, Qiang

arXiv.org Artificial IntelligenceNov-23-2018

In many practical machine-learning applications, it is critical to allow knowledge to be transferred from external domains while preserving user privacy. Unfortunately, existing transfer-learning works do not have a privacy guarantee. In this paper, for the first time, we propose a method that can simultaneously transfer knowledge from external datasets while offering an $\epsilon$-differential privacy guarantee. First, we show that a simple combination of the hypothesis transfer learning and the privacy preserving logistic regression can address the problem. However, the performance of this approach can be poor as the sample size in the target domain may be small. To address this problem, we propose a new method which splits the feature set in source and target data into several subsets, and trains models on these subsets before finally aggregating the predictions by a stacked generalization. Feature importance can also be incorporated into the proposed method to further improve performance. We prove that the proposed method has an $\epsilon$-differential privacy guarantee, and further analysis shows that its performance is better than above simple combination given the same privacy budget. Finally, experiments on MINST and real-world RUIJIN datasets show that our proposed method achieves the start-of-the-art performance.

big data, dataset, health & medicine, (20 more...)

arXiv.org Artificial Intelligence

1811.09491

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (0.49)
Research Report > Experimental Study (0.35)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.83)
Information Technology > Data Science > Data Mining > Big Data (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

Heterogeneous Transfer Learning for Image Classification

Zhu, Yin (Hong Kong University of Science and Technology) | Chen, Yuqiang (Shanghai Jiao Tong University) | Lu, Zhongqi (&dagger;Hong Kong University of Science and Technology) | Pan, Sinno Jialin (Institute for Infocomm Research) | Xue, Gui-Rong (Shanghai Jiao Tong University) | Yu, Yong (Shanghai Jiao Tong University) | Yang, Qiang (Hong Kong University of Science and Technology)

AAAI ConferencesAug-4-2011

Transfer learning as a new machine learning paradigm has gained increasing attention lately. In situations where the training data in a target domain are not sufficient to learn predictive models effectively, transfer learning leverages auxiliary source data from other related source domains for learning. While most of the existing works in this area only focused on using the source data with the same structure as the target data, in this paper, we push this boundary further by proposing a heterogeneous transfer learning framework for knowledge transfer between text and images. We observe that for a target-domain classification problem, some annotated images can be found on many social Web sites, which can serve as a bridge to transfer knowledge from the abundant text documents available over the Web. A key question is how to effectively transfer the knowledge in the source data even though the text can be arbitrarily found. Our solution is to enrich the representation of the target images with semantic concepts extracted from the auxiliary source data through a novel matrix factorization method. By using the latent semantic features generated by the auxiliary data, we are able to build a better integrated image classifier. We empirically demonstrate the effectiveness of our algorithm on the Caltech-256 image dataset.

artificial intelligence, image classification, machine learning, (16 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Hong Kong (0.14)
North America > United States > Wisconsin (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Visual Contextual Advertising: Bringing Textual Advertisements to Images

Chen, Yuqiang (Shanghai Jiao Tong University) | Jin, Ou (Shanghai Jiao Tong University) | Xue, Gui-Rong (Shanghai Jiao Tong University) | Chen, Jia (Shanghai Jiao Tong University) | Yang, Qiang (Hong Kong University of Science and Technology)

AAAI ConferencesJul-15-2010

Advertising in the case of textual Web pages has been studied extensively by many researchers. However, with the increasing amount of multimedia data such as image, audio and video on the Web, the need for recommending advertisement for the multimedia data is becoming a reality. In this paper, we address the novel problem of visual contextual advertising, which is to directly advertise when users are viewing images which do not have any surrounding text. A key challenging issue of visual contextual advertising is that images and advertisements are usually represented in image space and word space respectively, which are quite different with each other inherently. As a result, existing methods for Web page advertising are inapplicable since they represent both Web pages and advertisement in the same word space. In order to solve the problem, we propose to exploit the social Web to link these two feature spaces together. In particular, we present a unified generative model to integrate advertisements, words and images. Specifically, our solution combines two parts in a principled approach: First, we transform images from a image feature space to a word space utilizing the knowledge from images with annotations from social Web. Then, a language model based approach is applied to estimate the relevance between transformed images and advertisements. Moreover, in this model, the probability of recommending an advertisement can be inferred efficiently given an image, which enables potential applications to online advertising.

advertisement, artificial intelligence, information technology services, (17 more...)

AAAI Conferences

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.46)

Industry:

Marketing (1.00)
Information Technology > Services (0.71)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Translated Learning: Transfer Learning across Different Feature Spaces

Dai, Wenyuan, Chen, Yuqiang, Xue, Gui-rong, Yang, Qiang, Yu, Yong

Neural Information Processing SystemsDec-31-2009

This paper investigates a new machine learning strategy called translated learning. Unlikemany previous learning tasks, we focus on how to use labeled data from one feature space to enhance the classification of other entirely different learning spaces. For example, we might wish to use labeled text data to help learn a model for classifying image data, when the labeled images are difficult to obtain. Animportant aspect of translated learning is to build a "bridge" to link one feature space (known as the "source space") to another space (known as the "target space")through a translator in order to migrate the knowledge from source to target. The translated learning solution uses a language model to link the class labels to the features in the source spaces, which in turn is translated to the features inthe target spaces. Finally, this chain of linkages is completed by tracing back to the instances in the target spaces. We show that this path of linkage can be modeled using a Markov chain and risk minimization. Through experiments on the text-aided image classification and cross-language classification tasks, we demonstrate that our translated learning framework can greatly outperform many state-of-the-art baseline methods.

artificial intelligence, image understanding, tlrisk, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.14)
North America > United States > Wisconsin (0.14)

Genre:

Research Report (0.87)
Overview (0.68)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.52)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.35)

Add feedback