AITopics | Chen, Chu-Song

Collaborating Authors

Chen, Chu-Song

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Safety Alignment Depth in Large Language Models: A Markov Chain Perspective

Kao, Ching-Chia, Yu, Chia-Mu, Lu, Chun-Shien, Chen, Chu-Song

arXiv.org Artificial IntelligenceFeb-1-2025

Large Language Models (LLMs) are increasingly adopted in high-stakes scenarios, yet their safety mechanisms often remain fragile. Simple jailbreak prompts or even benign fine-tuning can bypass these protocols, underscoring the need to understand where and how they fail. Recent findings suggest that vulnerabilities emerge when alignment is confined to only the initial output tokens. Unfortunately, even with the introduction of deep safety alignment, determining the optimal safety depth remains an unresolved challenge. By leveraging the equivalence between autoregressive language models and Markov chains, this paper offers the first theoretical result on how to identify the ideal depth for safety alignment, and demonstrates how permutation-based data augmentation can tighten these bounds. Crucially, we reveal a fundamental interaction between alignment depth and ensemble width-indicating that broader ensembles can compensate for shallower alignments. These insights provide a theoretical foundation for designing more robust, scalable safety strategies that complement existing alignment approaches, opening new avenues for research into safer, more reliable LLMs.

large language model, machine learning, natural language, (11 more...)

arXiv.org Artificial Intelligence

2502.00669

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.65)

Add feedback

Open-Vocabulary Panoptic Segmentation Using BERT Pre-Training of Vision-Language Multiway Transformer Model

Chen, Yi-Chia, Li, Wei-Hua, Chen, Chu-Song

arXiv.org Artificial IntelligenceDec-25-2024

Open-vocabulary panoptic segmentation remains a challenging problem. One of the biggest difficulties lies in training models to generalize to an unlimited number of classes using limited categorized training data. Recent popular methods involve large-scale vision-language pre-trained foundation models, such as CLIP. In this paper, we propose OMTSeg for open-vocabulary segmentation using another large-scale vision-language pre-trained model called BEiT-3 and leveraging the cross-modal attention between visual and linguistic features in BEiT-3 to achieve better performance. Experiments result demonstrates that OMTSeg performs favorably against state-of-the-art models.

machine learning, natural language, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2412.18917

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ACCEPT: Adaptive Codebook for Composite and Efficient Prompt Tuning

Lin, Yu-Chen, Li, Wei-Hua, Chen, Jun-Cheng, Chen, Chu-Song

arXiv.org Artificial IntelligenceOct-17-2024

Prompt Tuning has been a popular Parameter-Efficient Fine-Tuning method attributed to its remarkable performance with few updated parameters on various large-scale pretrained Language Models (PLMs). Traditionally, each prompt has been considered indivisible and updated independently, leading the parameters increase proportionally as prompt length grows. To address this issue, we propose Adaptive Codebook for Composite and Efficient Prompt Tuning (ACCEPT). In our method, we refer to the concept of product quantization (PQ), allowing all soft prompts to share a set of learnable codebook vectors in each subspace, with each prompt differentiated by a set of adaptive weights. We achieve the superior performance on 17 diverse natural language tasks including natural language understanding (NLU) and question answering (QA) tasks by tuning only 0.3% of parameters of the PLMs. Our approach also excels in few-shot and large model settings, highlighting its significant potential.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.12847

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Information-Theoretical Principled Trade-off between Jailbreakability and Stealthiness on Vision Language Models

Kao, Ching-Chia, Yu, Chia-Mu, Lu, Chun-Shien, Chen, Chu-Song

arXiv.org Artificial IntelligenceOct-2-2024

Stable Diffusion Prompt: "A tall medieval knight in shining armor standing in a dense, dark forest, surrounded by towering trees with sunlight filtering through the leaves, the knight holding a sword and shield, an ancient castle visible in the distant mist, cinematic lighting." Example 3: Question: "How does a sunset over the ocean look?" Stable Diffusion Prompt: "A stunning sunset over the calm ocean, orange and pink skies reflecting on the water, waves gently rolling towards the shore, distant mountains on the horizon, seagulls flying overhead, a peaceful, serene atmosphere, watercolor painting style." Example 4: Question: "What does a futuristic robot look like in a factory?" Stable Diffusion Prompt: "A sleek, silver futuristic robot with glowing blue eyes working in a large industrial factory, surrounded by mechanical equipment, robotic arms assembling products, bright overhead lights casting shadows, high-tech machinery and cables everywhere, realistic style." Example 5: Question: "What does a peaceful village look like at night?" Stable Diffusion Prompt: "A quiet, peaceful village at night, small wooden houses with glowing windows,

jailbreak attack, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.01438

Country: Asia (0.14)

Genre: Research Report (1.00)

Industry:

Law (0.47)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images

Hong, Zong-Wei, Hung, Yen-Yang, Chen, Chu-Song

arXiv.org Artificial IntelligenceMay-14-2024

In this work, we introduce a novel method for calculating the 6DoF pose of an object using a single RGB-D image. Unlike existing methods that either directly predict objects' poses or rely on sparse keypoints for pose recovery, our approach addresses this challenging task using dense correspondence, i.e., we regress the object coordinates for each visible pixel. Our method leverages existing object detection methods. We incorporate a re-projection mechanism to adjust the camera's intrinsic matrix to accommodate cropping in RGB-D images. Moreover, we transform the 3D object coordinates into a residual representation, which can effectively reduce the output space and yield superior performance. We conducted extensive experiments to validate the efficacy of our approach for 6D pose estimation. Our approach outperforms most previous methods, especially in occlusion scenarios, and demonstrates notable improvements over the state-of-the-art methods. Our code is available on https://github.com/AI-Application-and-Integration-Lab/RDPN6D.

artificial intelligence, machine learning, pose estimation, (17 more...)

arXiv.org Artificial Intelligence

2405.08483

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

D4AM: A General Denoising Framework for Downstream Acoustic Models

Lee, Chi-Chang, Tsao, Yu, Wang, Hsin-Min, Chen, Chu-Song

arXiv.org Artificial IntelligenceNov-28-2023

Speech enhancement (SE) can be used as a front-end strategy to aid automatic speech recognition (ASR) systems. However, existing training objectives of SE methods are not fully effective at integrating speech-text and noisy-clean paired data for training toward unseen ASR systems. In this study, we propose a general denoising framework, D4AM, for various downstream acoustic models. In addition, our method aims to consider the regression objective as an auxiliary loss to make the SE model generalize to other unseen acoustic models. To jointly train an SE unit with regression and classification objectives, D4AM uses an adjustment scheme to directly estimate suitable weighting coefficients rather than undergoing a grid search process with additional training costs. The adjustment scheme consists of two parts: gradient calibration and regression objective weighting. The experimental results show that D4AM can consistently and effectively provide improvements to various unseen acoustic models and outperforms other combination setups. Specifically, when evaluated on the Google ASR API with real noisy data completely unseen during SE training, D4AM achieves a relative WER reduction of 24.65% compared with the direct feeding of noisy input. To our knowledge, this is the first work that deploys an effective combination scheme of regression (denoising) and classification (ASR) objectives to derive a general pre-processor applicable to various unseen ASR systems. Speech enhancement (SE) aims to extract speech components from distorted speech signals to obtain enhanced signals with better properties (Loizou, 2013).

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.16595

Country: Asia > Taiwan (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models

Lee, Chi-Chang, Chen, Hong-Wei, Chen, Chu-Song, Wang, Hsin-Min, Liu, Tsung-Te, Tsao, Yu

arXiv.org Artificial IntelligenceNov-28-2023

The performance of speaker verification (SV) models may drop dramatically in noisy environments. A speech enhancement (SE) module can be used as a front-end strategy. However, existing SE methods may fail to bring performance improvements to downstream SV systems due to artifacts in the predicted signals of SE models. To compensate for artifacts, we propose a generic denoising framework named LC4SV, which can serve as a pre-processor for various unknown downstream SV models. In LC4SV, we employ a learning-based interpolation agent to automatically generate the appropriate coefficients between the enhanced signal and its noisy input to improve SV performance in noisy environments. Our experimental results demonstrate that LC4SV consistently improves the performance of various unseen SV systems. To the best of our knowledge, this work is the first attempt to develop a learning-based interpolation scheme aiming at improving SV performance in noisy environments.

artificial intelligence, interpolation agent, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2311.16604

Country: Asia > Taiwan (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Compacting, Picking and Growing for Unforgetting Continual Learning

Hung, Steven C. Y., Tu, Cheng-Hao, Wu, Cheng-En, Chen, Chien-Hung, Chan, Yi-Ming, Chen, Chu-Song

arXiv.org Machine LearningOct-27-2019

Continual lifelong learning is essential to many applications. In this paper, we propose a simple but effective approach to continual deep learning. Our approach leverages the principles of deep model compression, critical weights selection, and progressive networks expansion. By enforcing their integration in an iterative manner, we introduce an incremental learning method that is scalable to the number of sequential tasks in a continual learning process. Our approach is easy to implement and owns several favorable characteristics. First, it can avoid forgetting (i.e., learn new tasks while remembering all previous tasks). Second, it allows model expansion but can maintain the model compactness when handling sequential tasks. Besides, through our compaction and selection/expansion mechanism, we show that the knowledge accumulated through learning previous tasks is helpful to build a better model for the new tasks compared to training the models independently with tasks. Experimental results show that our approach can incrementally learn a deep model tackling multiple tasks without forgetting, while the model compactness is maintained with the performance more satisfiable than individual task training.

deep learning, neural network, new task, (20 more...)

arXiv.org Machine Learning

1910.06562

Country: North America (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Bayesian Fisher's Discriminant for Functional Data

Yang, Yao-Hsiang, Chen, Lu-Hung, Wang, Chieh-Chih, Chen, Chu-Song

arXiv.org Machine LearningDec-9-2014

We propose a Bayesian framework of Gaussian process in order to extend Fisher's discriminant to classify functional data such as spectra and images. The probability structure for our extended Fisher's discriminant is explicitly formulated, and we utilize the smoothness assumptions of functional data as prior probabilities. Existing methods which directly employ the smoothness assumption of functional data can be shown as special cases within this framework given corresponding priors while their estimates of the unknowns are one-step approximations to the proposed MAP estimates. Empirical results on various simulation studies and different real applications show that the proposed method significantly outperforms the other Fisher's discriminant methods for functional data.

artificial intelligence, bayesian inference, kernel, (16 more...)

arXiv.org Machine Learning

1412.2929

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback