AITopics | Huang, Chu-Ren

Collaborating Authors

Huang, Chu-Ren

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs

Yang, Yiheng, Wang, Yujie, Ma, Chi, Yu, Lei, Chersoni, Emmanuele, Huang, Chu-Ren

arXiv.org Artificial IntelligenceFeb-26-2025

Dense large language models(LLMs) face critical efficiency bottlenecks as they rigidly activate all parameters regardless of input complexity. While existing sparsity methods(static pruning or dynamic activation) address this partially, they either lack adaptivity to contextual or model structural demands or incur prohibitive computational overhead. Inspired by human brain's dual-process mechanisms - predictive coding (N400) for backbone sparsity and structural reanalysis (P600) for complex context - we propose CLADA, a \textit{\textbf{C}ognitive-\textbf{L}oad-\textbf{A}ware \textbf{D}ynamic \textbf{A}ctivation} framework that synergizes statistical sparsity with semantic adaptability. Our key insight is that LLM activations exhibit two complementary patterns: 1) \textit{Global statistical sparsity} driven by sequence-level prefix information, and 2) \textit{Local semantic adaptability} modulated by cognitive load metrics(e.g., surprisal and entropy). CLADA employs a hierarchical thresholding strategy: a baseline from offline error-controlled optimization ensures 40\%+ sparsity, dynamically adjusted by real-time cognitive signals. Evaluations across six mainstream LLMs and nine benchmarks demonstrate that CLADA achieves \textbf{~20\% average speedup with <2\% accuracy drop}, outperforming Griffin (5\%+ degradation) and TT (negligible speedup). Crucially, we establish the first formal connection between neurolinguistic event-related potential (ERP) components and LLM efficiency mechanisms through multi-level regression analysis ($R^2=0.17$ for sparsity-adaptation synergy). Requiring no retraining or architectural changes, CLADA offers a deployable solution for resource-aware LLM inference while advancing biologically-inspired AI design. Our code is available at \href{https://github.com/Oldify/CLADA}{CLADA}.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.19078

Country: Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre:

Research Report > Experimental Study (0.88)
Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Dual Memory Network Model for Biased Product Review Classification

Long, Yunfei, Ma, Mingyu, Lu, Qin, Xiang, Rong, Huang, Chu-Ren

arXiv.org Artificial IntelligenceSep-15-2018

In sentiment analysis (SA) of product reviews, both user and product information are proven to be useful. Current tasks handle user profile and product information in a unified model which may not be able to learn salient features of users and products effectively. In this work, we propose a dual user and product memory network (DUPMN) model to learn user profiles and product reviews using separate memory networks. Then, the two representations are used jointly for sentiment prediction. The use of separate models aims to capture user profiles and product information more effectively. Compared to state-of-the-art unified prediction models, the evaluations on three benchmark datasets, IMDB, Yelp13, and Yelp14, show that our dual learning model gives performance gain of 0.6%, 1.2%, and 0.9%, respectively. The improvements are also deemed very significant measured by p-values.

deep learning, neural network, product information, (19 more...)

arXiv.org Artificial Intelligence

1809.05807

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (0.67)

Industry:

Media > Film (1.00)
Leisure & Entertainment (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ROOT13: Spotting Hypernyms, Co-Hyponyms and Randoms

Santus, Enrico (The Hong Kong Polytechnic University) | Lenci, Alessandro (University of Pisa) | Chiu, Tin-Shing (The Hong Kong Polytechnic University ) | Lu, Qin (The Hong Kong Polytechnic University) | Huang, Chu-Ren (The Hong Kong Polytechnic University)

AAAI ConferencesApr-19-2016

In this paper, we describe ROOT13, a supervised system for the classification of hypernyms, co-hyponyms and random words. The system relies on a Random Forest algorithm and 13 unsupervised corpus-based features. We evaluate it with a 10-fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts-Of-Speech (i.e. adjectives, nouns and verbs). When all the classes are present, ROOT13 achieves an F1 score of 88.3%, against a baseline of 57.6% (vector cosine). When the classification is binary, ROOT13 achieves the following results: hypernyms-co-hyponyms (93.4% vs. 60.2%), hypernyms-random (92.3% vs. 65.5%) and co-hyponyms-random (97.3% vs. 81.5%). Our results are competitive with state-of-the-art models.

artificial intelligence, hypernym, machine learning, (17 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Europe > Italy (0.15)

Genre: Research Report (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unsupervised Measure of Word Similarity: How to Outperform Co-Occurrence and Vector Cosine in VSMs

Santus, Enrico (The Hong Kong Polytechnic University) | Lenci, Alessandro (University of Pisa) | Chiu, Tin-Shing (The Hong Kong Polytechnic University) | Lu, Qin (The Hong Kong Polytechnic University) | Huang, Chu-Ren (The Hong Kong Polytechnic University)

AAAI ConferencesApr-19-2016

In this paper, we claim that vector cosine – which is generally considered among the most efficient unsupervised measures for identifying word similarity in Vector Space Models – can be outperformed by an unsupervised measure that calculates the extent of the intersection among the most mutually dependent contexts of the target words. To prove it, we describe and evaluate APSyn, a variant of the Average Precision that, without any optimization, outperforms the vector cosine and the co-occurrence on the standard ESL test set, with an improvement ranging between +9.00% and +17.98%, depending on the number of chosen top contexts.

artificial intelligence, similarity, text processing, (18 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Europe > Italy (0.15)

Industry: Education (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.30)

Add feedback