Goto

Collaborating Authors

 ca ts


cf66f995883298c4db2f0dcba28fb211-Paper-Conference.pdf

Neural Information Processing Systems

Time series forecasting is crucial for applications across multiple domains and various scenarios. Although Transformers have dramatically advanced the landscape of forecasting, their effectiveness remains debated.





EEGChaT: A Transformer-Based Modular Channel Selector for SEEG Analysis

Wang, Chen, Wang, Yansen, Han, Dongqi, Wang, Zilong, Li, Dongsheng

arXiv.org Artificial Intelligence

Analyzing stereoelectroencephalography (SEEG) signals is critical for brain-computer interface (BCI) applications and neuroscience research, yet poses significant challenges due to the large number of input channels and their heterogeneous relevance. Traditional channel selection methods struggle to scale or provide meaningful interpretability for SEEG data. In this work, we propose EEGChaT, a novel Transformer-based channel selection module designed to automatically identify the most task-relevant channels in SEEG recordings. EEGChaT introduces Channel Aggregation Tokens (CATs) to aggregate information across channels, and leverages an improved Attention Rollout technique to compute interpretable, quantitative channel importance scores. We evaluate EEGChaT on the DuIN dataset, demonstrating that integrating EEGChaT with existing classification models consistently improves decoding accuracy, achieving up to 17\% absolute gains. Furthermore, the channel weights produced by EEGChaT show substantial overlap with manually selected channels, supporting the interpretability of the approach. Our results suggest that EEGChaT is an effective and generalizable solution for channel selection in high-dimensional SEEG analysis, offering both enhanced performance and insights into neural signal relevance.


cf66f995883298c4db2f0dcba28fb211-Paper-Conference.pdf

Neural Information Processing Systems

Time series forecasting is crucial for applications across multiple domains and various scenarios. Although Transformers have dramatically advanced the landscape of forecasting, their effectiveness remains debated.


Learning Conjoint Attentions for Graph Neural Nets Supplementary Materials Tiantian He1,2 Y ew-Soon Ong 1,2 Lu Bai 1,2 1 Agency for Science, Technology and Research (A*ST AR) 2

Neural Information Processing Systems

To prove Theorem 1, we need to consider the two directions of the iff conditions. Obviously, the above equation does not hold as the terms in the summation operator are positive. Eq. (4), we have: null However, the RHS of Eq. (10) can be an RHS is an irrational number, while LHS is a rational number. Eq. (9) can be rewritten as: null To prove Theorem 2, we can follow the procedure which is used to prove Theorem 1. As Eq. (20) holds for any Obviously, the above equation does not hold as softmax function is positive.



Improving Multilingual Social Media Insights: Aspect-based Comment Analysis

Zhang, Longyin, Zou, Bowei, Aw, Ai Ti

arXiv.org Artificial Intelligence

The inherent nature of social media posts, characterized by the freedom of language use with a disjointed array of diverse opinions and topics, poses significant challenges to downstream NLP tasks such as comment clustering, comment summarization, and social media opinion analysis. To address this, we propose a granular level of identifying and generating aspect terms from individual comments to guide model attention. Specifically, we leverage multilingual large language models with supervised fine-tuning for comment aspect term generation (CAT-G), further aligning the model's predictions with human expectations through DPO. We demonstrate the effectiveness of our method in enhancing the comprehension of social media discourse on two NLP tasks. Moreover, this paper contributes the first multilingual CAT-G test set on English, Chinese, Malay, and Bahasa Indonesian. As LLM capabilities vary among languages, this test set allows for a comparative analysis of performance across languages with varying levels of LLM proficiency.


Reward Model Generalization for Compute-Aware Test-Time Reasoning

Song, Zeen, Qiang, Wenwen, Zhao, Siyu, Zheng, Changwen, Hua, Gang

arXiv.org Artificial Intelligence

External test-time reasoning enhances large language models (LLMs) by decoupling generation and selection. At inference time, the model generates multiple reasoning paths, and an auxiliary process reward model (PRM) is used to score and select the best one. A central challenge in this setting is test-time compute optimality (TCO), i.e., how to maximize answer accuracy under a fixed inference budget. In this work, we establish a theoretical framework to analyze how the generalization error of the PRM affects compute efficiency and reasoning performance. Leveraging PAC-Bayes theory, we derive generalization bounds and show that a lower generalization error of PRM leads to fewer samples required to find correct answers. Motivated by this analysis, we propose Compute-Aware Tree Search (CATS), an actor-critic framework that dynamically controls search behavior. The actor outputs sampling hyperparameters based on reward distributions and sparsity statistics, while the critic estimates their utility to guide budget allocation. Experiments on the MATH and AIME benchmarks with various LLMs and PRMs demonstrate that CATS consistently outperforms other external TTS methods, validating our theoretical predictions.