information interaction
Enhancing Chain-of-Thought Reasoning with Critical Representation Fine-tuning
Huang, Chenxi, Yan, Shaotian, Xie, Liang, Lin, Binbin, Fan, Sinan, Xin, Yue, Cai, Deng, Shen, Chen, Ye, Jieping
Representation Fine-tuning (ReFT), a recently proposed Parameter-Efficient Fine-Tuning (PEFT) method, has attracted widespread attention for significantly improving parameter efficiency by editing representation space alone. In this work, we investigate applying ReFT to complex reasoning tasks. However, directly using the native ReFT method, which modifies fixed representations at the beginning and end of each layer, yields suboptimal performance, as these fixed-position representations have uncertain impact on the outputs. We observe that, in complex reasoning tasks, there often exist certain critical representations. These representations either integrate significant information from preceding layers or regulate subsequent layer representations. Through layer-by-layer propagation, they exert a substantial influence on the final output. Naturally, fine-tuning these critical representations has the potential to greatly enhance reasoning performance. Building upon these insights, we propose Critical Representation Fine-Tuning (CRFT), a novel method that identifies and optimizes these critical representations through information flow analysis. CRFT operates within a supervised learning framework, dynamically optimizing critical representations in a low-rank linear subspace while freezing the base model. The effectiveness and efficiency of our method are validated across eight benchmarks for arithmetic and commonsense reasoning, using LLaMA and Mistral model families. Furthermore, our method also adapts effectively to few-shot settings, boosting one-shot accuracy by 16.4%. Our work highlights the untapped potential of representation-level optimization for CoT reasoning, offering a lightweight yet powerful alternative to traditional PEFT methods.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Asia > China > Zhejiang Province > Ningbo (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.67)
Panmodal Information Interaction
The emergence of generative artificial intelligence (GenAI) is transforming information interaction. For decades, search engines such as Google and Bing have been the primary means of locating relevant information for the general population. They have provided search results in the same standard format (the so-called "10 blue links"). The recent ability to chat via natural language with AI-based agents and have GenAI automatically synthesize answers in real-time (grounded in top-ranked results) is changing how people interact with and consume information at massive scale. These two information interaction modalities (traditional search and AI-powered chat) coexist in current search engines, either loosely coupled (e.g., as separate options/tabs) or tightly coupled (e.g., integrated as a chat answer embedded directly within a traditional search result page). We believe that the existence of these two different modalities, and potentially many others, is creating an opportunity to re-imagine the search experience, capitalize on the strengths of many modalities, and develop systems and strategies to support seamless flow between them. We refer to these as panmodal experiences. Unlike monomodal experiences, where only one modality is available and/or used for the task at hand, panmodal experiences make multiple modalities available to users (multimodal), directly support transitions between modalities (crossmodal), and seamlessly combine modalities to tailor task assistance (transmodal). While our focus is search and chat, with learnings from insights from a survey of over 100 individuals who have recently performed common tasks on these two modalities, we also present a more general vision for the future of information interaction using multiple modalities and the emergent capabilities of GenAI.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > Washington > King County > Redmond (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Information Technology (0.47)
- Health & Medicine (0.47)
A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations
Wang, Yao, Liu, Xin, Kong, Weikun, Yu, Hai-Tao, Racharak, Teeradaj, Kim, Kyoung-Sook, Nguyen, Minh Le
Named Entity Recognition and Relation Extraction are two crucial and challenging subtasks in the field of Information Extraction. Despite the successes achieved by the traditional approaches, fundamental research questions remain open. First, most recent studies use parameter sharing for a single subtask or shared features for both two subtasks, ignoring their semantic differences. Second, information interaction mainly focuses on the two subtasks, leaving the fine-grained informtion interaction among the subtask-specific features of encoding subjects, relations, and objects unexplored. Motivated by the aforementioned limitations, we propose a novel model to jointly extract entities and relations. The main novelties are as follows: (1) We propose to decouple the feature encoding process into three parts, namely encoding subjects, encoding objects, and encoding relations. Thanks to this, we are able to use fine-grained subtask-specific features. The experimental results demonstrate that our model outperforms several previous state-of-the-art models. Extensive additional experiments further confirm the effectiveness of our model. A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations Introduction Named Entity Recognition (NER) and Relation Extraction (RE), as two essential subtasks in information extraction, aim to extract entities and relations from semi-structured and unstructured texts. They are used in many downstream applications in different domains, such as knowledge graph construction [38, 39], Question-Answering [36, 37], and knowledge graph-based recommendation system [40, 41]. Most traditional models and some methods used in specialized areas [9,35,43,46] construct separate models for NER and RE to extract entities and relations in a pipelined manner. This type of method suffers from error propagation and unilateral information interaction.
- North America > United States (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > Middle East > Iraq > Karbala Governorate > Karbala (0.05)
- (3 more...)
- Energy (0.67)
- Health & Medicine (0.46)
Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image
Ren, Pengfei, Wen, Chao, Zheng, Xiaozheng, Xue, Zhou, Sun, Haifeng, Qi, Qi, Wang, Jingyu, Liao, Jianxin
Reconstructing interacting hands from a single RGB image is a very challenging task. On the one hand, severe mutual occlusion and similar local appearance between two hands confuse the extraction of visual features, resulting in the misalignment of estimated hand meshes and the image. On the other hand, there are complex spatial relationship between interacting hands, which significantly increases the solution space of hand poses and increases the difficulty of network learning. In this paper, we propose a decoupled iterative refinement framework to achieve pixel-alignment hand reconstruction while efficiently modeling the spatial relationship between hands. Specifically, we define two feature spaces with different characteristics, namely 2D visual feature space and 3D joint feature space. First, we obtain joint-wise features from the visual feature map and utilize a graph convolution network and a transformer to perform intra- and inter-hand information interaction in the 3D joint feature space, respectively. Then, we project the joint features with global information back into the 2D visual feature space in an obfuscation-free manner and utilize the 2D convolution for pixel-wise enhancement. By performing multiple alternate enhancements in the two feature spaces, our method can achieve an accurate and robust reconstruction of interacting hands. Our method outperforms all existing two-hand reconstruction methods by a large margin on the InterHand2.6M dataset.
Boosting Span-based Joint Entity and Relation Extraction via Squence Tagging Mechanism
Ji, Bin, Li, Shasha, Yu, Jie, Ma, Jun, Liu, Huijun
Span-based joint extraction simultaneously conducts named entity recognition (NER) and relation extraction (RE) in text span form. Recent studies have shown that token labels can convey crucial task-specific information and enrich token semantics. However, as far as we know, due to completely abstain from sequence tagging mechanism, all prior span-based work fails to use token label in-formation. To solve this problem, we pro-pose Sequence Tagging enhanced Span-based Network (STSN), a span-based joint extrac-tion network that is enhanced by token BIO label information derived from sequence tag-ging based NER. By stacking multiple atten-tion layers in depth, we design a deep neu-ral architecture to build STSN, and each atten-tion layer consists of three basic attention units. The deep neural architecture first learns seman-tic representations for token labels and span-based joint extraction, and then constructs in-formation interactions between them, which also realizes bidirectional information interac-tions between span-based NER and RE. Fur-thermore, we extend the BIO tagging scheme to make STSN can extract overlapping en-tity. Experiments on three benchmark datasets show that our model consistently outperforms previous optimal models by a large margin, creating new state-of-the-art results.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- (10 more...)
Full Transformer Framework for Robust Point Cloud Registration with Deep Information Interaction
Chen, Guangyan, Wang, Meiling, Yue, Yufeng, Zhang, Qingxiang, Yuan, Li
Recent Transformer-based methods have achieved advanced performance in point cloud registration by utilizing advantages of the Transformer in order-invariance and modeling dependency to aggregate information. However, they still suffer from indistinct feature extraction, sensitivity to noise, and outliers. The reasons are: (1) the adoption of CNNs fails to model global relations due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers and lack of positional encoding lead to indistinct feature extraction due to inefficient information interaction; (3) the omission of geometrical compatibility leads to inaccurate classification between inliers and outliers. To address above limitations, a novel full Transformer network for point cloud registration is proposed, named the Deep Interaction Transformer (DIT), which incorporates: (1) a Point Cloud Structure Extractor (PSE) to model global relations and retrieve structural information with Transformer encoders; (2) a deep-narrow Point Feature Transformer (PFT) to facilitate deep information interaction across two point clouds with positional encoding, such that Transformers can establish comprehensive associations and directly learn relative position between points; (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method to measure spatial consistency and estimate inlier confidence by designing the triangulated descriptor. Extensive experiments on clean, noisy, partially overlapping point cloud registration demonstrate that our method outperforms state-of-the-art methods.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Singapore (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Human Information Interaction, Artificial Intelligence, and Errors
Russell, Stephen (Army Research Laboratory) | Moskowitz, Ira S. (The Naval Research Laboratory)
In a time of pervasive and increasingly transparent computing, humans will interact with information objects and less and less with the computing devices that define them. Artificial Intelligence (AI) will be the proxy for humans’ interaction with information. Because interaction creates opportunities for error, the trend towards AI-augmented human information interaction (HII) will mandate an increased emphasis on cognition-oriented information science research and new ways of thinking about errors and error handling. A review of HII and its relationship to AI is presented, with a focus on errors in this context.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey (0.04)
- (4 more...)