context
The supplementary materials includes a detailed description of implementation details for experiments
We use BLIP-2 models built on the FLAN-T5 language model family. We use the same padding side as the FLAN-T5 models. We use a batch size of 8 for all datasets and models. The Q-former is kept in full precision. To produce decompositions, we use multinomial beam search sampling with 5 beams and a top-p of 0.95.
Appendix Contents
Every moral scenario consists of a triple ( context, action 1, action 2) and a set of auxiliary labels. The actions describe two possible actions in the first-person (e.g., The moral scenarios can be categorized into: 1. MoralChoice-LowAmbiguity The LLM-assisted construction (i.e., zero-and few-shot prompting setups) of the scenarios is grounded Category Rule Refined Rule Description Do not harm Do not kill Do not kill (i.e., do not cause permanent loss of consciousness). Do not cause pain Do not cause physical or emotional pain or unpleasant feelings (e.g., anger, sadness) to someone. Do not disable Do not deprive someone of their physical, mental or volitional ability (e.g. Do not deprive of freedom Do not deprive someone of their freedom (i.e., make a person unable to do something by altering the person's environment or situation).
- Oceania > New Zealand (0.04)
- Oceania > Australia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada (0.04)
- Oceania > New Zealand (0.04)
- Oceania > Australia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (4 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Finland (0.04)
- (2 more...)
88dddaf430b5bc38ab8228902bb61821-Supplemental-Conference.pdf
Supplementary figure 1. Ablanullon study, each row represents the ablated layer and each column the module that is ablated from that layer, for example the first panel shows ablanullon of anullennullon - key in layer 5. Different layers in GPT2 - XL model were ablated and the consequence of ablanullon on curvature measured for 2000 sentences in UD corpus. Red bar shows the layer where ablanullon was applied. AB Supplementary figure 3. A. curvature values for sampled 2000 sentence in RWKV model ( RNN) for both trained an untrained version. B correlanullon between model generated surprisal and curvature in RWKV model. Diamonds: syntacnullc surprisal Supplementary figure 5: E ffect of different decoding strategies in GPT2 - XL sequence generanullon and its comparison to ground - truth(true) same as figure 4b in the main manuscript.
7 Appendix
The details of datasets used in this work are summarised in Table 3. In these datasets, there is no personally identifiable information. In the case of the Gowalla dataset, the included POIs are across different cities, whereas FS-NYC/FS-TKY are only about a single city. As for the Education, there is only one morning peak. POIs with different semantic meanings have different visiting patterns.
Route Sparse Autoencoder to Interpret Large Language Models
Shi, Wei, Li, Sihang, Liang, Tao, Wan, Mingyang, Ma, Gojun, Wang, Xiang, He, Xiangnan
Mechanistic interpretability of large language models (LLMs) aims to uncover the internal processes of information propagation and reasoning. Sparse autoencoders (SAEs) have demonstrated promise in this domain by extracting interpretable and monosemantic features. However, prior works primarily focus on feature extraction from a single layer, failing to effectively capture activations that span multiple layers. In this paper, we introduce Route Sparse Autoencoder (RouteSAE), a new framework that integrates a routing mechanism with a shared SAE to efficiently extract features from multiple layers. It dynamically assigns weights to activations from different layers, incurring minimal parameter overhead while achieving high interpretability and flexibility for targeted feature manipulation. We evaluate RouteSAE through extensive experiments on Llama-3.2-1B-Instruct. Specifically, under the same sparsity constraint of 64, RouteSAE extracts 22.5% more features than baseline SAEs while achieving a 22.3% higher interpretability score. These results underscore the potential of RouteSAE as a scalable and effective method for LLM interpretability, with applications in feature discovery and model intervention. Our codes are available at https://github.com/swei2001/RouteSAEs.
- Asia > China (0.28)
- North America > United States > California (0.28)
Indeterminacy in Affective Computing: Considering Meaning and Context in Data Collection Practices
Dudzik, Bernd, Hrkalovic, Tiffany Matej, Hao, Chenxu, Raman, Chirag, Tsfasman, Masha
Automatic Affect Prediction (AAP) uses computational analysis of input data such as text, speech, images, and physiological signals to predict various affective phenomena (e.g., emotions or moods). These models are typically constructed using supervised machine-learning algorithms, which rely heavily on labeled training datasets. In this position paper, we posit that all AAP training data are derived from human Affective Interpretation Processes, resulting in a form of Affective Meaning. Research on human affect indicates a form of complexity that is fundamental to such meaning: it can possess what we refer to here broadly as Qualities of Indeterminacy (QIs) - encompassing Subjectivity (meaning depends on who is interpreting), Uncertainty (lack of confidence regarding meanings' correctness), Ambiguity (meaning contains mutually exclusive concepts) and Vagueness (meaning is situated at different levels in a nested hierarchy). Failing to appropriately consider QIs leads to results incapable of meaningful and reliable predictions. Based on this premise, we argue that a crucial step in adequately addressing indeterminacy in AAP is the development of data collection practices for modeling corpora that involve the systematic consideration of 1) a relevant set of QIs and 2) context for the associated interpretation processes. To this end, we are 1) outlining a conceptual model of AIPs and the QIs associated with the meaning these produce and a conceptual structure of relevant context, supporting understanding of its role. Finally, we use our framework for 2) discussing examples of context-sensitivity-related challenges for addressing QIs in data collection setups. We believe our efforts can stimulate a structured discussion of both the role of aspects of indeterminacy and context in research on AAP, informing the development of better practices for data collection and analysis.
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Research Report (0.50)
- Workflow (0.48)
Expect the Unexpected: FailSafe Long Context QA for Finance
Kamble, Kiran, Russak, Melisa, Mozolevskyi, Dmytro, Ali, Muayad, Russak, Mateusz, AlShikh, Waseem
We propose a new long-context financial benchmark, FailSafeQA, designed to test the robustness and context-awareness of LLMs against six variations in human-interface interactions in LLM-based query-answer systems within finance. We concentrate on two case studies: Query Failure and Context Failure. In the Query Failure scenario, we perturb the original query to vary in domain expertise, completeness, and linguistic accuracy. In the Context Failure case, we simulate the uploads of degraded, irrelevant, and empty documents. We employ the LLM-as-a-Judge methodology with Qwen2.5-72B-Instruct and use fine-grained rating criteria to define and calculate Robustness, Context Grounding, and Compliance scores for 24 off-the-shelf models. The results suggest that although some models excel at mitigating input perturbations, they must balance robust answering with the ability to refrain from hallucinating. Notably, Palmyra-Fin-128k-Instruct, recognized as the most compliant model, maintained strong baseline performance but encountered challenges in sustaining robust predictions in 17% of test cases. On the other hand, the most robust model, OpenAI o3-mini, fabricated information in 41% of tested cases. The results demonstrate that even high-performing models have significant room for improvement and highlight the role of FailSafeQA as a tool for developing LLMs optimized for dependability in financial applications. The dataset is available at: https://huggingface.co/datasets/Writer/FailSafeQA
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > San Bernardino County > Barstow (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (5 more...)
- Information Technology (0.88)
- Law > Business Law (0.68)
- Banking & Finance > Trading (0.46)
- Government > Regional Government > North America Government > United States Government (0.46)