castle
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (4 more...)
- Government (0.46)
- Health & Medicine (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
600-year-old Viking shipwreck is the largest of its kind
The medieval'cog' was nearly 92 feet long and featured castles on its bow and stern. Breakthroughs, discoveries, and DIY tips sent every weekday. Archaeologists in Denmark say a sunken Viking ship near Copenhagen is the largest boat of its kind ever discovered--and that's saying something. At nearly 92 feet long, the 600-year-old vessel is also one of the best preserved examples of a cog, a "super ship" whose advanced design and carrying capacity helped transform trade in medieval Europe. "The find is a milestone for maritime archaeology," excavation lead Otto Uldum said in a statement, adding the boat now offers a "unique opportunity to understand both the construction and life on board the biggest trading ships of the Middle Ages."
- Europe > Denmark > Capital Region > Copenhagen (0.25)
- North America > United States > Michigan (0.05)
- North America > United States > Massachusetts (0.05)
- Europe > Netherlands (0.05)
CASTLE: Regularization via Auxiliary Causal Graph Discovery
Regularization improves generalization of supervised models to out-of-sample data. Prior works have shown that prediction in the causal direction (effect from cause) results in lower testing error than the anti-causal direction. However, existing regularization methods are agnostic of causality. We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables. CASTLE learns the causal directed acyclical graph (DAG) as an adjacency matrix embedded in the neural network's input layers, thereby facilitating the discovery of optimal predictors. Furthermore, CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features. We provide a theoretical generalization bound for our approach and conduct experiments on a plethora of synthetic and real publicly available datasets demonstrating that CASTLE consistently leads to better out-of-sample predictions as compared to other popular benchmark regularizers.
Dynamically Scaled Activation Steering
Ferrando, Alex, Suau, Xavier, Gonzàlez, Jordi, Rodriguez, Pau
Activation steering has emerged as a powerful method for guiding the behavior of generative models towards desired outcomes such as toxicity mitigation. However, most existing methods apply interventions uniformly across all inputs, degrading model performance when steering is unnecessary. We introduce Dynamically Scaled Activation Steering (DSAS), a method-agnostic steering framework that decouples when to steer from how to steer. DSAS adaptively modulates the strength of existing steering transformations across layers and inputs, intervening strongly only when undesired behavior is detected. At generation time, DSAS computes context-dependent scaling factors that selectively adjust the strength of any steering method. We also show how DSAS can be jointly optimized end-to-end together with the steering function. When combined with existing steering methods, DSAS consistently improves the Pareto front with respect to steering alone, achieving a better trade-off between toxicity mitigation and utility preservation. We further demonstrate DSAS's generality by applying it to a text-to-image diffusion model, showing how adaptive steering allows the modulation of specific concepts. Finally, DSAS introduces minimal computational overhead while improving interpretability, pinpointing which tokens require steering and by how much.
- Asia > India (0.04)
- North America > United States > Texas > Loving County (0.04)
- Asia > Southeast Asia (0.04)
- Media (0.67)
- Transportation > Ground (0.46)
- Health & Medicine > Consumer Health (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Communications > Social Media (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (4 more...)
- Government (0.46)
- Health & Medicine (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Table A: Additional Experiments (real data)
We thank all the reviewers for their valuable suggestions and feedback. Table 2 contains L2 regularization. 's, of the proposed neural network (Section 3.3). We will elaborate this point with concrete examples in the revised submission. The description of the regularization terms is given in lines 162-168.
Causal Attention with Lookahead Keys
Song, Zhuoqing, Sun, Peng, Yuan, Huizhuo, Gu, Quanquan
In standard causal attention, each token's query, key, and value (QKV) are static and encode only preceding context. We introduce CAuSal aTtention with Lookahead kEys (CASTLE), an attention mechanism that continually updates each token's keys as the context unfolds. We term these updated keys lookahead keys because they belong to earlier positions yet integrate information from tokens that appear later relative to those positions, while strictly preserving the autoregressive property. Although the mechanism appears sequential, we derive a mathematical equivalence that avoids explicitly materializing lookahead keys at each position and enables efficient parallel training. On language modeling benchmarks, CASTLE consistently outperforms standard causal attention across model scales, reducing validation perplexity and improving performance on a range of downstream tasks.
Artificial Generals Intelligence: Mastering Generals.io with Reinforcement Learning
We introduce a real-time strategy game environment based on Generals.io, a game with thousands of weekly active players. Our environment is fully compatible with Gymnasium and PettingZoo and is capable of running thousands of frames per second on commodity hardware. We also present a reference agent, trained with supervised pre-training and self-play, which reached the top 0.003% of the 1v1 human leaderboard after only 36 hours on a single H100 GPU. To accelerate learning, we incorporate potential-based reward shaping and memory features. Our contributions of a modular RTS benchmark and a competitive baseline agent provide an accessible yet challenging platform for advancing multi-agent reinforcement learning research. The documented code, together with examples and tutorials, is available at https://github.com/strakam/generals-bots.
DreamLLM-3D: Affective Dream Reliving using Large Language Model and 3D Generative AI
Liu, Pinyao, Lee, Keon Ju, Steinmaurer, Alexander, Picard-Deland, Claudia, Carr, Michelle, Kitson, Alexandra
We present DreamLLM-3D, a composite multimodal AI system behind an immersive art installation for dream re-experiencing. It enables automated dream content analysis for immersive dream-reliving, by integrating a Large Language Model (LLM) with text-to-3D Generative AI. The LLM processes voiced dream reports to identify key dream entities (characters and objects), social interaction, and dream sentiment. The extracted entities are visualized as dynamic 3D point clouds, with emotional data influencing the color and soundscapes of the virtual dream environment. Additionally, we propose an experiential AI-Dreamworker Hybrid paradigm. Our system and paradigm could potentially facilitate a more emotionally engaging dream-reliving experience, enhancing personal insights and creativity.
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > Connecticut > Fairfield County > Norwalk (0.04)
- (4 more...)
Review for NeurIPS paper: CASTLE: Regularization via Auxiliary Causal Graph Discovery
Summary and Contributions: The aim of this paper is to improve performance of supervised learning on out-of-bag samples. In the case of deep networks, regularization helps mitigate overfit but does not exploit the structure of the feature variables and their relation to the outcome when the DGP can be represented by a causal DAG. The authors propose CASTLE, which jointly learns the causal graph while performing regularization. In particular, the adjacency matrix of the learned DAG is used in the input layers of neural network, which translates to the penalty function being decomposed into the reconstruction loss found in SAE, a (new) acyclicity loss, and a capacity-based regularizer of the adjacency matrices. Unlike other approaches, CASTLE improves upon capacity-based and auto-encoder-based regularization by exploiting the DAG structure for identification of causal predictors (parents of Y, if they exist) and for target selection for reconstruction regularization (features that have neighbours in the underlying DAG).