Goto

Collaborating Authors

 marten




dae3312c4c6c7000a37ecfb7b0aeb0e4-Paper.pdf

Neural Information Processing Systems

Based on the so-calledtensor normal(TN) distribution [31],wepropose andanalyze abrandnewapproximate natural gradient method, Tensor Normal Training(TNT), which likeShampoo, only requires knowledge of the shape of the training parameters.


AmortizedProximalOptimization

Neural Information Processing Systems

WeempiricallytestAPO for online adaptation of learning rates and structured preconditioning matrices for regression, image reconstruction, image classification, and natural language translationtasks.


Adorable ferret-sized martens are rebounding in California

Popular Science

Highly valued for their fur, martens were almost hunted to extinction in the late 20th century. Breakthroughs, discoveries, and DIY tips sent six days a week. It's understandable if you've never heard of the coastal marten. These secretive--but adorable--woodland carnivores nearly went extinct . Fortunately, these ferret-sized mammals are making a slow recovery in the forests of the Pacific Northwest.



Training Autoencoders Using Stochastic Hessian-Free Optimization with LSMR

Emirahmetoglu, Ibrahim, Stewart, David E.

arXiv.org Artificial Intelligence

Hessian-free (HF) optimization has been shown to effectively train deep autoencoders (Martens, 2010). In this paper, we aim to accelerate HF training of autoencoders by reducing the amount of data used in training. HF utilizes the conjugate gradient algorithm to estimate update directions. Instead, we propose using the LSMR method, which is known for effectively solving large sparse linear systems. We also incorporate Chapelle & Erhan (2011)'s improved preconditioner for HF optimization. In addition, we introduce a new mini-batch selection algorithm to mitigate overfitting. Our algorithm starts with a small subset of the training data and gradually increases the mini-batch size based on (i) variance estimates obtained during the computation of a mini-batch gradient (Byrd et al., 2012) and (ii) the relative decrease in objective value for the validation data. Our experimental results demonstrate that our stochastic Hessian-free optimization, using the LSMR method and the new sample selection algorithm, leads to rapid training of deep autoencoders with improved generalization error.


Beware of "Explanations" of AI

Martens, David, Shmueli, Galit, Evgeniou, Theodoros, Bauer, Kevin, Janiesch, Christian, Feuerriegel, Stefan, Gabel, Sebastian, Goethals, Sofie, Greene, Travis, Klein, Nadja, Kraus, Mathias, Kühl, Niklas, Perlich, Claudia, Verbeke, Wouter, Zharova, Alona, Zschech, Patrick, Provost, Foster

arXiv.org Artificial Intelligence

Understanding the decisions made and actions taken by increasingly complex AI system remains a key challenge. This has led to an expanding field of research in explainable artificial intelligence (XAI), highlighting the potential of explanations to enhance trust, support adoption, and meet regulatory standards. However, the question of what constitutes a "good" explanation is dependent on the goals, stakeholders, and context. At a high level, psychological insights such as the concept of mental model alignment can offer guidance, but success in practice is challenging due to social and technical factors. As a result of this ill-defined nature of the problem, explanations can be of poor quality (e.g. unfaithful, irrelevant, or incoherent), potentially leading to substantial risks. Instead of fostering trust and safety, poorly designed explanations can actually cause harm, including wrong decisions, privacy violations, manipulation, and even reduced AI adoption. Therefore, we caution stakeholders to beware of explanations of AI: while they can be vital, they are not automatically a remedy for transparency or responsible AI adoption, and their misuse or limitations can exacerbate harm. Attention to these caveats can help guide future research to improve the quality and impact of AI explanations.


Martens

AAAI Conferences

Linear logic programming languages have been identified in prior work as viable for specifying stories and analyzing their causal structure. We investigate the use of such a language for specifying story worlds, or settings where generalized narrative actions have uniform effects (not specific to a particular set of characters or setting elements), which may create emergent behavior through feedback loops. We show a sizable example of a story world specified in the language Celf and discuss its interpretation as a story-generating program, a simulation, and an interactive narrative. Further, we show that the causal analysis tools available by virtue of using a proof-theoretic language for specification can assist the author in reasoning about the structure and consequences of emergent stories.


Martens

AAAI Conferences

We present a rule specification language called Ceptre,intended to enable rapid prototyping for experimental game mechanics, especially in domains that depend on procedural generation and multi-agent simulation. Ceptre can be viewed as an explication of a new methodology for understanding games based on linear logic, a formal logic concerned with resource usage. We present a correspondence between gameplay and proof search in linear logic, building on prior work on generating narratives. In Ceptre, we introduce the ability to add interactivity selectively into a generative model, enabling inspection of intermediate states for debugging and exploration as well as a means of play. We claim that this methodology can support game designers and researchers in designing, anaylzing, and debugging the core systems of their work in generative, multi-agent gameplay. To support this claim, we provide two case studies implemented in Ceptre, one from interactive narrative and one from a strategy-like domain.