Goto

Collaborating Authors

 lca


LCA: Loss Change Allocation for Neural Network Training

Janice Lan, Rosanne Liu, Hattie Zhou, Jason Yosinski

Neural Information Processing Systems

This rich view shows which parameters are responsible for decreasing or increasing the loss during training, orwhich parameters "help" or"hurt" the network'slearning, respectively.



A review of NMF, PLSA, LBA, EMA, and LCA with a focus on the identifiability issue

Qi, Qianqian, van der Heijden, Peter G. M.

arXiv.org Machine Learning

Across fields such as machine learning, social science, geography, considerable attention has been given to models that factorize a nonnegative matrix into the product of two or three matrices, subject to nonnegative or row-sum-to-1 constraints. Although these models are to a large extend similar or even equivalent, they are presented under different names, and their similarity is not well known. This paper highlights similarities among five popular models, latent budget analysis (LBA), latent class analysis (LCA), end-member analysis (EMA), probabilistic latent semantic analysis (PLSA), and nonnegative matrix factorization (NMF). We focus on an essential issue-identifiability-of these models and prove that the solution of LBA, EMA, LCA, PLSA is unique if and only if the solution of NMF is unique. We also provide a brief review for algorithms of these models. We illustrate the models with a time budget dataset from social science, and end the paper with a discussion of closely related models such as archetypal analysis.


LCA: Loss Change Allocation for Neural Network Training

Neural Information Processing Systems

Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into this high-dimensional, dynamic process. We propose a new window into training called Loss Change Allocation (LCA), in which credit for changes to the network loss is conservatively partitioned to the parameters. This measurement is accomplished by decomposing the components of an approximate path integral along the training trajectory using a Runge-Kutta integrator. This rich view shows which parameters are responsible for decreasing or increasing the loss during training, or which parameters help or hurt the network's learning, respectively. LCA may be summed over training iterations and/or over neurons, channels, or layers for increasingly coarse views. This new measurement device produces several insights into training.


An Expert-grounded benchmark of General Purpose LLMs in LCA

Donaldson, Artur, Balaji, Bharathan, Oriekezie, Cajetan, Kumar, Manish, Patouillard, Laure

arXiv.org Artificial Intelligence

Purpose: Artificial intelligence (AI), and in particular large language models (LLMs), are increasingly being explored as tools to support life cycle assessment (LCA). While demonstrations exist across environmental and social domains, systematic evidence on their reliability, robustness, and usability remains limited. This study provides the first expert-grounded benchmark of LLMs in LCA, addressing the absence of standardized evaluation frameworks in a field where no clear ground truth or consensus protocols exist. Methods: We evaluated eleven general-purpose LLMs, spanning both commercial and open-source families, across 22 LCA-related tasks. Seventeen experienced practitioners reviewed model outputs against criteria directly relevant to LCA practice, including scientific accuracy, explanation quality, robustness, verifiability, and adherence to instructions. We collected 168 expert reviews. Results: Experts judged 37% of responses to contain inaccurate or misleading information. Ratings of accuracy and quality of explanation were generally rated average or good on many models even smaller models, and format adherence was generally rated favourably. Hallucination rates varied significantly, with some models producing hallucinated citations at rates of up to 40%. There was no clear-cut distinction between ratings on open-weight versus closed-weight LLMs, with open-weight models outperforming or competing on par with closed-weight models on criteria such as accuracy and quality of explanation. Conclusion: These findings highlight the risks of applying LLMs naïvely in LCA, such as when LLMs are treated as free-form oracles, while also showing benefits especially around quality of explanation and alleviating labour intensiveness of simple tasks. The use of general-purpose LLMs without grounding mechanisms presents ...




Assessing the Ecological Impact of AI

Wenmackers, Sylvia

arXiv.org Artificial Intelligence

Philosophers of technology have recently started paying more attention to the environmental impacts of AI, in particular of large language models (LLMs) and generative AI (genAI) applications. Meanwhile, few developers of AI give concrete estimates of the ecological impact of their models and products, and even when they do so, their analysis is often limited to green house gas emissions of certain stages of AI development or use. The current proposal encourages practically viable analyses of the sustainability aspects of genAI informed by philosophical ideas.


Lattice Climber Attack: Adversarial attacks for randomized mixtures of classifiers

Gnecco-Heredia, Lucas, Negrevergne, Benjamin, Chevaleyre, Yann

arXiv.org Artificial Intelligence

However, existing attacks have been shown to not suit this kind of classifier. In this paper, we discuss the problem of attacking a mixture in a principled way and introduce two desirable properties of attacks based on a geometrical analysis of the problem (effectiveness and maxi-mality). We then show that existing attacks do not meet both of these properties. Finally, we introduce a new attack called lattice climber attack with theoretical guarantees in the binary linear setting, and demonstrate its performance by conducting experiments on synthetic and real datasets. Keywords: adversarial robustness adversarial attacks randomized classifiers mixtures.


Predicting Multitasking in Manual and Automated Driving with Optimal Supervisory Control

Jokinen, Jussi, Ebel, Patrick, Kujala, Tuomo

arXiv.org Artificial Intelligence

Modern driving involves interactive technologies that can divert attention, increasing the risk of accidents. This paper presents a computational cognitive model that simulates human multitasking while driving. Based on optimal supervisory control theory, the model predicts how multitasking adapts to variations in driving demands, interactive tasks, and automation levels. Unlike previous models, it accounts for context-dependent multitasking across different degrees of driving automation. The model predicts longer in-car glances on straight roads and shorter glances during curves. It also anticipates increased glance durations with driver aids such as lane-centering assistance and their interaction with environmental demands. Validated against two empirical datasets, the model offers insights into driver multitasking amid evolving in-car technologies and automation.