AITopics

Recent progress in large language models (LLMs) has enabled the automated processing of lengthy documents even without supervised training on a task-specific dataset. Yet, their zero-shot performance in complex tasks as opposed to straightforward information extraction tasks remains suboptimal. One feasible approach for tasks with lengthy, complex input is to first summarize the document and then apply supervised fine-tuning to the summary. However, the summarization process inevitably results in some loss of information. In this study we present a method for processing the summaries of long documents aimed to capture different important aspects of the original document. We hypothesize that LLM summaries generated with different aspect-oriented prompts contain different \textit{information signals}, and we propose methods to measure these differences. We introduce approaches to effectively integrate signals from these different summaries for supervised training of transformer models. We validate our hypotheses on a high-impact task -- 30-day readmission prediction from a psychiatric discharge -- using real-world data from four hospitals, and show that our proposed method increases the prediction performance for the complex task of predicting patient outcome.

large language model, natural language, prediction, (18 more...)

2502.10388

Country:

North America > United States (0.93)
Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Navaratnarajah, Melane, Martin, Sophie A., Kelly, David A., Blake, Nathan, Chocker, Hana

3D ReX: Causal Explanations in 3D Neuroimaging Classification

Explainability remains a significant problem for AI models in medical imaging, making it challenging for clinicians to trust AI-driven predictions. We introduce 3D ReX, the first causality-based post-hoc explainability tool for 3D models. 3D ReX uses the theory of actual causality to generate responsibility maps which highlight the regions most crucial to the model's decision. We test 3D ReX on a stroke detection model, providing insight into the spatial distribution of features relevant to stroke.

artificial intelligence, explanation, machine learning, (18 more...)

2502.12181

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.65)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Farahmand, Ebrahim, Azghan, Reza Rahimi, Chatrudi, Nooshin Taheri, Kim, Eric, Gudur, Gautham Krishna, Thomaz, Edison, Pedrielli, Giulia, Turaga, Pavan, Ghasemzadeh, Hassan

AttenGluco: Multimodal Transformer-Based Blood Glucose Forecasting on AI-READI Dataset

Diabetes is a chronic metabolic disorder characterized by persistently high blood glucose levels (BGLs), leading to severe complications such as cardiovascular disease, neuropathy, and retinopathy. Predicting BGLs enables patients to maintain glucose levels within a safe range and allows caregivers to take proactive measures through lifestyle modifications. Continuous Glucose Monitoring (CGM) systems provide real-time tracking, offering a valuable tool for monitoring BGLs. However, accurately forecasting BGLs remains challenging due to fluctuations due to physical activity, diet, and other factors. Recent deep learning models show promise in improving BGL prediction. Nonetheless, forecasting BGLs accurately from multimodal, irregularly sampled data over long prediction horizons remains a challenging research problem. In this paper, we propose AttenGluco, a multimodal Transformer-based framework for long-term blood glucose prediction. AttenGluco employs cross-attention to effectively integrate CGM and activity data, addressing challenges in fusing data with different sampling rates. Moreover, it employs multi-scale attention to capture long-term dependencies in temporal data, enhancing forecasting accuracy. To evaluate the performance of AttenGluco, we conduct forecasting experiments on the recently released AIREADI dataset, analyzing its predictive accuracy across different subject cohorts including healthy individuals, people with prediabetes, and those with type 2 diabetes. Furthermore, we investigate its performance improvements and forgetting behavior as new cohorts are introduced. Our evaluations show that AttenGluco improves all error metrics, such as root mean square error (RMSE), mean absolute error (MAE), and correlation, compared to the multimodal LSTM model. AttenGluco outperforms this baseline model by about 10% and 15% in terms of RMSE and MAE, respectively.

artificial intelligence, forecasting, machine learning, (18 more...)

2502.09919

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Azarkhalili, Behrooz, Libbrecht, Maxwell

Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow

This paper introduces Generalized Attention Flow (GAF), a novel feature attribution method for Transformer-based models to address the limitations of current approaches. By extending Attention Flow and replacing attention weights with the generalized Information Tensor, GAF integrates attention weights, their gradients, the maximum flow problem, and the barrier method to enhance the performance of feature attributions. The proposed method exhibits key theoretical properties and mitigates the shortcomings of prior techniques that rely solely on simple aggregation of attention weights. Our comprehensive benchmarking on sequence classification tasks demonstrates that a specific variant of GAF consistently outperforms state-of-the-art feature attribution methods in most evaluation settings, providing a more reliable interpretation of Transformer model outputs.

algorithm 1, attribution, feature attribution, (15 more...)

2502.15765

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Italy > Tuscany > Florence (0.04)
(11 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

SlateFeb-13-2025, 17:38:53 GMT

The Series' Second Movie Beat em Citizen Kane /em on Rotten Tomatoes. The New One Is a Whole Different Animal.

The past decade has brought the world a lot of political and economic chaos, but in its defense, that same span of time has also given us the Paddington Bear movies. With those two London-set adventures, a mix of animation (Paddington) and live action (everyone else), director Paul King created a loopy world all his own, as cozy and visually pleasing as a dollhouse. The Paddington films were also refreshingly gentle, with moral messages that emerged not from preachy dialogue but from their ursine protagonist's unassuming goodness. And Ben Whishaw's voice performance as the unfailingly polite, naively bumbling bear is one of the all-time great matches between actor and animated character, up there with Tom Hanks' Woody in the Toy Story films: Whishaw quite simply is Paddington, and the completeness and believability of his characterization would have set the films apart even without their droll scripts and all-in supporting casts. The third film in the series, Paddington in Peru, ran a high risk of becoming a shark-jumping sequel, with King and his co-writers now replaced by first-time feature director Dougal Wilson and a new writing team consisting of Mark Burton, Jon Foster, and James Lamont.

artificial intelligence, different animal, paddington, (12 more...)

Slate

Country:

South America > Peru (0.66)
North America > United States > Indiana (0.05)

Industry: Media > Film (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.41)

Maldonado, Jaime, Krumme, Jonas, Zetzsche, Christoph, Didelez, Vanessa, Schill, Kerstin

Robot Pouring: Identifying Causes of Spillage and Selecting Alternative Action Parameters Using Probabilistic Actual Causation

In everyday life, we perform tasks (e.g., cooking or cleaning) that involve a large variety of objects and goals. When confronted with an unexpected or unwanted outcome, we take corrective actions and try again until achieving the desired result. The reasoning performed to identify a cause of the observed outcome and to select an appropriate corrective action is a crucial aspect of human reasoning for successful task execution. Central to this reasoning is the assumption that a factor is responsible for producing the observed outcome. In this paper, we investigate the use of probabilistic actual causation to determine whether a factor is the cause of an observed undesired outcome. Furthermore, we show how the actual causation probabilities can be used to find alternative actions to change the outcome. We apply the probabilistic actual causation analysis to a robot pouring task. When spillage occurs, the analysis indicates whether a task parameter is the cause and how it should be changed to avoid spillage. The analysis requires a causal graph of the task and the corresponding conditional probability distributions. To fulfill these requirements, we perform a complete causal modeling procedure (i.e., task analysis, definition of variables, determination of the causal graph structure, and estimation of conditional probability distributions) using data from a realistic simulation of the robot pouring task, covering a large combinatorial space of task parameters. Based on the results, we discuss the implications of the variables' representation and how the alternative actions suggested by the actual causation analysis would compare to the alternative solutions proposed by a human observer. The practical use of the analysis of probabilistic actual causation to select alternative action parameters is demonstrated.

artificial intelligence, machine learning, probability, (16 more...)

2502.09395

Country:

Europe > Germany > Bremen > Bremen (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.86)

Generating Causally Compliant Counterfactual Explanations using ASP

Dasgupta, Sopam

This research is focused on generating achievable counterfactual explanations. Given a negative outcome computed by a machine learning model or a decision system, the novel CoGS approach generates (i) a counterfactual solution that represents a positive outcome and (ii) a path that will take us from the negative outcome to the positive one, where each node in the path represents a change in an attribute (feature) value. CoGS computes paths that respect the causal constraints among features. Thus, the counterfactuals computed by CoGS are realistic. CoGS utilizes rule-based machine learning algorithms to model causal dependencies between features. The paper discusses the current status of the research and the preliminary results obtained.

artificial intelligence, machine learning, natural language, (15 more...)

doi: 10.4204/EPTCS.416.30

2502.09226

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Banking & Finance > Credit (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.64)

Pearce's Characterisation in an Epistemic Domain

Su, Ezgi Iraz

Answer-set programming (ASP) is a successful problem-solving approach in logic-based AI. In ASP, problems are represented as declarative logic programs, and solutions are identified through their answer sets. Equilibrium logic (EL) is a general-purpose nonmonotonic reasoning formalism, based on a monotonic logic called here-and-there logic. EL was basically proposed by Pearce as a foundational framework of ASP. Epistemic specifications (ES) are extensions of ASP-programs with subjective literals. These new modal constructs in the ASP-language make it possible to check whether a regular literal of ASP is true in every (or some) answer-set of a program. ES-programs are interpreted by world-views, which are essentially collections of answer-sets. (Reflexive) autoepistemic logic is a nonmonotonic formalism, modeling self-belief (knowledge) of ideally rational agents. A relatively new semantics for ES is based on a combination of EL and (reflexive) autoepistemic logic. In this paper, we first propose an overarching framework in the epistemic ASP domain. We then establish a correspondence between existing (reflexive) (auto)epistemic equilibrium logics and our easily-adaptable comprehensive framework, building on Pearce's characterisation of answer-sets as equilibrium models. We achieve this by extending Ferraris' work on answer sets for propositional theories to the epistemic case and reveal the relationship between some ES-semantic proposals.

artificial intelligence, es 94, logic & formal reasoning, (17 more...)

doi: 10.4204/EPTCS.416.18

2502.09221

Country:

Europe > Italy (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Vinck, Toon, Jonckers, Naïn, Dekkers, Gert, Prinzie, Jeffrey, Karsmakers, Peter

Mitigating multiple single-event upsets during deep neural network inference using fault-aware training

Over the past decade deep neural networks (DNNs) have made remarkable advancements in terms of performance. Special hardware accelerators, designed to optimise the execution of these algorithms, have played a crucial role in this progress. However, their robustness remains a concern for safetycritical operations, especially when deployed in environments that contain high levels of radiation. In these harsh environments, accelerators are susceptible to single-event upsets (SEUs), which can lead to bit-flips causing numerical errors or even a system crash. One approach to mitigate SEUs is radiation hardening by design (RHBD).

artificial intelligence, experiment, machine learning, (18 more...)

2502.09374

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.06)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Liu, Shuheng, Protopapas, Pavlos, Sondak, David, Chen, Feiyu

Recent Advances of NeuroDiffEq -- An Open-Source Library for Physics-Informed Neural Networks

Solving differential equations is a critical challenge across a host of domains. While many software packages efficiently solve these equations using classical numerical approaches, there has been less effort in developing a library for researchers interested in solving such systems using neural networks. With PyTorch as its backend, NeuroDiffEq is a software library that exploits neural networks to solve differential equations. In this paper, we highlight the latest features of the NeuroDiffEq library since its debut. We show that NeuroDiffEq can solve complex boundary value problems in arbitrary dimensions, tackle boundary conditions at infinity, and maintain flexibility for dynamic injection at runtime.

artificial intelligence, library, machine learning, (13 more...)

2502.12177

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(6 more...)

Genre: Research Report (0.51)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)