AITopics | data analysis

Collaborating Authors

data analysis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Simultaneous Latent Budget Trees for Stratified Classification

Buoncompagni, Simultaneous Latent Budget Trees for Stratified Classification Cristian, Pellegrino, Stefano, Vannucci, Giulia, Dubbioso, Raffaele, Siciliano, Roberta

arXiv.org Machine LearningJun-16-2026

In the era of Explainable Artificial Intelligence, there is a renewed focus on single trees for their ease of interpretation. This paper introduces Simultaneous Latent Budget Trees, a probabilistic machine learning framework for classification trees in the presence of a stratification factor such as a temporal, spatial, or demographic variable, acting as a control variable or potential confounder. Standard tree growth procedures are not designed to optimize a conditional split rule. A model-based split rule is proposed in which child nodes are interpreted as latent components of a simultaneous mixture model, such as the Simultaneous Latent Budget Model and its constrained versions, fitted to the parent node. Mixing parameters drive the observations, differently for each group, to the child nodes whereas latent budgets parameters update the response classes profile of each level of the control variable. Parameters are estimated by least squares considering a neural network perspective of the model. An informative tree structure can be interactively visualized with interpretation aids on the node and the paths, including visual pruning and decision tree selection procedure. Suitable measures are proposed to handle an unbalanced response class distribution. The proposed methodology is applied to investigate gender-related differences in disease progression of Amyotrophic Lateral Sclerosis. The SLBT library with the various tree-based algorithms is available in the linked GitHub repository.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Machine Learning

2606.13295

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Amyotrophic Lateral Sclerosis (ALS) (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.89)
(2 more...)

Add feedback

Continuous-Time Learning of Probability Distributions: A Case Study in a Digital Trial of Young Children with Type 1 Diabetes

Álvarez-López, Antonio, Matabuena, Marcos

arXiv.org Machine LearningMar-26-2026

Understanding how biomarker distributions evolve over time is a central challenge in digital health and chronic disease monitoring. In diabetes, changes in the distribution of glucose measurements can reveal patterns of disease progression and treatment response that conventional summary measures miss. Motivated by a 26-week clinical trial comparing the closed-loop insulin delivery system t:slim X2 with standard therapy in children with type 1 diabetes, we propose a probabilistic framework to model the continuous-time evolution of time-indexed distributions using continuous glucose monitoring data (CGM) collected every five minutes. We represent the glucose distribution as a Gaussian mixture, with time-varying mixture weights governed by a neural ODE. We estimate the model parameter using a distribution-matching criterion based on the maximum mean discrepancy. The resulting framework is interpretable, computationally efficient, and sensitive to subtle temporal distributional changes. Applied to CGM trial data, the method detects treatment-related improvements in glucose dynamics that are difficult to capture with traditional analytical approaches.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Machine Learning

2603.24427

Country: Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

Disease Trajectory Maps

Peter Schulam, Raman Arora

Neural Information Processing SystemsMar-23-2026, 06:53:46 GMT

Medical researchers are coming to appreciate that many diseases are in fact complex, heterogeneous syndromes composed of subpopulations that express different variants of a related complication. Longitudinal data extracted from individual electronic health records (EHR) offer an exciting new way to study subtle differences in the way these diseases progress over time. In this paper, we focus on answering two questions that can be asked using these databases of longitudinal EHR data. First, we want to understand whether there are individuals with similar disease trajectories and whether there are a small number of degrees of freedom that account for differences in trajectories across the population. Second, we want to understand how important clinical outcomes are associated with disease trajectories. To answer these questions, we propose the Disease Trajectory Map (DTM), a novel probabilistic model that learns low-dimensional representations of sparse and irregularly sampled longitudinal data. We propose a stochastic variational inference algorithm for learning the DTM that allows the model to scale to large modern medical datasets. To demonstrate the DTM, we analyze data collected on patients with the complex autoimmune disease, scleroderma. We find that DTM learns meaningful representations of disease trajectories and that the representations are significantly associated with important clinical outcomes.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > Experimental Study (0.56)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.54)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.48)
Health & Medicine > Therapeutic Area > Immunology (0.34)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

NanoBaseLib: A Multi-Task Benchmark Dataset for Nanopore Sequencing

Neural Information Processing SystemsMar-21-2026, 12:42:04 GMT

Nanopore sequencing is the third-generation sequencing technology with capabilities of generating long-read sequences and directly measuring modifications on DNA/RNA molecules, which makes it ideal for biological applications such as human Telomere-to-Telomere (T2T) genome assembly, Ebola virus surveillance and COVID-19 mRNA vaccine development. However, accuracies of computational methods in various tasks of Nanopore sequencing data analysis are far from satisfactory. For instance, the base calling accuracy of Nanopore RNA sequencing is $\sim$90\%, while the aim is $\sim$99.9\%. This highlights an urgent need of contributions from the machine learning community. A bottleneck that prevents machine learning researchers from entering this field is the lack of a large integrated benchmark dataset.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Debiasing Synthetic Data Generated by Deep Generative Models

Neural Information Processing SystemsMar-20-2026, 06:43:17 GMT

While synthetic data hold great promise for privacy protection, their statistical analysis poses significant challenges that necessitate innovative solutions. The use of deep generative models (DGMs) for synthetic data generation is known to induce considerable bias and imprecision into synthetic data analyses, compromising their inferential utility as opposed to original data analyses. This bias and uncertainty can be substantial enough to impede statistical convergence rates, even in seemingly straightforward analyses like mean calculation.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Genre: Research Report (0.39)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.30)

Add feedback

Towards Application-Driven and Comprehensive Data Analysis via Code Generation Xueqing Wu

Neural Information Processing SystemsFeb-17-2026, 04:43:16 GMT

Data analysis is a crucial analytical process essential for deriving insights from real-world databases.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(11 more...)

Industry:

Health & Medicine (1.00)
Banking & Finance (1.00)
Law (0.68)

Technology:

Information Technology > Databases (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Generalization in the Face of Adaptivity: A Bayesian Perspective

Neural Information Processing SystemsFeb-16-2026, 07:01:18 GMT

However, differential privacy's worst-case nature entails scaling such noise to the range of the queries even for

artificial intelligence, machine learning, query, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

fdff71fcab656abfbefaabecab1a7f6d-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 06:38:06 GMT

computational geometry, multiparameter persistence image, persistence diagram, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France (0.04)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MultiparameterPersistenceImagesforTopological MachineLearning

Neural Information Processing SystemsFeb-11-2026, 06:37:58 GMT

However,in manyapplications there are several different parameters one might wish to vary: for example, scale and density. In contrast to the one-parameter setting, techniques for applying statistics and machine learning in the setting of multiparameter persistence are not well understood due to the lack of a concise representationoftheresults.

artificial intelligence, machine learning, persistence image, (15 more...)

Neural Information Processing Systems

Country: