AITopics | case study

Collaborating Authors

case study

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Decision-Aligned Evaluation of Uncertainty Quantification

Schneider, Annika, Rochussen, Tommy, Stiller, Joshua, Fortuin, Vincent

arXiv.org Machine LearningJun-26-2026

Uncertainty estimates in machine learning are typically evaluated using generic metrics such as the negative log-likelihood and expected calibration error, yet good performance on such metrics does not necessarily imply high utility in downstream decisions. We introduce decision-alignment, a criterion that reveals which evaluation metrics meaningfully align with downstream utilities. Applying this framework, we show that many widely used uncertainty metrics are either misaligned with common decision problems or encode pathological prior beliefs about the downstream task. We then propose prior-weighted utility metrics, a special class of proper scoring rules that provides decision-aligned uncertainty evaluation. Across benchmark experiments and real-world case studies, our metrics consistently align with realized decision utility, while conventional metrics do not. Our results surface flaws in the current UQ evaluation protocol and offer a principled extension of existing metrics toward decision-relevant UQ evaluation.

alignment, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2606.2699

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Energy > Power Industry (0.93)
Banking & Finance > Loans (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
(2 more...)

Add feedback

Do In Context Learning for Causal Effect Estimation

Neural Information Processing SystemsJun-23-2026, 04:47:17 GMT

Causal effect estimation is critical to a range of scientific disciplines. Existing methods for this task either require interventional data, knowledge about the ground-truth causal graph, or rely on assumptions such as unconfoundedness, restricting their applicability in real-world settings. In the domain of tabular machine learning, Prior-data fitted networks (PFNs) have achieved state-of-theart predictive performance, having been pre-trained on synthetic causal data to solve tabular prediction problems via in-context learning. To assess whether this can be transferred to the problem of causal effect estimation, we pre-train PFNs on synthetic data drawn from a wide variety of causal structures, including interventions, to predict interventional outcomes given observational data. Through extensive experiments in synthetic and semi-synthetic settings, we show that our approach allows for the accurate estimation of causal effects without knowledge of the underlying causal graph.

do-pfn, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg (0.28)
North America > United States (0.28)
Europe > United Kingdom (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education (0.69)
Law (0.46)
Health & Medicine (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

e0ed6d6c2ec6df05f929b8a67b78513a-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsJun-22-2026, 23:38:44 GMT

In this section, we propose the detailed information during our benchmark and dataset construction821 process, including the data source description, dataset composition, filtering strategies, and the822 rationale for dataset construction. Chemical reaction data are separately collected from patent databases, including USPTO [19], Pista-828 chio [37], and Reaxys [8]. For reaction mechanism annotation, we followed the processing pipeline829 described in [26].830 A.2 Dataset Composition and Filtering Strategies831 Molecular Samples (25% of Benchmark): Although the ZINC database contains 250,000832 molecules, we observed that its molecular weight distribution is relatively concentrated. To en-833 sure diversity, we carefully selected molecules from PubChem, ChEMBL, and ZINC based on834 molecular weight and structural complexity.

artificial intelligence, large language model, natural language, (14 more...)

Neural Information Processing Systems

Industry: Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.49)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.39)

Add feedback

Unlocking for Data Analysis Code Generation via Non Parametric Knowledge Distillation

Neural Information Processing SystemsJun-21-2026, 13:01:22 GMT

Knowledge distillation from Large Language Models (LLMs) to locally hosted Small Language Models (SLMs) provides advantages for Data Analysis Code Generation (DACG) such as privacy protection. However, achieving effective distillation without resource-intensive training is challenging. This paper investigates whether LLMs can distill knowledge to SLMs through In-Context Learning (ICL), a training-free method for rapid task adaptation. We present the DARGO: Distillation and Adaptive Reasoning-Guided Orchestration framework, which facilitates automatic knowledge distillation from LLMs to SLMs. DARGO consists of three phases: exploration through an Model Orchestration Interface (MOI), Memory Collection of successful trajectories, and Knoweldge-driven Inference. We evaluate DARGO on three challenging DACG benchmarks (WIKITQ, TABMWP, and BIRD-SQL), each with in-domain training sets that enable detailed analysis of knowledge distillation effectiveness. DARGO demonstrates a substantial relative performance improvement of 27.5% on average for the student SLMs. To further observe generalization capabilities, we evaluate the DARGO across different teacher-student model combinations, knowledge transfer scenarios, and unified memory approaches for more advanced, test-only data analysis tasks. Our findings contribute a novel perspective on distillation methods that enhance performance for SLMs while avoiding intensive fine-tuning.

large language model, machine learning, slm, (19 more...)

Neural Information Processing Systems

Country:

Europe > Austria (0.28)
Asia > China (0.28)
North America > Mexico (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Workflow (0.94)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World

Neural Information Processing SystemsJun-18-2026, 02:58:53 GMT

Large language models (LLMs) have achieved significant performance progress in various natural language processing applications. However, LLMs still struggle to meet the strict requirements for accuracy and reliability in the medical field and face many challenges in clinical applications. Existing clinical diagnostic evaluation benchmarks for evaluating medical agents powered by LLMs have severe limitations. Firstly, most existing medical evaluation benchmarks face the risk of data leakage or contamination.

artificial intelligence, large language model, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.92)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(9 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Bridging Data Gaps in Structural Fragility Modeling through Transfer Learning: Methodology and Case Studies

Saeednejad, Narges, Padgett, Jamie Ellen

arXiv.org Machine LearningJun-18-2026

This paper presents a methodology-centered transfer learning framework for fragility adaptation under domain shift, class imbalance, and scarce target labels while preserving engineering interpretability and supporting decision-making under uncertainty. Four transfer learning strategies (instance-based, parameter-based, hierarchical Bayesian, and multi-source) are demonstrated through three complementary case studies: (i) instance-based transfer learning via importance weighting, demonstrated on coastal bridge fragility using Hurricane Katrina observations; (ii) parameter-based transfer learning together with hierarchical Bayesian transfer learning, enabling partial pooling across strata and posterior uncertainty quantification, demonstrated on residential building fragility using Hurricane Ian observations; and (iii) multi-source transfer learning that fuses multiple analytical fragility models with learned source weights and regularized target-domain adaptation, demonstrated on seismic bridge fragility using observations from the 2001 Nisqually earthquake. Across these case studies, direct transfer of source models (i.e. using existing state-of-the-art models) fails under domain shift and severe class imbalance, while targeted adaptation substantially improves failure detection and predictive stability in low-data regimes. These findings highlight the need for systematic guidance on diagnostics, strategy selection, and uncertainty reporting when developing and adapting fragility models.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2606.18567

Country: North America > United States > California (0.67)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Materials > Construction Materials (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Add feedback

1 Supplementary Material

Neural Information Processing SystemsJun-17-2026, 23:11:01 GMT

To investigate this further, we first observe that Claude-3.7-Sonnet Figure 1 shows the average pass rate under budgets of 12,000, 10 14,000, 16,000, and 17,000 tokens. As the data demonstrate, enlarging the thinking budget yields no 11 appreciable improvement in performance. This finding underscores 14 the challenging nature of ENGDESIGN and suggests its value as a rigorous testbed for future efforts 15 to enhance LLMs' engineering design proficiency. Figure 1: Average pass rate (%) of Claude-3.7-Thinking

artificial intelligence, large language model, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.49)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

FRESH: Information-Geometric Calibration of Patient-Level Models to Aggregate Evidence

Fuller, Franklin, Bertolini, Daniele, Liang, Samantha, Christopher, Jason, Smith, Aaron M.

arXiv.org Machine LearningMay-18-2026

Many decision in clinical science and epidemiology -- estimating probability of technical success for a clinical trial, assessing comparative effectiveness of two therapies, imputing a placebo effect onto natural history data -- rely on combining sources of information about a clinical cohort that comes from different kinds of studies. Specifically we contrast patient-level sources that provide granular pictures of individual disease course (clinical trial, registries, or electronic health records) with aggregate sources such as published clinical trial results and the TFLs (tables figures and listings). One strategy for combining aggregate with patient-level data sources is to bring each into a common format for a unified analysis. If one wants to maintain the analytic flexibility of patient-level data, then a natural solution is to convert the aggregate data information into a simulated patient-level dataset that recapitulate those aggregate statistics. This is an under-determined inverse problem in that there are many such datasets, and it cannot be well specified without further constraints. FRESH(Fusion of Recent Evidence with Subject Histories) provides a well-defined method for solving this problem, and therefore providing maximal analytic flexibility.

artificial intelligence, constraint, machine learning, (18 more...)

arXiv.org Machine Learning

2605.16246

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks Supplementary Materials

Neural Information Processing SystemsApr-30-2026, 03:53:44 GMT

The source code of Minigrid and Miniworld can be found at https://github.com/ To run the experiments, we have implemented the following functionalities: 1. implemented the human trajectory saving for MiniGrid-FourRooms-v0 (copied the ManualControlclass from Minigrid and added 38 lines of code, which are mostly calling data saving functions); 2. implemented the human trajectory saving for MiniWorld-FourRooms-v0 (copied the ManualControlclass from Miniworld and added 45 lines of code, which are mostly calling data saving functions); 3. implemented data saving and plotting for MiniGrid-FourRooms-v0 (33 lines of code, mostly for Matplotlib); 4. implemented data saving and plotting for MiniWorld-FourRooms-v0 (33 lines of code, mostly for Matplotlib). In total, the implementation of this new functionality required 149 lines of code. The source code is hosted on GitHub. We bear all the responsibility in case of violation of rights.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.17)

Industry: Education (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.54)

Add feedback

Filters

Collaborating Authors

case study

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Decision-Aligned Evaluation of Uncertainty Quantification

Do In Context Learning for Causal Effect Estimation

e0ed6d6c2ec6df05f929b8a67b78513a-Supplemental-Datasets_and_Benchmarks_Track.pdf

Unlocking for Data Analysis Code Generation via Non Parametric Knowledge Distillation

ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World

Bridging Data Gaps in Structural Fragility Modeling through Transfer Learning: Methodology and Case Studies

1 Supplementary Material

FRESH: Information-Geometric Calibration of Patient-Level Models to Aggregate Evidence

f0318ba897cee71ce200e408dea6062e-Supplemental-Conference.pdf

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks Supplementary Materials