AITopics | Dubrawski, Artur

Collaborating Authors

Dubrawski, Artur

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Investigating Compositional Reasoning in Time Series Foundation Models

Potosnak, Willa, Challu, Cristian, Goswami, Mononito, Olivares, Kin G., Wiliński, Michał, Żukowska, Nina, Dubrawski, Artur

arXiv.org Artificial IntelligenceFeb-9-2025

Large pre-trained time series foundation models (TSFMs) have demonstrated promising zero-shot performance across a wide range of domains. However, a question remains: Do TSFMs succeed solely by memorizing training patterns, or do they possess the ability to reason? While reasoning is a topic of great interest in the study of Large Language Models (LLMs), it is undefined and largely unexplored in the context of TSFMs. In this work, inspired by language modeling literature, we formally define compositional reasoning in forecasting and distinguish it from in-distribution generalization. We evaluate the reasoning and generalization capabilities of 23 popular deep learning forecasting models on multiple synthetic and real-world datasets. Additionally, through controlled studies, we systematically examine which design choices in TSFMs contribute to improved reasoning abilities. Our study yields key insights into the impact of TSFM architecture design on compositional reasoning and generalization. We find that patch-based Transformers have the best reasoning performance, closely followed by residualized MLP-based architectures, which are 97\% less computationally complex in terms of FLOPs and 86\% smaller in terms of the number of trainable parameters. Interestingly, in some zero-shot out-of-distribution scenarios, these models can outperform moving average and exponential smoothing statistical baselines trained on in-distribution data. Only a few design choices, such as the tokenization method, had a significant (negative) impact on Transformer model performance.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.06037

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.66)

Industry: Energy > Power Industry (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multimodal Structure Preservation Learning

Liu, Chang, Chen, Jieshi, Harrison, Lee H., Dubrawski, Artur

arXiv.org Artificial IntelligenceOct-29-2024

When selecting data to build machine learning models in practical applications, factors such as availability, acquisition cost, and discriminatory power are crucial considerations. Different data modalities often capture unique aspects of the underlying phenomenon, making their utilities complementary. On the other hand, some sources of data host structural information that is key to their value. Hence, the utility of one data type can sometimes be enhanced by matching the structure of another. We propose Multimodal Structure Preservation Learning (MSPL) as a novel method of learning data representations that leverages the clustering structure provided by one data modality to enhance the utility of data from another modality. We demonstrate the effectiveness of MSPL in uncovering latent structures in synthetic time series data and recovering clusters from whole genome sequencing and antimicrobial resistance data using mass spectrometry data in support of epidemiology applications. The results show that MSPL can imbue the learned features with external structures and help reap the beneficial synergies occurring across disparate data modalities.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.2252

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.96)

Add feedback

TimeSeriesExam: A time series understanding exam

Cai, Yifu, Choudhry, Arjun, Goswami, Mononito, Dubrawski, Artur

arXiv.org Artificial IntelligenceOct-17-2024

Large Language Models (LLMs) have recently demonstrated a remarkable ability to model time series data. These capabilities can be partly explained if LLMs understand basic time series concepts. However, our knowledge of what these models understand about time series data remains relatively limited. To address this gap, we introduce TimeSeriesExam, a configurable and scalable multiple-choice question exam designed to assess LLMs across five core time series understanding categories: pattern recognition, noise understanding, similarity analysis, anomaly detection, and causality analysis. TimeSeriesExam comprises of over 700 questions, procedurally generated using 104 carefully curated templates and iteratively refined to balance difficulty and their ability to discriminate good from bad models. We test 7 state-of-the-art LLMs on the TimeSeriesExam and provide the first comprehensive evaluation of their time series understanding abilities. Our results suggest that closed-source models such as GPT-4 and Gemini understand simple time series concepts significantly better than their open-source counterparts, while all models struggle with complex concepts such as causality analysis. We believe that the ability to programatically generate questions is fundamental to assessing and improving LLM's ability to understand and reason about time series data.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.14752

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Towards Long-Context Time Series Foundation Models

Żukowska, Nina, Goswami, Mononito, Wiliński, Michał, Potosnak, Willa, Dubrawski, Artur

arXiv.org Artificial IntelligenceSep-20-2024

Large Language and Vision Models (LLMs and LVMs) have revolutionized text and image modeling, enabling a wide range of applications with both limited data and expert supervision. Time series foundation models (TSFMs) [8, 7, 1, 19, 21, 6, 4, 12] promise to bring similar transformative advancements to modeling time series. However, most of these models, barring MOIRAI [21] and TTMs [6], can only model short univariate time series, limiting their widespread use in applications such as healthcare where long and multivariate time series are common. Most of these approaches downsample long time series to handle extended context lengths and model different channels independently to manage multivariate inputs, which limits their ability to capture potentially informative high-frequency and intra-variate dependencies. A straightforward solution to modeling both long and multivariate inputs is to re-design and pre-train TSFMs with longer context lengths and to concatenate multiple channels sequentially [21]. However, this naive solution drastically increases computational complexity as Transformer-based foundation models [8, 7, 1, 19, 21, 4, 12] are constrained by context-dependent memory, due to their quadratic complexity in the length of the input.

large language model, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2409.1353

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

A Rate-Distortion View of Uncertainty Quantification

Apostolopoulou, Ifigeneia, Eysenbach, Benjamin, Nielsen, Frank, Dubrawski, Artur

arXiv.org Machine LearningJun-18-2024

In supervised learning, understanding an input's proximity to the training data can help a model decide whether it has sufficient evidence for reaching a reliable prediction. While powerful probabilistic models such as Gaussian Processes naturally have this property, deep neural networks often lack it. In this paper, we introduce Distance Aware Bottleneck (DAB), i.e., a new method for enriching deep neural networks with this property. Building on prior information bottleneck approaches, our method learns a codebook that stores a compressed representation of all inputs seen during training. The distance of a new example from this codebook can serve as an uncertainty estimate for the example. The resulting model is simple to train and provides deterministic uncertainty estimates by a single forward pass. Finally, our method achieves better out-of-distribution (OOD) detection and misclassification prediction than prior methods, including expensive ensemble methods, deep kernel Gaussian Processes, and approaches based on the standard information bottleneck.

artificial intelligence, centroid, machine learning, (17 more...)

arXiv.org Machine Learning

2406.10775

Country:

North America > United States (0.46)
Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Exploring Loss Design Techniques For Decision Tree Robustness To Label Noise

Sztukiewicz, Lukasz, Good, Jack Henry, Dubrawski, Artur

arXiv.org Machine LearningMay-27-2024

In the real world, data is often noisy, affecting not only the quality of features but also the accuracy of labels. Current research on mitigating label errors stems primarily from advances in deep learning, and a gap exists in exploring interpretable models, particularly those rooted in decision trees. In this study, we investigate whether ideas from deep learning loss design can be applied to improve the robustness of decision trees. In particular, we show that loss correction and symmetric losses, both standard approaches, are not effective. We argue that other directions need to be explored to improve the robustness of decision trees to label noise.

artificial intelligence, correction 0, machine learning, (16 more...)

arXiv.org Machine Learning

2405.17672

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

MOMENT: A Family of Open Time-series Foundation Models

Goswami, Mononito, Szafer, Konrad, Choudhry, Arjun, Cai, Yifu, Li, Shuo, Dubrawski, Artur

arXiv.org Artificial IntelligenceFeb-6-2024

Time-series analysis is an important field encompassing a wide range of applications ranging from forecasting weather patterns Schneider and Dickinson [1974] or detecting irregular heartbeats using Electrocardiograms Goswami et al. [2021], to identifying anomalous software deployments Xu et al. [2018]. Due to its significant practical value and the unique challenges that modeling time-series data poses, time-series analysis continues to receive substantial interest from academia and industry alike. However, modeling such data typically requires substantial domain expertise, time, and task-specific design. Large pre-trained language Touvron et al. [2023], Devlin et al. [2019], Chung et al. [2022], vision Li et al. [2023a], and video Day et al. [2023] models, typically perform well on a variety of tasks on data from diverse domains, with little or no supervision, and they can be specialized to perform well on specific tasks. We unlock these key capabilities for time-series data and release the first family of open-source large pre-trained time-series models, which we call MOMENT. The models in this family (1) serve as a building block for diverse time-series analysis tasks (e.g., forecasting, classification, anomaly detection, and imputation, etc.), (2) are effective out-of-the-box, i.e., with no (or few) particular task-specific exemplars (enabling e.g., zero-shot forecasting, few-shot classification, etc.), and (3) are tunable using in-distribution and task-specific data to improve performance. MOMENT is a family of high-capacity transformer models, pre-trained using a masked time-series prediction task on large amounts of time-series data drawn from diverse domains. Below we summarize our key contributions.

data mining, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2402.03885

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Signal Quality Auditing for Time-series Data

Gao, Chufan, Gisolfi, Nicholas, Dubrawski, Artur

arXiv.org Artificial IntelligenceFeb-1-2024

Signal quality assessment (SQA) is required for monitoring the reliability of data acquisition systems, especially in AI-driven Predictive Maintenance (PMx) application contexts. SQA is vital for addressing "silent failures" of data acquisition hardware and software, which when unnoticed, misinform the users of data, creating the risk for incorrect decisions with unintended or even catastrophic consequences. We have developed an open-source software implementation of signal quality indices (SQIs) for the analysis of time-series data. We codify a range of SQIs, demonstrate them using established benchmark data, and show that they can be effective for signal quality assessment. We also study alternative approaches to denoising time-series data in an attempt to improve the quality of the already degraded signal, and evaluate them empirically on relevant real-world data. To our knowledge, our software toolkit is the first to provide an open source implementation of a broad range of signal quality assessment and improvement techniques validated on publicly available benchmark data for ease of reproducibility. The generality of our framework can be easily extended to assessing reliability of arbitrary time-series measurements in complex systems, especially when morphological patterns of the waveform shapes and signal periodicity are of key interest in downstream analyses.

artificial intelligence, machine learning, sqis, (17 more...)

arXiv.org Artificial Intelligence

2402.00803

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Software (0.89)
(2 more...)

Add feedback

Motion Informed Needle Segmentation in Ultrasound Images

Goel, Raghavv, Morales, Cecilia, Singh, Manpreet, Dubrawski, Artur, Galeotti, John, Choset, Howie

arXiv.org Artificial IntelligenceDec-4-2023

Segmenting a moving needle in ultrasound images is challenging due to the presence of artifacts, noise, and needle occlusion. This task becomes even more demanding in scenarios where data availability is limited. Convolutional Neural Networks (CNNs) have been successful in many computer vision applications, but struggle to accurately segment needles without considering their motion. In this paper, we present a novel approach for needle segmentation that combines classical Kalman Filter (KF) techniques with data-driven learning, incorporating both needle features and needle motion. Our method offers two key contributions. First, we propose a compatible framework that seamlessly integrates into commonly used encoder-decoder style architectures. Second, we demonstrate superior performance compared to recent state-of-the-art needle segmentation models using our novel convolutional neural network (CNN) based KF-inspired block, achieving a 15\% reduction in pixel-wise needle tip error and an 8\% reduction in length error. Third, to our knowledge we are the first to implement a learnable filter to incorporate non-linear needle motion for improving needle segmentation.

artificial intelligence, encoder, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2312.01239

Country:

Europe > Czechia (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Forecasting Response to Treatment with Global Deep Learning and Patient-Specific Pharmacokinetic Priors

Potosnak, Willa, Challu, Cristian, Olivares, Kin G., Dubrawski, Artur

arXiv.org Artificial IntelligenceNov-9-2023

Forecasting healthcare time series is crucial for early detection of adverse outcomes and for patient monitoring. Forecasting, however, can be difficult in practice due to noisy and intermittent data. The challenges are often exacerbated by change points induced via extrinsic factors, such as the administration of medication. To address these challenges, we propose a novel hybrid global-local architecture and a pharmacokinetic encoder that informs deep learning models of patient-specific treatment effects. We showcase the efficacy of our approach in achieving significant accuracy gains for a blood glucose forecasting task using both realistically simulated and real-world data. Our global-local architecture improves over patient-specific models by 9.2-14.6%. Additionally, our pharmacokinetic encoder improves over alternative encoding techniques by 4.4% on simulated data and 2.1% on real-world data. The proposed approach can have multiple beneficial applications in clinical practice, such as issuing early warnings about unexpected treatment responses, or helping to characterize patient-specific treatment effects in terms of drug absorption and elimination characteristics.

artificial intelligence, forecasting, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2309.13135

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Utah (0.14)
North America > United States > New Jersey (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback