AITopics | Wang, Yuyang

Plotting

Wang, Yuyang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Discovering Bias in Latent Space: An Unsupervised Debiasing Approach

Adila, Dyah, Zhang, Shuai, Han, Boran, Wang, Yuyang

arXiv.org Artificial IntelligenceJun-5-2024

The question-answering (QA) capabilities of foundation models are highly sensitive to prompt variations, rendering their performance susceptible to superficial, non-meaning-altering changes. This vulnerability often stems from the model's preference or bias towards specific input characteristics, such as option position or superficial image features in multi-modal settings. We propose to rectify this bias directly in the model's internal representation. Our approach, SteerFair, finds the bias direction in the model's representation space and steers activation values away from it during inference. Specifically, we exploit the observation that bias often adheres to simple association rules, such as the spurious association between the first option and correctness likelihood. Next, we construct demonstrations of these rules from unlabeled samples and use them to identify the bias directions. We empirically show that SteerFair significantly reduces instruction-tuned model performance variance across prompt modifications on three benchmark tasks. Remarkably, our approach surpasses a supervised baseline with 100 labels by an average of 10.86% accuracy points and 12.95 score points and matches the performance with 500 labels.

bias direction, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2406.03631

Country:

North America > United States (0.28)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Chronos: Learning the Language of Time Series

Ansari, Abdul Fatir, Stella, Lorenzo, Turkmen, Caner, Zhang, Xiyuan, Mercado, Pedro, Shen, Huibin, Shchur, Oleksandr, Rangapuram, Syama Sundar, Arango, Sebastian Pineda, Kapoor, Shubham, Zschiegner, Jasper, Maddix, Danielle C., Wang, Hao, Mahoney, Michael W., Torkkola, Kari, Wilson, Andrew Gordon, Bohlke-Schneider, Michael, Wang, Yuyang

arXiv.org Artificial IntelligenceMay-2-2024

We introduce Chronos, a simple yet effective framework for pretrained probabilistic time series models. Chronos tokenizes time series values using scaling and quantization into a fixed vocabulary and trains existing transformer-based language model architectures on these tokenized time series via the cross-entropy loss. We pretrained Chronos models based on the T5 family (ranging from 20M to 710M parameters) on a large collection of publicly available datasets, complemented by a synthetic dataset that we generated via Gaussian processes to improve generalization. In a comprehensive benchmark consisting of 42 datasets, and comprising both classical local models and deep learning methods, we show that Chronos models: (a) significantly outperform other methods on datasets that were part of the training corpus; and (b) have comparable and occasionally superior zero-shot performance on new datasets, relative to methods that were trained specifically on them. Our results demonstrate that Chronos models can leverage time series data from diverse domains to improve zero-shot accuracy on unseen forecasting tasks, positioning pretrained models as a viable tool to greatly simplify forecasting pipelines.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2403.07815

Country:

Europe (0.67)
North America > United States > Utah (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance (1.00)
Health & Medicine (0.68)
Transportation > Passenger (0.67)
Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

OmniColor: A Global Camera Pose Optimization Approach of LiDAR-360Camera Fusion for Colorizing Point Clouds

Liu, Bonan, Zhao, Guoyang, Jiao, Jianhao, Cai, Guang, Li, Chengyang, Yin, Handi, Wang, Yuyang, Liu, Ming, Hui, Pan

arXiv.org Artificial IntelligenceApr-6-2024

A Colored point cloud, as a simple and efficient 3D representation, has many advantages in various fields, including robotic navigation and scene reconstruction. This representation is now commonly used in 3D reconstruction tasks relying on cameras and LiDARs. However, fusing data from these two types of sensors is poorly performed in many existing frameworks, leading to unsatisfactory mapping results, mainly due to inaccurate camera poses. This paper presents OmniColor, a novel and efficient algorithm to colorize point clouds using an independent 360-degree camera. Given a LiDAR-based point cloud and a sequence of panorama images with initial coarse camera poses, our objective is to jointly optimize the poses of all frames for mapping images onto geometric reconstructions. Our pipeline works in an off-the-shelf manner that does not require any feature extraction or matching process. Instead, we find optimal poses by directly maximizing the photometric consistency of LiDAR maps. In experiments, we show that our method can overcome the severe visual distortion of omnidirectional images and greatly benefit from the wide field of view (FOV) of 360-degree cameras to reconstruct various scenarios with accuracy and stability. The code will be released at https://github.com/liubonan123/OmniColor/.

artificial intelligence, optimization problem, point cloud, (13 more...)

arXiv.org Artificial Intelligence

2404.04693

Country: Asia > China (0.29)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Explainable AI for Embedded Systems Design: A Case Study of Static Redundant NVM Memory Write Prediction

Gamatié, Abdoulaye, Wang, Yuyang

arXiv.org Artificial IntelligenceMar-7-2024

This paper investigates the application of eXplainable Artificial Intelligence (XAI) in the design of embedded systems using machine learning (ML). As a case study, it addresses the challenging problem of static silent store prediction. This involves identifying redundant memory writes based only on static program features. Eliminating such stores enhances performance and energy efficiency by reducing memory access and bus traffic, especially in the presence of emerging non-volatile memory technologies. To achieve this, we propose a methodology consisting of: 1) the development of relevant ML models for explaining silent store prediction, and 2) the application of XAI to explain these models. We employ two state-of-the-art model-agnostic XAI methods to analyze the causes of silent stores. Through the case study, we evaluate the effectiveness of the methods. We find that these methods provide explanations for silent store predictions, which are consistent with known causes of silent store occurrences from previous studies. Typically, this allows us to confirm the prevalence of silent stores in operations that write the zero constant into memory, or the absence of silent stores in operations involving loop induction variables. This suggests the potential relevance of XAI in analyzing ML models' decision in embedded system design. From the case study, we share some valuable insights and pitfalls we encountered. More generally, this study aims to lay the groundwork for future research in the emerging field of XAI for embedded system design.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2403.04337

Country:

Europe (0.68)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
(2 more...)

Add feedback

PreDiff: Precipitation Nowcasting with Latent Diffusion Models

Gao, Zhihan, Shi, Xingjian, Han, Boran, Wang, Hao, Jin, Xiaoyong, Maddix, Danielle, Zhu, Yi, Li, Mu, Wang, Yuyang

arXiv.org Artificial IntelligenceDec-28-2023

Earth system forecasting has traditionally relied on complex physical models that are computationally expensive and require significant domain expertise. In the past decade, the unprecedented increase in spatiotemporal Earth observation data has enabled data-driven forecasting models using deep learning techniques. These models have shown promise for diverse Earth system forecasting tasks but either struggle with handling uncertainty or neglect domain-specific prior knowledge, resulting in averaging possible futures to blurred forecasts or generating physically implausible predictions. To address these limitations, we propose a two-stage pipeline for probabilistic spatiotemporal forecasting: 1) We develop PreDiff, a conditional latent diffusion model capable of probabilistic forecasts. 2) We incorporate an explicit knowledge alignment mechanism to align forecasts with domain-specific physical constraints. This is achieved by estimating the deviation from imposed constraints at each denoising step and adjusting the transition distribution accordingly. We conduct empirical studies on two datasets: N-body MNIST, a synthetic dataset with chaotic behavior, and SEVIR, a real-world precipitation nowcasting dataset. Specifically, we impose the law of conservation of energy in N-body MNIST and anticipated precipitation intensity in SEVIR. Experiments demonstrate the effectiveness of PreDiff in handling uncertainty, incorporating domain-specific prior knowledge, and generating forecasts that exhibit high operational utility.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2307.10422

Country: Europe (0.14)

Genre:

Research Report (0.81)
Workflow (0.68)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Non-Parametric Time Series Forecaster

Rangapuram, Syama Sundar, Gasthaus, Jan, Stella, Lorenzo, Flunkert, Valentin, Salinas, David, Wang, Yuyang, Januschowski, Tim

arXiv.org Machine LearningDec-22-2023

This paper presents non-parametric baseline models for time series forecasting. Unlike classical forecasting models, the proposed approach does not assume any parametric form for the predictive distribution and instead generates predictions by sampling from the empirical distribution according to a tunable strategy. By virtue of this, the model is always able to produce reasonable forecasts (i.e., predictions within the observed data range) without fail unlike classical models that suffer from numerical stability on some data distributions. Moreover, we develop a global version of the proposed method that automatically learns the sampling strategy by exploiting the information across multiple related time series. The empirical evaluation shows that the proposed methods have reasonable and consistent performance across all datasets, proving them to be strong baselines to be considered in one's forecasting toolbox.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2312.14657

Country: North America > United States > New York (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Generating Molecular Conformer Fields

Wang, Yuyang, Elhag, Ahmed A., Jaitly, Navdeep, Susskind, Joshua M., Bautista, Miguel Angel

arXiv.org Artificial IntelligenceDec-5-2023

This complicates brute force approaches, making them virtually unfeasible for even moderately small molecules. In this paper we tackle the problem of generating conformers of a molecule in 3D space given Systematic methods, like OMEGA (Hawkins et al., 2010), its molecular graph. We parameterize these conformers offer rapid processing through rule-based generators and as continuous functions that map elements curated torsion templates. Despite their efficiency, these from the molecular graph to points in 3D models typically fail on complex molecules, as they often space. We then formulate the problem of learning overlook global interactions and are tricky to extend to to generate conformers as learning a distribution inputs like transition states or open-shell molecules. Classic over these functions using a diffusion generative stochastic methods, like molecular dynamics (MD) and model, called Molecular Conformer Fields Markov chain Monte Carlo (MCMC), rely on extensively exploring (MCF). Our approach is simple and scalable, and the energy landscape to find low-energy conformers.

artificial intelligence, conformer, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2311.17932

Country: Asia > Middle East > Israel (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Predict, Refine, Synthesize: Self-Guiding Diffusion Models for Probabilistic Time Series Forecasting

Kollovieh, Marcel, Ansari, Abdul Fatir, Bohlke-Schneider, Michael, Zschiegner, Jasper, Wang, Hao, Wang, Yuyang

arXiv.org Machine LearningNov-22-2023

Diffusion models have achieved state-of-the-art performance in generative modeling tasks across various domains. Prior works on time series diffusion models have primarily focused on developing conditional models tailored to specific forecasting or imputation tasks. In this work, we explore the potential of task-agnostic, unconditional diffusion models for several time series applications. We propose TSDiff, an unconditionally-trained diffusion model for time series. Our proposed self-guidance mechanism enables conditioning TSDiff for downstream tasks during inference, without requiring auxiliary networks or altering the training procedure. We demonstrate the effectiveness of our method on three different time series tasks: forecasting, refinement, and synthetic data generation. First, we show that TSDiff is competitive with several task-specific conditional forecasting methods (predict). Second, we leverage the learned implicit probability density of TSDiff to iteratively refine the predictions of base forecasters with reduced computational overhead over reverse diffusion (refine). Notably, the generative performance of the model remains intact -- downstream forecasters trained on synthetic samples from TSDiff outperform forecasters that are trained on samples from other state-of-the-art generative time series models, occasionally even outperforming models trained on real data (synthesize).

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Machine Learning

2307.11494

Country:

Asia (0.67)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Energy > Power Industry (0.93)
Health & Medicine (0.67)
Energy > Renewable > Solar (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Backpropagation through Back Substitution with a Backslash

Edelman, Alan, Akyurek, Ekin, Wang, Yuyang

arXiv.org Artificial IntelligenceAug-30-2023

We present a linear algebra formulation of backpropagation which allows the calculation of gradients by using a generically written ``backslash'' or Gaussian elimination on triangular systems of equations. Generally, the matrix elements are operators. This paper has three contributions: (i) it is of intellectual value to replace traditional treatments of automatic differentiation with a (left acting) operator theoretic, graph-based approach; (ii) operators can be readily placed in matrices in software in programming languages such as Julia as an implementation option; (iii) we introduce a novel notation, ``transpose dot'' operator ``$\{\}^{T_\bullet}$'' that allows for the reversal of operators. We further demonstrate the elegance of the operators approach in a suitable programming language consisting of generic linear algebra operators such as Julia \cite{bezanson2017julia}, and that it is possible to realize this abstraction in code. Our implementation shows how generic linear algebra can allow operators as elements of matrices. In contrast to ``operator overloading,'' where backslash would normally have to be rewritten to take advantage of operators, with ``generic programming'' there is no such need.

artificial intelligence, back substitution, backpropagation, (1 more...)

arXiv.org Artificial Intelligence

2303.15449

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.60)

Add feedback

Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting

Hasson, Hilaf, Maddix, Danielle C., Wang, Yuyang, Gupta, Gaurav, Park, Youngsuk

arXiv.org Artificial IntelligenceAug-28-2023

Ensembling is among the most popular tools in machine learning (ML) due to its effectiveness in minimizing variance and thus improving generalization. Most ensembling methods for black-box base learners fall under the umbrella of "stacked generalization," namely training an ML algorithm that takes the inferences from the base learners as input. While stacking has been widely applied in practice, its theoretical properties are poorly understood. In this paper, we prove a novel result, showing that choosing the best stacked generalization from a (finite or finite-dimensional) family of stacked generalizations based on cross-validated performance does not perform "much worse" than the oracle best. Our result strengthens and significantly extends the results in Van der Laan et al. (2007). Inspired by the theoretical analysis, we further propose a particular family of stacked generalizations in the context of probabilistic forecasting, each one with a different sensitivity for how much the ensemble weights are allowed to vary across items, timestamps in the forecast horizon, and quantiles. Experimental results demonstrate the performance gain of the proposed method.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.15786

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback