Goto

Collaborating Authors

 Regression


Machine Learning-Enhanced Prediction of Surface Smoothness for Inertial Confinement Fusion Target Polishing Using Limited Data

arXiv.org Artificial Intelligence

In Inertial Confinement Fusion (ICF) process, roughly a 2mm spherical shell made of high density carbon is used as target for laser beams, which compress and heat it to energy levels needed for high fusion yield. These shells are polished meticulously to meet the standards for a fusion shot. However, the polishing of these shells involves multiple stages, with each stage taking several hours. To make sure that the polishing process is advancing in the right direction, we are able to measure the shell surface roughness. This measurement, however, is very labor-intensive, time-consuming, and requires a human operator. We propose to use machine learning models that can predict surface roughness based on the data collected from a vibration sensor that is connected to the polisher. Such models can generate surface roughness of the shells in real-time, allowing the operator to make any necessary changes to the polishing for optimal result.


Interpretable Causal Inference for Analyzing Wearable, Sensor, and Distributional Data

arXiv.org Artificial Intelligence

Many modern causal questions ask how treatments affect complex outcomes that are measured using wearable devices and sensors. Current analysis approaches require summarizing these data into scalar statistics (e.g., the mean), but these summaries can be misleading. For example, disparate distributions can have the same means, variances, and other statistics. Researchers can overcome the loss of information by instead representing the data as distributions. We develop an interpretable method for distributional data analysis that ensures trustworthy and robust decision-making: Analyzing Distributional Data via Matching After Learning to Stretch (ADD MALTS). We (i) provide analytical guarantees of the correctness of our estimation strategy, (ii) demonstrate via simulation that ADD MALTS outperforms other distributional data analysis methods at estimating treatment effects, and (iii) illustrate ADD MALTS' ability to verify whether there is enough cohesion between treatment and control units within subpopulations to trustworthily estimate treatment effects. We demonstrate ADD MALTS' utility by studying the effectiveness of continuous glucose monitors in mitigating diabetes risks.


What's Next? Predicting Hamiltonian Dynamics from Discrete Observations of a Vector Field

arXiv.org Artificial Intelligence

Each method is either informed or uninformed of the Hamiltonian property. We empirically and comparatively evaluate the methods and observe that information that the system is Hamiltonian can be effectively informed, and that different methods strike different trade-offs between efficiency and effectiveness for different dynamical systems.


Hypothesis Testing for Class-Conditional Noise Using Local Maximum Likelihood

arXiv.org Machine Learning

In supervised learning, automatically assessing the quality of the labels before any learning takes place remains an open research question. In certain particular cases, hypothesis testing procedures have been proposed to assess whether a given instance-label dataset is contaminated with class-conditional label noise, as opposed to uniform label noise. The existing theory builds on the asymptotic properties of the Maximum Likelihood Estimate for parametric logistic regression. However, the parametric assumptions on top of which these approaches are constructed are often too strong and unrealistic in practice. To alleviate this problem, in this paper we propose an alternative path by showing how similar procedures can be followed when the underlying model is a product of Local Maximum Likelihood Estimation that leads to more flexible nonparametric logistic regression models, which in turn are less susceptible to model misspecification. This different view allows for wider applicability of the tests by offering users access to a richer model class. Similarly to existing works, we assume we have access to anchor points which are provided by the users. We introduce the necessary ingredients for the adaptation of the hypothesis tests to the case of nonparametric logistic regression and empirically compare against the parametric approach presenting both synthetic and real-world case studies and discussing the advantages and limitations of the proposed approach.


Targeted Machine Learning for Average Causal Effect Estimation Using the Front-Door Functional

arXiv.org Machine Learning

Evaluating the average causal effect (ACE) of a treatment on an outcome often involves overcoming the challenges posed by confounding factors in observational studies. A traditional approach uses the back-door criterion, seeking adjustment sets to block confounding paths between treatment and outcome. However, this method struggles with unmeasured confounders. As an alternative, the front-door criterion offers a solution, even in the presence of unmeasured confounders between treatment and outcome. This method relies on identifying mediators that are not directly affected by these confounders and that completely mediate the treatment's effect. Here, we introduce novel estimation strategies for the front-door criterion based on the targeted minimum loss-based estimation theory. Our estimators work across diverse scenarios, handling binary, continuous, and multivariate mediators. They leverage data-adaptive machine learning algorithms, minimizing assumptions and ensuring key statistical properties like asymptotic linearity, double-robustness, efficiency, and valid estimates within the target parameter space. We establish conditions under which the nuisance functional estimations ensure the root n-consistency of ACE estimators. Our numerical experiments show the favorable finite sample performance of the proposed estimators. We demonstrate the applicability of these estimators to analyze the effect of early stage academic performance on future yearly income using data from the Finnish Social Science Data Archive.


Distributed Learning of Mixtures of Experts

arXiv.org Machine Learning

In modern machine learning problems one has to deal with datasets that are not centralized. This may be related to the application context in which the data can be by nature available at different locations and not accessible in a centralized mode, or distributed for computational issues in case of a large amount of data. Indeed, even if the dataset is fully available in a centralized mode, implementing reasonable learning algorithms may be computationally demanding in case of a large number of examples. The construction of distributed techniques in a Federated Learning setting Yang et al. (2019) in which the model is trained collaboratively under the orchestration of a central server, while keeping the data decentralized, is an increasing area of research. The most attractive strategy is to perform standard inference on local machines to obtain local estimators, then transmits them to a central machine where they are aggregated to produce an overall estimator, while attempting to satisfy some statistical guarantees criteria. There are many successful attempts in this direction of parallelizing the existing learning algorithms and statistical methods. Those that may be mentioned here include, among others, parallelizing stochastic gradient descent (Zinkevich et al., 2010), multiple linear regression (Mingxian et al., 1991), parallel K-means in clustering based on MapReduce (Zhao et al., 2009), distributed learning for heterogeneous data via model integration (Merugu and Ghosh, 2005), split-and-conquer approach for penalized regressions (Chen and ge Xie, 2014), for logistic regression (Shofiyah and Sofro, 2018), for k-clustering with heavy noise Li and Guo (2018). It is only very recently that a distributed learning approach has been proposed for mixture distributions, specifically for finite Gaussian mixtures (Zhang and Chen, 2022a). In this paper we focus on mixtures of experts (MoE) models (Jacobs et al., 1991; Jordan and Xu, 1995) which extend the standard unconditional mixture distributions that are typically used for clustering purposes, to model complex non-linear relationships of a response Y conditionally on some predictors X, for prediction purposes, while enjoying denseness results, e.g.


The irruption of cryptocurrencies into Twitter cashtags: a classifying solution

arXiv.org Artificial Intelligence

There is a consensus about the good sensing characteristics of Twitter to mine and uncover knowledge in financial markets, being considered a relevant feeder for taking decisions about buying or holding stock shares and even for detecting stock manipulation. Although Twitter hashtags allow to aggregate topic-related content, a specific mechanism for financial information also exists: Cashtag. However, the irruption of cryptocurrencies has resulted in a significant degradation on the cashtag-based aggregation of posts. Unfortunately, Twitter' users may use homonym tickers to refer to cryptocurrencies and to companies in stock markets, which means that filtering by cashtag may result on both posts referring to stock companies and cryptocurrencies. This research proposes automated classifiers to distinguish conflicting cashtags and, so, their container tweets by analyzing the distinctive features of tweets referring to stock companies and cryptocurrencies. As experiment, this paper analyses the interference between cryptocurrencies and company tickers in the London Stock Exchange (LSE), specifically, companies in the main and alternative market indices FTSE-100 and AIM-100. Heuristic-based as well as supervised classifiers are proposed and their advantages and drawbacks, including their ability to self-adapt to Twitter usage changes, are discussed. The experiment confirms a significant distortion in collected data when colliding or homonym cashtags exist, i.e., the same \$ acronym to refer to company tickers and cryptocurrencies. According to our results, the distinctive features of posts including cryptocurrencies or company tickers support accurate classification of colliding tweets (homonym cashtags) and Independent Models, as the most detached classifiers from training data, have the potential to be trans-applicability (in different stock markets) while retaining performance.


Prediction of rare events in the operation of household equipment using co-evolving time series

arXiv.org Artificial Intelligence

In this study, we propose an approach for predicting rare events by exploiting time series in coevolution. Our approach involves a weighted autologistic regression model, where we leverage the temporal behavior of the data to enhance predictive capabilities. By addressing the issue of imbalanced datasets, we establish constraints leading to weight estimation and to improved performance. Evaluation on synthetic and real-world datasets confirms that our approach outperform state-of-the-art of predicting home equipment failure methods.


Modeling and Predicting Epidemic Spread: A Gaussian Process Regression Approach

arXiv.org Machine Learning

Modeling and prediction of epidemic spread are critical to assist in policy-making for mitigation. Therefore, we present a new method based on Gaussian Process Regression to model and predict epidemics, and it quantifies prediction confidence through variance and high probability error bounds. Gaussian Process Regression excels in using small datasets and providing uncertainty bounds, and both of these properties are critical in modeling and predicting epidemic spreading processes with limited data. However, the derivation of formal uncertainty bounds remains lacking when using Gaussian Process Regression in the setting of epidemics, which limits its usefulness in guiding mitigation efforts. Therefore, in this work, we develop a novel bound on the variance of the prediction that quantifies the impact of the epidemic data on the predictions we make. Further, we develop a high probability error bound on the prediction, and we quantify how the epidemic spread, the infection data, and the length of the prediction horizon all affect this error bound. We also show that the error stays below a certain threshold based on the length of the prediction horizon. To illustrate this framework, we leverage Gaussian Process Regression to model and predict COVID-19 using real-world infection data from the United Kingdom.


Big Data - Supply Chain Management Framework for Forecasting: Data Preprocessing and Machine Learning Techniques

arXiv.org Machine Learning

This article intends to systematically identify and comparatively analyze state-of-the-art supply chain (SC) forecasting strategies and technologies. A novel framework has been proposed incorporating Big Data Analytics in SC Management (problem identification, data sources, exploratory data analysis, machine-learning model training, hyperparameter tuning, performance evaluation, and optimization), forecasting effects on human-workforce, inventory, and overall SC. Initially, the need to collect data according to SC strategy and how to collect them has been discussed. The article discusses the need for different types of forecasting according to the period or SC objective. The SC KPIs and the error-measurement systems have been recommended to optimize the top-performing model. The adverse effects of phantom inventory on forecasting and the dependence of managerial decisions on the SC KPIs for determining model performance parameters and improving operations management, transparency, and planning efficiency have been illustrated. The cyclic connection within the framework introduces preprocessing optimization based on the post-process KPIs, optimizing the overall control process (inventory management, workforce determination, cost, production and capacity planning). The contribution of this research lies in the standard SC process framework proposal, recommended forecasting data analysis, forecasting effects on SC performance, machine learning algorithms optimization followed, and in shedding light on future research.