AITopics

2302.01536

Country:

Europe > Austria > Vienna (0.14)
North America > United States > North Carolina > Durham County > Durham (0.05)
North America > United States > New York (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

arXiv.org Artificial IntelligenceFeb-2-2023

Personalized Understanding of Blood Glucose Dynamics via Mobile Sensor Data

Royston, Sam

Continuous Blood Glucose (CGM) monitors have revolutionized the ability of diabetics to manage their blood glucose, and paved the way for artificial pancreas systems. In this paper we augment CGM data with sensor input collected by a smart phone and use it to provide analytical tools for patients and clinicians. We collected GPS data, activity classifications, and blood glucose data with a custom iOS application over a 9 month period from a single free-living type-1 diabetic patient. This data set is novel in terms of it's size, the inclusion of GPS data, and the fact that it was collected non-intrusively from a free-living patient. We describe a method to measure the occurrence of lifestyle \textit{events} based on GPS and activity data, and show that they can capture instances of food consumption and are therefore correlated to changes in blood glucose. Finally, we incorporate these event representations into our system to create useful visualizations and notifications to aid patients in managing their diabetes.

application, artificial intelligence, machine learning, (17 more...)

2302.014

Country:

North America > United States > Ohio (0.04)
North America > United States > New York (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Eshima, Shusei, Imai, Kosuke, Sasaki, Tomoya

Keyword Assisted Topic Models

arXiv.org Artificial IntelligenceFeb-2-2023

The unsupervised nature of the models makes them suitable for exploring topics in a corpus without prior knowledge. However, researchers find that these models often fail to measure specific concepts of substantive interest by inadvertently creating multiple topics with similar content and combining distinct themes into a single topic. In this paper, we empirically demonstrate that providing a small number of keywords can substantially enhance the measurement performance of topic models. An important advantage of the proposed keyword assisted topic model (keyATM) is that the specification of keywords requires researchers to label topics prior to fitting a model to the data. This contrasts with a widespread practice of post-hoc topic interpretation and adjustments that compromises the objectivity of empirical findings. In our application, we find that keyATM provides more interpretable results, has better document classification performance, and is less sensitive to the number of topics than the standard topic models. Finally, we show that keyATM can also incorporate covariates and model time trends. An open-source software package is available for implementing the proposed methodology. Verification Materials: The data and materials required to verify the computational reproducibility of the results, procedures and analyses in this article are available on the American Journal of Political Science Dataverse within the Harvard Dataverse Network, at: https://doi.org/10.7910/DVN/RKNNVL

artificial intelligence, machine learning, natural language, (20 more...)

2004.05964

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground (1.00)
Law > Statutes (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(14 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

arXiv.org Machine LearningFeb-2-2023

Hypothesis Testing and Machine Learning: Interpreting Variable Effects in Deep Artificial Neural Networks using Cohen's f2

Messner, Wolfgang

Deep artificial neural networks show high predictive performance in many fields, but they do not afford statistical inferences and their black-box operations are too complicated for humans to comprehend. Because positing that a relationship exists is often more important than prediction in scientific experiments and research models, machine learning is far less frequently used than inferential statistics. Additionally, statistics calls for improving the test of theory by showing the magnitude of the phenomena being studied. This article extends current XAI methods and develops a model agnostic hypothesis testing framework for machine learning. First, Fisher's variable permutation algorithm is tweaked to compute an effect size measure equivalent to Cohen's f2 for OLS regression models. Second, the Mann-Kendall test of monotonicity and the Theil-Sen estimator is applied to Apley's accumulated local effect plots to specify a variable's direction of influence and statistical significance. The usefulness of this approach is demonstrated on an artificial data set and a social survey with a Python sandbox implementation.

artificial intelligence, machine learning, well-being, (14 more...)

arXiv.org Machine Learning

doi: 10.1016/j.asoc.2023.110729

2302.01407

Country:

North America > United States > South Carolina (0.28)
Europe > United Kingdom > England (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Education (0.68)
Transportation > Air (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

#artificialintelligenceFeb-1-2023, 13:46:30 GMT

Assumption of linear regression. Linear regression is a widely used…

Linearity: This assumption states that there is a linear relationship between the independent and dependent variables. The relationship should be expressed as a straight line on a scatter plot, and the residuals (the difference between the actual and predicted values) should be randomly dispersed around zero. If this assumption is violated, the regression results may be misleading and the model will not generalize well to new data. Independence: The observations in the data should be independent of each other, meaning that the value of one observation should not affect the value of another observation. If the observations are dependent, the standard errors of the regression coefficients will be biased and the results will not be valid.

artificial intelligence, assumption, machine learning, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.83)

Soft Sensing Regression Model: from Sensor to Wafer Metrology Forecasting

Fan, Angzhi, Huang, Yu, Xu, Fei, Bom, Sthitie

The semiconductor industry is one of the most technology-evolving and capital-intensive market sectors. Effective inspection and metrology are necessary to improve product yield, increase product quality and reduce costs. In recent years, many semiconductor manufacturing equipments are equipped with sensors to facilitate real-time monitoring of the production process. These production-state and equipment-state sensor data provide an opportunity to practice machine-learning technologies in various domains, such as anomaly/fault detection, maintenance scheduling, quality prediction, etc. In this work, we focus on the task of soft sensing regression, which uses sensor data to predict impending inspection measurements that used to be measured in wafer inspection and metrology systems. We proposed an LSTM-based regressor and designed two loss functions for model training. Although engineers may look at our prediction errors in a subjective manner, a new piece-wise evaluation metric was proposed for assessing model accuracy in a mathematical way. The experimental results demonstrated that the proposed model can achieve accurate and early prediction of various types of inspections in complicated manufacturing processes.

artificial intelligence, machine learning, soft sensing regression model, (2 more...)

doi: 10.3390/s23208363

2301.08974

Genre: Research Report (0.40)

Industry:

Semiconductors & Electronics (0.93)
Information Technology > Hardware (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Hierarchical shrinkage Gaussian processes: applications to computer code emulation and dynamical system recovery

Tang, Tao, Mak, Simon, Dunson, David

In many areas of science and engineering, computer simulations are widely used as proxies for physical experiments, which can be infeasible or unethical. Such simulations can often be computationally expensive, and an emulator can be trained to efficiently predict the desired response surface. A widely-used emulator is the Gaussian process (GP), which provides a flexible framework for efficient prediction and uncertainty quantification. Standard GPs, however, do not capture structured sparsity on the underlying response surface, which is present in many applications, particularly in the physical sciences. We thus propose a new hierarchical shrinkage GP (HierGP), which incorporates such structure via cumulative shrinkage priors within a GP framework. We show that the HierGP implicitly embeds the well-known principles of effect sparsity, heredity and hierarchy for analysis of experiments, which allows our model to identify structured sparse features from the response surface with limited data. We propose efficient posterior sampling algorithms for model training and prediction, and prove desirable consistency properties for the HierGP. Finally, we demonstrate the improved performance of HierGP over existing models, in a suite of numerical experiments and an application to dynamical system recovery.

artificial intelligence, hiergp, machine learning, (16 more...)

2302.00755

Country: North America > United States (0.14)

Genre:

Research Report (0.81)
Overview (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
(2 more...)

Additive Higher-Order Factorization Machines

Rügamer, David

In the age of big data and interpretable machine learning, approaches need to work at scale and at the same time allow for a clear mathematical understanding of the method's inner workings. While there exist inherently interpretable semi-parametric regression techniques for large-scale applications to account for non-linearity in the data, their model complexity is still often restricted. One of the main limitations are missing interactions in these models, which are not included for the sake of better interpretability, but also due to untenable computational costs. To address this shortcoming, we derive a scalable high-order tensor product spline model using a factorization approach. Our method allows to include all (higher-order) interactions of non-linear feature effects while having computational costs proportional to a model without interactions. We prove both theoretically and empirically that our methods scales notably better than existing approaches, derive meaningful penalization schemes and also discuss further theoretical aspects. We finally investigate predictive and estimation performance both with synthetic and real data.

artificial intelligence, dimension, machine learning, (18 more...)

2205.14515

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Ferdous, Syeda Nyma, Li, Xin, Sahoo, Kamalakanta, Bergman, Richard

Analysis of Biomass Sustainability Indicators from a Machine Learning Perspective

Plant biomass estimation is critical due to the variability of different environmental factors and crop management practices associated with it. The assessment is largely impacted by the accurate prediction of different environmental sustainability indicators. A robust model to predict sustainability indicators is a must for the biomass community. This study proposes a robust model for biomass sustainability prediction by analyzing sustainability indicators using machine learning models. The prospect of ensemble learning was also investigated to analyze the regression problem. All experiments were carried out on a crop residue data from the Ohio state. Ten machine learning models, namely, linear regression, ridge regression, multilayer perceptron, k-nearest neighbors, support vector machine, decision tree, gradient boosting, random forest, stacking and voting, were analyzed to estimate three biomass sustainability indicators, namely soil erosion factor, soil conditioning index, and organic matter factor. The performance of the model was assessed using cross-correlation (R2), root mean squared error and mean absolute error metrics. The results showed that Random Forest was the best performing model to assess sustainability indicators. The analyzed model can now serve as a guide for assessing sustainability indicators in real time.

artificial intelligence, machine learning, sustainability indicator, (16 more...)

2302.00828

Country:

North America > United States > Ohio (0.25)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(2 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Food & Agriculture > Agriculture (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.56)

arXiv.org Machine LearningJan-31-2023

Adaptive sparseness for correntropy-based robust regression via automatic relevance determination

Li, Yuanhao, Chen, Badong, Yamashita, Okito, Yoshimura, Natsue, Koike, Yasuharu

Sparseness and robustness are two important properties for many machine learning scenarios. In the present study, regarding the maximum correntropy criterion (MCC) based robust regression algorithm, we investigate to integrate the MCC method with the automatic relevance determination (ARD) technique in a Bayesian framework, so that MCC-based robust regression could be implemented with adaptive sparseness. To be specific, we use an inherent noise assumption from the MCC to derive an explicit likelihood function, and realize the maximum a posteriori (MAP) estimation with the ARD prior by variational Bayesian inference. Compared to the existing robust and sparse L1-regularized MCC regression, the proposed MCC-ARD regression can eradicate the troublesome tuning for the regularization hyper-parameter which controls the regularization strength. Further, MCC-ARD achieves superior prediction performance and feature selection capability than L1-regularized MCC, as demonstrated by a noisy and high-dimensional simulation study.

artificial intelligence, machine learning, regression, (15 more...)

arXiv.org Machine Learning

doi: 10.1109/IJCNN54540.2023.10191293

2302.00082

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)