nonlinear effect
Robust Hypothesis Test for Nonlinear Effect with Gaussian Processes
This work constructs a hypothesis test for detecting whether an data-generating function $h: \real^p \rightarrow \real$ belongs to a specific reproducing kernel Hilbert space $\mathcal{H}_0$, where the structure of $\mathcal{H}_0$ is only partially known. Utilizing the theory of reproducing kernels, we reduce this hypothesis to a simple one-sided score test for a scalar parameter, develop a testing procedure that is robust against the mis-specification of kernel functions, and also propose an ensemble-based estimator for the null model to guarantee test performance in small samples. To demonstrate the utility of the proposed method, we apply our test to the problem of detecting nonlinear interaction between groups of continuous features. We evaluate the finite-sample performance of our test under different data-generating functions and estimation strategies for the null model. Our results revealed interesting connection between notions in machine learning (model underfit/overfit) and those in statistical inference (i.e. Type I error/power of hypothesis test), and also highlighted unexpected consequences of common model estimating strategies (e.g.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)
Bayesian Semi-Parametric Spatial Dispersed Count Model for Precipitation Analysis
Nadifar, Mahsa, Bekker, Andriette, Arashi, Mohammad, Ramoelo, Abel
The appropriateness of the Poisson model is frequently challenged when examining spatial count data marked by unbalanced distributions, over-dispersion, or under-dispersion. Moreover, traditional parametric models may inadequately capture the relationships among variables when covariates display ambiguous functional forms or when spatial patterns are intricate and indeterminate. To tackle these issues, we propose an innovative Bayesian hierarchical modeling system. This method combines non-parametric techniques with an adapted dispersed count model based on renewal theory, facilitating the effective management of unequal dispersion, non-linear correlations, and complex geographic dependencies in count data. We illustrate the efficacy of our strategy by applying it to lung and bronchus cancer mortality data from Iowa, emphasizing environmental and demographic factors like ozone concentrations, PM2.5, green space, and asthma prevalence. Our analysis demonstrates considerable regional heterogeneity and non-linear relationships, providing important insights into the impact of environmental and health-related factors on cancer death rates. This application highlights the significance of our methodology in public health research, where precise modeling and forecasting are essential for guiding policy and intervention efforts. Additionally, we performed a simulation study to assess the resilience and accuracy of the suggested method, validating its superiority in managing dispersion and capturing intricate spatial patterns relative to conventional methods. The suggested framework presents a flexible and robust instrument for geographical count analysis, offering innovative insights for academics and practitioners in disciplines such as epidemiology, environmental science, and spatial statistics.
- North America > United States > Iowa (0.25)
- North America > Canada > Alberta (0.14)
- Africa > South Africa > Gauteng > Pretoria (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Public Health (1.00)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Reviews: Robust Hypothesis Test for Nonlinear Effect with Gaussian Processes
The paper proposes a statistical test for particular non-linear effects in a linear mixed model (LMM). The problem of testing non-linear effects is relevant, especially in the natural sciences. The experimental validation has its flaws, but may be considered acceptable for a conference paper. The method consists of multiple parts: 1) The main new idea introduced in the paper is to introduce a kernel parameter (garotte) that interpolates between a null model and the desired alternative model and to perform a score test on this parameter. This elegant new idea is combined with several established steps to obtain the final testing procedure: 2) Defining a score statistic and deriving an approximate null distribution for the statistic based on the Satterthwaite approximation.
Robust Hypothesis Test for Nonlinear Effect with Gaussian Processes
Utilizing the theory of reproducing kernels, we reduce this hypothesis to a simple one-sided score test for a scalar parameter, develop a testing procedure that is robust against the misspecification of kernel functions, and also propose an ensemble-based estimator for the null model to guarantee test performance in small samples. To demonstrate the utility of the proposed method, we apply our test to the problem of detecting nonlinear interaction between groups of continuous features. We evaluate the finite-sample performance of our test under different data-generating functions and estimation strategies for the null model. Our results reveal interesting connections between notions in machine learning (model underfit/overfit) and those in statistical inference (i.e. Type I error/power of hypothesis test), and also highlight unexpected consequences of common model estimating strategies (e.g.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)
Simple Full-Spectrum Correlated k-Distribution Model based on Multilayer Perceptron
Wang, Xin, Kuang, Yucheng, Wang, Chaojun, Di, Hongyuan, He, Boshu
While neural networks have been successfully applied to the full-spectrum k-distribution (FSCK) method at a large range of thermodynamics with k-values predicted by a trained multilayer perceptron (MLP) model, the required a-values still need to be calculated on-the-fly, which theoretically degrades the FSCK method and may lead to errors. On the other hand, too complicated structure of the current MLP model inevitably slows down the calculation efficiency. Therefore, to compensate among accuracy, efficiency and storage, the simple MLP designed based on the nature of FSCK method are developed, i.e., the simple FSCK MLP (SFM) model, from which those correlated k-values and corresponding ka-values can be efficiently obtained. Several test cases have been carried out to compare the developed SFM model and other FSCK tools including look-up tables and traditional FSCK MLP (TFM) model. Results show that the SFM model can achieve excellent accuracy that is even better than look-up tables at a tiny computational cost that is far less than that of TFM model. Considering accuracy, efficiency and portability, the SFM model is not only an excellent tool for the prediction of spectral properties, but also provides a method to reduce the errors due to nonlinear effects.
- Asia > China > Beijing > Beijing (0.05)
- North America > United States > New York (0.04)
- Europe > Switzerland (0.04)
An Association Test Based on Kernel-Based Neural Networks for Complex Genetic Association Analysis
Hou, Tingting, Jiang, Chang, Lu, Qing
The advent of artificial intelligence, especially the progress of deep neural networks, is expected to revolutionize genetic research and offer unprecedented potential to decode the complex relationships between genetic variants and disease phenotypes, which could mark a significant step toward improving our understanding of the disease etiology. While deep neural networks hold great promise for genetic association analysis, limited research has been focused on developing neural-network-based tests to dissect complex genotype-phenotype associations. This complexity arises from the opaque nature of neural networks and the absence of defined limiting distributions. We have previously developed a kernel-based neural network model (KNN) that synergizes the strengths of linear mixed models with conventional neural networks. KNN adopts a computationally efficient minimum norm quadratic unbiased estimator (MINQUE) algorithm and uses KNN structure to capture the complex relationship between large-scale sequencing data and a disease phenotype of interest. In the KNN framework, we introduce a MINQUE-based test to assess the joint association of genetic variants with the phenotype, which considers non-linear and non-additive effects and follows a mixture of chi-square distributions. We also construct two additional tests to evaluate and interpret linear and non-linear/non-additive genetic effects, including interaction effects. Our simulations show that our method consistently controls the type I error rate under various conditions and achieves greater power than a commonly used sequence kernel association test (SKAT), especially when involving non-linear and interaction effects. When applied to real data from the UK Biobank, our approach identified genes associated with hippocampal volume, which can be further replicated and evaluated for their role in the pathogenesis of Alzheimer's disease.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.69)
Equalization in Dispersion-Managed Systems Using Learned Digital Back-Propagation
Abu-Romoh, Mohannad, Costa, Nelson, Jaouën, Yves, Napoli, Antonio, Pedro, João, Spinnler, Bernhard, Yousefi, Mansoor
In this paper, we investigate the use of the learned digital back-propagation (LDBP) for equalizing dual-polarization fiber-optic transmission in dispersion-managed (DM) links. LDBP is a deep neural network that optimizes the parameters of DBP using the stochastic gradient descent. We evaluate DBP and LDBP in a simulated WDM dual-polarization fiber transmission system operating at the bitrate of 256 Gbit/s per channel, with a dispersion map designed for a 2016 km link with 15% residual dispersion. Our results show that in single-channel transmission, LDBP achieves an effective signal-to-noise ratio improvement of 6.3 dB and 2.5 dB, respectively, over linear equalization and DBP. In WDM transmission, the corresponding $Q$-factor gains are 1.1 dB and 0.4 dB, respectively. Additionally, we conduct a complexity analysis, which reveals that a frequency-domain implementation of LDBP and DBP is more favorable in terms of complexity than the time-domain implementation. These findings demonstrate the effectiveness of LDBP in mitigating the nonlinear effects in DM fiber-optic transmission systems.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > France (0.04)
- (7 more...)
Examining spatial heterogeneity of ridesourcing demand determinants with explainable machine learning
Zhang, Xiaojian, Yan, Xiang, Zhou, Zhengze, Xu, Yiming, Zhao, Xilei
The growing significance of ridesourcing services in recent years suggests a need to examine the key determinants of ridesourcing demand. However, little is known regarding the nonlinear effects and spatial heterogeneity of ridesourcing demand determinants. This study applies an explainable-machine-learning-based analytical framework to identify the key factors that shape ridesourcing demand and to explore their nonlinear associations across various spatial contexts (airport, downtown, and neighborhood). We use the ridesourcing-trip data in Chicago for empirical analysis. The results reveal that the importance of built environment varies across spatial contexts, and it collectively contributes the largest importance in predicting ridesourcing demand for airport trips. Additionally, the nonlinear effects of built environment on ridesourcing demand show strong spatial variations. Ridesourcing demand is usually most responsive to the built environment changes for downtown trips, followed by neighborhood trips and airport trips. These findings offer transportation professionals nuanced insights for managing ridesourcing services.
- North America > United States > Illinois > Cook County > Chicago (0.25)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > United States > Florida > Alachua County > Gainesville (0.04)
- (6 more...)
- Transportation > Passenger (1.00)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Government (0.93)
Neural network quantifies P-integral in solids - Verve times
Researchers have recently proposed the concept of ionization integral (P-int) to quantify the strength of plasma effect. This is the first time that nonlinear effects have been quantified experimentally. Relevant results were published in Physical Review Research. By combining with the widely used B integral (B-int), the two main nonlinear effects in the spectral broadening process: Kerr effect and ionization effect, were investigated numerically and experimentally by Dr. Gao Yitan at the Institute of Physics of the Chinese Academy of Sciences, under the joint guidance of Dr. Zhao Kun and Prof. Wei Zhiyi. Observing the response of the laser field in the interaction between ultra-fast laser and matter is a powerful way to explore the ultra-fast physical mechanism in the process of interaction, especially in the strong field regime.