AITopics | He, Jingyu

Collaborating Authors

He, Jingyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Growing the Efficient Frontier on Panel Trees

Cong, Lin William, Feng, Guanhao, He, Jingyu, He, Xin

arXiv.org Machine LearningFeb-4-2025

Estimating the mean-variance efficient (MVE) frontier is crucial for asset pricing and investment management. Yet, estimating the tangency portfolio (Markowitz, 1952) using the unbalanced panel of thousands of individual asset returns proves impracticable. Empirical studies typically consider a "diversified" set of test assets (e.g., ME-BM 25 portfolios) to estimate and evaluate factor models, hoping these test assets or a few common factors can span the same efficient frontier as individual assets. However, popular factor models hardly explain the cross section of conventional prespecified test assets (e.g., Kozak et al., 2018; Lopez-Lira and Roussanov, 2020), not to mention the ad hoc nature of these test assets hampers the effectiveness of model estimations and evaluations (Lewellen et al., 2010; Ang et al., 2020). For example, characteristics-based test assets are often limited to univariate-and bivariate-sorted portfolios due to the challenges of high-dimensional sorting (Cochrane, 2011), overlooking nonlinearity and asymmetric interactions (that do not uniformly apply to all assets), even with dependent sorting (Daniel et al., 1997).

artificial intelligence, machine learning, portfolio, (18 more...)

arXiv.org Machine Learning

2501.1673

Country:

Asia > China (0.46)
North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.45)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Add feedback

Local Gaussian process extrapolation for BART models with applications to causal inference

Wang, Meijiang, He, Jingyu, Hahn, P. Richard

arXiv.org Machine LearningFeb-24-2023

Tree-based supervised learning algorithms, such as the Classification and Regression Tree (CART) (Breiman et al., 1984), Random Forests (Breiman, 2001), and XGBoost (Chen and Guestrin, 2016) are popular in practice due to their ability to learn complex nonlinear functions efficiently. Bayesian Additive Regression Trees (BART, Chipman et al. (2010)) is the most popular model-based regression tree method; it has been demonstrated empirically to provide accurate out-of-sample prediction (without covariate shift), and its Bayesian uncertainty intervals often out-perform alternatives in terms of frequentist coverage (see Chipman et al. (2010); Kapelner and Bleich (2013)). XBART (He and Hahn, 2021) is a stochastic tree ensemble method that can be used to approximate BART models in a fraction of the run-time. Throughout the paper, we will refer to BART models but will use the XBART fitting algorithm. While tree-based methods frequently provide accurate out-of-sample predictions, their ability to extrapolate is fundamentally limited by their intrinsic, piecewise constant structure.

artificial intelligence, decision tree learning, machine learning, (18 more...)

arXiv.org Machine Learning

2204.10963

Country:

Asia > China (0.14)
Europe (0.14)

Genre:

Research Report > Experimental Study (0.47)
Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

Bayesian Inference for Gamma Models

He, Jingyu, Polson, Nicholas, Xu, Jianeng

arXiv.org Machine LearningJun-3-2021

We use the theory of normal variance-mean mixtures to derive a data augmentation scheme for models that include gamma functions. Our methodology applies to many situations in statistics and machine learning, including Multinomial-Dirichlet distributions, Negative binomial regression, Poisson-Gamma hierarchical models, Extreme value models, to name but a few. All of those models include a gamma function which does not admit a natural conjugate prior distribution providing a significant challenge to inference and prediction. To provide a data augmentation strategy, we construct and develop the theory of the class of Exponential Reciprocal Gamma distributions. This allows scalable EM and MCMC algorithms to be developed. We illustrate our methodology on a number of examples, including gamma shape inference, negative binomial regression and Dirichlet allocation. Finally, we conclude with directions for future research.

artificial intelligence, bayesian inference, machine learning, (1 more...)

arXiv.org Machine Learning

2106.01906

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)

Add feedback

Bayesian Inference for Polya Inverse Gamma Models

Glynn, Christopher, He, Jingyu, Polson, Nicholas G., Xu, Jianeng

arXiv.org Machine LearningMay-28-2019

The normalizing constants of these distributions depend on gamma functions whose arguments include shape (gamma, inverse gamma) and concentration (beta, Dirichlet) parameters. Bayesian learning of parameters nested inside the gamma function presents significant technical difficulties, since there is no known conjugate prior distribution. In fact, inferring the shape parameter in the gamma distribution is a long-studied problem in Bayesian inference (Damsleth, 1975; Rossell et al., 2009; Miller, 2018). In this paper, we develop the theoretical and algorithmic foundation of a P olya-inverse Gamma (PIG) data augmentation scheme for fully Bayesian inference of shape and concentration parameters in gamma, inverse gamma, and Dirichlet models, respectively . PIG data augmentation may be utilized to design efficient Markov chain Monte Carlo (MCMC) algorithms in latent Dirichlet allocation (Blei et al., 2003), Beta-negative binomial models (Zhou et al., 2012), and Gamma-Gamma (GaGa) hierarchical models (Rossell et al., 2009).

bayesian inference, gamma function, health & medicine, (17 more...)

arXiv.org Machine Learning

1905.12141

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

XBART: Accelerated Bayesian Additive Regression Trees

He, Jingyu, Yalov, Saar, Hahn, P. Richard

arXiv.org Machine LearningMar-14-2019

Bayesian additive regression trees (BART) (Chipman et. al., 2010) is a powerful predictive model that often outperforms alternative models at out-of-sample prediction. BART is especially well-suited to settings with unstructured predictor variables and substantial sources of unmeasured variation as is typical in the social, behavioral and health sciences. This paper develops a modified version of BART that is amenable to fast posterior estimation. We present a stochastic hill climbing algorithm that matches the remarkable predictive accuracy of previous BART implementations, but is many times faster and less memory intensive. Simulation studies show that the new method is comparable in computation time and more accurate at function estimation than both random forests and gradient boosting.

algorithm, artificial intelligence, decision tree learning, (18 more...)

arXiv.org Machine Learning

1810.02215

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Efficient sampling for Gaussian linear regression with arbitrary priors

Hahn, P. Richard, He, Jingyu, Lopes, Hedibert

arXiv.org Machine LearningJun-14-2018

This paper develops a slice sampler for Bayesian linear regression models with arbitrary priors. The new sampler has two advantages over current approaches. One, it is faster than many custom implementations that rely on auxiliary latent variables, if the number of regressors is large. Two, it can be used with any prior with a density function that can be evaluated up to a normalizing constant, making it ideal for investigating the properties of new shrinkage priors without having to develop custom sampling algorithms. The new sampler takes advantage of the special structure of the linear regression likelihood, allowing it to produce better effective sample size per second than common alternative approaches.

artificial intelligence, machine learning, sampler, (17 more...)

arXiv.org Machine Learning

1806.05738

Country:

North America > United States > Texas (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Deep Learning for Predicting Asset Returns

Feng, Guanhao, He, Jingyu, Polson, Nicholas G.

arXiv.org Machine LearningApr-26-2018

Deep learning searches for nonlinear factors for predicting asset returns. Predictability is achieved via multiple layers of composite factors as opposed to additive ones. Viewed in this way, asset pricing studies can be revisited using multi-layer deep learners, such as rectified linear units (ReLU) or long-short-term-memory (LSTM) for time-series effects. State-of-the-art algorithms including stochastic gradient descent (SGD), TensorFlow and dropout design provide imple- mentation and efficient factor exploration. To illustrate our methodology, we revisit the equity market risk premium dataset of Welch and Goyal (2008). We find the existence of nonlinear factors which explain predictability of returns, in particular at the extremes of the characteristic space. Finally, we conclude with directions for future research.

banking & finance, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1804.09314

Country:

Asia > China > Hong Kong (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback