ml model
Explaining Deep Learning Models -- A Bayesian Non-parametric Approach
Understanding and interpreting how machine learning (ML) models make decisions have been a big challenge. While recent research has proposed various technical approaches to provide some clues as to how an ML model makes individual predictions, they cannot provide users with an ability to inspect a model as a complete entity. In this work, we propose a novel technical approach that augments a Bayesian non-parametric regression mixture model with multiple elastic nets. Using the enhanced mixture model, we can extract generalizable insights for a target model through a global approximation. To demonstrate the utility of our approach, we evaluate it on different ML models in the context of image recognition. The empirical results indicate that our proposed approach not only outperforms the state-of-the-art techniques in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of the target ML models.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- South America > Brazil (0.04)
- (2 more...)
A benchmark of categorical encoders for binary classification
Categorical encoders transform categorical features into numerical representations that are indispensable for a wide range of machine learning models. Existing encoder benchmark studies lack generalizability because of their limited choice of 1. encoders, 2. experimental factors, and 3. datasets. Additionally, inconsistencies arise from the adoption of varying aggregation strategies. This paper is the most comprehensive benchmark of categorical encoders to date, including an extensive evaluation of 32 configurations of encoders from diverse families, with 48 combinations of experimental factors, and on 50 datasets. The study shows the profound influence of dataset selection, experimental factors, and aggregation strategies on the benchmark's conclusions -- aspects disregarded in previous encoder benchmarks.
- North America > United States (0.14)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- North America > United States (0.14)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Oceania > Australia (0.04)
- North America > United States > California > Santa Clara County > San Jose (0.04)
- (3 more...)
- North America > United States > Michigan (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- North America > United States (0.31)
- Africa > Ethiopia (0.04)
- North America > Canada > Quebec > Montreal (0.14)
- North America > Saint Martin (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (13 more...)
- Government (0.92)
- Energy (0.68)
- Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
- Oceania > Papua New Guinea (0.04)
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Europe > Poland > Lesser Poland Province > Kraków (0.04)
- Banking & Finance (0.68)
- Health & Medicine (0.67)
Uncertainty-Adjusted Sorting for Asset Pricing with Machine Learning
Liu, Yan, Luo, Ye, Wang, Zigan, Zhang, Xiaowei
A large and rapidly expanding literature demonstrates that machine learning (ML) methods substantially improve out-of-sample asset return prediction relative to conventional linear benchmarks, and that these statistical gains often translate into economically meaningful portfolio performance. Seminal contributions such as Gu et al. (2020) document large Sharpe ratio improvements from nonlinear learners in U.S. equities, while subsequent work extends these findings to stochastic discount factor estimation (Chen et al. 2024), international equity markets (Leippold et al. 2022), and bond return forecasting (Kelly et al. 2019, Bianchi et al. 2020). Collectively, this literature establishes ML as a powerful tool for extracting conditional expected returns in environments characterized by noisy signals, nonlinear interactions, and pervasive multicollinearity.
- North America > United States (0.14)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- (2 more...)
- Banking & Finance > Trading (1.00)
- Banking & Finance > Economy (0.92)