AITopics

Structural equation modelling (SEM) is widely used in survey-based business and information systems research to assess latent constructs and theory-driven structural relationships. However, SEM path significance is obtained within a particular model specification and may not show whether findings remain stable under alternative estimation frameworks. This study develops and demonstrates a staged robustness analysis framework that connects SEM, ordinary least squares (OLS) regression, and Double Machine Learning (DML). SEM is first used to refine the measurement structure and estimate the robustness-baseline SEM model, in which the full theory-specified structural path system is retained for downstream robustness analysis before final structural path evaluation. OLS regression is then applied to SEM-derived construct scores as a transparent regression benchmark. Finally, DML-style residualisation is used to examine whether each tested focal relationship remains stable after flexible machine-learning-based adjustment for observed controls. Learner-sensitivity checks compare Random Forest, Gradient Boosting, and Support Vector Machine learners, and selected reverse-direction diagnostics are used to examine directional sensitivity. The framework is demonstrated using a FinTech Digital Customer Intimacy survey model. The findings identify which relationships are stable across SEM, OLS, and DML-style checks, and which require more cautious interpretation. A reproducible Google Colab workbook and generated result files are publicly available, providing a reusable template that researchers and students can adapt to other survey-based latent-construct studies. The paper contributes a practical robustness workflow and interpretation guide for survey-based researchers seeking to complement SEM with conventional and machine-learning-based robustness checks.

artificial intelligence, machine learning, robustness, (17 more...)

2607.00512

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Huthasana, Krishna Harsha Kovelakuntla, Olama, Alireza, Lundell, Andreas

Entropy-Regularized Probabilistic Gates for Sparse Model Discovery in Scarce-Data Federated Learning

Federated Learning (FL) is a distributed machine learning (ML) paradigm with collaboration among multiple clients without sharing data. FL is challenging under data heterogeneity and partial client participation. Learning sparse models is useful for communication and computational efficiency in FL, but it is especially difficult in the small-sample high-dimensional regime (d >> N) where optimization can yield parameter configurations that fail to generalize to unseen test data. While magnitude-based pruning doesn't account for uncertainty exploration in the parameter space, a formulation with probabilistic gates and an L0 constraint allows sampling from competing sparse configurations during training. In this work, we study entropy regularization of gate distributions as a mechanism to maintain uncertainty in sparse federated optimization by preventing early commitment to sparse support. We examine its impact under data heterogeneity, client participation heterogeneity, and sparsity. Experiments on synthetic and real-world benchmarks show consistent improvements over federated iterative hard thresholding (Fed-IHT) and pruning after dense federated averaging (FedAvg) training, both in statistical performance on test data and in sparsity recovery accuracy.

artificial intelligence, machine learning, sparsity, (14 more...)

2607.00275

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.47)
Health & Medicine > Therapeutic Area > Hematology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Kopeć, Agnieszka, Przybyłowicz, Paweł, Wiącek, Martyna

Neural Network-Based Estimation of Time-Dependent Parameters in AR(p) Processes

We investigate a forecasting framework based on a simple discrete-time dynamic model with coefficients varying in time. The parameters of the model are recovered within a deep learning framework, which makes it possible to retain a transparent parametric structure while simultaneously accounting for complex and nonstationary patterns in the observed phenomenon. Our analysis covers two specifications of the noise process. Besides the standard Gaussian setting, we also consider Laplace-distributed noise, which can offer a more adequate description in the presence of heavier tails and sharper local fluctuations. For both cases, we formulate the predictive scheme of the model and analyze the associated uncertainty quantification, including the construction of prediction intervals. The results illustrate that a relatively simple model, when combined with time-dependent parameter estimation, can serve as a mathematically tractable and practically flexible tool for forecasting complex dynamics under different noise assumptions. The general model is stated for TVAR($p$), while the prediction-interval formulas and the numerical experiments are developed for the TVAR(1) case.

artificial intelligence, experiment, machine learning, (19 more...)

2607.0047

Country: Europe > Poland (0.14)

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Häberle, Konstantin, Bölcskei, Helmut

Function-Counting Theory for Low-Dimensional Data Structures

The success of deep learning models in classification and regression is widely attributed to the low-dimensional structure that real-world data tend to exhibit, despite their high-dimensional representation. This work attempts to provide a mathematical framework for binary classification on low-dimensional data, building on Cover's (1965) function-counting theory. With our framework, we aim to address the question of how the low-dimensional structure of the data affects the classification capabilities of learning models. Cover's theory relies on a general position assumption that blinds it to the underlying data structure. We refine this assumption to account for the low-dimensionality of the data and derive dichotomy counts that reflect the data structure. We further extend Cover's separation capacity and problem of generalization to the low-dimensional setting, enabling the impact of the underlying data structure on both to be analyzed.

artificial intelligence, machine learning, programming language, (17 more...)

2607.0101

Genre: Research Report (0.40)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Measuring Racial Disparities in Rent Growth Under Algorithmic Landlord Concentration in U.S. Metros

Ranade, Advay

The 2024 Department of Justice antitrust complaint against RealPage, Inc. named five major residential REITs for coordinating algorithmic rent pricing across hundreds of thousands of apartment units in major US metropolitan areas. This paper studies whether census-tract-level corporate landlord concentration (CLC), measured from SEC EDGAR 10-K property filings geocoded to census tracts, the first such application in the literature, is associated with rent growth 2019-2023, and whether that association is larger in majority-minority neighborhoods. Rent outcomes are measured using the Zillow Observed Rent Index (ZORI). To account for the possibility that corporate landlords preferentially locate in neighborhoods already seeing rent appreciation, all regressions control for a fully novel Algorithmic Housing Burden Index (AHBI), a composite of pre-existing rent burden and market tightness from ACS data. Across 665 census tracts in ten US metropolitan areas, doubling REIT concentration is associated with 2.8 percentage points higher rent growth (p = 0.086, p = 0.030, HC1 robust). This association is significantly stronger in majority-minority tracts. Within the same metro, high-CLC majority-minority tracts are associated with 5.9 percentage points higher rent growth than comparable white tracts (p = 0.039). An XGBoost model predicts 44 percent of out-of-sample rent growth variance, with SHAP analysis independently confirming that CLC's contribution is positive in minority tracts and negative in white tracts. Taken all together, these findings provide the first tract-level evidence consistent with corporate landlord concentration being associated with disproportionately higher rent growth in communities of color.

artificial intelligence, machine learning, tract, (13 more...)

doi: 10.2139/ssrn.6941879

2606.27525

Country: North America > United States > California (0.47)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.91)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Real Estate (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Meek, Christopher, Sadeghi, Kayvan

Characterizing and Identifying Separable Graphical Models

We study a broad class of graphical models whose independencies correspond to vertex separation in mixed graphs with directed, undirected, and bidirected edges, that are capable of encoding independence structures arising from feedback, latent and selection mechanisms. In particular, we introduce separable graphs, in which each missing edge implies the existence of a separating set for its endpoints, and essentially separable graphs, those graphs separation equivalent to a separable graph. We show that these models include many existing graph families used to define graphical models an provide several characterizations of separable graphs and essentially separable graphs. We also provide multiple characterizations of separation equivalence for separable graphs. One is a graphical characterization in terms of ordinary graph properties, extending earlier results for specific subfamilies Another is a separational characterization depending only on graph separation properties. Finally, we provide a canonical representation for the equivalence classes of essentially separable graphs and develop an algorithm that, under suitable assumptions, identifies the equivalence class of any essentially separable graph.

artificial intelligence, graph, machine learning, (16 more...)

2607.01057

Country: North America > United States (0.93)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Manoj, Naren Sarayu, Patel, Kumar Kshitij

Distributionally Robust Linear Regression With Block Lewis Weights

Machine learning algorithms and their training datasets have grown substantially in both size and complexity over the past decade. This increased model complexity has made it challenging to interpret and predict their behavior in unobserved scenarios. Hence, many applications that involve societal decisions still rely on simple, interpretable models like linear regression, often after feature engineering. Examples of such applications include predicting national housing prices, estimating wages across industries, forecasting loan amounts across banks, predicting life insurance premiums across groups, and projecting energy consumption across communities [CGKMN24]. A shared safety and sometimes legal concern across the above applications is the potential for wildly different model qualities for different distributions, i.e., outputting a notably worse model for some source data distributions [Dat14; BS16; HPS16; VVB18; SBFVV19; BHJKR21; CGNSG23; Cho16; KLMR18; ADW19; CGKMN24; SVWZ24].

artificial intelligence, lemma 7, machine learning, (19 more...)

2607.00252

Country:

Europe (0.92)
North America > United States > California (0.27)

Genre:

Research Report (0.64)
Workflow (0.45)

Industry: Banking & Finance > Insurance (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.60)

Raeth, Kornelius, Ludwig, Nicole

Decision-Aware Training for Sample-Based Generative Models

Kornelius Raeth 1 Nicole Ludwig 1 2 Abstractscoring rules distribute the training gradient in proportion to Sample-based generative models are increasingly data density, with no awareness of the decision maker's cost structure. The model's limited capacity is allocated globused for probabilistic forecasting in high-stakes ally, leaving decision-critical regions of the output space decision settings, yet their training objectives are potentially underserved. These models are commonly trained with strictly proper Given a forecast, a decision maker with cost function c(a,y), scoring rules, such as the energy score, which al-of action aand outcome y, selects the action that minimises locate their training signal in proportion to dataexpected cost under the forecast distribution; a point forecast density, with no awareness of where forecast eris insufficient to evaluate this expectation. A good forecast rors are most costly for downstream decisions. Crucially, the energy score objective with a differentiable deci-observed cost of the optimal action is itself a proper scoring sion loss that directly penalises the cost incurredrule (Hartline et al., 2025; Kleinberg et al., 2023), placing by acting on the model's forecast. This combinedit in the same family as the energy score which licenses loss is theoretically grounded, as the decision losstheir combination as a theoretically well-founded training is itself a proper scoring rule. Introduction score acts as that anchor, preventing the model from collapsing outside cost-sensitive regions. Our method is theo-tion based on a temperature forecast, balancing asset loss against the cost of intervention. In the weather domain, retically grounded and leads to better downstream decisions state-of-the-art forecasting systems (Lang et al., 2024; Pricewhile retaining full probabilistic forecasts, as validated on et al., 2023) are trained with strictly proper scoring rulessynthetic and real-world forecasting tasks. A gradient analysis showing which regions benefitscore reduces to the continuous ranked probability score from the decision loss and why, based on the cost (CRPS), widely used in meteorological forecast verificafunction structure. Both model classes introduced above are commonly trained by minimising strictly proper sion calibration.

artificial intelligence, machine learning, natural language, (18 more...)

2607.01171

Country: Europe > Germany (0.28)

Genre: Research Report (0.81)

Industry: Energy > Renewable > Wind (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.64)

Deep Multitask Learning for Mixed-Type Outcomes with Shared Sparsity

Li, Huichao, Wang, Tong, Zhang, Sanguo, Ma, Shuangge

Most existing multitask learning approaches are limited by their reliance on task-specific loss functions tailored to the scale and type of each outcome. When outcomes differ across tasks, these losses are generally not directly comparable, which makes it difficult to formulate a unified objective and may limit information sharing across tasks. We propose a multitask transformation framework in which task-specific responses may differ through unknown monotone transformations. Motivated by high-dimensional biological applications in which the predictor dimension may diverge with the sample size while only a common subset of predictors is informative, we consider shared sparsity across tasks. Under this framework, we estimate the target functions and identify important predictors by optimizing a smoothed rank-based criterion with a group-Lasso penalty, implemented through a multitask deep neural network with a shared first layer. We establish the nonasymptotic excess-risk bounds, and variable-selection consistency for the proposed estimator. Simulation studies show that the proposed method achieves competitive prediction and variable-selection performance compared with competing approaches. Analyses of gene-expression studies with continuous, binary, and mixed outcomes further illustrate that the proposed method improves prediction and identifies biologically meaningful shared predictors.

artificial intelligence, machine learning, predictor, (18 more...)

2607.00995

Country: Asia > China (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Liu, Peilin, Zhou, Ding-Xuan

Ghost in the Kernel: In-Context Learning with Efficient Transformers via Domain Generalization

Transformer-based large models have demonstrated remarkable generalization abilities across different tasks by leveraging a context-aware attention module for in-context learning. With richer context, transformers adapt more effectively to the current use case without any parameter updates. However, the quadratic computational and memory complexity with respect to context length significantly slows data processing in softmax transformers. Linear transformers were proposed to address this issue by reducing the complexity to linear dependence on context length, but the design and understanding of the feature mapping in linear attention, from a theoretical viewpoint, remain unclear. In this paper, we investigate the approximation and generalization abilities of linear transformers under a two-staged sampling process from domain generalization. We show that linear transformers perform in-context learning as learning a mapping from context distributions to response functions. A dimension-independent convergence rate is obtained for our generalization analysis, which also exhibits the tradeoff between the regularities of data distributions and latent features. Guided by our theoretical framework, we propose a new perspective on activation and loss design for linearizing pretrained softmax large language models.

large language model, machine learning, natural language, (21 more...)

2607.00479

Country: Europe (0.28)

Genre: Research Report (0.63)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)