AITopics

2412.1818

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (0.89)
Research Report > New Finding (0.67)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

arXiv.org Machine LearningDec-24-2024

HNCI: High-Dimensional Network Causal Inference

Du, Wenqin, Ding, Rundong, Fan, Yingying, Lv, Jinchi

The problem of evaluating the effectiveness of a treatment or policy commonly appears in causal inference applications under network interference. In this paper, we suggest the new method of high-dimensional network causal inference (HNCI) that provides both valid confidence interval on the average direct treatment effect on the treated (ADET) and valid confidence set for the neighborhood size for interference effect. We exploit the model setting in Belloni et al. (2022) and allow certain type of heterogeneity in node interference neighborhood sizes. We propose a linear regression formulation of potential outcomes, where the regression coefficients correspond to the underlying true interference function values of nodes and exhibit a latent homogeneous structure. Such a formulation allows us to leverage existing literature from linear regression and homogeneity pursuit to conduct valid statistical inferences with theoretical guarantees. The resulting confidence intervals for the ADET are formally justified through asymptotic normalities with estimable variances. We further provide the confidence set for the neighborhood size with theoretical guarantees exploiting the repro samples approach. The practical utilities of the newly suggested methods are demonstrated through simulation and real data examples.

artificial intelligence, machine learning, node, (17 more...)

2412.18568

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.45)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Mansouri, Dou El Kefel, Benkabou, Seif-Eddine, Benabdeslem, Khalid

Fr\'echet regression for multi-label feature selection with implicit regularization

arXiv.org Machine LearningDec-24-2024

Fréchet regression, an extension of classical linear regression to general metric spaces, offers a robust framework for modeling complex relationships between variables when the responses lie outside of Euclidean spaces. This approach is especially well suited to high-dimensional datasets, such as vector representations, with particular relevance to fields like imaging, where capturing nonlinear dependencies and the intrinsic data structure is critical for accurate modeling (Fréchet (1948), Petersen and Müller (2019), Bhattacharjee and Müller (2023), Qiu, Yu and Zhu (2024)). A significant consideration in Fréchet regression arises when predicting multiple responses simultaneously, as seen in multi-target or multidimensional problems (Zhang and Zhou (2007), Hyvönen, Jääsaari and Roos (2024)). Unlike traditional regression, where each observation corresponds to a single response, Fréchet regression can be extended to model complex interactions between multiple outputs. This ability to address complex relationships between several responses opens new avenues, particularly in fields such as bioinformatics (Huang et al. (2005)) and image analysis (Lathuilière et al. (2019)), where multidimensional data and interdependencies between responses require adaptive and specialized methodologies. However, to date, the handling of multilabel scenarios within the context of Fréchet regression remains relatively unexplored in the literature, despite its potential significance in addressing complex, multidimensional applications. In this paper, we present an extension of the Global Fréchet regression model, a specific variant of Fréchet regression that generalizes classical multiple linear regression by modeling responses as random objects. This extension enables the explicit modeling of relationships between input variables and multiple responses, thereby addressing the multi-label setting. Our second contribution in this paper addresses the dimensionality challenge in the context of the proposed Fréchet regression extension.

artificial intelligence, machine learning, regression, (17 more...)

2412.18247

Country:

Europe > France (0.04)
Africa > Middle East > Algeria > Tiaret Province > Tiaret (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.75)

arXiv.org Machine LearningDec-24-2024

Bivariate Matrix-valued Linear Regression (BMLR): Finite-sample performance under Identifiability and Sparsity Assumptions

Bettache, Nayel

This study explores the estimation of parameters in a matrix-valued linear regression model, where the $T$ responses $(Y_t)_{t=1}^T \in \mathbb{R}^{n \times p}$ and predictors $(X_t)_{t=1}^T \in \mathbb{R}^{m \times q}$ satisfy the relationship $Y_t = A^* X_t B^* + E_t$ for all $t = 1, \ldots, T$. In this model, $A^* \in \mathbb{R}_+^{n \times m}$ has $L_1$-normalized rows, $B^* \in \mathbb{R}^{q \times p}$, and $(E_t)_{t=1}^T$ are independent noise matrices following a matrix Gaussian distribution. The primary objective is to estimate the unknown parameters $A^*$ and $B^*$ efficiently. We propose explicit optimization-free estimators and establish non-asymptotic convergence rates to quantify their performance. Additionally, we extend our analysis to scenarios where $A^*$ and $B^*$ exhibit sparse structures. To support our theoretical findings, we conduct numerical simulations that confirm the behavior of the estimators, particularly with respect to the impact of the dimensions $n, m, p, q$, and the sample size $T$ on finite-sample performances. We complete the simulations by investigating the denoising performances of our estimators on noisy real-world images.

artificial intelligence, machine learning, matrix, (18 more...)

2412.17749

Country: Europe > France (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Chang, Jae Ho, Russo, Massimiliano, Paul, Subhadeep

Heterogeneous transfer learning for high dimensional regression with feature mismatch

arXiv.org Machine LearningDec-23-2024

We consider the problem of transferring knowledge from a source, or proxy, domain to a new target domain for learning a high-dimensional regression model with possibly different features. Recently, the statistical properties of homogeneous transfer learning have been investigated. However, most homogeneous transfer and multi-task learning methods assume that the target and proxy domains have the same feature space, limiting their practical applicability. In applications, target and proxy feature spaces are frequently inherently different, for example, due to the inability to measure some variables in the target data-poor environments. Conversely, existing heterogeneous transfer learning methods do not provide statistical error guarantees, limiting their utility for scientific discovery. We propose a two-stage method that involves learning the relationship between the missing and observed features through a projection step in the proxy data and then solving a joint penalized regression optimization problem in the target data. We develop an upper bound on the method's parameter estimation risk and prediction risk, assuming that the proxy and the target domain parameters are sparsely different. Our results elucidate how estimation and prediction error depend on the complexity of the model, sample size, the extent of overlap, and correlation between matched and mismatched features.

artificial intelligence, machine learning, prediction error, (18 more...)

2412.18081

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Ohio (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Erata, Ferhat, Paradise, Orr, Antonopoulos, Timos, Nguyen, ThanhVu, Goldwasser, Shafi, Piskac, Ruzica

Learning Randomized Reductions and Program Properties

arXiv.org Artificial IntelligenceDec-23-2024

The correctness of computations remains a significant challenge in computer science, with traditional approaches relying on automated testing or formal verification. Self-testing/correcting programs introduce an alternative paradigm, allowing a program to verify and correct its own outputs via randomized reductions, a concept that previously required manual derivation. In this paper, we present Bitween, a method and tool for automated learning of randomized (self)-reductions and program properties in numerical programs. Bitween combines symbolic analysis and machine learning, with a surprising finding: polynomial-time linear regression, a basic optimization method, is not only sufficient but also highly effective for deriving complex randomized self-reductions and program invariants, often outperforming sophisticated mixed-integer linear programming solvers. We establish a theoretical framework for learning these reductions and introduce RSR-Bench, a benchmark suite for evaluating Bitween's capabilities on scientific and machine learning functions. Our empirical results show that Bitween surpasses state-of-the-art tools in scalability, stability, and sample efficiency when evaluated on nonlinear invariant benchmarks like NLA-DigBench. Bitween is open-source as a Python package and accessible via a web interface that supports C language programs.

artificial intelligence, bitween, machine learning, (16 more...)

2412.18134

Country:

North America > United States > Connecticut > New Haven County > New Haven (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(15 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningDec-23-2024

Learning from Summarized Data: Gaussian Process Regression with Sample Quasi-Likelihood

Shikuri, Yuta

Gaussian process regression is a powerful Bayesian nonlinear regression method. Recent research has enabled the capture of many types of observations using non-Gaussian likelihoods. To deal with various tasks in spatial modeling, we benefit from this development. Difficulties still arise when we can only access summarized data consisting of representative features, summary statistics, and data point counts. Such situations frequently occur primarily due to concerns about confidentiality and management costs associated with spatial data. This study tackles learning and inference using only summarized data within the framework of Gaussian process regression. To address this challenge, we analyze the approximation errors in the marginal likelihood and posterior distribution that arise from utilizing representative features. We also introduce the concept of sample quasi-likelihood, which facilitates learning and inference using only summarized data. Non-Gaussian likelihoods satisfying certain assumptions can be captured by specifying a variance function that characterizes a sample quasi-likelihood function. Theoretical and experimental results demonstrate that the approximation performance is influenced by the granularity of summarized data relative to the length scale of covariance functions. Experiments on a real-world dataset highlight the practicality of our method for spatial modeling.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

2412.17455

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Artificial IntelligenceDec-23-2024

An Instrumental Value for Data Production and its Application to Data Pricing

Ai, Rui, Lyu, Boxiang, Wang, Zhaoran, Yang, Zhuoran, Xu, Haifeng

How much value does a dataset or a data production process have to an agent who wishes to use the data to assist decision-making? This is a fundamental question towards understanding the value of data as well as further pricing of data. This paper develops an approach for capturing the instrumental value of data production processes, which takes two key factors into account: (a) the context of the agent's decision-making problem; (b) prior data or information the agent already possesses. We ''micro-found'' our valuation concepts by showing how they connect to classic notions of information design and signals in information economics. When instantiated in the domain of Bayesian linear regression, our value naturally corresponds to information gain. Based on our designed data value, we then study a basic monopoly pricing setting with a buyer looking to purchase from a seller some labeled data of a certain feature direction in order to improve a Bayesian regression model. We show that when the seller has the ability to fully customize any data request, she can extract the first-best revenue (i.e., full surplus) from any population of buyers, i.e., achieving first-degree price discrimination. If the seller can only sell data that are derived from an existing data pool, this limits her ability to customize, and achieving first-best revenue becomes generally impossible. However, we design a mechanism that achieves seller revenue at most $\log (\kappa)$ less than the first-best revenue, where $\kappa$ is the condition number associated with the data matrix. A corollary of this result is that the seller can extract the first-best revenue in the multi-armed bandits special case.

machine learning, mechanism, natural language, (18 more...)

2412.1814

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.68)
Law (0.45)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)

Rusli, Andre, Shishido, Makoto

An Experimental Evaluation of Japanese Tokenizers for Sentiment-Based Text Classification

arXiv.org Artificial IntelligenceDec-23-2024

This study investigates the performance of three popular tokenization tools: MeCab, Sudachi, and SentencePiece, when applied as a preprocessing step for sentiment-based text classification of Japanese texts. Using Term Frequency-Inverse Document Frequency (TF-IDF) vectorization, we evaluate two traditional machine learning classifiers: Multinomial Naive Bayes and Logistic Regression. The results reveal that Sudachi produces tokens closely aligned with dictionary definitions, while MeCab and SentencePiece demonstrate faster processing speeds. The combination of SentencePiece, TF-IDF, and Logistic Regression outperforms the other alternatives in terms of classification performance.

machine learning, natural language, text classification, (19 more...)

2412.17361

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)

Cai, Mingyang, Klausch, Thomas, van de Wiel, Mark A.

A Semi-supervised CART Model for Covariate Shift

arXiv.org Artificial IntelligenceDec-22-2024

Machine learning models used in medical applications often face challenges due to the covariate shift, which occurs when there are discrepancies between the distributions of training and target data. This can lead to decreased predictive accuracy, especially with unknown outcomes in the target data. This paper introduces a semi-supervised classification and regression tree (CART) that uses importance weighting to address these distribution discrepancies. Our method improves the predictive performance of the CART model by assigning greater weights to training samples that more accurately represent the target distribution, especially in cases of covariate shift without target outcomes. In addition to CART, we extend this weighted approach to generalized linear model trees and tree ensembles, creating a versatile framework for managing the covariate shift in complex datasets. Through simulation studies and applications to real-world medical data, we demonstrate significant improvements in predictive accuracy. These findings suggest that our weighted approach can enhance reliability in medical applications and other fields where the covariate shift poses challenges to model performance across various data distributions.

artificial intelligence, decision tree learning, machine learning, (16 more...)

2410.20978

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.93)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)