Goto

Collaborating Authors

 tp 0


Geodesics in the Deep Linear Network

arXiv.org Artificial Intelligence

We derive a general system of ODEs and associated explicit solutions in a special case for geodesics between full rank matrices in the deep linear network geometry. In the process, we characterize all horizontal straight lines in the invariant balanced manifold that remain geodesics under Riemannian submersion.


Minimax optimal transfer learning for high-dimensional additive regression

arXiv.org Machine Learning

Many human tasks benefit from prior experience when that experience is related to the task at hand. This phenomenon, whereby knowledge from previous tasks is transferred to new ones, has motivated the machine learning technique known as transfer learning. From a statistical perspective, consider the problem of analyzing a regression relationship when the available data are limited. Transfer learning (Torrey and Shavlik (2010)), one of the most widely used techniques in machine learning, can provide a solution. In this framework, one typically leverages related estimates obtained from large but non-identically distributed auxiliary samples, and then refines these estimates to obtain improved estimators from the smaller target sample. Transfer learning has been shown to be effective in a wide range of real-world applications, including computer vision (Kolesnikov et al. (2020); Bu et al. (2021)), natural language processing (Lee et al. (2020); Yuan et al. (2020)), and bioinformatics (Vorontsov et al. (2024); Gao and Cui (2020)), among others. Recently, the theoretical properties of transfer-learned estimators have been extensively investigated across a range of statistical problems.


Quantifying Human Bias and Knowledge to guide ML models during Training

arXiv.org Artificial Intelligence

This paper discusses a crowdsourcing based method that we designed to quantify the importance of different attributes of a dataset in determining the outcome of a classification problem. This heuristic, provided by humans acts as the initial weight seed for machine learning models and guides the model towards a better optimal during the gradient descent process. Often times when dealing with data, it is not uncommon to deal with skewed datasets, that over represent items of certain classes, while underrepresenting the rest. Skewed datasets may lead to unforeseen issues with models such as learning a biased function or overfitting. Traditional data augmentation techniques in supervised learning include oversampling and training with synthetic data. We introduce an experimental approach to dealing with such unbalanced datasets by including humans in the training process. We ask humans to rank the importance of features of the dataset, and through rank aggregation, determine the initial weight bias for the model. We show that collective human bias can allow ML models to learn insights about the true population instead of the biased sample. In this paper, we use two rank aggregator methods Kemeny Young and the Markov Chain aggregator to quantify human opinion on importance of features. This work mainly tests the effectiveness of human knowledge on binary classification (Popular vs Not-popular) problems on two ML models: Deep Neural Networks and Support Vector Machines. This approach considers humans as weak learners and relies on aggregation to offset individual biases and domain unfamiliarity.


Active pooling design in group testing based on Bayesian posterior prediction

arXiv.org Machine Learning

In identifying infected patients in a population, group testing is an effective method to reduce the number of tests and correct the test errors. In the group testing procedure, tests are performed on pools of specimens collected from patients, where the number of pools is lower than that of patients. The performance of group testing heavily depends on the design of pools and algorithms that are used in inferring the infected patients from the test outcomes. In this paper, an adaptive design method of pools based on the predictive distribution is proposed in the framework of Bayesian inference. The proposed method executed using the belief propagation algorithm results in more accurate identification of the infected patients, as compared to the group testing performed on random pools determined in advance.


Bayesian inference of infected patients in group testing with prevalence estimation

arXiv.org Machine Learning

Group testing is a method of identifying infected patients by performing tests on a pool of specimens collected from patients. For the case in which the test returns a false result with finite probability, we propose Bayesian inference and a corresponding belief propagation (BP) algorithm to identify the infected patients from the results of tests performed on the pool. We show that the true-positive rate is improved by taking into account the credible interval of a point estimate of each patient. Further, the prevalence and the error probability in the test are estimated by combining an expectation-maximization method with the BP algorithm. As another approach, we introduce a hierarchical Bayes model to identify the infected patients and estimate the prevalence. By comparing these methods, we formulate a guide for practical usage.


Iterative Flattening Search for the Flexible Job Shop Scheduling Problem

AAAI Conferences

This paper presents a meta-heuristic algorithm for solving the Flexible Job Shop Scheduling Problem (FJSSP). This strategy, known as Iterative Flattening Search (IFS), iteratively applies a relaxation-step, in which a subset of scheduling decisions are randomly retracted from the current solution; and a solving-step, in which a new solution is incrementally recomputed from this partial schedule. This work contributes two separate results: (1) it proposes a constraint-based procedure extending an existing approach previously used for classical Job Shop Scheduling Problem; (2) it proposes an original relaxation strategy on feasible FJSSP solutions based on the idea of randomly breaking the execution orders of the activities on the machines and opening the resource options for some activities selected at random. The efficacy of the overall heuristic optimization algorithm is demonstrated on a set of well-known benchmarks.