Goto

Collaborating Authors

 Popescu, Ionel


From Monte Carlo to neural networks approximations of boundary value problems

arXiv.org Artificial Intelligence

In this paper we study probabilistic and neural network approximations for solutions to Poisson equation subject to H\" older data in general bounded domains of $\mathbb{R}^d$. We aim at two fundamental goals. The first, and the most important, we show that the solution to Poisson equation can be numerically approximated in the sup-norm by Monte Carlo methods, { and that this can be done highly efficiently if we use a modified version} of the walk on spheres algorithm { as an acceleration method. This provides estimates which are efficient with respect to the prescribed approximation error and with polynomial complexity in the dimension and the reciprocal of the error.} {A crucial feature is that} the overall number of samples does not not depend on the point at which the approximation is performed. As a second goal, we show that the obtained Monte Carlo solver renders { in a constructive way} ReLU deep neural network (DNN) solutions to Poisson problem, whose sizes depend at most polynomialy in the dimension $d$ and in the desired error. In fact we show that the random DNN provides with high probability a small approximation error and low polynomial complexity in the dimension.


Inverse problem for parameters identification in a modified SIRD epidemic model using ensemble neural networks

arXiv.org Artificial Intelligence

In this paper, we propose a parameter identification methodology of the SIRD model, an extension of the classical SIR model, that considers the deceased as a separate category. In addition, our model includes one parameter which is the ratio between the real total number of infected and the number of infected that were documented in the official statistics. Due to many factors, like governmental decisions, several variants circulating, opening and closing of schools, the typical assumption that the parameters of the model stay constant for long periods of time is not realistic. Thus our objective is to create a method which works for short periods of time. In this scope, we approach the estimation relying on the previous 7 days of data and then use the identified parameters to make predictions. To perform the estimation of the parameters we propose the average of an ensemble of neural networks. Each neural network is constructed based on a database built by solving the SIRD for 7 days, with random parameters. In this way, the networks learn the parameters from the solution of the SIRD model. Lastly we use the ensemble to get estimates of the parameters from the real data of Covid19 in Romania and then we illustrate the predictions for different periods of time, from 10 up to 45 days, for the number of deaths. The main goal was to apply this approach on the analysis of COVID-19 evolution in Romania, but this was also exemplified on other countries like Hungary, Czech Republic and Poland with similar results. The results are backed by a theorem which guarantees that we can recover the parameters of the model from the reported data. We believe this methodology can be used as a general tool for dealing with short term predictions of infectious diseases or in other compartmental models.


Interpolation property of shallow neural networks

arXiv.org Artificial Intelligence

We study the geometry of global minima of the loss landscape of overparametrized neural networks. In the light of the interpolation threshold outlined in [1] one of the important issues in neural networks is to have guarantees that the interpolation is indeed achieved. We tackle this problem for the case of shallow neural network and show that this holds true in general as long as the activation is not a polynomial of low degree. Standard optimization problems are done in the case the loss function is convex in which case we only have a global minima. Another class of optimization problems is for nonconvex loss functions which has a discrete number of global minima. Recently there has been interesting progress aimed at understanding the locus of the global minima for overparametrized neural networks ([3], [6]) when the activation function is continuous. In this paper, we generalize these results in section 2 for a larger class of activation functions. More precisely, we prove that in the overparametrized regime, we can interpolate any data set consisting of d points with a shallow neural network having at least d neurons on the hidden layer and with an activation function which is locally integrable and not almost everywhere a polynomial of degree at most d 2. In addition, if the activation function is also smooth, the locus of global minima of the loss landscape of an over-parametrized neural network is a submanifold of R


Recover the spectrum of covariance matrix: a non-asymptotic iterative method

arXiv.org Machine Learning

It is well known the sample covariance has a consistent bias in the spectrum, for example spectrum of Wishart matrix follows the Marchenko-Pastur law. We in this work introduce an iterative algorithm 'Concent' that actively eliminate this bias and recover the true spectrum for small and moderate dimensions.


A regime switching on Covid19 analysis and prediction in Romania

arXiv.org Machine Learning

In this paper we propose a regime separation for the analysis of Covid19 on Romania combined with mathematical models of SIR and SIRD. The main regimes we study are, the free spread of the virus, the quarantine and partial relaxation and the last one is the relaxation regime. The main model we use is SIR which is a classical model, but because we can not fully trust the numbers of infected or recovered we base our analysis on the number of deceased people which is more reliable. To actually deal with this we introduce a simple modification of the SIR model to account for the deceased separately. This in turn will be our base for fitting the parameters. The estimation of the parameters is done in two steps. The first one consists in training a neural network based on SIR models to detect the regime changes. Once this is done we fit the main parameters of the SIRD model using a grid search. At the end, we make some predictions on what the evolution will be in a timeframe of a month with the fitted parameters.


An Analytical Formula for Spectrum Reconstruction

arXiv.org Machine Learning

We study the spectrum reconstruction technique. As is known to all, eigenvalues play an important role in many research fields and are foundation to many practical techniques such like PCA(Principal Component Analysis). We believe that related algorithms should perform better with more accurate spectrum estimation. There was an approximation formula proposed, however, they didn't give any proof. In our research, we show why the formula works. And when both number of features and dimension of space go to infinity, we find the order of error for the approximation formula, which is related to a constant $c$-the ratio of dimension of space and number of features.


A cost-reducing partial labeling estimator in text classification problem

arXiv.org Machine Learning

We propose a new approach to address the text classification problems when learning with partial labels is beneficial. Instead of offering each training sample a set of candidate labels, we assign negative-oriented labels to the ambiguous training examples if they are unlikely fall into certain classes. We construct our new maximum likelihood estimators with self-correction property, and prove that under some conditions, our estimators converge faster. Also we discuss the advantages of applying one of our estimator to a fully supervised learning problem. The proposed method has potential applicability in many areas, such as crowdsourcing, natural language processing and medical image analysis.


Naive Bayes with Correlation Factor for Text Classification Problem

arXiv.org Machine Learning

Naive Bayes estimator is widely used in text classification problems. However, it doesn't perform well with small-size training dataset. We propose a new method based on Naive Bayes estimator to solve this problem. A correlation factor is introduced to incorporate the correlation among different classes. Experimental results show that our estimator achieves a better accuracy compared with traditional Naive Bayes in real world data.