Goto

Collaborating Authors

 covid-19 data


Learning Graph ARMA Processes from Time-Vertex Spectra

arXiv.org Machine Learning

ANY modern digital platforms involve the acquisition of data over networks, while network data has a stationary process models, ARMA models widely used in typically time-varying structure. For instance, measurements classical signal processing have also been adapted to graph acquired on a sensor network or user data in a social network domains in several recent works [5], [6]. Meanwhile, the often vary over time. Such data can be modeled as timevarying computation of an ARMA process model is a challenging graph signals, or time-vertex signals. In many practical problem in graph domains as it typically involves the solution applications, time-vertex signals may have missing observations of highly nonlinear and nonconvex optimization problems. The due to issues such as sensor failure, connection loss, and problem of learning graph ARMA process models has been partial availability of user statistics. Hence, the spatio-temporal addressed in the previous studies [3], [5], [6]; however, none of interpolation of time-vertex signals arises as an important these studies explicitly aim to capture the specific time-vertex problem of interest.


Module-based regularization improves Gaussian graphical models when observing noisy data

arXiv.org Artificial Intelligence

Inferring relations from correlational data allows researchers across the sciences to uncover complex connections between variables for insights into the underlying mechanisms. The researchers often represent inferred relations using Gaussian graphical models, requiring regularization to sparsify the models. Acknowledging that the modular structure of the inferred network is often studied, we suggest module-based regularization to balance under- and overfitting. Compared with the graphical lasso, a standard approach using the Gaussian log-likelihood for estimating the regularization strength, this approach better recovers and infers modular structure in noisy synthetic and real data. The module-based regularization technique improves the usefulness of Gaussian graphical models in the many applications where they are employed.


Improved Bitcoin Price Prediction based on COVID-19 data

arXiv.org Artificial Intelligence

Social turbulence can affect people financial decisions, causing changes in spending and saving. During a global turbulence as significant as the COVID-19 pandemic, such changes are inevitable. Here we examine how the effects of COVID-19 on various jurisdictions influenced the global price of Bitcoin. We hypothesize that lock downs and expectations of economic recession erode people trust in fiat (government-issued) currencies, thus elevating cryptocurrencies. Hence, we expect to identify a causal relation between the turbulence caused by the pandemic, demand for Bitcoin, and ultimately its price. To test the hypothesis, we merged datasets of Bitcoin prices and COVID-19 cases and deaths. We also engineered extra features and applied statistical and machine learning (ML) models. We applied a Random Forest model (RF) to identify and rank the feature importance, and ran a Long Short-Term Memory (LSTM) model on Bitcoin prices data set twice: with and without accounting for COVID-19 related features. We find that adding COVID-19 data into the LSTM model improved prediction of Bitcoin prices.


A Roadmap to Asymptotic Properties with Applications to COVID-19 Data

arXiv.org Artificial Intelligence

A good estimator should, at least in the asymptotic sense, be close to the true quantity that it wishes to estimate and we should be able to give uncertainty measure based on a finite sample size. An estimator with well-behaved asymptotic properties can help clinicians in many ways such as reducing the number of patients needed in a trial, cutting down the budget for toxicology studies and providing insightful findings for late phase trials. Suggested by Sr. Fisher [1], generations of statisticians have worked on the so-called "consistency" and "asymptotic normality" of estimators. The former is based on different versions of law of large numbers (LLN) and the later is based on various types of central limit theorems (CLT) [2]. In addition to these two main tools, statisticians also apply other important but less well-known results in probability theory and other mathematical fields. To name a few, extreme value theory for distributions of maxima and minima [3], convex analysis for checking the optimality of a statistical design [4], asymptotic relative efficiency (ARE) of an estimator [5], concentration inequalities for finite sample properties and selection consistency [6] and other non-normal limits, robustness and simultaneous confidence bands of common statistical estimators [7, 8]. Despite of different properties, consistency and asymptotic normality are still the most celebrated and important properties of statistical estimators in either academia or industry. Hence, in the following, we present a roadmap to consistency and asymptotic normality. Then we provide their applications in toxicology studies and clinical trials using a COVID-19 dataset.


On the intrinsic dimensionality of Covid-19 data: a global perspective

arXiv.org Machine Learning

This paper aims to develop a global perspective of the complexity of the relationship between the standardised per-capita growth rate of Covid-19 cases, deaths, and the OxCGRT Covid-19 Stringency Index, a measure describing a country's stringency of lockdown policies. To achieve our goal, we use a heterogeneous intrinsic dimension estimator implemented as a Bayesian mixture model, called Hidalgo. We identify that the Covid-19 dataset may project onto two low-dimensional manifolds without significant information loss. The low dimensionality suggests strong dependency among the standardised growth rates of cases and deaths per capita and the OxCGRT Covid-19 Stringency Index for a country over 2020-2021. Given the low dimensional structure, it may be feasible to model observable Covid-19 dynamics with few parameters. Importantly, we identify spatial autocorrelation in the intrinsic dimension distribution worldwide. Moreover, we highlight that high-income countries are more likely to lie on low-dimensional manifolds, likely arising from aging populations, comorbidities, and increased per capita mortality burden from Covid-19. Finally, we temporally stratify the dataset to examine the intrinsic dimension at a more granular level throughout the Covid-19 pandemic.


Artificial intelligence expert to speak at WCSU about COVID data

#artificialintelligence

Western Connecticut State University Department of Philosophy and Humanistic Studies will present Dr. Rick Lawrence, of Ridgefield, for a discussion, "COVID-19: Perspectives from a Data Scientist," at 5:30 p.m. on Wednesday, Nov. 3, in Room 125 of the Science Building on the university's Midtown campus, 181 White St., Danbury. The talk is free and open to the public in-person (masks must be worn) or virtually through this link. The program is also sponsored by WCSU's Department of Computer Science and Department of Mathematics. Lawrence currently volunteers as the COVID data scientist on Ridgefield's COVID-19 Task Force, providing daily analysis of the latest COVID-19 data to help town officials make science-based policy decisions, and provides periodic analysis of vaccination rates to the Office of the Governor of Connecticut. Lawrence's work has evolved from nuclear science to computer science to machine learning and, most recently, to quantitative finance.


Pandemic model with data-driven phase detection, a study using COVID-19 data

arXiv.org Artificial Intelligence

The recent COVID-19 pandemic has promoted vigorous scientific activity in an effort to understand, advice and control the pandemic. Data is now freely available at a staggering rate worldwide. Unfortunately, this unprecedented level of information contains a variety of data sources and formats, and the models do not always conform to the description of the data. Health officials have recognized the need for more accurate models that can adjust to sudden changes, such as produced by changes in behavior or social restrictions. In this work we formulate a model that fits a ``SIR''-type model concurrently with a statistical change detection test on the data. The result is a piece wise autonomous ordinary differential equation, whose parameters change at various points in time (automatically learned from the data). The main contributions of our model are: (a) providing interpretation of the parameters, (b) determining which parameters of the model are more important to produce changes in the spread of the disease, and (c) using data-driven discovery of sudden changes in the evolution of the pandemic. Together, these characteristics provide a new model that better describes the situation and thus, provides better quality of information for decision making.


Modeling Effect of Lockdowns and Other Effects on India Covid-19 Infections Using SEIR Model and Machine Learning

arXiv.org Artificial Intelligence

The SEIR model is a widely used epidemiological model used to predict the rise in infections. This model has been widely used in different countries to predict the number of Covid-19 cases. But the original SEIR model does not take into account the effect of factors such as lockdowns, vaccines, and re-infections. In India the first wave of Covid started in March 2020 and the second wave in April 2021. In this paper, we modify the SEIR model equations to model the effect of lockdowns and other influencers, and fit the model on data of the daily Covid-19 infections in India using lmfit, a python library for least squares minimization for curve fitting. We modify R0 parameter in the standard SEIR model as a rectangle in order to account for the effect of lockdowns. Our modified SEIR model accurately fits the available data of infections.


Modeling COVID-19 uncertainties evolving over time and density-dependent social reinforcement and asymptomatic infections

arXiv.org Machine Learning

The novel coronavirus disease 2019 (COVID-19) presents unique and unknown problem complexities and modeling challenges, where an imperative task is to model both its process and data uncertainties, represented in implicit and high-proportional undocumented infections, asymptomatic contagion, social reinforcement of infections, and various quality issues in the reported data. These uncertainties become even more phenomenal in the overwhelming mutation-dominated resurgences with vaccinated but still susceptible populations. Here we introduce a novel hybrid approach to (1) characterizing and distinguishing Undocumented (U) and Documented (D) infections commonly seen during COVID-19 incubation periods and asymptomatic infections by expanding the foundational compartmental epidemic Susceptible-Infected-Recovered (SIR) model with two compartments, resulting in a new Susceptible-Undocumented infected-Documented infected-Recovered (SUDR) model; (2) characterizing the probabilistic density of infections by empowering SUDR to capture exogenous processes like clustering contagion interactions, superspreading and social reinforcement; and (3) approximating the density likelihood of COVID-19 prevalence over time by incorporating Bayesian inference into SUDR. Different from existing COVID-19 models, SUDR characterizes the undocumented infections during unknown transmission processes. To capture the uncertainties of temporal transmission and social reinforcement during the COVID-19 contagion, the transmission rate is modeled by a time-varying density function of undocumented infectious cases. We solve the modeling by sampling from the mean-field posterior distribution with reasonable priors, making SUDR suitable to handle the randomness, noise and sparsity of COVID-19 observations widely seen in the public COVID-19 case data.


Development of the InBan_CIDO Ontology by Reusing the Concepts along with Detecting Overlapping Information

arXiv.org Artificial Intelligence

The covid19 pandemic is a global emergency that badly impacted the economies of various countries. Covid19 hit India when the growth rate of the country was at the lowest in the last 10 years. To semantically analyze the impact of this pandemic on the economy, it is curial to have an ontology. CIDO ontology is a well standardized ontology that is specially designed to assess the impact of coronavirus disease and utilize its results for future decision forecasting for the government, industry experts, and professionals in the field of various domains like research, medical advancement, technical innovative adoptions, and so on. However, this ontology does not analyze the impact of the Covid19 pandemic on the Indian banking sector. On the other side, Covid19IBO ontology has been developed to analyze the impact of the Covid19 pandemic on the Indian banking sector but this ontology does not reflect complete information of Covid19 data. Resultantly, users cannot get all the relevant information about Covid19 and its impact on the Indian economy. This article aims to extend the CIDO ontology to show the impact of Covid19 on the Indian economy sector by reusing the concepts from other data sources. We also provide a simplified schema matching approach that detects the overlapping information among the ontologies. The experimental analysis proves that the proposed approach has reasonable results.