AITopics

Country: North America > United States (0.61)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Li, Alexander Hanbo, Bradic, Jelena

Censored Quantile Regression Forest

arXiv.org Machine LearningJan-8-2020

Random forests are powerful non-parametric regression method but are severely limited in their usage in the presence of randomly censored observations, and naively applied can exhibit poor predictive performance due to the incurred biases. Based on a local adaptive representation of random forests, we develop its regression adjustment for randomly censored regression quantile models. Regression adjustment is based on a new estimating equation that adapts to censoring and leads to quantile score whenever the data do not exhibit censoring. The proposed procedure named {\it censored quantile regression forest}, allows us to estimate quantiles of time-to-event without any parametric modeling assumption. We establish its consistency under mild model specifications. Numerical studies showcase a clear advantage of the proposed procedure.

estimator, quantile, random forest, (14 more...)

2001.03458

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Sicily > Palermo (0.04)

Genre: Research Report (1.00)

Industry: Law > Civil Rights & Constitutional Law (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Watson, Oliver P, Cortes-Ciriano, Isidro, Watson, James A

A semi-supervised learning framework for quantitative structure-activity regression modelling

arXiv.org Machine LearningJan-7-2020

Supervised learning models, also known as quantitative structure-activity regression (QSAR) models, are increasingly used in assisting the process of preclinical, small molecule drug discovery. The models are trained on data consisting of a finite dimensional representation of molecular structures and their corresponding target specific activities. These models can then be used to predict the activity of previously unmeasured novel compounds. In this work we address two problems related to this approach. The first is to estimate the extent to which the quality of the model predictions degrades for compounds very different from the compounds in the training data. The second is to adjust for the screening dependent selection bias inherent in many training data sets. In the most extreme cases, only compounds which pass an activity-dependent screening are reported. By using a semi-supervised learning framework, we show that it is possible to make predictions which take into account the similarity of the testing compounds to those in the training data and adjust for the reporting selection bias. We illustrate this approach using publicly available structure-activity data on a large set of compounds reported by GlaxoSmithKline (the Tres Cantos AntiMalarial Set) to inhibit in vitro P. falciparum growth.

active compound, compound, training data, (16 more...)

2001.01924

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Thailand (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.41)

Ozer, Murat, Elsayed, Nelly, Varlioglu, Said, Li, Chengcheng

A Rule-Based Model for Victim Prediction

arXiv.org Artificial IntelligenceJan-7-2020

In this paper, we proposed a novel automated model, called Vulnerability Index for Population at Risk (VIPAR) scores, to identify rare populations for their future shooting victimizations. Likewise, the focused deterrence approach identifies vulnerable individuals and offers certain types of treatments (e.g., outreach services) to prevent violence in communities. The proposed rule-based engine model is the first AI-based model for victim prediction. This paper aims to compare the list of focused deterrence strategy with the VIPAR score list regarding their predictive power for the future shooting victimizations. Drawing on the criminological studies, the model uses age, past criminal history, and peer influence as the main predictors of future violence. Social network analysis is employed to measure the influence of peers on the outcome variable. The model also uses logistic regression analysis to verify the variable selections. Our empirical results show that VIPAR scores predict 25.8% of future shooting victims and 32.2% of future shooting suspects, whereas focused deterrence list predicts 13% of future shooting victims and 9.4% of future shooting suspects. The model outperforms the intelligence list of focused deterrence policies in predicting the future fatal and non-fatal shootings. Furthermore, we discuss the concerns about the presumption of innocence right.

co-offending network, crime, victimization, (14 more...)

arXiv.org Artificial Intelligence

2001.01391

Country:

Europe > Portugal > Braga > Braga (0.05)
North America > United States > Ohio > Hamilton County > Cincinnati (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

#artificialintelligenceJan-6-2020, 19:24:18 GMT

All Machine Learning Models Explained in 6 Minutes

In my previous article, I explained what regression was and showed how it could be used in application. This week, I'm going to go over the majority of common machine learning models used in practice, so that I can spend more time building and improving models rather than explaining the theory behind it. All machine learning models are categorized as either supervised or unsupervised. If the model is a supervised model, it's then sub-categorized as either a regression or classification model. We'll go over what these terms mean and the corresponding models that fall into each category below.

decision tree, linear regression, regression, (11 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Farnadi, Golnoosh, Getoor, Lise, Moens, Marie-Francine, De Cock, Martine

User Profiling Using Hinge-loss Markov Random Fields

arXiv.org Machine LearningJan-5-2020

A variety of approaches have been proposed to automatically infer the profiles of users from their digital footprint in social media. Most of the proposed approaches focus on mining a single type of information, while ignoring other sources of available user-generated content (UGC). In this paper, we propose a mechanism to infer a variety of user characteristics, such as, age, gender and personality traits, which can then be compiled into a user profile. To this end, we model social media users by incorporating and reasoning over multiple sources of UGC as well as social relations. Our model is based on a statistical relational learning framework using Hinge-loss Markov Random Fields (HL-MRFs), a class of probabilistic graphical models that can be defined using a set of first-order logical rules. We validate our approach on data from Facebook with more than 5k users and almost 725k relations. We show how HL-MRFs can be used to develop a generic and extensible user profiling framework by leveraging textual, visual, and relational content in the form of status updates, profile pictures and Facebook page likes. Our experimental results demonstrate that our proposed model successfully incorporates multiple sources of information and outperforms competing methods that use only one source of information or an ensemble method across the different sources for modeling of users in social media.

baseline psl-prior 0, characteristic, information, (13 more...)

2001.01177

Country:

North America > United States > Washington > Pierce County > Tacoma (0.04)
North America > United States > New Jersey (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)
(2 more...)

#artificialintelligenceJan-4-2020, 05:50:49 GMT

Linear Regression Least Squares Method Machine Learning Tutorial myTectra

In this tutorial, we are discussing Object Function (Least Square Method). The Least Square method is a parameter estimation method, used widely in engineering and across nearly all fields of science. This method attempts to determine the mathematical relationship between the dependent value and the physical quantity. The optimal method for solving the regression problem is the LS method. Call Us on 91 90191 91856 Website https://www.mytectra.com

method machine learning tutorial mytectra

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

#artificialintelligenceJan-4-2020, 05:50:49 GMT

Linear Regression Least Squares Method Machine Learning Tutorial myTectra

method machine learning tutorial mytectra

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Uras, Nicola, Marchesi, Lodovica, Marchesi, Michele, Tonelli, Roberto

Forecasting Bitcoin closing price series using linear regression and neural networks models

arXiv.org Machine LearningJan-4-2020

This is probably due to at least two reasons: high volatility of the Bitcoin price and market immaturity for cryptocurrencies. This is confirmed by the statistics reported in tables 1 and 2. The results obtained partitioning the dataset into shorter sequences also confirmed the kindness of our hypothesis of identifying time regimes that do not resemble a random walk and that are easier to model, finding that best results are obtained using more than one previous price. It is worth noting that, with this novel approach we obtained the best results for the Bitcoin price series, rather than for the stock market series as happened in the analysis of the series in their totality. As stated before, this is probably 18 due to the high volatility of the Bitcoin price, in fact it is no accident that the best result was found for the time regime identified by a translation step h of 120, where the Bitcoin prices are more distributed around the mean, showing a lower variance. This is confirmed by the standard deviation values shown in table 2. It is important to emphasize that the innovative approach proposed in this paper, namely the identification of short-time regimes within the entire series, allowed us to obtain leading-edge results in the field of financial series forecasting.

algorithm, best result, time sery, (16 more...)

2001.01127

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Europe > Italy > Sardinia > Cagliari (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.94)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Papacharalampous, Georgia, Tyralis, Hristos

Hydrological time series forecasting using simple combinations: Big data testing and investigations on one-year ahead river flow predictability

arXiv.org Machine LearningJan-2-2020

Delivering useful hydrological forecasts is critical for urban and agricultural water management, hydropower generation, flood protection and management, drought mitigation and alleviation, and river basin planning and management, among others. In this work, we present and appraise a new methodology for hydrological time series forecasting. This methodology is based on simple combinations. The appraisal is made by using a big dataset consisted of 90-year-long mean annual river flow time series from approximately 600 stations. Covering large parts of North America and Europe, these stations represent various climate and catchment characteristics, and thus can collectively support benchmarking. Five individual forecasting methods and 26 variants of the introduced methodology are applied to each time series. The application is made in one-step ahead forecasting mode. The individual methods are the last-observation benchmark, simple exponential smoothing, complex exponential smoothing, automatic autoregressive fractionally integrated moving average (ARFIMA) and Facebook's Prophet, while the 26 variants are defined by all the possible combinations (per two, three, four or five) of the five afore-mentioned methods. The findings have both practical and theoretical implications. The simple methodology of the study is identified as well-performing in the long run. Our large-scale results are additionally exploited for finding an interpretable relationship between predictive performance and temporal dependence in the river flow time series, and for examining one-year ahead river flow predictability.

forecast, forecasting, time sery, (12 more...)

2001.00811

Country:

Europe > Greece (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > New York (0.05)
(10 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Water & Waste Management > Water Management (0.66)
Energy > Renewable (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)