AITopics

doi: 10.1145/3336191.3371768

2001.05228

Country:

North America > United States > Texas > Harris County > Houston (0.05)
Asia > India (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Marketing (0.85)
Information Technology > Services (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

#artificialintelligenceJan-19-2020, 22:58:17 GMT

Balancing Interpretability and Predictive Power with Cubist Models in R

Machine learning models are powerful tools that do well in their purpose of prediction. In many business applications, the power of these models is quite beneficial. With any application of a machine learning model, the process to choosing which model involves determining the model that performs best across a given set of criteria. One of these criteria is the interpretability of the model. Neural nets to decision trees, to regression models all have varying levels of interpretability.

balancing interpretability and predictive power, prediction, regression model, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.36)

Kulin, Merima, Kazaz, Tarik, Moerman, Ingrid, de Poorter, Eli

A survey on Machine Learning-based Performance Improvement of Wireless Networks: PHY, MAC and Network layer

arXiv.org Machine LearningJan-18-2020

This paper provides a systematic and comprehensive survey that reviews the latest research efforts focused on machine learning (ML) based performance improvement of wireless networks, while considering all layers of the protocol stack (PHY, MAC and network). First, the related work and paper contributions are discussed, followed by providing the necessary background on data-driven approaches and machine learning for non-machine learning experts to understand all discussed techniques. Then, a comprehensive review is presented on works employing ML-based approaches to optimize the wireless communication parameters settings to achieve improved network quality-of-service (QoS) and quality-of-experience (QoE). We first categorize these works into: radio analysis, MAC analysis and network prediction approaches, followed by subcategories within each. Finally, open challenges and broader perspectives are discussed.

algorithm, neural network, wireless network, (15 more...)

2001.04561

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
(8 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.67)

Industry:

Telecommunications > Networks (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Networks (1.00)
(5 more...)

Technology:

Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(6 more...)

#artificialintelligenceJan-17-2020, 03:23:58 GMT

Does Machine Learning Improve Prediction of VA Primary Care Reliance?

Machine learning models, used to predict future use of primary care services from the Veterans Affairs (VA) Health Care System, did not outperform traditional regression models. ABSTRACT Objectives: The Veterans Affairs (VA) Health Care System is among the largest integrated health systems in the United States. Many VA enrollees are dual users of Medicare, and little research has examined methods to most accurately predict which veterans will be mostly reliant on VA services in the future. This study examined whether machine learning methods can better predict future reliance on VA primary care compared with traditional statistical methods. Study Design: Observational study of 83,143 VA patients dually enrolled in fee-for-service Medicare using VA and Medicare administrative databases and the 2012 Survey of Healthcare Experiences of Patients.

prediction, reliance, veteran, (13 more...)

#artificialintelligence

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (0.90)

Industry:

Health & Medicine > Government Relations & Public Policy (1.00)
Government > Regional Government > North America Government > United States Government (0.99)
Health & Medicine > Health Care Providers & Services > Reimbursement (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.39)

Shahhosseini, Mohsen, Hu, Guiping, Archontoulis, Sotirios V.

Forecasting Corn Yield with Machine Learning Ensembles

The emerge of new technologies to synthesize and analyze big data with high-performance computing, has increased our capacity to more accurately predict crop yields. Recent research has shown that Machine learning (ML) can provide reasonable predictions, faster, and with higher flexibility compared to simulation crop modeling. The earlier the prediction during the growing season the better, but this has not been thoroughly investigated as previous studies considered all data available to predict yields. This paper provides a machine learning based framework to forecast corn yields in three US Corn Belt states (Illinois, Indiana, and Iowa) considering complete and partial in-season weather knowledge. Several ensemble models are designed using blocked sequential procedure to generate out-of-bag predictions. The forecasts are made in county-level scale and aggregated for agricultural district, and state level scales. Results show that ensemble models based on weighted average of the base learners outperform individual models. Specifically, the proposed ensemble model could achieve best prediction accuracy (RRMSE of 7.8%) and least mean bias error (-6.06 bu/acre) compared to other developed models. Comparing our proposed model forecasts with the literature demonstrates the superiority of forecasts made by our proposed ensemble model. Results from the scenario of having partial in-season weather knowledge reveal that decent yield forecasts can be made as early as June 1st. To find the marginal effect of each input feature on the forecasts made by the proposed ensemble model, a methodology is suggested that is the basis for finding feature importance for the ensemble model. The findings suggest that weather features corresponding to weather in weeks 18-24 (May 1st to June 1st) are the most important input features.

ensemble, ensemble model, prediction, (14 more...)

2001.09055

Country:

North America > United States > Indiana (0.24)
North America > United States > Illinois (0.24)
North America > United States > Iowa > Story County > Ames (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Scalable Bid Landscape Forecasting in Real-time Bidding

Ghosh, Aritra, Mitra, Saayan, Sarkhel, Somdeb, Xie, Jason, Wu, Gang, Swaminathan, Viswanathan

In programmatic advertising, ad slots are usually sold using second-price (SP) auctions in real-time. The highest bidding advertiser wins but pays only the second-highest bid (known as the winning price). In SP, for a single item, the dominant strategy of each bidder is to bid the true value from the bidder's perspective. However, in a practical setting, with budget constraints, bidding the true value is a sub-optimal strategy. Hence, to devise an optimal bidding strategy, it is of utmost importance to learn the winning price distribution accurately. Moreover, a demand-side platform (DSP), which bids on behalf of advertisers, observes the winning price if it wins the auction. For losing auctions, DSPs can only treat its bidding price as the lower bound for the unknown winning price. In literature, typically censored regression is used to model such partially observed data. A common assumption in censored regression is that the winning price is drawn from a fixed variance (homoscedastic) uni-modal distribution (most often Gaussian). However, in reality, these assumptions are often violated. We relax these assumptions and propose a heteroscedastic fully parametric censored regression approach, as well as a mixture density censored network. Our approach not only generalizes censored regression but also provides flexibility to model arbitrarily distributed real-world data. Experimental evaluation on the publicly available dataset for winning price estimation demonstrates the effectiveness of our method. Furthermore, we evaluate our algorithm on one of the largest demand-side platforms and significant improvement has been achieved in comparison with the baseline solutions.

assumption, auction, probability, (11 more...)

2001.06587

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > California > Santa Clara County > San Jose (0.04)

Genre: Research Report (0.64)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Banking & Finance > Trading (1.00)
Marketing (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Hushchyn, Mikhail, Ustyuzhanin, Andrey

Generalization of Change-Point Detection in Time Series Data Based on Direct Density Ratio Estimation

The goal of the change-point detection is to discover changes of time series distribution. One of the state of the art approaches of the change-point detection are based on direct density ratio estimation. In this work we show how existing algorithms can be generalized using various binary classification and regression models. In particular, we show that the Gradient Boosting over Decision Trees and Neural Networks can be used for this purpose. The algorithms are tested on several synthetic and real-world datasets. The results show that the proposed methods outperform classical RuLSIF algorithm. Discussion of cases where the proposed algorithms have advantages over existing methods are also provided.

algorithm, change-point detection, detection, (14 more...)

2001.06386

Country:

Asia > Russia (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)

Delplace, Antoine, Hermoso, Sheryl, Anandita, Kristofer

Cyber Attack Detection thanks to Machine Learning Algorithms

Cybersecurity attacks are growing both in frequency and sophistication over the years. This increasing sophistication and complexity call for more advancement and continuous innovation in defensive strategies. Traditional methods of intrusion detection and deep packet inspection, while still largely used and recommended, are no longer sufficient to meet the demands of growing security threats. As computing power increases and cost drops, Machine Learning is seen as an alternative method or an additional mechanism to defend against malwares, botnets, and other attacks. This paper explores Machine Learning as a viable solution by examining its capabilities to classify malicious traffic in a network. First, a strong data analysis is performed resulting in 22 extracted features from the initial Netflow datasets. All these features are then compared with one another through a feature selection process. Then, our approach analyzes five different machine learning algorithms against NetFlow dataset containing common botnets. The Random Forest Classifier succeeds in detecting more than 95% of the botnets in 8 out of 13 scenarios and more than 55% in the most difficult datasets. Finally, insight is given to improve and generalize the results, especially through a bootstrapping technique.

algorithm, dataset, netflow, (14 more...)

2001.06309

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > Minnesota (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
(2 more...)

Kirchgässner, Wilhelm, Wallscheid, Oliver, Böcker, Joachim

Data-Driven Permanent Magnet Temperature Estimation in Synchronous Motors with Supervised Machine Learning

Monitoring the magnet temperature in permanent magnet synchronous motors (PMSMs) for automotive applications is a challenging task for several decades now, as signal injection or sensor-based methods still prove unfeasible in a commercial context. Overheating results in severe motor deterioration and is thus of high concern for the machine's control strategy and its design. Lack of precise temperature estimations leads to lesser device utilization and higher material cost. In this work, several machine learning (ML) models are empirically evaluated on their estimation accuracy for the task of predicting latent high-dynamic magnet temperature profiles. The range of selected algorithms covers as diverse approaches as possible with ordinary and weighted least squares, support vector regression, $k$-nearest neighbors, randomized trees and neural networks. Having test bench data available, it is shown that ML approaches relying merely on collected data meet the estimation performance of classical thermal models built on thermodynamic theory, yet not all kinds of models render efficient use of large datasets or sufficient modeling capacities. Especially linear regression and simple feed-forward neural networks with optimized hyperparameters mark strong predictive quality at low to moderate model sizes.

neural network, permanent magnet synchronous motor, synchronous motor, (13 more...)

2001.06246

Country: Europe > Germany (0.04)

Genre: Research Report (0.50)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Steinberg, Daniel, Reid, Alistair, O'Callaghan, Simon

Fairness Measures for Regression via Probabilistic Classification

arXiv.org Machine LearningJan-16-2020

Algorithmic fairness involves expressing notions such as equity, or reasonable treatment, as quantifiable measures that a machine learning algorithm can optimise. Most work in the literature to date has focused on classification problems where the prediction is categorical, such as accepting or rejecting a loan application. This is in part because classification fairness measures are easily computed by comparing the rates of outcomes, leading to behaviours such as ensuring that the same fraction of eligible men are selected as eligible women. But such measures are computationally difficult to generalise to the continuous regression setting for problems such as pricing, or allocating payments. The difficulty arises from estimating conditional densities (such as the probability density that a system will over-charge by a certain amount). For the regression setting we introduce tractable approximations of the independence, separation and sufficiency criteria by observing that they factorise as ratios of different conditional probabilities of the protected attributes. We introduce and train machine learning classifiers, distinct from the predictor, as a mechanism to estimate these probabilities from the data. This naturally leads to model agnostic, tractable approximations of the criteria, which we explore experimentally.

criteria, fairness criteria, information, (15 more...)

2001.06089

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
(3 more...)

Genre: Research Report (0.83)

Industry: Banking & Finance > Loans (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)