Goto

Collaborating Authors

 Regression


Categorical Difference and Related Brain Regions of the Attentional Blink Effect

arXiv.org Artificial Intelligence

Attentional blink (AB) is a biological effect, showing that for 200 to 500ms after paying attention to one visual target, it is difficult to notice another target that appears next, and attentional blink magnitude (ABM) is a indicating parameter to measure the degree of this effect. Researchers have shown that different categories of images can access the consciousness of human mind differently, and produce different ranges of ABM values. So in this paper, we compare two different types of images, categorized as animal and object, by predicting ABM values directly from image features extracted from convolutional neural network (CNN), and indirectly from functional magnetic resonance imaging (fMRI) data. First, for two sets of images, we separately extract their average features from layers of Alexnet, a classic model of CNN, then input the features into a trained linear regression model to predict ABM values, and we find higher-level instead of lower-level image features determine the categorical difference in AB effect, and mid-level image features predict ABM values more correctly than low-level and high-level image features. Then we employ fMRI data from different brain regions collected when the subjects viewed 50 test images to predict ABM values, and conclude that brain regions covering relatively broader areas, like LVC, HVC and VC, perform better than other smaller brain regions, which means AB effect is more related to synthetic impact of several visual brain regions than only one particular visual regions.


Regularization for Shuffled Data Problems via Exponential Family Priors on the Permutation Group

arXiv.org Machine Learning

In the analysis of data sets consisting of (X, Y)-pairs, a tacit assumption is that each pair corresponds to the same observation unit. If, however, such pairs are obtained via record linkage of two files, this assumption can be violated as a result of mismatch error rooting, for example, in the lack of reliable identifiers in the two files. Recently, there has been a surge of interest in this setting under the term "Shuffled data" in which the underlying correct pairing of (X, Y)-pairs is represented via an unknown index permutation. Explicit modeling of the permutation tends to be associated with substantial overfitting, prompting the need for suitable methods of regularization. In this paper, we propose a flexible exponential family prior on the permutation group for this purpose that can be used to integrate various structures such as sparse and locally constrained shuffling. This prior turns out to be conjugate for canonical shuffled data problems in which the likelihood conditional on a fixed permutation can be expressed as product over the corresponding (X,Y)-pairs. Inference is based on the EM algorithm in which the intractable E-step is approximated by the Fisher-Yates algorithm. The M-step is shown to admit a significant reduction from $n^2$ to $n$ terms if the likelihood of (X,Y)-pairs has exponential family form as in the case of generalized linear models. Comparisons on synthetic and real data show that the proposed approach compares favorably to competing methods.


An AI-powered Smart Routing Solution for Payment Systems

arXiv.org Artificial Intelligence

In the current era of digitization, online payment systems are attracting considerable interest. Improving the efficiency of a payment system is important since it has a substantial impact on revenues for businesses. A gateway is an integral component of a payment system through which every transaction is routed. In an online payment system, payment processors integrate with these gateways by means of various configurations such as pricing, methods, risk checks, etc. These configurations are called terminals. Each gateway can have multiple terminals associated with it. Routing a payment transaction through the best terminal is crucial to increase the probability of a payment transaction being successful. Machine learning (ML) and artificial intelligence (AI) techniques can be used to accurately predict the best terminals based on their previous performance and various payment-related attributes. We have devised a pipeline consisting of static and dynamic modules. The static module does the initial filtering of the terminals using static rules and a logistic regression model that predicts gateway downtimes. Subsequently, the dynamic module computes a lot of novel features based on success rate, payment attributes, time lag, etc. to model the terminal behaviour accurately. These features are updated using an adaptive time decay rate algorithm in real-time using a feedback loop and passed to a random forest classifier to predict the success probabilities for every terminal. This pipeline is currently in production at Razorpay routing millions of transactions through it in real-time and has given a 4-6\% improvement in success rate across all payment methods (credit card, debit card, UPI, net banking). This has made our payment system more resilient to performance drops, which has improved the user experience, instilled more trust in the merchants, and boosted the revenue of the business.


Linear Regression in R

#artificialintelligence

Linear regression is the regression evaluation that is normally used to version the connection among one established variable Y and one or more predictor variables. When there is one predictor, it is called simple linear regression. When there is more than one predictor, that is called Multilinear Regression.


What is regression Analysis

#artificialintelligence

Regression analysis is likely the first predictive modeling method you learned as a practitioner during your academic studies or the most common modeling method for your analytics group. Regression concepts were first published in the early 1800s by Adrien‐Marie Legrendre and Carl Gauss. Legrendre was born into a wealthy French family and contributed to a number of advances in the fi elds of mathematics and statistics. Gauss, in contrast, was born to a poor family in Germany. Gauss was a child math prodigy but throughout his life he was reluctant to publish any work that he felt was not above criticism.


Three common problems on supervised learning

#artificialintelligence

A: They are almost identical. Linear Regression uses Ordinary least squares (OLS) to get an unbiased and high variance solution. Things like multi-collinearity can cause Linear Regression to fail. Ridge Regression is solved pretty much the same way, but it adds a regularization constant. The constant is a source of bias and can decrease variance.


Optimal prediction for kernel-based semi-functional linear regression

arXiv.org Machine Learning

In this paper, we establish minimax optimal rates of convergence for prediction in a semi-functional linear model that consists of a functional component and a less smooth nonparametric component. Our results reveal that the smoother functional component can be learned with the minimax rate as if the nonparametric component were known. More specifically, a double-penalized least squares method is adopted to estimate both the functional and nonparametric components within the framework of reproducing kernel Hilbert spaces. By virtue of the representer theorem, an efficient algorithm that requires no iterations is proposed to solve the corresponding optimization problem, where the regularization parameters are selected by the generalized cross validation criterion. Numerical studies are provided to demonstrate the effectiveness of the method and to verify the theoretical analysis.


False Positive Detection and Prediction Quality Estimation for LiDAR Point Cloud Segmentation

arXiv.org Artificial Intelligence

We present a novel post-processing tool for semantic segmentation of LiDAR point cloud data, called LidarMetaSeg, which estimates the prediction quality segmentwise. For this purpose we compute dispersion measures based on network probability outputs as well as feature measures based on point cloud input features and aggregate them on segment level. These aggregated measures are used to train a meta classification model to predict whether a predicted segment is a false positive or not and a meta regression model to predict the segmentwise intersection over union. Both models can then be applied to semantic segmentation inferences without knowing the ground truth. In our experiments we use different LiDAR segmentation models and datasets and analyze the power of our method. We show that our results outperform other standard approaches.


What Is TensorFlow 2.0?

#artificialintelligence

TensorFlow is one of the most widely used open-source library for machine learning and deep learning applications built by Google. TensorFlow 2.0 is the official second version of this library that encompasses many changes to make users more productive. Some major features highlights of TensorFlow 2.0 are: You can read more about the changes TensorFlow 2.0 encompasses in this TensorFlow's official blog. Learn how to build Machine Learning projects using TensorFlow 2.0? Enroll in this TensorFlow Course created by The Click Reader.


Machine Learning with PySpark Course

#artificialintelligence

Spark is a powerful, general purpose tool for working with Big Data. Spark transparently handles the distribution of compute tasks across a cluster. This means that operations are fast, but it also allows you to focus on the analysis rather than worry about technical details. In this course you'll learn how to get data into Spark and then delve into the three fundamental Spark Machine Learning algorithms: Linear Regression, Logistic Regression/Classifiers, and creating pipelines. With this background you'll be ready to harness the power of Spark and apply it on your own Machine Learning projects!