AITopics | logistic loss function

Collaborating Authors

logistic loss function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Clinical Deterioration Prediction in Brazilian Hospitals Based on Artificial Neural Networks and Tree Decision Models

Yazdanpanah, Hamed, Silva, Augusto C. M., Guedes, Murilo, Morales, Hugo M. P., Coelho, Leandro dos S., Moro, Fernando G.

arXiv.org Artificial IntelligenceDec-17-2022

Early recognition of clinical deterioration (CD) has vital importance in patients' survival from exacerbation or death. Electronic health records (EHRs) data have been widely employed in Early Warning Scores (EWS) to measure CD risk in hospitalized patients. Recently, EHRs data have been utilized in Machine Learning (ML) models to predict mortality and CD. The ML models have shown superior performance in CD prediction compared to EWS. Since EHRs data are structured and tabular, conventional ML models are generally applied to them, and less effort is put into evaluating the artificial neural network's performance on EHRs data. Thus, in this article, an extremely boosted neural network (XBNet) is used to predict CD, and its performance is compared to eXtreme Gradient Boosting (XGBoost) and random forest (RF) models. For this purpose, 103,105 samples from thirteen Brazilian hospitals are used to generate the models. Moreover, the principal component analysis (PCA) is employed to verify whether it can improve the adopted models' performance. The performance of ML models and Modified Early Warning Score (MEWS), an EWS candidate, are evaluated in CD prediction regarding the accuracy, precision, recall, F1-score, and geometric mean (G-mean) metrics in a 10-fold cross-validation approach. According to the experiments, the XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

2212.08975

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Singapore (0.05)
South America > Brazil > Paraná > Curitiba (0.05)
(8 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.56)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Multiple Run Ensemble Learning withLow-Dimensional Knowledge Graph Embeddings

Xu, Chengjin, Nayyeri, Mojtaba, Vahdati, Sahar, Lehmann, Jens

arXiv.org Artificial IntelligenceApr-11-2021

Among the top approaches of recent years, link prediction using knowledge graph embedding (KGE) models has gained significant attention for knowledge graph completion. Various embedding models have been proposed so far, among which, some recent KGE models obtain state-of-the-art performance on link prediction tasks by using embeddings with a high dimension (e.g. 1000) which accelerate the costs of training and evaluation considering the large scale of KGs. In this paper, we propose a simple but effective performance boosting strategy for KGE models by using multiple low dimensions in different repetition rounds of the same model. For example, instead of training a model one time with a large embedding size of 1200, we repeat the training of the model 6 times in parallel with an embedding size of 200 and then combine the 6 separate models for testing while the overall numbers of adjustable parameters are same (6*200=1200) and the total memory footprint remains the same. We show that our approach enables different models to better cope with their expressiveness issues on modeling various graph patterns such as symmetric, 1-n, n-1 and n-n. In order to justify our findings, we conduct experiments on various KGE models. Experimental results on standard benchmark datasets, namely FB15K, FB15K-237 and WN18RR, show that multiple low-dimensional models of the same kind outperform the corresponding single high-dimensional models on link prediction in a certain range and have advantages in training efficiency by using parallel training while the overall numbers of adjustable parameters are same.

kge model, relation, transe, (16 more...)

arXiv.org Artificial Intelligence

2104.05003

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Germany > Berlin (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.92)

Add feedback

Research Guide: Advanced Loss Functions for Machine Learning Models

#artificialintelligenceOct-14-2019, 22:53:17 GMT

Logistic loss functions don't perform very well during training when the data in question is very noisy. Such noise can be caused by outliers and mislabeled data. In this paper, Google Brain authors aim to solve the shortcomings of the logistic loss function by replacing the logarithm and exponential functions with their corresponding "tempered" versions. The authors introduce a temperature into the exponential function and replace the softmax output layer of neural nets with a high-temperature generalization. The algorithm used in the log loss is replaced by a low-temperature logarithm.

advanced loss function, loss function, machine learning model, (5 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

How Do Machines Learn To Make Predictions?

#artificialintelligenceSep-2-2019, 16:53:54 GMT

The craze behind machine learning is fueled by its ability to make predictions that come handy in running a business. This post tries to provide an intuitive understanding of how machines learn to spit out probabilities. Suppose you run a bubble tea shop, and want to create a machine learning model to predict if customers will like pieces of mock coconut in their bubble tea. Step #1: One day you make a note of all the customers who ordered mock coconut pieces, and whether they liked it or not (by rummaging through the garbage can later and checking how many of them actually finished all those white little cubes of industrial waste). This allows you to make a chart like the one below, where 0 represents the customer didn't like the mock coconut pieces, and 1 they really liked it.

artificial intelligence, customer, machine learning, (12 more...)

#artificialintelligence

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.76)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Logitron: Perceptron-augmented classification model based on an extended logistic loss function

Woo, Hyenkyun

arXiv.org Machine LearningApr-5-2019

Classification is the most important process in data analysis. However, due to the inherent non-convex and non-smooth structure of the zero-one loss function of the classification model, various convex surrogate loss functions such as hinge loss, squared hinge loss, logistic loss, and exponential loss are introduced. These loss functions have been used for decades in diverse classification models, such as SVM (support vector machine) with hinge loss, logistic regression with logistic loss, and Adaboost with exponential loss and so on. In this work, we present a Perceptron-augmented convex classification framework, {\it Logitron}. The loss function of it is a smoothly stitched function of the extended logistic loss with the famous Perceptron loss function. The extended logistic loss function is a parameterized function established based on the extended logarithmic function and the extended exponential function. The main advantage of the proposed Logitron classification model is that it shows the connection between SVM and logistic regression via polynomial parameterization of the loss function. In more details, depending on the choice of parameters, we have the Hinge-Logitron which has the generalized $k$-th order hinge-loss with an additional $k$-th root stabilization function and the Logistic-Logitron which has a logistic-like loss function with relatively large $|k|$. Interestingly, even $k=-1$, Hinge-Logitron satisfies the classification-calibration condition and shows reasonable classification performance with low computational cost. The numerical experiment in the linear classifier framework demonstrates that Hinge-Logitron with $k=4$ (the fourth-order SVM with the fourth root stabilization function) outperforms logistic regression, SVM, and other Logitron models in terms of classification accuracy.

artificial intelligence, loss function, machine learning, (16 more...)

arXiv.org Machine Learning

1904.02958

Genre:

Research Report > New Finding (0.89)
Research Report > Experimental Study (0.75)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback

Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods

Freund, Robert M., Grigas, Paul, Mazumder, Rahul

arXiv.org Machine LearningOct-19-2018

Logistic regression is one of the most popular methods in binary classification, wherein estimation of model parameters is carried out by solving the maximum likelihood (ML) optimization problem, and the ML estimator is defined to be the optimal solution of this problem. It is well known that the ML estimator exists when the data is non-separable, but fails to exist when the data is separable. First-order methods are the algorithms of choice for solving large-scale instances of the logistic regression problem. In this paper, we introduce a pair of condition numbers that measure the degree of non-separability or separability of a given dataset in the setting of binary classification, and we study how these condition numbers relate to and inform the properties and the convergence guarantees of first-order methods. When the training data is non-separable, we show that the degree of non-separability naturally enters the analysis and informs the properties and convergence guarantees of two standard first-order methods: steepest descent (for any given norm) and stochastic gradient descent. Expanding on the work of Bach, we also show how the degree of non-separability enters into the analysis of linear convergence of steepest descent (without needing strong convexity), as well as the adaptive convergence of stochastic gradient descent. When the training data is separable, first-order methods rather curiously have good empirical success, which is not well understood in theory. In the case of separable data, we demonstrate how the degree of separability enters into the analysis of $\ell_2$ steepest descent and stochastic gradient descent for delivering approximate-maximum-margin solutions with associated computational guarantees as well. This suggests that first-order methods can lead to statistically meaningful solutions in the separable case, even though the ML solution does not exist.

artificial intelligence, descent, machine learning, (17 more...)

arXiv.org Machine Learning

1810.08727

Country:

North America > United States > Massachusetts (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Classifying Big Data over Networks via the Logistic Network Lasso

Ambos, Henrik, Tran, Nguyen, Jung, Alexander

arXiv.org Machine LearningMay-7-2018

ABSTRACT We apply network Lasso to solve binary classification (clustering) problems on network structured data. To this end, we generalize ordinary logistic regression to non-Euclidean data defined over a complex network structure. A scalable classification algorithm is obtained by applying the alternating direction methods of multipliers to solve this optimization problem. Index Terms-- compressed sensing, big data over networks, semi-supervised learning, classification, clustering, complex networks, convex optimization I. INTRODUCTION We consider the problem of classifying or clustering a large set of data points which conform to an underlying network structure. Such network-structured datasets arise in a wide range of application domains, e.g., image-and video processing as well as social networks [1].

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

1805.02483

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry: Information Technology (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.61)

Add feedback

Piecewise-Linear Approximation for Feature Subset Selection in a Sequential Logit Model

Sato, Toshiki, Takano, Yuichi, Miyashiro, Ryuhei

arXiv.org Machine LearningOct-19-2015

This paper concerns a method of selecting a subset of features for a sequential logit model. Tanaka and Nakagawa (2014) proposed a mixed integer quadratic optimization formulation for solving the problem based on a quadratic approximation of the logistic loss function. However, since there is a significant gap between the logistic loss function and its quadratic approximation, their formulation may fail to find a good subset of features. To overcome this drawback, we apply a piecewise-linear approximation to the logistic loss function. Accordingly, we frame the feature subset selection problem of minimizing an information criterion as a mixed integer linear optimization problem. The computational results demonstrate that our piecewise-linear approximation approach found a better subset of features than the quadratic approximation approach.

artificial intelligence, logistic loss function, machine learning, (13 more...)

arXiv.org Machine Learning

1510.05417

Genre: Research Report > New Finding (0.34)

Industry:

Leisure & Entertainment (0.50)
Education (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback