AITopics

2301.12588

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)

Zilber, Pini, Nadler, Boaz

Imbalanced Mixed Linear Regression

arXiv.org Artificial IntelligenceJan-29-2023

We consider the problem of mixed linear regression (MLR), where each observed sample belongs to one of $K$ unknown linear models. In practical applications, the proportions of the $K$ components are often imbalanced. Unfortunately, most MLR methods do not perform well in such settings. Motivated by this practical challenge, in this work we propose Mix-IRLS, a novel, simple and fast algorithm for MLR with excellent performance on both balanced and imbalanced mixtures. In contrast to popular approaches that recover the $K$ models simultaneously, Mix-IRLS does it sequentially using tools from robust regression. Empirically, Mix-IRLS succeeds in a broad range of settings where other methods fail. These include imbalanced mixtures, small sample sizes, presence of outliers, and an unknown number of models $K$. In addition, Mix-IRLS outperforms competing methods on several real-world datasets, in some cases by a large margin. We complement our empirical results by deriving a recovery guarantee for Mix-IRLS, which highlights its advantage on imbalanced mixtures.

artificial intelligence, machine learning, mix-irl, (18 more...)

2301.12559

Country: North America > United States > New York (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceJan-28-2023, 18:05:52 GMT

Machine Learning

The Machine Learning Specialization is a foundational online program created in collaboration between DeepLearning.AI and Stanford Online. This beginner-friendly program will teach you the fundamentals of machine learning and how to use these techniques to build real-world AI applications. This Specialization is taught by Andrew Ng, an AI visionary who has led critical research at Stanford University and groundbreaking work at Google Brain, Baidu, and Landing.AI to advance the AI field. This 3-course Specialization is an updated version of Andrew's pioneering Machine Learning course, rated 4.9 out of 5 and taken by over 4.8 million learners since it launched in 2012. It provides a broad introduction to modern machine learning, including supervised learning (multiple linear regression, logistic regression, neural networks, and decision trees), unsupervised learning (clustering, dimensionality reduction, recommender systems), and some of the best practices used in Silicon Valley for artificial intelligence and machine learning innovation (evaluating and tuning models, taking a data-centric approach to improving performance, and more.)

artificial intelligence, educational setting, machine learning, (6 more...)

Country: North America > United States > California (0.28)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.62)

arXiv.org Artificial IntelligenceJan-28-2023

Team Resilience under Shock: An Empirical Analysis of GitHub Repositories during Early COVID-19 Pandemic

Lu, Xuan, Ai, Wei, Wang, Yixin, Mei, Qiaozhu

While many organizations have shifted to working remotely during the COVID-19 pandemic, how the remote workforce and the remote teams are influenced by and would respond to this and future shocks remain largely unknown. Software developers have relied on remote collaborations long before the pandemic, working in virtual teams (GitHub repositories). The dynamics of these repositories through the pandemic provide a unique opportunity to understand how remote teams react under shock. This work presents a systematic analysis. We measure the overall effect of the early pandemic on public GitHub repositories by comparing their sizes and productivity with the counterfactual outcomes forecasted as if there were no pandemic. We find that the productivity level and the number of active members of these teams vary significantly during different periods of the pandemic. We then conduct a finer-grained investigation and study the heterogeneous effects of the shock on individual teams. We find that the resilience of a team is highly correlated to certain properties of the team before the pandemic. Through a bootstrapped regression analysis, we reveal which types of teams are robust or fragile to the shock.

artificial intelligence, machine learning, repository, (18 more...)

2301.12326

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.84)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.70)

Technology:

Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Ribeiro, Antônio H., Schön, Thomas B.

Overparameterized Linear Regression under Adversarial Attacks

arXiv.org Artificial IntelligenceJan-27-2023

We study the error of linear regression in the face of adversarial attacks. In this framework, an adversary changes the input to the regression model in order to maximize the prediction error. We provide bounds on the prediction error in the presence of an adversary as a function of the parameter norm and the error in the absence of such an adversary. We show how these bounds make it possible to study the adversarial error using analysis from non-adversarial setups. The obtained results shed light on the robustness of overparameterized linear models to adversarial attacks. Adding features might be either a source of additional robustness or brittleness. On the one hand, we use asymptotic results to illustrate how double-descent curves can be obtained for the adversarial error. On the other hand, we derive conditions under which the adversarial error can grow to infinity as more features are added, while at the same time, the test error goes to zero. We show this behavior is caused by the fact that the norm of the parameter vector grows with the number of features. It is also established that $\ell_\infty$ and $\ell_2$-adversarial attacks might behave fundamentally differently due to how the $\ell_1$ and $\ell_2$-norms of random projections concentrate. We also show how our reformulation allows for solving adversarial training as a convex optimization problem. This fact is then exploited to establish similarities between adversarial training and parameter-shrinking methods and to study how the training might affect the robustness of the estimated models.

adversarial risk, artificial intelligence, machine learning, (18 more...)

doi: 10.1109/TSP.2023.3246228

2204.06274

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)

Genre: Research Report (0.81)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Artificial IntelligenceJan-27-2023

An Analysis of Loss Functions for Binary Classification and Regression

Buzas, Jeffrey

This paper explores connections between margin-based loss functions and consistency in binary classification and regression applications. It is shown that a large class of margin-based loss functions for binary classification/regression result in estimating scores equivalent to log-likelihood scores weighted by an even function. A simple characterization for conformable (consistent) loss functions is given, which allows for straightforward comparison of different losses, including exponential loss, logistic loss, and others. The characterization is used to construct a new Huber-type loss function for the logistic model. A simple relation between the margin and standardized logistic regression residuals is derived, demonstrating that all margin-based loss can be viewed as loss functions of squared standardized logistic regression residuals. The relation provides new, straightforward interpretations for exponential and logistic loss, and aids in understanding why exponential loss is sensitive to outliers. In particular, it is shown that minimizing empirical exponential loss is equivalent to minimizing the sum of squared standardized logistic regression residuals. The relation also provides new insight into the AdaBoost algorithm.

artificial intelligence, loss function, machine learning, (15 more...)

2301.07638

Country:

North America > United States > New York (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Vermont > Chittenden County > Burlington (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

#artificialintelligenceJan-26-2023, 11:25:07 GMT

Probabilistic Logistic Regression and Deep Learning

This article belongs to the series "Probabilistic Deep Learning". This weekly series covers probabilistic approaches to deep learning. The main goal is to extend deep learning models to quantify uncertainty, i.e., know what they do not know. In this article, we will introduce the concept of probabilistic logistic regression, a powerful technique that allows for the inclusion of uncertainty in the prediction process. We will explore how this approach can lead to more robust and accurate predictions, especially in cases where the data is noisy, or the model is overfitting.

artificial intelligence, machine learning, model parameter, (15 more...)

Genre:

Research Report > New Finding (0.73)
Research Report > Experimental Study (0.73)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.98)

Kamelesun, D., Saranya, R., Kathiravan, P.

A Benchmark Study by using various Machine Learning Models for Predicting Covid-19 trends

arXiv.org Artificial IntelligenceJan-26-2023

Machine learning and deep learning play vital roles in predicting diseases in the medical field. Machine learning algorithms are widely classified as supervised, unsupervised, and reinforcement learning. This paper contains a detailed description of our experimental research work in that we used a supervised machine-learning algorithm to build our model for outbreaks of the novel Coronavirus that has spread over the whole world and caused many deaths, which is one of the most disastrous Pandemics in the history of the world. The people suffered physically and economically to survive in this lockdown. This work aims to understand better how machine learning, ensemble, and deep learning models work and are implemented in the real dataset. In our work, we are going to analyze the current trend or pattern of the coronavirus and then predict the further future of the covid-19 confirmed cases or new cases by training the past Covid-19 dataset by using the machine learning algorithm such as Linear Regression, Polynomial Regression, K-nearest neighbor, Decision Tree, Support Vector Machine and Random forest algorithm are used to train the model. The decision tree and the Random Forest algorithm perform better than SVR in this work. The performance of SVR and lasso regression are low in all prediction areas Because the SVR is challenging to separate the data using the hyperplane for this type of problem. So SVR mostly gives a lower performance in this problem. Ensemble (Voting, Bagging, and Stacking) and deep learning models(ANN) also predict well. After the prediction, we evaluated the model using MAE, MSE, RMSE, and MAPE. This work aims to find the trend/pattern of the covid-19.

artificial intelligence, deep learning, machine learning, (15 more...)

2301.11257

Country:

North America > United States (0.14)
South America > Brazil (0.05)
Europe > Russia (0.05)
(16 more...)

Genre:

Research Report > New Finding (0.90)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceJan-23-2023, 12:45:09 GMT

Regression - Shrijayan Rajendran - Medium

Regression is a statistical method used to analyze the relationship between one or more independent variables and a continuous dependent variable. It can be used to predict the value of the dependent variable based on the values of the independent variables. Linear regression is the most common type of regression and is used when the relationship between the variables is linear. Non-linear regression is used when the relationship between the variables is non-linear. Other types of regression include logistic regression, which is used when the dependent variable is binary, and polynomial regression, which is used when the relationship between the variables is non-linear but can be modeled by a polynomial equation.

artificial intelligence, machine learning, regression, (1 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceJan-23-2023, 09:15:33 GMT

Is a Small Dataset Risky?. Some reflections and tests on the use…

Recently I have written an article about the risks of using the train_test_split() function provided by the scikit-learn Python package. That article has raised a lot of comments, some positives, and others with some concerns. The main concern in the article was that I used a small dataset to demonstrate my theory, which was: be careful when you use the train_test_split() function, because the different seeds may produce very different models. The main concern was that the train_test_split() function does not behave strangely; the problem is that I used a small dataset to demonstrate my thesis. In this article, I try to discover which is the performance of a Linear Regression model by varying the dataset size.

artificial intelligence, dataset, machine learning, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.59)