AITopics

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Artificial IntelligenceOct-8-2022

Robust and Sparse Estimation of Linear Regression Coefficients with Heavy-tailed Noises and Covariates

Sasai, Takeyuki

Robust and sparse estimation of linear regression coefficients is investigated. The situation addressed by the present paper is that covariates and noises are sampled from heavy-tailed distributions, and the covariates and noises are contaminated by malicious outliers. Our estimator can be computed efficiently. Further, the error bound of the estimator is nearly optimal.

artificial intelligence, estimation, machine learning, (16 more...)

2206.07594

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Chen, Kuilin, Lee, Chi-Guhn

Unsupervised Few-shot Learning via Deep Laplacian Eigenmaps

arXiv.org Artificial IntelligenceOct-7-2022

Few-shot learning (Fei-Fei et al., 2006) aims to learn a new classification or regression model on a novel task that is not seen during training, given only a few examples in the novel task. Existing few-shot learning methods either rely on episodic meta-learning (Finn et al., 2017, Snell et al., 2017) or standard pretraining (Chen et al., 2019, Tian et al., 2020b) in a supervised manner to extract transferrable knowledge to a new few-shot task. Unfortunately, these methods require many labeled meta-training samples. Acquiring a lot of labeled data is costly or even impossible in practice. Recently, several unsupervised meta-learning approaches have attempted to address this problem by constructing synthetic tasks on unlabeled meta-training data (Hsu et al., 2019, Khodadadeh et al., 2019, 2021) or meta-training on self-supervised pretrained features (Lee et al., 2021a). However, the performance of unsupervised meta-learning approaches is still far from their supervised counterparts. Empirical studies in supervised pretraining show that representation learning via grouping similar samples together (Chen et al., 2019, Dhillon et al., 2020, Laenen and Bertinetto, 2021, Tian et al., 2020b) outperforms a wide range of episodic meta-learning methods, where the definition of similar samples is given by class labels. The motivation of this study is to develop an unsupervised representation learning method by grouping unlabeled meta-training data without episodic training and close the performance gap between supervised and unsupervised few-shot learning. Contrastive self-supervised learning has shown remarkable success in learning representation from unlabeled data, which is competitive with supervised learning on multiple visual tasks (Hénaff et al., 2020, Tian et al., 2020a).

artificial intelligence, deep learning, machine learning, (16 more...)

2210.03595

Country:

North America > Canada > Ontario > Toronto (0.28)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Peng, Chengyang, Donca, Octavian, Hereid, Ayonga

Safe Path Planning for Polynomial Shape Obstacles via Control Barrier Functions and Logistic Regression

arXiv.org Artificial IntelligenceOct-7-2022

Safe path planning is critical for bipedal robots to operate in safety-critical environments. Common path planning algorithms, such as RRT or RRT*, typically use geometric or kinematic collision check algorithms to ensure collision-free paths toward the target position. However, such approaches may generate non-smooth paths that do not comply with the dynamics constraints of walking robots. It has been shown that the control barrier function (CBF) can be integrated with RRT/RRT* to synthesize dynamically feasible collision-free paths. Yet, existing work has been limited to simple circular or elliptical shape obstacles due to the challenging nature of constructing appropriate barrier functions to represent irregular-shaped obstacles. In this paper, we present a CBF-based RRT* algorithm for bipedal robots to generate a collision-free path through complex space with polynomial-shaped obstacles. In particular, we used logistic regression to construct polynomial barrier functions from a grid map of the environment to represent arbitrarily shaped obstacles. Moreover, we developed a multi-step CBF steering controller to ensure the efficiency of free space exploration. The proposed approach was first validated in simulation for a differential drive model, and then experimentally evaluated with a 3D humanoid robot, Digit, in a lab setting with randomly placed obstacles.

artificial intelligence, barrier function, machine learning, (18 more...)

2210.03704

Country:

Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
North America > United States > Ohio (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre:

Research Report > New Finding (0.71)
Research Report > Experimental Study (0.62)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

#artificialintelligenceOct-6-2022, 15:05:32 GMT

How Should We Detect and Treat the Outliers?

How do we need to detect outliers? How do we need to treat the outliers? An outlier is that datapoint or observation which behaves very differently from the rest of the data. If we are finding the average net worth of a group of people, and if we find Elon Musk in that group, then the complete analysis will go wrong because of just one outlier. This is a reason why outliers should be treated properly before building a machine learning model.

data and boxplot, outlier, standard deviation, (12 more...)

Industry: Law Enforcement & Public Safety > Fraud (0.31)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

#artificialintelligenceOct-6-2022, 11:41:20 GMT

Back of the Envelope Machine Learning

Data science projects fail, frequently. Between the end of 2017 and 2019 several published reports from Gartner, NewVantage, and VentureBeat AI showed that'failure' rates on data science projects are north of 75%. But I don't think this is indicative of how powerful the growth of data, machine learning, and AI has been for business (and likely all sectors of the economy) over the same timeframe. Back-of-the-envelope machine learning is inconspicuously powering business today (2020). A premortem is a thought exercise to predict or foresee why an analysis or project might fail.

envelope machine learning, pledge backer, premortem, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)

Dandl, Susanne, Bender, Andreas, Hothorn, Torsten

Heterogeneous Treatment Effect Estimation for Observational Data using Model-based Forests

arXiv.org Machine LearningOct-6-2022

The estimation of heterogeneous treatment effects (HTEs) has attracted considerable interest in many disciplines, most prominently in medicine and economics. Contemporary research has so far primarily focused on continuous and binary responses where HTEs are traditionally estimated by a linear model, which allows the estimation of constant or heterogeneous effects even under certain model misspecifications. More complex models for survival, count, or ordinal outcomes require stricter assumptions to reliably estimate the treatment effect. Most importantly, the noncollapsibility issue necessitates the joint estimation of treatment and prognostic effects. Model-based forests allow simultaneous estimation of covariate-dependent treatment and prognostic effects, but only for randomized trials. In this paper, we propose modifications to model-based forests to address the confounding issue in observational data. In particular, we evaluate an orthogonalization strategy originally proposed by Robinson (1988, Econometrica) in the context of model-based forests targeting HTE estimation in generalized linear models and transformation models. We found that this strategy reduces confounding effects in a simulated study with various outcome distributions. We demonstrate the practical aspects of HTE estimation for survival and ordinal outcomes by an assessment of the potentially heterogeneous effect of Riluzole on the progress of Amyotrophic Lateral Sclerosis.

artificial intelligence, machine learning, model-based forest, (17 more...)

arXiv.org Machine Learning

2210.02836

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Austria > Vienna (0.14)
North America > Greenland (0.04)
(6 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.67)
Government > Regional Government > North America Government > United States Government (0.67)
Health & Medicine > Therapeutic Area > Neurology > Amyotrophic Lateral Sclerosis (ALS) (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Artificial IntelligenceOct-6-2022

Uncovering the Structural Fairness in Graph Contrastive Learning

Wang, Ruijia, Wang, Xiao, Shi, Chuan, Song, Le

Recent studies show that graph convolutional network (GCN) often performs worse for low-degree nodes, exhibiting the so-called structural unfairness for graphs with long-tailed degree distributions prevalent in the real world. Graph contrastive learning (GCL), which marries the power of GCN and contrastive learning, has emerged as a promising self-supervised approach for learning node representations. How does GCL behave in terms of structural fairness? Surprisingly, we find that representations obtained by GCL methods are already fairer to degree bias than those learned by GCN. We theoretically show that this fairness stems from intra-community concentration and inter-community scatter properties of GCL, resulting in a much clear community structure to drive low-degree nodes away from the community boundary. Based on our theoretical analysis, we further devise a novel graph augmentation method, called GRAph contrastive learning for DEgree bias (GRADE), which applies different strategies to low- and high-degree nodes. Extensive experiments on various benchmarks and evaluation protocols validate the effectiveness of the proposed method.

artificial intelligence, machine learning, node, (16 more...)

2210.03011

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Artificial IntelligenceOct-5-2022

Privacy-Preserving Text Classification on BERT Embeddings with Homomorphic Encryption

Lee, Garam, Kim, Minsoo, Park, Jai Hyun, Hwang, Seung-won, Cheon, Jung Hee

Embeddings, which compress information in raw text into semantics-preserving low-dimensional vectors, have been widely adopted for their efficacy. However, recent research has shown that embeddings can potentially leak private information about sensitive attributes of the text, and in some cases, can be inverted to recover the original input text. To address these growing privacy challenges, we propose a privatization mechanism for embeddings based on homomorphic encryption, to prevent potential leakage of any piece of information in the process of text classification. In particular, our method performs text classification on the encryption of embeddings from state-of-the-art models like BERT, supported by an efficient GPU implementation of CKKS encryption scheme. We show that our method offers encrypted protection of BERT embeddings, while largely preserving their utility on downstream text classification tasks.

computational linguistic, machine learning, natural language, (16 more...)

2210.02574

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(4 more...)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)

#artificialintelligenceOct-4-2022, 03:07:57 GMT

The Supervised Machine Learning Bootcamp

The supervised machine learning algorithms you will learn here are some of the most powerful data science tools you need to solve regression and classification tasks. These are invaluable skills anyone who wants to work as a machine learning engineer and data scientist should have in their toolkit. In this course, you will learn the theory behind all 6 algorithms, and then apply your skills to practical case studies tailored to each one of them, using Python's sci-kit learn library. First, we cover naïve Bayes – a powerful technique based on Bayesian statistics. Its strong point is that it's great at performing tasks in real-time.

algorithm, make accurate prediction, supervised machine learning bootcamp, (1 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.96)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)