Goto

Collaborating Authors

 Performance Analysis


How rotational invariance of common kernels prevents generalization in high dimensions

arXiv.org Machine Learning

Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings. However, its behavior in high dimensions is much less understood. Recent work establishes consistency for kernel regression under certain assumptions on the ground truth function and the distribution of the input data. In this paper, we show that the rotational invariance property of commonly studied kernels (such as RBF, inner product kernels and fully-connected NTK of any depth) induces a bias towards low-degree polynomials in high dimensions. Our result implies a lower bound on the generalization error for a wide range of distributions and various choices of the scaling for kernels with different eigenvalue decays. This lower bound suggests that general consistency results for kernel ridge regression in high dimensions require a more refined analysis that depends on the structure of the kernel beyond its eigenvalue decay.


A conversation on artificial intelligence and gender bias

#artificialintelligence

The world celebrated Women's History Month in March, and it is a timely moment for us to look at the forces that will shape gender parity in the future. Even as the pandemic accelerates digitization and the future of work, artificial intelligence (AI) stands out as a potentially helpful--or hurtful--tool in the equity agenda. McKinsey recorded a podcast in collaboration with Citi that dives into how gender bias is reflected in AI, why we must consciously debias our machine-human interfaces, and how AI can be a positive force for gender parity. Ioana Niculcea: Before we start the conversation, I think it's important for us to spend a moment assessing the amount of change that has taken place with regard to AI, and how the pace of that change has accelerated over the past few years. And many people argue that in light of the current COVID-19 circumstance, we'll feel further acceleration as people move toward digitization. I spent the past eight years in financial services, and it all started with data. Datafication of the industry was sort of the point of origin. And we hear often that over 90 percent of the data that we have today was created over the past two years. You hear things like every minute, there's over one million Facebook logins and 4.5 million YouTube videos being streamed, or 17,000 different Uber rides. There's a lot of data, and only 1 percent of that is being analyzed, as said today.


Study suggests that AI model selection might introduce bias

#artificialintelligence

Register for a free or VIP pass today. The past several years have made it clear that AI and machine learning are not a panacea when it comes to fair outcomes. Applying algorithmic solutions to social problems can magnify biases against marginalized peoples; undersampling populations always results in worse predictive accuracy. But bias in AI doesn't arise from the datasets alone. Problem formulation, or the way researchers fit tasks to AI techniques, can contribute.


Handling Climate Change Using Counterfactuals: Using Counterfactuals in Data Augmentation to Predict Crop Growth in an Uncertain Climate Future

arXiv.org Artificial Intelligence

Climate change poses a major challenge to humanity, especially in its impact on agriculture, a challenge that a responsible AI should meet. In this paper, we examine a CBR system (PBI-CBR) designed to aid sustainable dairy farming by supporting grassland management, through accurate crop growth prediction. As climate changes, PBI-CBR's historical cases become less useful in predicting future grass growth. Hence, we extend PBI-CBR using data augmentation, to specifically handle disruptive climate events, using a counterfactual method (from XAI). Study 1 shows that historical, extreme climate-events (climate outlier cases) tend to be used by PBI-CBR to predict grass growth during climate disrupted periods. Study 2 shows that synthetic outliers, generated as counterfactuals on a outlier-boundary, improve the predictive accuracy of PBI-CBR, during the drought of 2018. This study also shows that an instance-based counterfactual method does better than a benchmark, constraint-guided method.


Towards End-to-End Neural Face Authentication in the Wild -- Quantifying and Compensating for Directional Lighting Effects

arXiv.org Artificial Intelligence

The recent availability of low-power neural accelerator hardware, combined with improvements in end-to-end neural facial recognition algorithms provides, enabling technology for on-device facial authentication. The present research work examines the effects of directional lighting on a State-of-Art(SoA) neural face recognizer. A synthetic re-lighting technique is used to augment data samples due to the lack of public data-sets with sufficient directional lighting variations. Top lighting and its variants (top-left, top-right) are found to have minimal effect on accuracy, while bottom-left or bottom-right directional lighting has the most pronounced effects. Following the fine-tuning of network weights, the face recognition model is shown to achieve close to the original Receiver Operating Characteristic curve (ROC)performance across all lighting conditions and demonstrates an ability to generalize beyond the lighting augmentations used in the fine-tuning data-set. This work shows that an SoA neural face recognition model can be tuned to compensate for directional lighting effects, removing the need for a pre-processing step before applying facial recognition.


AI Product Manager

#artificialintelligence

I recently completed the Artificial Intelligence Product Manager Nanodegree Program on Udacity and I'd like to share a summary of everything I learned with you. This also includes bits from my experience as a technical product manager. This all a huge dump from my mind, written from the first stroke to last on my keyboard so kindly excuse any details I may miss or depths I didn't hit. It would be great to start with "why" and what motivated me to complete this program. In the past year, I've been working as a full-time product manager, sitting at the intersection of engineering and business and it's been fun. However, I'd recently been thinking deeply about the future of technology and what turns it could take.


Concentration Inequalities for Two-Sample Rank Processes with Application to Bipartite Ranking

arXiv.org Machine Learning

The ROC curve is the gold standard for measuring the performance of a test/scoring statistic regarding its capacity to discriminate between two statistical populations in a wide variety of applications, ranging from anomaly detection in signal processing to information retrieval, through medical diagnosis. Most practical performance measures used in scoring/ranking applications such as the AUC, the local AUC, the p-norm push, the DCG and others, can be viewed as summaries of the ROC curve. In this paper, the fact that most of these empirical criteria can be expressed as two-sample linear rank statistics is highlighted and concentration inequalities for collections of such random variables, referred to as two-sample rank processes here, are proved, when indexed by VC classes of scoring functions. Based on these nonasymptotic bounds, the generalization capacity of empirical maximizers of a wide class of ranking performance criteria is next investigated from a theoretical perspective. It is also supported by empirical evidence through convincing numerical experiments.


Bootstrapping of memetic from genetic evolution via inter-agent selection pressures

arXiv.org Artificial Intelligence

We create an artificial system of agents (attention-based neural networks) which selectively exchange messages with each-other in order to study the emergence of memetic evolution and how memetic evolutionary pressures interact with genetic evolution of the network weights. We observe that the ability of agents to exert selection pressures on each-other is essential for memetic evolution to bootstrap itself into a state which has both high-fidelity replication of memes, as well as continuing production of new memes over time. However, in this system there is very little interaction between this memetic 'ecology' and underlying tasks driving individual fitness - the emergent meme layer appears to be neither helpful nor harmful to agents' ability to learn to solve tasks. Sourcecode for these experiments is available at https://github.com/GoodAI/memes


Active learning using weakly supervised signals for quality inspection

arXiv.org Artificial Intelligence

Because manufacturing processes evolve fast, and since production visual aspect can vary significantly on a daily basis, the ability to rapidly update machine vision based inspection systems is paramount. Unfortunately, supervised learning of convolutional neural networks requires a significant amount of annotated images for being able to learn effectively from new data. Acknowledging the abundance of continuously generated images coming from the production line and the cost of their annotation, we demonstrate it is possible to prioritize and accelerate the annotation process. In this work, we develop a methodology for learning actively, from rapidly mined, weakly (i.e. partially) annotated data, enabling a fast, direct feedback from the operators on the production line and tackling a big machine vision weakness: false positives. We also consider the problem of covariate shift, which arises inevitably due to changing conditions during data acquisition. In that regard, we show domain-adversarial training to be an efficient way to address this issue.


Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data

arXiv.org Artificial Intelligence

Electronic Health Record (EHR) data has been of tremendous utility in Artificial Intelligence (AI) for healthcare such as predicting future clinical events. These tasks, however, often come with many challenges when using classical machine learning models due to a myriad of factors including class imbalance and data heterogeneity (i.e., the complex intra-class variances). To address some of these research gaps, this paper leverages the exciting contrastive learning framework and proposes a novel contrastive regularized clinical classification model. The contrastive loss is found to substantially augment EHR-based prediction: it effectively characterizes the similar/dissimilar patterns (by its "push-and-pull" form), meanwhile mitigating the highly skewed class distribution by learning more balanced feature spaces (as also echoed by recent findings). In particular, when naively exporting the contrastive learning to the EHR data, one hurdle is in generating positive samples, since EHR data is not as amendable to data augmentation as image data. To this end, we have introduced two unique positive sampling strategies specifically tailored for EHR data: a feature-based positive sampling that exploits the feature space neighborhood structure to reinforce the feature learning; and an attribute-based positive sampling that incorporates pre-generated patient similarity metrics to define the sample proximity. Both sampling approaches are designed with an awareness of unique high intra-class variance in EHR data. Our overall framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data with a total of 5,712 patients admitted to a large, urban health system. Specifically, our method reaches a high AUROC prediction score of 0.959, which outperforms other baselines and alternatives: cross-entropy(0.873) and focal loss(0.931).