Goto

Collaborating Authors

 indispensable


Are Anchor Points Really Indispensable in Label-Noise Learning?

Neural Information Processing Systems

In label-noise learning, the \textit{noise transition matrix}, denoting the probabilities that clean labels flip into noisy labels, plays a central role in building \textit{statistically consistent classifiers}. Existing theories have shown that the transition matrix can be learned by exploiting \textit{anchor points} (i.e., data points that belong to a specific class almost surely). However, when there are no anchor points, the transition matrix will be poorly learned, and those previously consistent classifiers will significantly degenerate. In this paper, without employing anchor points, we propose a \textit{transition-revision} ($T$-Revision) method to effectively learn transition matrices, leading to better classifiers. Specifically, to learn a transition matrix, we first initialize it by exploiting data points that are similar to anchor points, having high \textit{noisy class posterior probabilities}. Then, we modify the initialized matrix by adding a \textit{slack variable}, which can be learned and validated together with the classifier by using noisy data. Empirical results on benchmark-simulated and real-world label-noise datasets demonstrate that without using exact anchor points, the proposed method is superior to state-of-the-art label-noise learning methods.


Voice Tech Becoming Indispensable to Healthcare - RTInsights

#artificialintelligence

As we navigate the post-pandemic world (and grapple with Covid's continual resurgence), voice technology supported by AI will be healthcare's next big tool. Artificial intelligence rocketed up the hype cycle during the pandemic, but no ascension is more fascinating than voice technology. With use cases in multiple industries, including the one on everyone's mind--healthcare, voice tech investments, and deployment will only increase as we navigate a post-Covid world. Voice tech has multiple use case potentials even outside the common customer service applications. Researchers at Carnegie Mellon, for example, pioneered voice tech to identify potential Covid infections from the sound of someone's voice and breathing patterns.


On The State of Data In Computer Vision: Human Annotations Remain Indispensable for Developing Deep Learning Models

arXiv.org Artificial Intelligence

High-quality labeled datasets play a crucial role in fueling the development of machine learning (ML), and in particular the development of deep learning (DL). However, since the emergence of the ImageNet dataset and the AlexNet model in 2012, the size of new open-source labeled vision datasets has remained roughly constant. Consequently, only a minority of publications in the computer vision community tackle supervised learning on datasets that are orders of magnitude larger than Imagenet. In this paper, we survey computer vision research domains that study the effects of such large datasets on model performance across different vision tasks. We summarize the community's current understanding of those effects, and highlight some open questions related to training with massive datasets. In particular, we tackle: (a) The largest datasets currently used in computer vision research and the interesting takeaways from training on such datasets; (b) The effectiveness of pre-training on large datasets; (c) Recent advancements and hurdles facing synthetic datasets; (d) An overview of double descent and sample non-monotonicity phenomena; and finally, (e) A brief discussion of lifelong/continual learning and how it fares compared to learning from huge labeled datasets in an offline setting. Overall, our findings are that research on optimization for deep learning focuses on perfecting the training routine and thus making DL models less data hungry, while research on synthetic datasets aims to offset the cost of data labeling. However, for the time being, acquiring non-synthetic labeled data remains indispensable to boost performance.


An Indispensable Need for Data and Analytics Leadership (DAL)

#artificialintelligence

For a typical organization, following are statements that must have been a part of almost every employee's career journey: This all is only great if everything happens in the area of your interest, experience, and more importantly, competence. Unfortunately for most organizations, doers are not decision makers and vice versa. My focus is to draw attention to those new-age data and analytics workforce which is raring to work and innovate but getting stuck in a place with scant understanding from those who call the shots, let alone being mentored by them. LinkedIn's 2017 U.S Emerging Jobs Report says that Data Scientist roles have grown over 650% and Machine Learning engineers' have risen by a humongous 980% since 2012. Have a look at the survey conducted recently on Data Scientists.


So You Think You're Indispensable? Think Again!

#artificialintelligence

We have emotions, family and friends to spend time with. Whether employed, self-employed, we like to take time off to relax and spend time with them. We occasionally become unwell and can be rather unpredictable too. Sometimes we just down tools and decide to take a day or two off, whatever the reason may be. We are all "overheads" in some way or another, costly, and in some cases slow too – young or old.