aum
example, a 1. 2% reduction in error on CIFAR100 (without synthetic noise) simply by removing data
We thank the reviewers for their helpful feedback. We are encouraged that you note AUM's simplicity--"works with It seems that R3, as they admit themself, is "confused" by our submission and contribution. We could cite [Wang et al., CVPR 2018] (as suggested by R3) but Additionally, we clearly discuss/compare to Co-Teaching in Sec. However, we do agree with R3's point concerning the subsampled Clothing1M dataset (see response to R4). Thank you for your supportive comments and interesting remarks. Thus the difference between AUM and standard training is 0. 2%. Thank you for positive feedback and detailed questions. We hope to address them here and in the camera ready. "Do the removed samples introduce new problem?" WebVision are less likely to be mislabeled (e.g. We will discuss this more in Sec. 5. "How to choose a good set of [threshold] samples?": We choose We are unclear what you mean by "the assigned logit "Analyses about the difference AUM and original margin": AUM is more robust and consistent than the margin Averaging across epochs increases the "signal to noise ratio."
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
Erol, Mehmet Hamza, Senocak, Arda, Feng, Jiu, Chung, Joon Son
Transformers have rapidly become the preferred choice for audio classification, surpassing methods based on CNNs. However, Audio Spectrogram Transformers (ASTs) exhibit quadratic scaling due to self-attention. The removal of this quadratic self-attention cost presents an appealing direction. Recently, state space models (SSMs), such as Mamba, have demonstrated potential in language and vision tasks in this regard. In this study, we explore whether reliance on self-attention is necessary for audio classification tasks. By introducing Audio Mamba (AuM), the first self-attention-free, purely SSM-based model for audio classification, we aim to address this question. We evaluate AuM on various audio datasets - comprising six different benchmarks - where it achieves comparable or better performance compared to well-established AST model.
ActiveAED: A Human in the Loop Improves Annotation Error Detection
Manually annotated datasets are crucial for training and evaluating Natural Language Processing models. However, recent work has discovered that even widely-used benchmark datasets contain a substantial number of erroneous annotations. This problem has been addressed with Annotation Error Detection (AED) models, which can flag such errors for human re-annotation. However, even though many of these AED methods assume a final curation step in which a human annotator decides whether the annotation is erroneous, they have been developed as static models without any human-in-the-loop component. In this work, we propose ActiveAED, an AED method that can detect errors more accurately by repeatedly querying a human for error corrections in its prediction loop. We evaluate ActiveAED on eight datasets spanning five different tasks and find that it leads to improvements over the state of the art on seven of them, with gains of up to six percentage points in average precision.
Optimizing ROC Curves with a Sort-Based Surrogate Loss Function for Binary Classification and Changepoint Detection
Hillman, Jonathan, Hocking, Toby Dylan
Receiver Operating Characteristic (ROC) curves are plots of true positive rate versus false positive rate which are useful for evaluating binary classification models, but difficult to use for learning since the Area Under the Curve (AUC) is non-convex. ROC curves can also be used in other problems that have false positive and true positive rates such as changepoint detection. We show that in this more general context, the ROC curve can have loops, points with highly sub-optimal error rates, and AUC greater than one. This observation motivates a new optimization objective: rather than maximizing the AUC, we would like a monotonic ROC curve with AUC=1 that avoids points with large values for Min(FP,FN). We propose a convex relaxation of this objective that results in a new surrogate loss function called the AUM, short for Area Under Min(FP, FN). Whereas previous loss functions are based on summing over all labeled examples or pairs, the AUM requires a sort and a sum over the sequence of points on the ROC curve. We show that AUM directional derivatives can be efficiently computed and used in a gradient descent learning algorithm. In our empirical study of supervised binary classification and changepoint detection problems, we show that our new AUM minimization learning algorithm results in improved AUC and comparable speed relative to previous baselines.
S&P 500 Momentum Ranked Top Buy This Month By Artificial Intelligence
April was certainly a solid month for stocks, but will this month pay homage to the "Sell in May and Go Away" strategy or will we see a new trend emerge? Of course, many investors would prefer to play the long game and own high quality companies in a diversified basket of stocks via an ETF holding. While some of the below list will present some value, and some will look overseas for an advantage, all of them have been rated as our Top Buy ETFs for the month of May. These picks should help you diversify and mitigate risk inside your portfolio, and hopefully provide some upside throughout the month. Q.ai's deep learning algorithms have identified several ETFs to do some due diligence on for May, based on their fund flows over the last 90-days, 30-days, and 7-days.
Best Thematic ETFs for 2020
Globally, thematic investing has tripled over the past five years to around $40.76 billion, per Morningstar Inc. This is steadily taking over the investment world, largely due to the introduction of theme-based funds and also for its long-term and easy-to-comprehend approach. Thematic investing requires investment in companies that can benefit from the technological, demographic and environmental changes (read: Top ETF Areas for 2020). Let's take a look at some of the themes that are currently in vogue. We are living in an era that is largely dominated by AI applications and technological advancements.
Identifying Mislabeled Data using the Area Under the Margin Ranking
Pleiss, Geoff, Zhang, Tianyi, Elenberg, Ethan R., Weinberger, Kilian Q.
Not all data in a typical training set help with generalization; some samples can be overly ambiguous or outrightly mislabeled. This paper introduces a new method to identify such samples and mitigate their impact when training neural networks. At the heart of our algorithm is the Area Under the Margin (AUM) statistic, which exploits differences in the training dynamics of clean and mislabeled samples. A simple procedure - adding an extra class populated with purposefully mislabeled indicator samples - learns a threshold that isolates mislabeled data based on this metric. This approach consistently improves upon prior work on synthetic and real-world datasets. On the WebVision50 classification task our method removes 17% of training data, yielding a 2.6% (absolute) improvement in test error. On CIFAR100 removing 13% of the data leads to a 1.2% drop in error.