wce
Convex-Geometric Error Bounds for Positive-Weight Kernel Quadrature
Kernel quadrature (KQ) is a kernel-based approach to numerical integration, closely related to Bayesian quadrature (BQ) and probabilistic integration [38, 39, 10]. For sufficiently regular integrands, KQ can exploit spectral structure in a reproducing kernel Hilbert space (RKHS) that is invisible to plain Monte Carlo and thereby converge faster than the usual O(N 1/2) rate in the number of points [3, 28]. Unconstrained kernel-based rules, however, may produce numerically unstable weights, motivating longstanding interest in positively weighted rules [13, 21, 29, 46]. In this paper, positive weights mean nonnegative weights that sum to one, i.e., simplex or convex-combination weights. Whether positive-weight KQ can systematically improve over Monte Carlo is a subtle question. Kernel herding and related constructions suggested fast rates under favorable assumptions [13], but the conditional-gradient viewpoint of Bach et al. [4] clarified that the strongest such assumptions are not generally available in infinite-dimensional RKHSs. Subsequent herding-type analyses in broad RKHS settings have therefore mostly remained at the Monte-Carlo scale, except under additional structure or modified algorithms such as sparse herding variants [31, 44, 43]. Beyond herding, subsampling-based positive KQ methods such as thinning [16, 15] and recombination [21, 24] have obtained rates beyond Monte Carlo, but a general mechanism for such improvement in the simple i.i.d.
Thumb on the Scale: Optimal Loss Weighting in Last Layer Retraining
Stromberg, Nathan, Thrampoulidis, Christos, Sankar, Lalitha
While machine learning models become more capable in discriminative tasks at scale, their ability to overcome biases introduced by training data has come under increasing scrutiny. Previous results suggest that there are two extremes of parameterization with very different behaviors: the population (underparameterized) setting where loss weighting is optimal and the separable overparameterized setting where loss weighting is ineffective at ensuring equal performance across classes. This work explores the regime of last layer retraining (LLR) in which the unseen limited (retraining) data is frequently inseparable and the model proportionately sized, falling between the two aforementioned extremes. We show, in theory and practice, that loss weighting is still effective in this regime, but that these weights \emph{must} take into account the relative overparameterization of the model.
Deep Reinforcement Learning-Based Control for Stomach Coverage Scanning of Wireless Capsule Endoscopy
Zhang, Yameng, Bai, Long, Liu, Li, Ren, Hongliang, Meng, Max Q. -H.
Due to its non-invasive and painless characteristics, wireless capsule endoscopy has become the new gold standard for assessing gastrointestinal disorders. Omissions, however, could occur throughout the examination since controlling capsule endoscope can be challenging. In this work, we control the magnetic capsule endoscope for the coverage scanning task in the stomach based on reinforcement learning so that the capsule can comprehensively scan every corner of the stomach. We apply a well-made virtual platform named VR-Caps to simulate the process of stomach coverage scanning with a capsule endoscope model. We utilize and compare two deep reinforcement learning algorithms, the Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) algorithms, to train the permanent magnetic agent, which actuates the capsule endoscope directly via magnetic fields and then optimizes the scanning efficiency of stomach coverage. We analyze the pros and cons of the two algorithms with different hyperparameters and achieve a coverage rate of 98.04% of the stomach area within 150.37 seconds.
Evaluating clinical diversity and plausibility of synthetic capsule endoscopic images
Vats, Anuja, Pedersen, Marius, Mohammed, Ahmed, Hovde, รistein
Wireless Capsule Endoscopy (WCE) is being increasingly used as an alternative imaging modality for complete and non-invasive screening of the gastrointestinal tract. Although this is advantageous in reducing unnecessary hospital admissions, it also demands that a WCE diagnostic protocol be in place so larger populations can be effectively screened. This calls for training and education protocols attuned specifically to this modality. Like training in other modalities such as traditional endoscopy, CT, MRI, etc., a WCE training protocol would require an atlas comprising of a large corpora of images that show vivid descriptions of pathologies and abnormalities, ideally observed over a period of time. Since such comprehensive atlases are presently lacking in WCE, in this work, we propose a deep learning method for utilizing already available studies across different institutions for the creation of a realistic WCE atlas using StyleGAN. We identify clinically relevant attributes in WCE such that synthetic images can be generated with selected attributes on cue. Beyond this, we also simulate several disease progression scenarios. The generated images are evaluated for realism and plausibility through three subjective online experiments with the participation of eight gastroenterology experts from three geographical locations and a variety of years of experience. The results from the experiments indicate that the images are highly realistic and the disease scenarios plausible. The images comprising the atlas are available publicly for use in training applications as well as supplementing real datasets for deep learning.
RelTransformer: Balancing the Visual Relationship Detection from Local Context, Scene and Memory
Chen, Jun, Agarwal, Aniket, Abdelkarim, Sherif, Zhu, Deyao, Elhoseiny, Mohamed
Visual relationship recognition (VRR) is a fundamental scene understanding task. The structure that VRR provides is essential to improve the AI interpretability in downstream tasks such as image captioning and visual question answering. Several recent studies showed that the long-tail problem in VRR is even more critical than that in object recognition due to the compositional complexity and structure. To overcome this limitation, we propose a novel transformer-based framework, dubbed as RelTransformer, which performs relationship prediction using rich semantic features from multiple image levels. We assume that more abundantcon textual features can generate more accurate and discriminative relationships, which can be useful when sufficient training data are lacking. The key feature of our model is its ability to aggregate three different-level features (local context, scene, and dataset-level) to compositionally predict the visual relationship. We evaluate our model on the visual genome and two "long-tail" VRR datasets, GQA-LT and VG8k-LT. Extensive experiments demonstrate that our RelTransformer could improve over the state-of-the-art baselines on all the datasets. In addition, our model significantly improves the accuracy of GQA-LT by 27.4% upon the best baselines on tail-relationship prediction. Our code is available in https://github.com/Vision-CAIR/RelTransformer.
High pooled performance of convolutional neural networks in computer-aided diagnosis of GI ulcers and/or hemorrhage on wireless capsule endoscopy images: a systematic review and meta-analysis
Diagnosis of gastrointestinal (GI) ulcers and/or hemorrhage by wireless capsule endoscopy (WCE) is limited by the physician-dependent, tedious, time-consuming process of image and/ or video classification. Computer-aided diagnosis (CAD) by convolutional neural networks (CNN) based machine learning may help reduce this burden. Our aim was to conduct a meta-analysis and appraise the reported data.
Word-Class Embeddings for Multiclass Text Classification
Moreo, Alejandro, Esuli, Andrea, Sebastiani, Fabrizio
Pre-trained word embeddings encode general word semantics and lexical regularities of natural language, and have proven useful across many NLP tasks, including word sense disambiguation, machine translation, and sentiment analysis, to name a few. In supervised tasks such as multiclass text classification (the focus of this article) it seems appealing to enhance word representations with ad-hoc embeddings that encode task-specific information. We propose (supervised) word-class embeddings (WCEs), and show that, when concatenated to (unsupervised) pre-trained word embeddings, they substantially facilitate the training of deep-learning models in multiclass classification by topic. We show empirical evidence that WCEs yield a consistent improvement in multiclass classification accuracy, using four popular neural architectures and six widely used and publicly available datasets for multiclass text classification. Our code that implements WCEs is publicly available at https://github.com/AlexMoreo/word-class-embeddings
Adaptive Class Weight based Dual Focal Loss for Improved Semantic Segmentation
Hossain, Md Sazzad, Paplinski, Andrew P, Betts, John M
In this paper, we propose a Dual Focal Loss (DFL) function, as a replacement for the standard cross entropy (CE) function to achieve a better treatment of the unbalanced classes in a dataset. Our DFL method is an improvement on the recently reported Focal Loss (FL) cross-entropy function, which proposes a scaling method that puts more weight on the examples that are difficult to classify over those that are easy. However, the scaling parameter of FL is empirically set, which is problem-dependent. In addition, like other CE variants, FL only focuses on the loss of true classes. Therefore, no loss feedback is gained from the false classes. Although focusing only on true examples increases probability on true classes and correspondingly reduces probability on false classes due to the nature of the softmax function, it does not achieve the best convergence due to avoidance of the loss on false classes. Our DFL method improves on the simple FL in two ways. Firstly, it takes the idea of FL to focus more on difficult examples than the easy ones, but evaluates loss on both true and negative classes with equal importance. Secondly, the scaling parameter of DFL has been made learnable so that it can tune itself by backpropagation rather than being dependent on manual tuning. In this way, our proposed DFL method offers an auto-tunable loss function that can reduce the class imbalance effect as well as put more focus on both true difficult examples and negative easy examples. Experimental results show that our proposed method provides better accuracy in every test run conducted over a variety of different network models and datasets.