Goto

Collaborating Authors

 regularization technique





Well-tunedSimpleNetsExcelon TabularDatasets

Neural Information Processing Systems

Weempirically assess theimpact oftheseregularization cocktailsforMLPs ina large-scale empirical study comprising 40 tabular datasets and demonstrate that (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditionalMLmethods,suchasXGBoost.



PUe: Biased Positive-Unlabeled Learning Enhancement by Causal Inference

Neural Information Processing Systems

Positive-Unlabeled (PU) learning aims to achieve high-accuracy binary classification with limited labeled positive examples and numerous unlabeled ones. Existing cost-sensitive-based methods often rely on strong assumptions that examples with an observed positive label were selected entirely at random. In fact, the uneven distribution of labels is prevalent in real-world PU problems, indicating that most actual positive and unlabeled data are subject to selection bias. In this paper, we propose a PU learning enhancement (PUe) algorithm based on causal inference theory, which employs normalized propensity scores and normalized inverse probability weighting (NIPW) techniques to reconstruct the loss function, thus obtaining a consistent, unbiased estimate of the classifier and enhancing the model's performance. Moreover, we investigate and propose a method for estimating propensity scores in deep learning using regularization techniques when the labeling mechanism is unknown. Our experiments on three benchmark datasets demonstrate the proposed PUe algorithm significantly improves the accuracy of classifiers on non-uniform label distribution datasets compared to advanced cost-sensitive PU methods.


2 Theoreticalsetting

Neural Information Processing Systems

Theoretically, the focus is on fittingalargeclassofproblems intoasingleMinMax frameworkandgeneralizing regularization techniques knownfrom classical optimal transport.



4ffb0d2ba92f664c2281970110a2e071-Paper.pdf

Neural Information Processing Systems

TheobjectiveofGANs istoproduce random samples from atarget data distribution, given only access toan initial set of training samples. This isachievedbylearning twofunctions: ageneratorG,which maps random input noise to a generated sample, and a discriminatorD, which tries to classify input samples as either real (i.e., from the training dataset) or fake (i.e., produced by the generator).


1ef91c212e30e14bf125e9374262401f-Supplemental.pdf

Neural Information Processing Systems

In this section, we provide more empricial evidence to identify the connection of the weight loss landscape and the robust generalization gap across learning rate schedules, model architectures, datasets,andthreatmodels. We adversarially train PreAct ResNet-18 with different learning rate schedules using the same experimental settings in Section 3. The learning curves are shown on the left column in Figure 7, where the whole training process can be split into twostages: the early stage with small robust generalization gap ( 10%) and the late stage with large robust generalization gap (> 10%). Meanwhile, the weight loss landscape also becomes sharp much later. Meanwhile, the weight loss landscape also keeps flat at 10-th epoch and starts tobecome sharper. C.4 TheConnectiononL2ThreatModel To further explore the universality of the connection, we additionally conduct experiments onL2 threatmodelinFigure10. In this section, we first provide the pseudo-code of the AWP-based vanilla advesraial trainig (ATAWP), and then describe how to satisfy the constraint of the perturbation size in Eq.(8) via the weightupdateinEq.