tl 1
Reweighting Local Mimina with Tilted SAM
Li, Tian, Zhou, Tianyi, Bilmes, Jeffrey A.
Sharpness-Aware Minimization (SAM) has been demonstrated to improve the generalization performance of overparameterized models by seeking flat minima on the loss landscape through optimizing model parameters that incur the largest loss within a neighborhood. Nevertheless, such min-max formulations are computationally challenging especially when the problem is highly non-convex. Additionally, focusing only on the worst-case local solution while ignoring potentially many other local solutions may be suboptimal when searching for flat minima. In this work, we propose Tilted SAM (TSAM), a generalization of SAM inspired by exponential tilting that effectively assigns higher priority to local solutions that are flatter and that incur larger losses. TSAM is parameterized by a tilt hyperparameter t and reduces to SAM as t approaches infinity. We prove that (1) the TSAM objective is smoother than SAM and thus easier to optimize; and (2) TSAM explicitly favors flatter minima as t increases. This is desirable as flatter minima could have better generalization properties for certain tasks. We develop algorithms motivated by the discretization of Hamiltonian dynamics to solve TSAM. Empirically, TSAM arrives at flatter local minima and results in superior test performance than the baselines of SAM and ERM across a range of image and text tasks.
Adaptive Webpage Fingerprinting from TLS Traces
Mavroudis, Vasilios, Hayes, Jamie
In webpage fingerprinting, an on-path adversary infers the specific webpage loaded by a victim user by analysing the patterns in the encrypted TLS traffic exchanged between the user's browser and the website's servers. This work studies modern webpage fingerprinting adversaries against the TLS protocol; aiming to shed light on their capabilities and inform potential defences. Despite the importance of this research area (the majority of global Internet users rely on standard web browsing with TLS) and the potential real-life impact, most past works have focused on attacks specific to anonymity networks (e.g., Tor). We introduce a TLS-specific model that: 1) scales to an unprecedented number of target webpages, 2) can accurately classify thousands of classes it never encountered during training, and 3) has low operational costs even in scenarios of frequent page updates. Based on these findings, we then discuss TLS-specific countermeasures and evaluate the effectiveness of the existing padding capabilities provided by TLS 1.3.
Principal Component Analysis Based on T$\ell_1$-norm Maximization
Yang, Xiang-Fei, Shao, Yuan-Hai, Li, Chun-Na, Liu, Li-Ming, Deng, Nai-Yang
Classical principal component analysis (PCA) may suffer from the sensitivity to outliers and noise. Therefore PCA based on $\ell_1$-norm and $\ell_p$-norm ($0 < p < 1$) have been studied. Among them, the ones based on $\ell_p$-norm seem to be most interesting from the robustness point of view. However, their numerical performance is not satisfactory. Note that, although T$\ell_1$-norm is similar to $\ell_p$-norm ($0 < p < 1$) in some sense, it has the stronger suppression effect to outliers and better continuity. So PCA based on T$\ell_1$-norm is proposed in this paper. Our numerical experiments have shown that its performance is superior than PCA-$\ell_p$ and $\ell_p$SPCA as well as PCA, PCA-$\ell_1$ obviously.
Barron Spaces and the Compositional Function Spaces for Neural Network Models
One of the key issues in the analysis of machine learning models is to identify the appropriate function space for the model. This is the space of functions that the particular machine learning model can approximate with good accuracy, endowed with a natural norm associated with the approximation process. In this paper, we address this issue for two representative neural network models: the two-layer networks and the residual neural networks. We define Barron space and show that it is the right space for two-layer neural network models in the sense that optimal direct and inverse approximation theorems hold for functions in the Barron space. For residual neural network models, we construct the so-called compositional function space, and prove direct and inverse approximation theorems for this space. In addition, we show that the Rademacher complexity has the optimal upper bounds for these spaces.
Variational limits of k-NN graph based functionals on data clouds
We consider i.i.d. samples $x_1, \dots, x_n$ from a measure $\nu$ with density supported on a bounded Euclidean domain $D \subseteq R^d $ where $d\geq3$. A graph on the point cloud is obtained by connecting two points if one of them is among the $k$-nearest neighbors of the other. Our goal is to study consistency of graph based procedures to clustering, classification and dimensionality reduction by studying the variational convergence of the graph total variation associated to such $k$-NN graph. We prove that provided $k:=k_n$ scales like $n \gg k_n \gg \log(n)$, then the $\Gamma$-convergence of the graph total variation towards an appropriate weighted total variation is guaranteed.
Extended ASP tableaux and rule redundancy in normal logic programs
Järvisalo, Matti, Oikarinen, Emilia
We introduce an extended tableau calculus for answer set programming (ASP). The proof system is based on the ASP tableaux defined in [Gebser&Schaub, ICLP 2006], with an added extension rule. We investigate the power of Extended ASP Tableaux both theoretically and empirically. We study the relationship of Extended ASP Tableaux with the Extended Resolution proof system defined by Tseitin for sets of clauses, and separate Extended ASP Tableaux from ASP Tableaux by giving a polynomial-length proof for a family of normal logic programs P_n for which ASP Tableaux has exponential-length minimal proofs with respect to n. Additionally, Extended ASP Tableaux imply interesting insight into the effect of program simplification on the lengths of proofs in ASP. Closely related to Extended ASP Tableaux, we empirically investigate the effect of redundant rules on the efficiency of ASP solving. To appear in Theory and Practice of Logic Programming (TPLP).