qnullp
A Proofs
In this section, we provide proofs of the main theorems presented in the paper. "Task specific generalization bound", that bounds the generalization error averaged over all "Task environment generalization bound", that bounds the transfer error from the observed Subsequently, combining Eq.(15) with Eq.(16), it is straightforward to get Eq.(4), with C ( δ,λ,β,n,m For the "task environment generalization bound", define the "meta-training" generalization error of a The "task environment generalization bound" is the same as the one in Theorem 2, because The inequality uses Jensen's inequality to move the logarithm Therefore, we can rewrite Eq.(24) in the form of the implicit gradient, d(PacB) d p The second term of Eq.(25) is equivalent to, null Q The Monte-Carlo gradient estimator of this has the same high-variance problem as in the policy gradient method, which causes unreliable inference without warm-start. The Pseudocode of P ACMAML is shown in Algorithm 1.Algorithm 1 Each iteration in the P ACOH and P ACMAML setting takes about 0.03-0.06s P ACOH and P ACMAML obtained from the same set of experiments for Figure 1-4. In Figure 1, we show the comparison between the total bound of P ACOH and P ACMAML.
Information Limits for Detecting a Subhypergraph
We consider the problem of recovering a subhypergraph based on an observed adjacency tensor corresponding to a uniform hypergraph. The uniform hypergraph is assumed to contain a subset of vertices called as subhypergraph. The edges restricted to the subhypergraph are assumed to follow a different probability distribution than other edges. We consider both weak recovery and exact recovery of the subhypergraph, and establish information-theoretic limits in each case. Specifically, we establish sharp conditions for the possibility of weakly or exactly recovering the subhypergraph from an information-theoretic point of view. These conditions are fundamentally different from their counterparts derived in hypothesis testing literature.
- North America > United States > North Dakota (0.04)
- North America > United States > New Jersey (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- (2 more...)
PAC-Bayesian Contrastive Unsupervised Representation Learning
Nozawa, Kento, Germain, Pascal, Guedj, Benjamin
Contrastive unsupervised representation learning (CURL) is the state-of-the-art technique to learn representations (as a set of features) from unlabelled data. While CURL has collected several empirical successes recently, theoretical understanding of its performance was still missing. In a recent work, Arora et al. ( 2019) provide the first generalisation bounds for CURL, relying on a Rademacher complexity. We extend their framework to the flexible PAC-Bayes setting, allowing to deal with the non-iid setting. We present PAC-Bayesian generalisation bounds for CURL, which are then used to derive a new representation learning algorithm. Numerical experiments on real-life datasets illustrate that our algorithm achieves competitive accuracy, and yields generalisation bounds with non-vacuous values.
- Oceania > Australia > New South Wales (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Research Report (0.84)
- Instructional Material > Course Syllabus & Notes (0.46)