nullm
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
- Telecommunications > Networks (0.34)
- Information Technology > Networks (0.34)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Supplementary Material for " Partial Optimal Transport with Applications on Positive-Unlabeled Learning '
The proof involves 3 steps: 1. A null 5 (with a constant A > 2ξ) the GW formulation involves pairs of points. This yields the following cases: Case 1: a > 0. In that case, φ (γ) is a convex function, whose minimum on [0, 1] is reached for γ Using the development in Section 1.2.2 of the supplemental, we can establish that The partial-OT computation is based on a augmented problem with a dummy point and, as such, is convex. On the contrary, the GW problem is non-convex and, although the algorithm is proved to converge, there is no guarantee that the global optimum is reached. The quality of the solution is therefore highly dependent on the initialization.
- Europe > France > Normandy > Seine-Maritime > Rouen (0.04)
- North America > Canada (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Hamburg (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Hamburg (0.04)
Understanding Generalization in Deep Learning via Tensor Methods
Li, Jingling, Sun, Yanchao, Su, Jiahao, Suzuki, Taiji, Huang, Furong
Deep neural networks generalize well on unseen data though the number of parameters often far exceeds the number of training examples. Recently proposed complexity measures have provided insights to understanding the generalizability in neural networks from perspectives of PAC-Bayes, robustness, overparametrization, compression and so on. In this work, we advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective. Using tensor analysis, we propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks; thus, in practice, our generalization bound outperforms the previous compression-based ones, especially for neural networks using tensors as their weight kernels (e.g. CNNs). Moreover, these intuitive measurements provide further insights into designing neural network architectures with properties favorable for better/guaranteed generalizability. Our experimental results demonstrate that through the proposed measurable properties, our generalization error bound matches the trend of the test error well. Our theoretical analysis further provides justifications for the empirical success and limitations of some widely-used tensor-based compression approaches. We also discover the improvements to the compressibility and robustness of current neural networks when incorporating tensor operations via our proposed layer-wise structure.
- Africa > Senegal > Kolda Region > Kolda (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Simple Strategies in Multi-Objective MDPs (Technical Report)
Delgrange, Florent, Katoen, Joost-Pieter, Quatmann, Tim, Randour, Mickael
We consider the verification of multiple expected reward objectives at once on Markov decision processes (MDPs). This enables a trade-off analysis among multiple objectives by obtaining the Pareto front. We focus on strategies that are easy to employ and implement. That is, strategies that are pure (no randomization) and have bounded memory. We show that checking whether a point is achievable by a pure stationary strategy is NP-complete, even for two objectives, and we provide an MILP encoding to solve the corresponding problem. The bounded memory case can be reduced to the stationary one by a product construction. Experimental results using \Storm and Gurobi show the feasibility of our algorithms.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
- Europe > Belgium (0.04)