factorization
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Workflow (0.46)
- Research Report > New Finding (0.45)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Security & Privacy (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
- Health & Medicine (0.46)
- Government (0.46)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Lower Saxony > Gottingen (0.04)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.95)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > Switzerland (0.04)
Implicit bias as a Gauge correction: Theory and Inverse Design
Aladrah, Nicola, Ballarin, Emanuele, Biagetti, Matteo, Ansuini, Alessio, d'Onofrio, Alberto, Anselmi, Fabio
A central problem in machine learning theory is to characterize how learning dynamics select particular solutions among the many compatible with the training objective, a phenomenon, called implicit bias, which remains only partially characterized. In the present work, we identify a general mechanism, in terms of an explicit geometric correction of the learning dynamics, for the emergence of implicit biases, arising from the interaction between continuous symmetries in the model's parametrization and stochasticity in the optimization process. Our viewpoint is constructive in two complementary directions: given model symmetries, one can derive the implicit bias they induce; conversely, one can inverse-design a wide class of different implicit biases by computing specific redundant parameterizations. More precisely, we show that, when the dynamics is expressed in the quotient space obtained by factoring out the symmetry group of the parameterization, the resulting stochastic differential equation gains a closed form geometric correction in the stationary distribution of the optimizer dynamics favoring orbits with small local volume. We compute the resulting symmetry induced bias for a range of architectures, showing how several well known results fit into a single unified framework. The approach also provides a practical methodology for deriving implicit biases in new settings, and it yields concrete, testable predictions that we confirm by numerical simulations on toy models trained on synthetic data, leaving more complex scenarios for future work. Finally, we test the implicit bias inverse-design procedure in notable cases, including biases toward sparsity in linear features or in spectral properties of the model parameters.
- Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts (0.04)
- (3 more...)
A review of NMF, PLSA, LBA, EMA, and LCA with a focus on the identifiability issue
Qi, Qianqian, van der Heijden, Peter G. M.
Across fields such as machine learning, social science, geography, considerable attention has been given to models that factorize a nonnegative matrix into the product of two or three matrices, subject to nonnegative or row-sum-to-1 constraints. Although these models are to a large extend similar or even equivalent, they are presented under different names, and their similarity is not well known. This paper highlights similarities among five popular models, latent budget analysis (LBA), latent class analysis (LCA), end-member analysis (EMA), probabilistic latent semantic analysis (PLSA), and nonnegative matrix factorization (NMF). We focus on an essential issue-identifiability-of these models and prove that the solution of LBA, EMA, LCA, PLSA is unique if and only if the solution of NMF is unique. We also provide a brief review for algorithms of these models. We illustrate the models with a time budget dataset from social science, and end the paper with a discussion of closely related models such as archetypal analysis.
- Asia > Middle East > Jordan (0.04)
- Asia > Singapore (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > Canada (0.04)
- (4 more...)