AITopics | Meila, Marina

Collaborating Authors

Meila, Marina

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Isometry pursuit

Koelle, Samson, Meila, Marina

arXiv.org Machine LearningNov-27-2024

Isometry pursuit is a convex algorithm for identifying orthonormal column-submatrices of wide matrices. It consists of a novel normalization method followed by multitask basis pursuit. Applied to Jacobians of putative coordinate functions, it helps identity isometric embeddings from within interpretable dictionaries. We provide theoretical and experimental results justifying this method. For problems involving coordinate selection and diversification, it offers a synergistic alternative to greedy and brute force search.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2411.18502

Country: North America > United States (0.29)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Data Science (0.93)

Add feedback

Dictionary-based Manifold Learning

Zhang, Hanyu, Koelle, Samson, Meila, Marina

arXiv.org Artificial IntelligenceFeb-4-2023

We propose a paradigm for interpretable Manifold Learning for scientific data analysis, whereby we parametrize a manifold with $d$ smooth functions from a scientist-provided dictionary of meaningful, domain-related functions. When such a parametrization exists, we provide an algorithm for finding it based on sparse non-linear regression in the manifold tangent bundle, bypassing more standard manifold learning algorithms. We also discuss conditions for the existence of such parameterizations in function space and for successful recovery from finite samples. We demonstrate our method with experimental results from a real scientific domain.

artificial intelligence, machine learning, manifold, (18 more...)

arXiv.org Artificial Intelligence

2302.00263

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Add feedback

The Parametric Stability of Well-separated Spherical Gaussian Mixtures

Zhang, Hanyu, Meila, Marina

arXiv.org Artificial IntelligenceJan-31-2023

We quantify the parameter stability of a spherical Gaussian Mixture Model (sGMM) under small perturbations in distribution space. Namely, we derive the first explicit bound to show that for a mixture of spherical Gaussian $P$ (sGMM) in a pre-defined model class, all other sGMM close to $P$ in this model class in total variation distance has a small parameter distance to $P$. Further, this upper bound only depends on $P$. The motivation for this work lies in providing guarantees for fitting Gaussian mixtures; with this aim in mind, all the constants involved are well defined and distribution free conditions for fitting mixtures of spherical Gaussians. Our results tighten considerably the existing computable bounds, and asymptotically match the known sharp thresholds for this problem.

artificial intelligence, machine learning, min 2, (18 more...)

arXiv.org Artificial Intelligence

2302.00242

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Double Diffusion Maps and their Latent Harmonics for Scientific Computations in Latent Space

Evangelou, Nikolaos, Dietrich, Felix, Chiavazzo, Eliodoro, Lehmberg, Daniel, Meila, Marina, Kevrekidis, Ioannis G.

arXiv.org Artificial IntelligenceApr-26-2022

We introduce a data-driven approach to building reduced dynamical models through manifold learning; the reduced latent space is discovered using Diffusion Maps (a manifold learning technique) on time series data. A second round of Diffusion Maps on those latent coordinates allows the approximation of the reduced dynamical models. This second round enables mapping the latent space coordinates back to the full ambient space (what is called lifting); it also enables the approximation of full state functions of interest in terms of the reduced coordinates. In our work, we develop and test three different reduced numerical simulation methodologies, either through pre-tabulation in the latent space and integration on the fly or by going back and forth between the ambient space and the latent space. The data-driven latent space simulation results, based on the three different approaches, are validated through (a) the latent space observation of the full simulation through the Nystr\"om Extension formula, or through (b) lifting the reduced trajectory back to the full ambient space, via Latent Harmonics. Latent space modeling often involves additional regularization to favor certain properties of the space over others, and the mapping back to the ambient space is then constructed mostly independently from these properties; here, we use the same data-driven approach to construct the latent space and then map back to the ambient space.

artificial intelligence, diffusion map, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jcp.2023.112072

2204.12536

Country:

North America > United States (0.93)
Europe (0.68)

Genre: Research Report (0.64)

Industry:

Energy > Oil & Gas (0.46)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

A class of network models recoverable by spectral clustering

Wan, Yali, Meila, Marina

arXiv.org Machine LearningApr-21-2021

Finding communities in networks is a problem that remains difficult, in spite of the amount of attention it has recently received. The Stochastic Block-Model (SBM) is a generative model for graphs with "communities" for which, because of its simplicity, the theoretical understanding has advanced fast in recent years. In particular, there have been various results showing that simple versions of spectral clustering using the Normalized Laplacian of the graph can recover the communities almost perfectly with high probability. Here we show that essentially the same algorithm used for the SBM and for its extension called Degree-Corrected SBM, works on a wider class of Block-Models, which we call Preference Frame Models, with essentially the same guarantees. Moreover, the parametrization we introduce clearly exhibits the free parameters needed to specify this class of models, and results in bounds that expose with more clarity the parameters that control the recovery error in this model class.

artificial intelligence, eigenvalue, machine learning, (19 more...)

arXiv.org Machine Learning

2104.10347

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Guarantees for Hierarchical Clustering by the Sublevel Set method

Meila, Marina

arXiv.org Machine LearningJul-5-2020

Compared to (simple) clustering data into K clusters, hierarchical clustering is much more complex and much less understood. One of the few seminal advances in hierarchical clusterings is the introduction by Dasgupta (2016) of a general yet simple paradigm of hierarchical clustering as loss minimization. This paradigm was expanded by Charikar and Chatziafratis (2016) and Roy and Pokutta (2016). The latter work also introduces a new set of techniques for obtaining hierarchical clusterings by showing that optimizing the loss can be relaxed to a Linear Program (LP). This paper introduces the first method to obtain optimality guarantees in the context of hierarchical clustering. Specifically, it is shown that the Sublevel Set (SS) paradigm invented by Meila (2018) for simple, nonhiearchical clustering, can be extended as well to hierarchical clustering. The main contribution is show that there is a natural distance between hierarchical clusterings whose properties can be exploited in the setting of the SS problem we will present in Section 3. The Sublevel Set method produces stability theorems of the following form.

artificial intelligence, hierarchical clustering, machine learning, (13 more...)

arXiv.org Machine Learning

2006.10274

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.25)
North America > United States (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

A class of network models recoverable by spectral clustering

Wan, Yali, Meila, Marina

Neural Information Processing SystemsFeb-14-2020, 14:13:08 GMT

Finding communities in networks is a problem that remains difficult, in spite of the amount of attention it has recently received. The Stochastic Block-Model (SBM) is a generative model for graphs with communities for which, because of its simplicity, the theoretical understanding has advanced fast in recent years. In particular, there have been various results showing that simple versions of spectralclustering using the Normalized Laplacian of the graph can recoverthe communities almost perfectly with high probability. Here we show that essentially the same algorithm used for the SBM and for its extension called Degree-Corrected SBM, works on a wider class of Block-Models, which we call Preference Frame Models, with essentially the same guarantees. Moreover, the parametrization we introduce clearly exhibits the free parameters needed to specify this class of models, and results in bounds that expose with more clarity the parameters that control the recovery error in this model class.

graph, network model recoverable, spectral

Neural Information Processing Systems

Genre: Research Report (0.45)

Technology:

Information Technology > Artificial Intelligence (0.51)
Information Technology > Communications > Networks (0.40)

Add feedback

How to tell when a clustering is (approximately) correct using convex relaxations

Meila, Marina

Neural Information Processing SystemsDec-31-2018

We introduce the Sublevel Set (SS) method, a generic method to obtain sufficient guarantees of near-optimality and uniqueness (up to small perturbations) for a clustering. This method can be instantiated for a variety of clustering loss functions for which convex relaxations exist. Obtaining the guarantees in practice amounts to solving a convex optimization. We demonstrate the applicability of this method by obtaining distribution free guarantees for K-means clustering on realistic data sets.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

How to tell when a clustering is (approximately) correct using convex relaxations

Meila, Marina

Neural Information Processing SystemsDec-31-2018

artificial intelligence, data mining, relaxation, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Measuring the Robustness of Graph Properties

Wan, Yali, Meila, Marina

arXiv.org Machine LearningDec-3-2018

In this paper, we propose a perturbation framework to measure the robustness of graph properties. Although there are already perturbation methods proposed to tackle this problem, they are limited by the fact that the strength of the perturbation cannot be well controlled. We firstly provide a perturbation framework on graphs by introducing weights on the nodes, of which the magnitude of perturbation can be easily controlled through the variance of the weights. Meanwhile, the topology of the graphs are also preserved to avoid uncontrollable strength in the perturbation. We then extend the measure of robustness in the robust statistics literature to the graph properties.

artificial intelligence, data mining, perturbation, (19 more...)

arXiv.org Machine Learning

1901.09661

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Data Science > Data Mining (0.46)

Add feedback