AITopics | Mao, Cheng

Collaborating Authors

Mao, Cheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Impossibility of latent inner product recovery via rate distortion

Mao, Cheng, Zhang, Shenduo

arXiv.org Machine LearningJul-16-2024

In this largely expository note, we present an impossibility result for inner product recovery in a random geometric graph or latent space model using the rate-distortion theory. More precisely, suppose that we observe a graph $A$ on $n$ vertices with average edge density $p$ generated from Gaussian or spherical latent locations $z_1, \dots, z_n \in \mathbb{R}^d$ associated with the $n$ vertices. It is of interest to estimate the inner products $\langle z_i, z_j \rangle$ which represent the geometry of the latent points. We prove that it is impossible to recover the inner products if $d \gtrsim n h(p)$ where $h(p)$ is the binary entropy function. This matches the condition required for positive results on inner product recovery in the literature. The proof follows the well-established rate-distortion theory with the main technical ingredient being a lower bound on the rate-distortion function of the Wishart distribution which is interesting in its own right.

artificial intelligence, machine learning, rate-distortion function, (16 more...)

arXiv.org Machine Learning

2407.11932

Country: North America > United States (0.15)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Information-Theoretic Thresholds for Planted Dense Cycles

Mao, Cheng, Wein, Alexander S., Zhang, Shenduo

arXiv.org Machine LearningJan-31-2024

The Watts-Strogatz small-world model has been an influential random graph model since its proposal in 1998 due to the ubiquity of the small-world phenomenon in complex networks [WS98, Wat04]. In this model, there are n vertices with latent positions on a circle, and the vertices are more likely to be connected to their k-nearest geometric neighbors than to more distant vertices. In other words, a denser cycle of length n and width k is "planted" in the sparser ambient random graph on n vertices. Informally, the small-world model can also be viewed as an interpolation between a random geometric graph [Pen03], where edges exist only between vertices with nearby locations on a circle, and an Erdős-Rényi graph [ER59], where edges are random and independent. As a consequence, a small-world network tends to have a high clustering coefficient due to the geometry while preserving low distances between vertices in a random graph. While there has been extensive literature on small-world networks and geometric graphs, the associated statistical problems, such as detection and recovery of the latent geometry from the observed random graph, have only gained attention more recently.

artificial intelligence, definition 1, threshold, (16 more...)

arXiv.org Machine Learning

2402.00305

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.54)

Add feedback

Sharp analysis of EM for learning mixtures of pairwise differences

Dhawan, Abhishek, Mao, Cheng, Pananjady, Ashwin

arXiv.org Artificial IntelligenceJun-22-2023

A common approach is to apply a spectral algorithm to initialize the parameters of interest, and then run a locally convergent nonconvex optimization algorithm; alternatively, one could even run the nonconvex algorithm from a random initialization. These nonconvex optimization algorithms have been extensively analyzed under Gaussian assumptions on the covariates (see, e.g. Balakrishnan et al. (2017); Klusowski et al. (2019); Chen et al. (2019); Kwon and Caramanis (2020)), and it is known that they can exhibit favorable behavior globally in some settings. In many tasks such as ranking (Bradley and Terry, 1952), crowd-labeling (Dawid and Skene, 1979) and web search (Chen et al., 2013), however, measurements are taken as pairwise comparisons between entities, which renders the covariates inherently discrete. These designs or covariates are far from Gaussian, and it is natural to ask how standard nonconvex optimization algorithms now perform. Motivated by this question, we study how the celebrated expectation-maximization (EM) algorithm behaves when estimating a mixture of symmetric linear regressions under the pairwise comparison design.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Artificial Intelligence

2302.10066

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Detection of Dense Subhypergraphs by Low-Degree Polynomials

Dhawan, Abhishek, Mao, Cheng, Wein, Alexander S.

arXiv.org Machine LearningApr-17-2023

Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem that has been extensively studied in recent years. We study a hypergraph version of the problem. Let $G^r(n,p)$ denote the $r$-uniform Erd\H{o}s-R\'enyi hypergraph model with $n$ vertices and edge density $p$. We consider detecting the presence of a planted $G^r(n^\gamma, n^{-\alpha})$ subhypergraph in a $G^r(n, n^{-\beta})$ hypergraph, where $0< \alpha < \beta < r-1$ and $0 < \gamma < 1$. Focusing on tests that are degree-$n^{o(1)}$ polynomials of the entries of the adjacency tensor, we determine the threshold between the easy and hard regimes for the detection problem. More precisely, for $0 < \gamma < 1/2$, the threshold is given by $\alpha = \beta \gamma$, and for $1/2 \le \gamma < 1$, the threshold is given by $\alpha = \beta/2 + r(\gamma - 1/2)$. Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known. Our proof of low-degree hardness is based on a conditional variant of the standard low-degree likelihood calculation.

artificial intelligence, machine learning, threshold, (17 more...)

arXiv.org Machine Learning

2304.08135

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)

Add feedback

Random graph matching at Otter's threshold via counting chandeliers

Mao, Cheng, Wu, Yihong, Xu, Jiaming, Yu, Sophie H.

arXiv.org Machine LearningFeb-13-2023

We propose an efficient algorithm for graph matching based on similarity scores constructed from counting a certain family of weighted trees rooted at each vertex. For two Erd\H{o}s-R\'enyi graphs $\mathcal{G}(n,q)$ whose edges are correlated through a latent vertex correspondence, we show that this algorithm correctly matches all but a vanishing fraction of the vertices with high probability, provided that $nq\to\infty$ and the edge correlation coefficient $\rho$ satisfies $\rho^2>\alpha \approx 0.338$, where $\alpha$ is Otter's tree-counting constant. Moreover, this almost exact matching can be made exact under an extra condition that is information-theoretically necessary. This is the first polynomial-time graph matching algorithm that succeeds at an explicit constant correlation and applies to both sparse and dense graphs. In comparison, previous methods either require $\rho=1-o(1)$ or are restricted to sparse graphs. The crux of the algorithm is a carefully curated family of rooted trees called chandeliers, which allows effective extraction of the graph correlation from the counts of the same tree while suppressing the undesirable correlation between those of different trees.

artificial intelligence, graph, machine learning, (19 more...)

arXiv.org Machine Learning

2209.12313

Country: North America > United States (1.00)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.64)

Add feedback

Exact Matching of Random Graphs with Constant Correlation

Mao, Cheng, Rudelson, Mark, Tikhomirov, Konstantin

arXiv.org Machine LearningOct-11-2021

This paper deals with the problem of graph matching or network alignment for Erd\H{o}s--R\'enyi graphs, which can be viewed as a noisy average-case version of the graph isomorphism problem. Let $G$ and $G'$ be $G(n, p)$ Erd\H{o}s--R\'enyi graphs marginally, identified with their adjacency matrices. Assume that $G$ and $G'$ are correlated such that $\mathbb{E}[G_{ij} G'_{ij}] = p(1-\alpha)$. For a permutation $\pi$ representing a latent matching between the vertices of $G$ and $G'$, denote by $G^\pi$ the graph obtained from permuting the vertices of $G$ by $\pi$. Observing $G^\pi$ and $G'$, we aim to recover the matching $\pi$. In this work, we show that for every $\varepsilon \in (0,1]$, there is $n_0>0$ depending on $\varepsilon$ and absolute constants $\alpha_0, R > 0$ with the following property. Let $n \ge n_0$, $(1+\varepsilon) \log n \le np \le n^{\frac{1}{R \log \log n}}$, and $0 < \alpha < \min(\alpha_0,\varepsilon/4)$. There is a polynomial-time algorithm $F$ such that $\mathbb{P}\{F(G^\pi,G')=\pi\}=1-o(1)$. This is the first polynomial-time algorithm that recovers the exact matching between vertices of correlated Erd\H{o}s--R\'enyi graphs with constant correlation with high probability. The algorithm is based on comparison of partition trees associated with the graph vertices.

artificial intelligence, machine learning, probability, (19 more...)

arXiv.org Machine Learning

2110.05

Country: North America (0.14)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.50)

Add feedback

Optimal Spectral Recovery of a Planted Vector in a Subspace

Mao, Cheng, Wein, Alexander S.

arXiv.org Machine LearningMay-31-2021

Recovering a planted vector $v$ in an $n$-dimensional random subspace of $\mathbb{R}^N$ is a generic task related to many problems in machine learning and statistics, such as dictionary learning, subspace recovery, and principal component analysis. In this work, we study computationally efficient estimation and detection of a planted vector $v$ whose $\ell_4$ norm differs from that of a Gaussian vector with the same $\ell_2$ norm. For instance, in the special case of an $N \rho$-sparse vector $v$ with Rademacher nonzero entries, our results include the following: (1) We give an improved analysis of (a slight variant of) the spectral method proposed by Hopkins, Schramm, Shi, and Steurer, showing that it approximately recovers $v$ with high probability in the regime $n \rho \ll \sqrt{N}$. In contrast, previous work required either $\rho \ll 1/\sqrt{n}$ or $n \sqrt{\rho} \lesssim \sqrt{N}$ for polynomial-time recovery. Our result subsumes both of these conditions (up to logarithmic factors) and also treats the dense case $\rho = 1$ which was not previously considered. (2) Akin to $\ell_\infty$ bounds for eigenvector perturbation, we establish an entrywise error bound for the spectral estimator via a leave-one-out analysis, from which it follows that thresholding recovers $v$ exactly. (3) We study the associated detection problem and show that in the regime $n \rho \gg \sqrt{N}$, any spectral method from a large class (and more generally, any low-degree polynomial of the input) fails to detect the planted vector. This establishes optimality of our upper bounds and offers evidence that no polynomial-time algorithm can succeed when $n \rho \gg \sqrt{N}$.

artificial intelligence, machine learning, probability, (19 more...)

arXiv.org Machine Learning

2105.15081

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Random Graph Matching with Improved Noise Robustness

Mao, Cheng, Rudelson, Mark, Tikhomirov, Konstantin

arXiv.org Machine LearningJan-27-2021

Graph matching, also known as network alignment, refers to finding a bijection between the vertex sets of two given graphs so as to maximally align their edges. This fundamental computational problem arises frequently in multiple fields such as computer vision and biology. Recently, there has been a plethora of work studying efficient algorithms for graph matching under probabilistic models. In this work, we propose a new algorithm for graph matching and show that, for two Erd\H{o}s-R\'enyi graphs with edge correlation $1-\alpha$, our algorithm recovers the underlying matching with high probability when $\alpha \le 1 / (\log \log n)^C$, where $n$ is the number of vertices in each graph and $C$ denotes a positive universal constant. This improves the condition $\alpha \le 1 / (\log n)^C$ achieved in previous work.

artificial intelligence, machine learning, probability, (19 more...)

arXiv.org Machine Learning

2101.11783

Country: North America > United States (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.49)

Add feedback

Learning Mixtures of Permutations: Groups of Pairwise Comparisons and Combinatorial Method of Moments

Mao, Cheng, Wu, Yihong

arXiv.org Machine LearningSep-14-2020

In applications such as rank aggregation, mixture models for permutations are frequently used when the population exhibits heterogeneity. In this work, we study the widely used Mallows mixture model. In the high-dimensional setting, we propose a polynomial-time algorithm that learns a Mallows mixture of permutations on $n$ elements with the optimal sample complexity that is proportional to $\log n$, improving upon previous results that scale polynomially with $n$. In the high-noise regime, we characterize the optimal dependency of the sample complexity on the noise parameter. Both objectives are accomplished by first studying demixing permutations under a noiseless query model using groups of pairwise comparisons, which can be viewed as moments of the mixing distribution, and then extending these results to the noisy Mallows model by simulating the noiseless oracle.

artificial intelligence, machine learning, permutation, (18 more...)

arXiv.org Machine Learning

2009.06784

Country: North America > United States (0.46)

Genre:

Research Report (0.63)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Spectral Graph Matching and Regularized Quadratic Relaxations II: Erd\H{o}s-R\'enyi Graphs and Universality

Fan, Zhou, Mao, Cheng, Wu, Yihong, Xu, Jiaming

arXiv.org Machine LearningJul-20-2019

We analyze a new spectral graph matching algorithm, GRAph Matching by Pairwise eigen-Alignments (GRAMPA), for recovering the latent vertex correspondence between two unlabeled, edge-correlated weighted graphs. Extending the exact recovery guarantees established in the companion paper for Gaussian weights, in this work, we prove the universality of these guarantees for a general correlated Wigner model. In particular, for two Erd\H{o}s-R\'enyi graphs with edge correlation coefficient $1-\sigma^2$ and average degree at least $\operatorname{polylog}(n)$, we show that GRAMPA exactly recovers the latent vertex correspondence with high probability when $\sigma \lesssim 1/\operatorname{polylog}(n)$. Moreover, we establish a similar guarantee for a variant of GRAMPA, corresponding to a tighter quadratic programming relaxation of the quadratic assignment problem. Our analysis exploits a resolvent representation of the GRAMPA similarity matrix and local laws for the resolvents of sparse Wigner matrices.

matrix, optimization problem, survey article, (17 more...)

arXiv.org Machine Learning

1907.08883

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback