AITopics | Sun, Xiaorui

Collaborating Authors

Sun, Xiaorui

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms

Chan, Siu On, Diakonikolas, Ilias, Servedio, Rocco A., Sun, Xiaorui

Neural Information Processing SystemsFeb-14-2020, 08:43:06 GMT

Let $p$ be an unknown and arbitrary probability distribution over $[0,1)$. We consider the problem of \emph{density estimation}, in which a learning algorithm is given i.i.d. The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any $k$ and $\eps$, we give an algorithm that makes $\tilde{O}(k/\eps 2)$ draws from $p$, runs in $\tilde{O}(k/\eps 2)$ time, and outputs a hypothesis distribution $h$ that is piecewise constant with $O(k \log 2(1/\eps))$ pieces. The sample size and running time of our algorithm are both optimal up to logarithmic factors.

artificial intelligence, hypothesis distribution, variable-width histogram, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)

Add feedback

Linear regression without correspondence

Hsu, Daniel J., Shi, Kevin, Sun, Xiaorui

Neural Information Processing SystemsDec-31-2017

This article considers algorithmic and statistical aspects of linear regression when the correspondence between the covariates and the responses is unknown. First, a fully polynomial-time approximation scheme is given for the natural least squares optimization problem in any constant dimension. Next, in an average-case and noise-free setting where the responses exactly correspond to a linear function of i.i.d. draws from a standard multivariate normal distribution, an efficient algorithm based on lattice basis reduction is shown to exactly recover the unknown linear function in arbitrary dimension. Finally, lower bounds on the signal-to-noise ratio are established for approximate recovery of the unknown linear function by any estimator.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

Add feedback

Linear regression without correspondence

Hsu, Daniel, Shi, Kevin, Sun, Xiaorui

arXiv.org Machine LearningNov-7-2017

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1705.07048

Country: North America > United States > New York (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

Add feedback

Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms

Chan, Siu On, Diakonikolas, Ilias, Servedio, Rocco A., Sun, Xiaorui

Neural Information Processing SystemsDec-31-2014

Let $p$ be an unknown and arbitrary probability distribution over $[0 ,1)$. We consider the problem of \emph{density estimation}, in which a learning algorithm is given i.i.d. draws from $p$ and must (with high probability) output a hypothesis distribution that is close to $p$. The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any $k$ and $\eps$, we give an algorithm that makes $\tilde{O}(k/\eps^2)$ draws from $p$, runs in $\tilde{O}(k/\eps^2)$ time, and outputs a hypothesis distribution $h$ that is piecewise constant with $O(k \log^2(1/\eps))$ pieces. With high probability the hypothesis $h$ satisfies $\dtv(p,h) \leq C \cdot \opt_k(p) + \eps$, where $\dtv$ denotes the total variation distance (statistical distance), $C$ is a universal constant, and $\opt_k(p)$ is the smallest total variation distance between $p$ and any $k$-piecewise constant distribution. The sample size and running time of our algorithm are both optimal up to logarithmic factors. The ``approximation factor'' $C$ that is present in our result is inherent in the problem, as we prove that no algorithm with sample size bounded in terms of $k$ and $\eps$ can achieve $C < 2$ regardless of what kind of hypothesis distribution it uses.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Efficient Density Estimation via Piecewise Polynomial Approximation

Chan, Siu-On, Diakonikolas, Ilias, Servedio, Rocco A., Sun, Xiaorui

arXiv.org Machine LearningMay-14-2013

We give a highly efficient "semi-agnostic" algorithm for learning univariate probability distributions that are well approximated by piecewise polynomial density functions. Let $p$ be an arbitrary distribution over an interval $I$ which is $\tau$-close (in total variation distance) to an unknown probability distribution $q$ that is defined by an unknown partition of $I$ into $t$ intervals and $t$ unknown degree-$d$ polynomials specifying $q$ over each of the intervals. We give an algorithm that draws $\tilde{O}(t\new{(d+1)}/\eps^2)$ samples from $p$, runs in time $\poly(t,d,1/\eps)$, and with high probability outputs a piecewise polynomial hypothesis distribution $h$ that is $(O(\tau)+\eps)$-close (in total variation distance) to $p$. This sample complexity is essentially optimal; we show that even for $\tau=0$, any algorithm that learns an unknown $t$-piecewise degree-$d$ probability distribution over $I$ to accuracy $\eps$ must use $\Omega({\frac {t(d+1)} {\poly(1 + \log(d+1))}} \cdot {\frac 1 {\eps^2}})$ samples from the distribution, regardless of its running time. Our algorithm combines tools from approximation theory, uniform convergence, linear programming, and dynamic programming. We apply this general algorithm to obtain a wide range of results for many natural problems in density estimation over both continuous and discrete domains. These include state-of-the-art results for learning mixtures of log-concave distributions; mixtures of $t$-modal distributions; mixtures of Monotone Hazard Rate distributions; mixtures of Poisson Binomial Distributions; mixtures of Gaussians; and mixtures of $k$-monotone densities. Our general technique yields computationally efficient algorithms for all these problems, in many cases with provably optimal sample complexities (up to logarithmic factors) in all parameters.

algorithm, artificial intelligence, optimization problem, (18 more...)

arXiv.org Machine Learning

1305.3207

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Participation Maximization Based on Social Influence in Online Discussion Forums

Sun, Tao (Peking University and Microsoft Research Asia) | Chen, Wei (Microsoft Research Asia) | Liu, Zhenming (Harvard School of Engineering and Applied Sciences and Microsoft Research Asia) | Wang, Yajun (Microsoft Research Asia) | Sun, Xiaorui (Shanghai Jiaotong University and Microsoft Research Asia) | Zhang, Ming (Peking University) | Lin, Chin-Yew (Microsoft Research Asia)

AAAI ConferencesJul-12-2011

In online discussion forums, users are more motivated to take part in discussions when observing other users’ participation—the effect of social influence among forum users. In this paper, we study how to utilize social influence for increasing the overall forum participation. To this end, we propose a mechanism to maximize user influence and boost participation by displaying forum threads to users. We formally define the participation maximization problem, and show that it is a special instance of the social welfare maximization problem with submodular utility functions and it is NP-hard. However, generic approximation algorithms is impracticable for real-world forums due to time complexity. Thus we design a heuristic algorithm, named Thread Allocation Based on Influence (TABI), to tackle the problem. Through extensive experiments using a dataset from a real-world online forum, we demonstrate that TABI consistently outperforms all other algorithms in maximizing participation. The results of this work demonstrates that current recommender systems can be made more effective by considering future influence propagations. The problem of participation maximization based on influence also opens a new direction in the study of social influence.

artificial intelligence, participation, social media, (17 more...)

AAAI Conferences

Fifth International AAAI Conference on Weblogs and Social Media

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.49)

Add feedback