Not enough data to create a plot.
Try a different view from the menu above.
Tang, Ke
Evolutionary Reinforcement Learning via Cooperative Coevolutionary Negatively Correlated Search
Zhang, Hu, Yang, Peng, Yu, Yanglong, Li, Mingjia, Tang, Ke
Evolutionary algorithms (EAs) have been successfully applied to optimize the policies for Reinforcement Learning (RL) tasks due to their exploration ability. The recently proposed Negatively Correlated Search (NCS) provides a distinct parallel exploration search behavior and is expected to facilitate RL more effectively. Considering that the commonly adopted neural policies usually involves millions of parameters to be optimized, the direct application of NCS to RL may face a great challenge of the large-scale search space. To address this issue, this paper presents an NCS-friendly Cooperative Coevolution (CC) framework to scale-up NCS while largely preserving its parallel exploration search behavior. The issue of traditional CC that can deteriorate NCS is also discussed. Empirical studies on 10 popular Atari games show that the proposed method can significantly outperform three state-of-the-art deep RL methods with 50% less computational time by effectively exploring a 1.7 million-dimensional search space.
Negatively Correlated Search as a Parallel Exploration Search Strategy
Yang, Peng, Tang, Ke, Yao, Xin
Parallel exploration is a key to a successful search. The recently proposed Negatively Correlated Search (NCS) achieved this ability by constructing a set of negatively correlated search processes and has been applied to many real-world problems. In NCS, the key technique is to explicitly model and maximize the diversity among search processes in parallel. However, the original diversity model was mostly devised by intuition, which introduced several drawbacks to NCS. In this paper, a mathematically principled diversity model is proposed to solve the existing drawbacks of NCS, resulting a new NCS framework. A new instantiation of NCS is also derived and its effectiveness is verified on a set of multi-modal continuous optimization problems.
Decision Making with Machine Learning and ROC Curves
Feng, Kai, Hong, Han, Tang, Ke, Wang, Jingyuan
The Receiver Operating Characteristic (ROC) curve is a representation of the statistical information discovered in binary classification problems and is a key concept in machine learning and data science. This paper studies the statistical properties of ROC curves and its implication on model selection. We analyze the implications of different models of incentive heterogeneity and information asymmetry on the relation between human decisions and the ROC curves. Our theoretical discussion is illustrated in the context of a large data set of pregnancy outcomes and doctor diagnosis from the Pre-Pregnancy Checkups of reproductive age couples in Henan Province provided by the Chinese Ministry of Health.
Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions
Lei, Yunwen, Hu, Ting, Tang, Ke
Stochastic gradient descent (SGD) is a popular and efficient method with wide applications in training deep neural nets and other nonconvex models. While the behavior of SGD is well understood in the convex learning setting, the existing theoretical results for SGD applied to nonconvex objective functions are far from mature. For example, existing results require to impose a nontrivial assumption on the uniform boundedness of gradients for all iterates encountered in the learning process, which is hard to verify in practical implementations. In this paper, we establish a rigorous theoretical foundation for SGD in nonconvex learning by showing that this boundedness assumption can be removed without affecting convergence rates. In particular, we establish sufficient conditions for almost sure convergence as well as optimal convergence rates for SGD applied to both general nonconvex objective functions and gradient-dominated objective functions. A linear convergence is further derived in the case with zero variances.
Stochastic Composite Mirror Descent: Optimal Bounds with High Probabilities
Lei, Yunwen, Tang, Ke
We study stochastic composite mirror descent, a class of scalable algorithms able to exploit the geometry and composite structure of a problem. We consider both convex and strongly convex objectives with non-smooth loss functions, for each of which we establish high-probability convergence rates optimal up to a logarithmic factor. We apply the derived computational error bounds to study the generalization performance of multi-pass stochastic gradient descent (SGD) in a non-parametric setting. Our high-probability generalization bounds enjoy a logarithmical dependency on the number of passes provided that the step size sequence is square-summable, which improves the existing bounds in expectation with a polynomial dependency and therefore gives a strong justification on the ability of multi-pass SGD to overcome overfitting. Our analysis removes boundedness assumptions on subgradients often imposed in the literature. Numerical results are reported to support our theoretical findings.
Maximizing Monotone DR-submodular Continuous Functions by Derivative-free Optimization
Zhang, Yibo, Qian, Chao, Tang, Ke
In this paper, we study the problem of monotone (weakly) DR-submodular continuous maximization. While previous methods require the gradient information of the objective function, we propose a derivative-free algorithm LDGM for the first time. We define $\beta$ and $\alpha$ to characterize how close a function is to continuous DR-submodulr and submodular, respectively. Under a convex polytope constraint, we prove that LDGM can achieve a $(1-e^{-\beta}-\epsilon)$-approximation guarantee after $O(1/\epsilon)$ iterations, which is the same as the best previous gradient-based algorithm. Moreover, in some special cases, a variant of LDGM can achieve a $((\alpha/2)(1-e^{-\alpha})-\epsilon)$-approximation guarantee for (weakly) submodular functions. We also compare LDGM with the gradient-based algorithm Frank-Wolfe under noise, and show that LDGM can be more robust. Empirical results on budget allocation verify the effectiveness of LDGM.
Automatic Construction of Parallel Portfolios via Explicit Instance Grouping
Liu, Shengcai, Tang, Ke, Yao, Xin
Simultaneously utilizing several complementary solvers is a simple yet effective strategy for solving computationally hard problems. However, manually building such solver portfolios typically requires considerable domain knowledge and plenty of human effort. As an alternative, automatic construction of parallel portfolios (ACPP) aims at automatically building effective parallel portfolios based on a given problem instance set and a given rich design space. One promising way to solve the ACPP problem is to explicitly group the instances into different subsets and promote a component solver to handle each of them.This paper investigates solving ACPP from this perspective, and especially studies how to obtain a good instance grouping.The experimental results showed that the parallel portfolios constructed by the proposed method could achieve consistently superior performances to the ones constructed by the state-of-the-art ACPP methods,and could even rival sophisticated hand-designed parallel solvers.
On Multiset Selection With Size Constraints
Qian, Chao (University of Science and Technology of China) | Zhang, Yibo (University of Science and Technology of China) | Tang, Ke (Southern University of Science and Technology) | Yao, Xin (Southern University of Science and Technology)
This paper considers the multiset selection problem with size constraints, which arises in many real-world applications such as budget allocation. Previous studies required the objective function f to be submodular, while we relax this assumption by introducing the notion of the submodularity ratios (denoted by ฮฑ_f and ฮฒ_f). We propose an anytime randomized iterative approach POMS, which maximizes the given objective f and minimizes the multiset size simultaneously. We prove that POMS using a reasonable time achieves an approximation guarantee of max{1-1/e^(ฮฒ_f), (ฮฑ_f/2)(1-1/e^(ฮฑ_f))}. Particularly, when f is submdoular, this bound is at least as good as that of the previous greedy-style algorithms. In addition, we give lower bounds on the submodularity ratio for the objectives of budget allocation. Experimental results on budget allocation as well as a more complex application, namely, generalized influence maximization, exhibit the superior performance of the proposed approach.
Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning
Zhang, Liangpeng, Tang, Ke, Yao, Xin
Under/overestimation of state/action values are harmful for reinforcement learning agents. In this paper, we show that a state/action value estimated using the Bellman equation can be decomposed to a weighted sum of path-wise values that follow log-normal distributions. Since log-normal distributions are skewed, the distribution of estimated state/action values can also be skewed, leading to an imbalanced likelihood of under/overestimation. The degree of such imbalance can vary greatly among actions and policies within a single problem instance, making the agent prone to select actions/policies that have inferior expected return and higher likelihood of overestimation. We present a comprehensive analysis to such skewness, examine its factors and impacts through both theoretical and empirical results, and discuss the possible ways to reduce its undesirable effects.
Subset Selection under Noise
Qian, Chao, Shi, Jing-Cheng, Yu, Yang, Tang, Ke, Zhou, Zhi-Hua
The problem of selecting the best $k$-element subset from a universe is involved in many applications. While previous studies assumed a noise-free environment or a noisy monotone submodular objective function, this paper considers a more realistic and general situation where the evaluation of a subset is a noisy monotone function (not necessarily submodular), with both multiplicative and additive noises. To understand the impact of the noise, we firstly show the approximation ratio of the greedy algorithm and POSS, two powerful algorithms for noise-free subset selection, in the noisy environments. We then propose to incorporate a noise-aware strategy into POSS, resulting in the new PONSS algorithm. We prove that PONSS can achieve a better approximation ratio under some assumption such as i.i.d. noise distribution. The empirical results on influence maximization and sparse regression problems show the superior performance of PONSS.