AITopics

2406.13618

Country:

Europe (0.67)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJan-16-2024

Boosting Gradient Ascent for Continuous DR-submodular Maximization

Zhang, Qixin, Wan, Zongqi, Deng, Zengde, Chen, Zaiyi, Sun, Xiaoming, Zhang, Jialin, Yang, Yu

Projected Gradient Ascent (PGA) is the most commonly used optimization scheme in machine learning and operations research areas. Nevertheless, numerous studies and examples have shown that the PGA methods may fail to achieve the tight approximation ratio for continuous DR-submodular maximization problems. To address this challenge, we present a boosting technique in this paper, which can efficiently improve the approximation guarantee of the standard PGA to \emph{optimal} with only small modifications on the objective function. The fundamental idea of our boosting technique is to exploit non-oblivious search to derive a novel auxiliary function $F$, whose stationary points are excellent approximations to the global maximum of the original DR-submodular objective $f$. Specifically, when $f$ is monotone and $\gamma$-weakly DR-submodular, we propose an auxiliary function $F$ whose stationary points can provide a better $(1-e^{-\gamma})$-approximation than the $(\gamma^2/(1+\gamma^2))$-approximation guaranteed by the stationary points of $f$ itself. Similarly, for the non-monotone case, we devise another auxiliary function $F$ whose stationary points can achieve an optimal $\frac{1-\min_{\boldsymbol{x}\in\mathcal{C}}\|\boldsymbol{x}\|_{\infty}}{4}$-approximation guarantee where $\mathcal{C}$ is a convex constraint set. In contrast, the stationary points of the original non-monotone DR-submodular function can be arbitrarily bad~\citep{chen2023continuous}. Furthermore, we demonstrate the scalability of our boosting technique on four problems. In all of these four problems, our resulting variants of boosting PGA algorithm beat the previous standard PGA in several aspects such as approximation ratio and efficiency. Finally, we corroborate our theoretical findings with numerical experiments, which demonstrate the effectiveness of our boosting PGA methods.

algorithm, artificial intelligence, machine learning, (17 more...)

2401.0833

Genre: Research Report (1.00)

Industry:

Information Technology (0.45)
Leisure & Entertainment > Sports (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMar-6-2023

An Online Algorithm for Chance Constrained Resource Allocation

Chen, Yuwei, Deng, Zengde, Zhou, Yinzhi, Chen, Zaiyi, Chen, Yujie, Hu, Haoyuan

This paper studies the online stochastic resource allocation problem (RAP) with chance constraints. The online RAP is a 0-1 integer linear programming problem where the resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future information. Moreover, in online applications, the resource consumption coefficients are often obtained by prediction. To model their uncertainties, we take the chance constraints into the consideration. To the best of our knowledge, this is the first time chance constraints are introduced in the online RAP problem. Assuming that the uncertain variables have known Gaussian distributions, the stochastic RAP can be transformed into a deterministic but nonlinear problem with integer second-order cone constraints. Next, we linearize this nonlinear problem and analyze the performance of vanilla online primal-dual algorithm for solving the linearized stochastic RAP. Under mild technical assumptions, the optimality gap and constraint violation are both on the order of $\sqrt{n}$. Then, to further improve the performance of the algorithm, several modified online primal-dual algorithms with heuristic corrections are proposed. Finally, extensive numerical experiments on both synthetic and real data demonstrate the applicability and effectiveness of our methods.

artificial intelligence, constraint, optimization problem, (12 more...)

2303.03254

Country: Asia > China (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.49)

arXiv.org Artificial IntelligenceJan-29-2023

Online Allocation Problem with Two-sided Resource Constraints

Zhang, Qixin, Ye, Wenbing, Chen, Zaiyi, Hu, Haoyuan, Chen, Enhong, Yu, Yang

Online resource allocation is a prominent paradigm for sequential decision making during a finite horizon subject to the resource constraints, increasingly attracting the wide attention of researchers and practitioners in theoretical computer science (Mehta et al., 2007; Devanur and Jain, 2012; Devanur et al., 2019), operations research (Agrawal et al., 2014; Li and Ye, 2021) and machine learning communities (Balseiro et al., 2020; Li et al., 2020). In these settings, the requests arrive online and we need to serve each request via one of the available channels, which consumes a certain amount of resources and generates a corresponding service charge. The objective of the decision maker is to maximize the cumulative revenue subject to the resource capacity constraints. Such problem frequently appears in many applications including online advertising (Mehta et al., 2007; Buchbinder et al., 2007), online combinatorial auctions (Chawla et al., 2010), online linear programming(Agrawal et al., 2014; Buchbinder and Naor, 2009), online routing(Buchbinder and Naor, 2006), online multi-leg flight seats and hotel rooms allocation (Talluri et al., 2004), etc. The aforementioned online resource allocation framework only considers the capacity (upper bound) constraints for resources.

artificial intelligence, inequality, machine learning, (18 more...)

2112.13964

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

arXiv.org Machine LearningAug-16-2019

Symmetric Cross Entropy for Robust Learning with Noisy Labels

Wang, Yisen, Ma, Xingjun, Chen, Zaiyi, Luo, Yuan, Yi, Jinfeng, Bailey, James

Training accurate deep neural networks (DNNs) in the presence of noisy labels is an important and challenging task. Though a number of approaches have been proposed for learning with noisy labels, many open issues remain. In this paper, we show that DNN learning with Cross Entropy (CE) exhibits overfitting to noisy labels on some classes ("easy" classes), but more surprisingly, it also suffers from significant under learning on some other classes ("hard" classes). Intuitively, CE requires an extra term to facilitate learning of hard classes, and more importantly, this term should be noise tolerant, so as to avoid overfitting to noisy labels. Inspired by the symmetric KL-divergence, we propose the approach of \textbf{Symmetric cross entropy Learning} (SL), boosting CE symmetrically with a noise robust counterpart Reverse Cross Entropy (RCE). Our proposed SL approach simultaneously addresses both the under learning and overfitting problem of CE in the presence of noisy labels. We provide a theoretical analysis of SL and also empirically show, on a range of benchmark and real-world datasets, that SL outperforms state-of-the-art methods. We also show that SL can be easily incorporated into existing methods in order to further enhance their performance.

deep learning, neural network, noisy label, (17 more...)

1908.06112

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

arXiv.org Machine LearningJun-10-2019

Joint Semantic Domain Alignment and Target Classifier Learning for Unsupervised Domain Adaptation

Chen, Dong-Dong, Wang, Yisen, Yi, Jinfeng, Chen, Zaiyi, Zhou, Zhi-Hua

Unsupervised domain adaptation aims to transfer the classifier learned from the source domain to the target domain in an unsupervised manner. With the help of target pseudo-labels, aligning class-level distributions and learning the classifier in the target domain are two widely used objectives. Existing methods often separately optimize these two individual objectives, which makes them suffer from the neglect of the other. However, optimizing these two aspects together is not trivial. To alleviate the above issues, we propose a novel method that jointly optimizes semantic domain alignment and target classifier learning in a holistic way. The joint optimization mechanism can not only eliminate their weaknesses but also complement their strengths. The theoretical analysis also verifies the favor of the joint optimization mechanism. Extensive experiments on benchmark datasets show that the proposed method yields the best performance in comparison with the state-of-the-art unsupervised domain adaptation methods.

deep learning, domain adaptation, neural network, (16 more...)

1906.04053

Country: Asia (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningSep-14-2018

Efficient Rank Minimization via Solving Non-convexPenalties by Iterative Shrinkage-Thresholding Algorithm

Chen, Zaiyi

Rank minimization (RM) is a wildly investigated task of finding solutions by exploiting low-rank structure of parameter matrices. Recently, solving RM problem by leveraging non-convex relaxations has received significant attention. It has been demonstrated by some theoretical and experimental work that non-convex relaxation, e.g. Truncated Nuclear Norm Regularization (TNNR) and Reweighted Nuclear Norm Regularization (RNNR), can provide a better approximation of original problems than convex relaxations. However, designing an efficient algorithm with theoretical guarantee remains a challenging problem. In this paper, we propose a simple but efficient proximal-type method, namely Iterative Shrinkage-Thresholding Algorithm(ISTA), with concrete analysis to solve rank minimization problems with both non-convex weighted and reweighted nuclear norm as low-rank regularizers. Theoretically, the proposed method could converge to the critical point under very mild assumptions with the rate in the order of $O(1/T)$. Moreover, the experimental results on both synthetic data and real world data sets show that proposed algorithm outperforms state-of-arts in both efficiency and accuracy.

algorithm, artificial intelligence, machine learning, (14 more...)

1809.05292

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

arXiv.org Machine LearningSep-4-2018

Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions

Chen, Zaiyi, Yang, Tianbao, Yi, Jinfeng, Zhou, Bowen, Chen, Enhong

Although stochastic gradient descent (SGD) method and its variants (e.g., stochastic momentum methods, AdaGrad) are the choice of algorithms for solving non-convex problems (especially deep learning), there still remain big gaps between the theory and the practice with many questions unresolved. For example, there is still a lack of theories of convergence for SGD and its variants that use stagewise step size and return an averaged solution in practice. In addition, theoretical insights of why adaptive step size of AdaGrad could improve non-adaptive step size of {\sgd} is still missing for non-convex optimization. This paper aims to address these questions and fill the gap between theory and practice. We propose a universal stagewise optimization framework for a broad family of {\bf non-smooth non-convex} (namely weakly convex) problems with the following key features: (i) at each stage any suitable stochastic convex optimization algorithms (e.g., SGD or AdaGrad) that return an averaged solution can be employed for minimizing a regularized convex problem; (ii) the step size is decreased in a stagewise manner; (iii) an averaged solution is returned as the final solution that is selected from all stagewise averaged solutions with sampling probabilities {\it increasing} as the stage number. Our theoretical results of stagewise AdaGrad exhibit its adaptive convergence, therefore shed insights on its faster convergence for problems with sparse stochastic gradients than stagewise SGD. To the best of our knowledge, these new results are the first of their kind for addressing the unresolved issues of existing theories mentioned earlier.

convergence, deep learning, neural network, (18 more...)

1808.06296

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)