AITopics | Ge, Dongdong

Collaborating Authors

Ge, Dongdong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Beyond $\mathcal{O}(\sqrt{T})$ Regret: Decoupling Learning and Decision-making in Online Linear Programming

Gao, Wenzhi, Ge, Dongdong, Xue, Chenyu, Sun, Chunlin, Ye, Yinyu

arXiv.org Machine LearningJan-5-2025

Online linear programming plays an important role in both revenue management and resource allocation, and recent research has focused on developing efficient first-order online learning algorithms. Despite the empirical success of first-order methods, they typically achieve a regret no better than $\mathcal{O} ( \sqrt{T} )$, which is suboptimal compared to the $\mathcal{O} (\log T)$ bound guaranteed by the state-of-the-art linear programming (LP)-based online algorithms. This paper establishes a general framework that improves upon the $\mathcal{O} ( \sqrt{T} )$ result when the LP dual problem exhibits certain error bound conditions. For the first time, we show that first-order learning algorithms achieve $o( \sqrt{T} )$ regret in the continuous support setting and $\mathcal{O} (\log T)$ regret in the finite support setting beyond the non-degeneracy assumption. Our results significantly improve the state-of-the-art regret results and provide new insights for sequential decision-making.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2501.02761

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reward Learning From Preference With Ties

Liu, Jinsong, Ge, Dongdong, Zhu, Ruihao

arXiv.org Artificial IntelligenceOct-5-2024

Reinforcement learning from human feedback (RLHF) (Christiano et al., 2017; Ziegler et al., 2019; Ouyang et al., 2022) has played a pivotal role in aligning large language models (LLMs) (Kenton et al., 2021), enhancing specific capabilities of LLMs in various fields, such as summarization (Stiennon et al., 2020), coding (Gao et al., 2023), and medical assistance (Moor et al., 2023). A crucial component of the RLHF process is the reward model, which serves as the primary mechanism for integrating human preferences and feedback into the learning process (Wang et al., 2024). The reward model guides the optimization procedure of RLHF towards objectives aligned with human preferences (Kaufmann et al., 2023). Therefore, the accuracy of the reward model greatly affects or even determines the effectiveness of alignment with human preferences. Moreover, the direct preference optimization (DPO) method (Rafailov et al., 2024) utilizes LLMs to implicitly represent the reward model through mathematical transformations, bypassing the complex RL optimization phase and focusing solely on the reward modeling phase. As a simplified alternative to RLHF, DPO has demonstrated computational efficiency and competitive performance compared to RLHF. To learn a reward model from human preferences, obtaining high-quality human preference data is crucial (Wang et al., 2024), typically achieved by having human labelers annotate previously collected data consisting of a prompt and a pair of responses (Ouyang et al., 2022; Bai et al., 2022).

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.05328

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ORLM: Training Large Language Models for Optimization Modeling

Tang, Zhengyang, Huang, Chenyu, Zheng, Xin, Hu, Shixi, Wang, Zizhuo, Ge, Dongdong, Wang, Benyou

arXiv.org Artificial IntelligenceMay-29-2024

Large Language Models (LLMs) have emerged as powerful tools for tackling complex Operations Research (OR) problem by providing the capacity in automating optimization modeling. However, current methodologies heavily rely on prompt engineering (e.g., multi-agent cooperation) with proprietary LLMs, raising data privacy concerns that could be prohibitive in industry applications. To tackle this issue, we propose training open-source LLMs for optimization modeling. We identify four critical requirements for the training dataset of OR LLMs, design and implement OR-Instruct, a semi-automated process for creating synthetic data tailored to specific requirements. We also introduce the IndustryOR benchmark, the first industrial benchmark for testing LLMs on solving real-world OR problems. We apply the data from OR-Instruct to various open-source LLMs of 7b size (termed as ORLMs), resulting in a significantly improved capability for optimization modeling. Our best-performing ORLM achieves state-of-the-art performance on the NL4OPT, MAMO, and IndustryOR benchmarks. Our code and data are available at \url{https://github.com/Cardinal-Operations/ORLM}.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.17743

Country:

Asia > China (0.47)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Decoupling Learning and Decision-Making: Breaking the $\mathcal{O}(\sqrt{T})$ Barrier in Online Resource Allocation with First-Order Methods

Gao, Wenzhi, Sun, Chunlin, Xue, Chenyu, Ge, Dongdong, Ye, Yinyu

arXiv.org Artificial IntelligenceFeb-11-2024

Online linear programming plays an important role in both revenue management and resource allocation, and recent research has focused on developing efficient first-order online learning algorithms. Despite the empirical success of first-order methods, they typically achieve a regret no better than $\mathcal{O}(\sqrt{T})$, which is suboptimal compared to the $\mathcal{O}(\log T)$ bound guaranteed by the state-of-the-art linear programming (LP)-based online algorithms. This paper establishes several important facts about online linear programming, which unveils the challenge for first-order-method-based online algorithms to achieve beyond $\mathcal{O}(\sqrt{T})$ regret. To address the challenge, we introduce a new algorithmic framework that decouples learning from decision-making. More importantly, for the first time, we show that first-order methods can attain regret $\mathcal{O}(T^{1/3})$ with this new framework. Lastly, we conduct numerical experiments to validate our theoretical findings.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2402.07108

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

A Homogenization Approach for Gradient-Dominated Stochastic Optimization

Tan, Jiyuan, Xue, Chenyu, Zhang, Chuwen, Deng, Qi, Ge, Dongdong, Ye, Yinyu

arXiv.org Artificial IntelligenceAug-21-2023

Gradient dominance property is a condition weaker than strong convexity, yet it sufficiently ensures global convergence for first-order methods even in non-convex optimization. This property finds application in various machine learning domains, including matrix decomposition, linear neural networks, and policy-based reinforcement learning (RL). In this paper, we study the stochastic homogeneous second-order descent method (SHSODM) for gradient-dominated optimization with $\alpha \in [1, 2]$ based on a recently proposed homogenization approach. Theoretically, we show that SHSODM achieves a sample complexity of $O(\epsilon^{-7/(2 \alpha) +1})$ for $\alpha \in [1, 3/2)$ and $\tilde{O}(\epsilon^{-2/\alpha})$ for $\alpha \in [3/2, 2]$. We further provide a SHSODM with a variance reduction technique enjoying an improved sample complexity of $O( \epsilon ^{-( 7-3\alpha ) /( 2\alpha )})$ for $\alpha \in [1,3/2)$. Our results match the state-of-the-art sample complexity bounds for stochastic gradient-dominated optimization without \emph{cubic regularization}. Since the homogenization approach only relies on solving extremal eigenvector problems instead of Newton-type systems, our methods gain the advantage of cheaper iterations and robustness in ill-conditioned problems. Numerical experiments on several RL tasks demonstrate the efficiency of SHSODM compared to other off-the-shelf methods.

artificial intelligence, machine learning, survey article, (17 more...)

arXiv.org Artificial Intelligence

2308.1063

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

DRSOM: A Dimension Reduced Second-Order Method

Zhang, Chuwen, Ge, Dongdong, He, Chang, Jiang, Bo, Jiang, Yuntian, Ye, Yinyu

arXiv.org Artificial IntelligenceJul-2-2023

In this paper, we propose a Dimension-Reduced Second-Order Method (DRSOM) for convex and nonconvex (unconstrained) optimization. Under a trust-region-like framework, our method preserves the convergence of the second-order method while using only curvature information in a few directions. Consequently, the computational overhead of our method remains comparable to the first-order such as the gradient descent method. Theoretically, we show that the method has a local quadratic convergence and a global convergence rate of $O(\epsilon^{-3/2})$ to satisfy the first-order and second-order conditions if the subspace satisfies a commonly adopted approximated Hessian assumption. We further show that this assumption can be removed if we perform a corrector step using a Krylov-like method periodically at the end stage of the algorithm. The applicability and performance of DRSOM are exhibited by various computational experiments, including $L_2 - L_p$ minimization, CUTEst problems, and sensor network localization.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2208.00208

Country:

Asia (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Pre-trained Mixed Integer Optimization through Multi-variable Cardinality Branching

Chen, Yanguang, Gao, Wenzhi, Ge, Dongdong, Ye, Yinyu

arXiv.org Artificial IntelligenceMay-21-2023

We propose a new method to accelerate online Mixed Integer Optimization with Pre-trained machine learning models (PreMIO). The key component of PreMIO is a multi-variable cardinality branching procedure that splits the feasible region with data-driven hyperplanes, which can be easily integrated into any MIP solver with two lines of code. Moreover, we incorporate learning theory and concentration inequalities to develop a straightforward and interpretable hyper-parameter selection strategy for our method. We test the performance of PreMIO by applying it to state-of-the-art MIP solvers and running numerical experiments on both classical OR benchmark datasets and real-life instances. The results validate the effectiveness of our proposed method.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2305.12352

Country: Europe (0.28)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Stochastic Dimension-reduced Second-order Methods for Policy Optimization

Liu, Jinsong, Xie, Chenghan, Deng, Qi, Ge, Dongdong, Ye, Yinyu

arXiv.org Artificial IntelligenceJan-28-2023

In this paper, we propose several new stochastic second-order algorithms for policy optimization that only require gradient and Hessian-vector product in each iteration, making them computationally efficient and comparable to policy gradient methods. Specifically, we propose a dimension-reduced second-order method (DR-SOPO) which repeatedly solves a projected two-dimensional trust region subproblem. We show that DR-SOPO obtains an $\mathcal{O}(\epsilon^{-3.5})$ complexity for reaching approximate first-order stationary condition and certain subspace second-order stationary condition. In addition, we present an enhanced algorithm (DVR-SOPO) which further improves the complexity to $\mathcal{O}(\epsilon^{-3})$ based on the variance reduction technique. Preliminary experiments show that our proposed algorithms perform favorably compared with stochastic and variance-reduced policy gradient methods.

artificial intelligence, machine learning, optimization, (16 more...)

arXiv.org Artificial Intelligence

2301.12174

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Interior-point Methods Strike Back: Solving the Wasserstein Barycenter Problem

Ge, Dongdong, Wang, Haoyue, Xiong, Zikai, Ye, Yinyu

arXiv.org Machine LearningMay-30-2019

To compare, summarize, and combine probability measures defined on a space is a fundamental task in statistics and machine learning. Given support points of probability measures in a metric space and a transportation cost function (e.g. the Euclidean distance), Wasserstein distance defines a distance between two measures as the minimal transportation cost between them. This notion of distance leads to a host of important applications, including text classification [28], clustering [23, 24, 14], unsupervised learning [21], semi-supervised learning [44], statistics [36, 37, 46, 19], and others [5, 39, 45]. Given a set of measures in the same space, the 2-Wasserstein barycenter is defined as the measure minimizing the sum of squared 2-Wasserstein distances to all measures in the set. For example, if a set of images (with common structure but varying noise) are modeled as probability measures, then the Wasserstein barycenter is a mixture of the images that share this common structure.

artificial intelligence, barycenter, machine learning, (15 more...)

arXiv.org Machine Learning

1905.12895

Country:

Asia (0.28)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.68)

Add feedback