AITopics | nonconvexity

Collaborating Authors

nonconvexity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

Neural Information Processing SystemsFeb-9-2026, 05:14:45 GMT

However, the theoretical convergence of ANIL has not been studied yet.

artificial intelligence, inner loop, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima

Neural Information Processing SystemsSep-30-2025, 13:07:05 GMT

We establish theoretical results concerning all local optima of various regularized M-estimators, where both loss and penalty functions are allowed to be nonconvex. Our results show that as long as the loss function satisfies restricted strong convexity and the penalty function satisfies suitable regularity conditions, any local optimum of the composite objective function lies within statistical precision of the true parameter vector. Our theory covers a broad class of nonconvex objective functions, including corrected versions of the Lasso for errors-in-variables linear models; regression in generalized linear models using nonconvex regularizers such as SCAD and MCP; and graph and inverse covariance matrix estimation. On the optimization side, we show that a simple adaptation of composite gradient descent may be used to compute a global optimum up to the statistical precision epsilon in log(1/epsilon) iterations, which is the fastest possible rate of any first-order method. We provide a variety of simulations to illustrate the sharpness of our theoretical predictions.

nonconvexity, regularized m-estimator, statistical and algorithmic theory, (5 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

Add feedback

84c578f202616448a2f80e6f56d5f16d-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 23:12:04 GMT

algorithm, anil, inner loop, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Reviews: Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy

Neural Information Processing SystemsJan-22-2025, 07:33:15 GMT

Originality: The authors apply the idea that overparametrization induces local linearization, which has been documented for supervised learning, and in another submission for TD learning. In particular, they decompose the error into two terms, one due to TD, and the other due to SGD, and incorporate them in the analysis of infinite-dimensional mirror descent. The insight that the previous previous analysis for TD could be generalised to a meta algorithm that includes both TD and SGD as particular cases is key. Related work is adequately cited, and differences with previous works are clearly stated, including differences with the sister submission [5]. Quality: The submission seems technically sound, and includes detailed proofs (I just skimmed through them). This is a complete piece of work.

architecture, optimization attain globally optimal policy, submission, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.77)

Add feedback

Planning Shorter Paths in Graphs of Convex Sets by Undistorting Parametrized Configuration Spaces

Garg, Shruti, Cohn, Thomas, Tedrake, Russ

arXiv.org Artificial IntelligenceNov-28-2024

Abstract-- Optimization based motion planning provides a useful modeling framework through various costs and constraints. Using Graph of Convex Sets (GCS) for trajectory optimization gives guarantees of feasibility and optimality by representing configuration space as the finite union of convex sets. Nonlinear parametrizations can be used to extend this technique to handle cases such as kinematic loops, but this distorts distances, such that solving with convex objectives will yield paths that are suboptimal in the original space. We present a method to extend GCS to nonconvex objectives, allowing us to "undistort" the optimization landscape while maintaining feasibility guarantees. We demonstrate our method's efficacy on three different robotic planning domains: a bimanual robot moving an object with both arms, the set of 3D rotations using Euler angles, and a rational parametrization of kinematics that enables certifying regions as collision free. Across the board, our method significantly improves path length and trajectory duration with only a minimal increase in runtime.

artificial intelligence, objective, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2411.18913

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Switzerland (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.68)

Add feedback

Towards Convexity in Anomaly Detection: A New Formulation of SSLM with Unique Optimal Solutions

Liu, Hongying, Wang, Hao, Chu, Haoran, Wu, Yibo

arXiv.org Artificial IntelligenceOct-31-2024

An unsolved issue in widely used methods such as Support Vector Data Description (SVDD) and Small Sphere and Large Margin SVM (SSLM) for anomaly detection is their nonconvexity, which hampers the analysis of optimal solutions in a manner similar to SVMs and limits their applicability in large-scale scenarios. In this paper, we introduce a novel convex SSLM formulation which has been demonstrated to revert to a convex quadratic programming problem for hyperparameter values of interest. Leveraging the convexity of our method, we derive numerous results that are unattainable with traditional nonconvex approaches. We conduct a thorough analysis of how hyperparameters influence the optimal solution, pointing out scenarios where optimal solutions can be trivially found and identifying instances of ill-posedness. Most notably, we establish connections between our method and traditional approaches, providing a clear determination of when the optimal solution is unique -- a task unachievable with traditional nonconvex methods. We also derive the {\nu}-property to elucidate the interactions between hyperparameters and the fractions of support vectors and margin errors in both positive and negative classes.

optimal, optimal solution, sslm, (17 more...)

arXiv.org Artificial Intelligence

2410.23774

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.70)

Add feedback

Reviews: On the Optimization Landscape of Tensor Decompositions

Neural Information Processing SystemsOct-8-2024, 05:33:08 GMT

Specifically, it studies random over-complete tensors. The associated objective function is nonconvex, yet in practice simple methods based on gradient ascent are observed to solve this problem. This paper proves why we should expect such outcome by showing that there is almost no local maxima other than the global maxima of the problem when the optimization is initialized by any solution that is slightly better than random guess. Importantly, it is shown that these initial points do not have to be close to the true components of the tensor. This is an interesting result and well written paper. The analysis involves two steps: local (points close to true components) and global (point far from true components). The number of local maxima in each case is analyzed and shown to be exactly 2n for the former and almost nonexistent for the latter.

optimization landscape, tensor decomposition, true component, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.42)

Add feedback

Regularized M estimators with Statistical and algorithmic theory for local optima

Neural Information Processing SystemsMar-13-2024, 23:17:36 GMT

We establish theoretical results concerning local optima of regularized M-estimators, where both loss and penalty functions are allowed to be nonconvex. Our results show that as long as the loss satisfies restricted strong convexity and the penalty satisfies suitable regularity conditions, any local optimum of the composite objective lies within statistical precision of the true parameter vector. Our theory covers a broad class of nonconvex objective functions, including corrected versions of the Lasso for errors-in-variables linear models and regression in generalized linear models using nonconvex regularizers such as SCAD and MCP.

linear regression, local optima, regression, (11 more...)

Neural Information Processing Systems

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)

Add feedback

On the nonconvexity of some push-forward constraints and its consequences in machine learning

de Lara, Lucas, Deronzier, Mathis, González-Sanz, Alberto, Foy, Virgile

arXiv.org Machine LearningMar-12-2024

The push-forward operation enables one to redistribute a probability measure through a deterministic map. It plays a key role in statistics and optimization: many learning problems (notably from optimal transport, generative modeling, and algorithmic fairness) include constraints or penalties framed as push-forward conditions on the model. However, the literature lacks general theoretical insights on the (non)convexity of such constraints and its consequences on the associated learning problems. This paper aims at filling this gap. In a first part, we provide a range of sufficient and necessary conditions for the (non)convexity of two sets of functions: the maps transporting one probability measure to another; the maps inducing equal output distributions across distinct probability measures. This highlights that for most probability measures, these push-forward constraints are not convex. In a second time, we show how this result implies critical limitations on the design of convex optimization problems for learning generative models or group-fair predictors. This work will hopefully help researchers and practitioners have a better understanding of the critical impact of push-forward conditions onto convexity.

convex, convexity, probability measure, (15 more...)

arXiv.org Machine Learning

2403.07471

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Structure Learning with Continuous Optimization: A Sober Look and Beyond

Ng, Ignavier, Huang, Biwei, Zhang, Kun

arXiv.org Artificial IntelligenceApr-4-2023

Bayesian networks are a class of probabilistic graphical models that encode probabilistic distributions in a compact way (Pearl, 1988; Koller and Friedman, 2009). Recovery of their graphical structures from data, represented by directed acyclic graphs (DAGs), has found applications in several fields such as genetics (Peters et al., 2017) and education (Gong et al., 2022). This problem is NP-hard in general (Chickering, 1996; Chickering et al., 2004) owing to the combinatorial space of DAGs. Classical structure learning approaches fall into two broad categories, i.e., constraint-based methods and score-based methods. Constraint-based methods, such as PC (Spirtes and Glymour, 1991), employ conditional independence tests to estimate the skeleton and further perform edge orientation up to the Markov equivalence class (MEC) (Spirtes et al., 2001). Score-based methods typically assign a score to each structure and search for a high-scoring structure in the space of DAGs or equivalence classes (Koivisto and Sood, 2004; Singh and Moore, 2005; Cussens, 2011; Yuan and Malone, 2013). These methods often adopt greedy search because of the large space of possible structures (Chickering, 1996), such as GES (Chickering, 2002) and GDS (Peters and Bühlmann, 2013). Recently, Zheng et al. (2018) proposed a smooth characterization of acyclicity and transformed the structure learning problem of discrete nature into a continuous, nonconvex optimization problem, thus enabling the application of gradient-based methods.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2304.02146

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback