AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Adaptive Regularization for Large-Scale Sparse Feature Embedding Models

arXiv.org Machine LearningNov-11-2025

The one-epoch overfitting problem has drawn widespread attention, especially in CTR and CVR estimation models in search, advertising, and recommendation domains. These models which rely heavily on large-scale sparse categorical features, often suffer a significant decline in performance when trained for multiple epochs. Although recent studies have proposed heuristic solutions, the fundamental cause of this phenomenon remains unclear. In this work, we present a theoretical explanation grounded in Rademacher complexity, supported by empirical experiments, to explain why overfitting occurs in models with large-scale sparse categorical features. Based on this analysis, we propose a regularization method that constrains the norm budget of embedding layers adaptively. Our approach not only prevents the severe performance degradation observed during multi-epoch training, but also improves model performance within a single epoch. This method has already been deployed in online production systems. Click-through rate (CTR) and conversion rate (CVR) estimation are critical for advertising, search and recommendation (ASR) applications. E-commerce platforms like Amazon and Taobao rely on optimizing CTR and CVR estimation to boost gross merchandise volume (GMV), while advertising platforms at Google and Meta depend on it to drive revenue growth.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

2511.06374

Country:

Asia > China > Shandong Province > Dongying (0.04)
North America > Canada > British Columbia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Bridging Theory and Practice: A Stochastic Learning-Optimization Model for Resilient Automotive Supply Chains

Shahnawaz, Muhammad, Safder, Adeel

arXiv.org Machine LearningNov-11-2025

Supply chain disruptions and volatile demand pose significant challenges to the UK automotive industry, which relies heavily on Just-In-Time (JIT) manufacturing. While qualitative studies highlight the potential of integrating Artificial Intelligence (AI) with traditional optimization, a formal, quantitative demonstration of this synergy is lacking. This paper introduces a novel stochastic learning-optimization framework that integrates Bayesian inference with inventory optimization for supply chain management (SCM). We model a two-echelon inventory system subject to stochastic demand and supply disruptions, comparing a traditional static optimization policy against an adaptive policy where Bayesian learning continuously updates parameter estimates to inform stochastic optimization. Our simulations over 365 periods across three operational scenarios demonstrate that the integrated approach achieves 7.4\% cost reduction in stable environments and 5.7\% improvement during supply disruptions, while revealing important limitations during sudden demand shocks due to the inherent conservatism of Bayesian updating. This work provides mathematical validation for practitioner observations and establishes a formal framework for understanding AI-driven supply chain resilience, while identifying critical boundary conditions for successful implementation.

artificial intelligence, machine learning, optimization, (17 more...)

arXiv.org Machine Learning

2511.06479

Country:

Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.70)

Industry: Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

Add feedback

Non-Negative Stiefel Approximating Flow: Orthogonalish Matrix Optimization for Interpretable Embeddings

Avants, Brian B., Tustison, Nicholas J., Stone, James R

arXiv.org Machine LearningNov-11-2025

Interpretable representation learning is a central challenge in modern machine learning, particularly in high-dimensional settings such as neuroimaging, genomics, and text analysis. Current methods often struggle to balance the competing demands of interpretability and model flexibility, limiting their effectiveness in extracting meaningful insights from complex data. We introduce Non-negative Stiefel Approximating Flow (NSA-Flow), a general-purpose matrix estimation framework that unifies ideas from sparse matrix factorization, orthogonalization, and constrained manifold learning. NSA-Flow enforces structured sparsity through a continuous balance between reconstruction fidelity and column-wise decorrelation, parameterized by a single tunable weight. The method operates as a smooth flow near the Stiefel manifold with proximal updates for non-negativity and adaptive gradient control, yielding representations that are simultaneously sparse, stable, and interpretable. Unlike classical regularization schemes, NSA-Flow provides an intuitive geometric mechanism for manipulating sparsity at the level of global structure while simplifying latent features. We demonstrate that the NSA-Flow objective can be optimized smoothly and integrates seamlessly with existing pipelines for dimensionality reduction while improving interpretability and generalization in both simulated and real biomedical data. Empirical validation on the Golub leukemia dataset and in Alzheimer's disease demonstrate that the NSA-Flow constraints can maintain or improve performance over related methods with little additional methodological effort. NSA-Flow offers a scalable, general-purpose tool for interpretable ML, applicable across data science domains.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2511.06425

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

UAV-Assisted Resilience in 6G and Beyond Network Energy Saving: A Multi-Agent DRL Approach

Dinh, Dao Lan Vy, Mai, Anh Nguyen Thi, Tran, Hung, Vu, Giang Quynh Le, Ho, Tu Dac, Pan, Zhenni, Van, Vo Nhan, Chatzinotas, Symeon, Tran, Dinh-Hieu

arXiv.org Artificial IntelligenceNov-11-2025

This paper investigates the unmanned aerial vehicle (UAV)-assisted resilience perspective in the 6G network energy saving (NES) scenario. More specifically, we consider multiple ground base stations (GBSs) and each GBS has three different sectors/cells in the terrestrial networks, and multiple cells are turned off due to NES or incidents, e.g., disasters, hardware failures, or outages. To address this, we propose a Multi-Agent Deep Deterministic Policy Gradient (MADDPG) framework to enable UAV-assisted communication by jointly optimizing UAV trajectories, transmission power, and user-UAV association under a sleeping ground base station (GBS) strategy. This framework aims to ensure the resilience of active users in the network and the long-term operability of UAVs. Specifically, it maximizes service coverage for users during power outages or NES zones, while minimizing the energy consumption of UAVs. Simulation results demonstrate that the proposed MADDPG policy consistently achieves high coverage ratio across different testing episodes, outperforming other baselines. Moreover, the MADDPG framework attains the lowest total energy consumption, with a reduction of approximately 24\% compared to the conventional all GBS ON configuration, while maintaining a comparable user service rate. These results confirm the effectiveness of the proposed approach in achieving a superior trade-off between energy efficiency and service performance, supporting the development of sustainable and resilient UAV-assisted cellular networks.

machine learning, reinforcement learning, uav, (18 more...)

arXiv.org Artificial Intelligence

2511.07366

Country:

Asia (0.94)
Europe > Italy (0.69)

Genre: Research Report > New Finding (0.69)

Industry:

Telecommunications (1.00)
Information Technology (1.00)
Energy > Power Industry (0.35)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.49)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.48)
(2 more...)

Add feedback

Exact Smooth Reformulations for Trajectory Optimization Under Signal Temporal Logic Specifications

Han, Shaohang, Verhagen, Joris, Tumova, Jana

arXiv.org Artificial IntelligenceNov-11-2025

Abstract-- We study motion planning under Signal T emporal Logic (STL), a useful formalism for specifying spatial-temporal requirements. We pose STL synthesis as a trajectory optimization problem leveraging the STL robustness semantics. T o obtain a differentiable problem without approximation error, we introduce an exact reformulation of the max and min operators. The resulting method is exact, smooth, and sound. Temporal logics are formal specification languages used to describe complex desired behaviors of autonomous systems.

artificial intelligence, optimization problem, specification, (10 more...)

arXiv.org Artificial Intelligence

2511.07375

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Enhancing Robustness of Graph Neural Networks through p-Laplacian

Sirohi, Anuj Kumar, Halder, Subhanu, Kumar, Kabir, Kumar, Sandeep

arXiv.org Artificial IntelligenceNov-11-2025

With the increase of data in day-to-day life, businesses and different stakeholders need to analyze the data for better predictions. Traditionally, relational data has been a source of various insights, but with the increase in computational power and the need to understand deeper relationships between entities, the need to design new techniques has arisen. For this graph data analysis has become an extraordinary tool for understanding the data, which reveals more realistic and flexible modelling of complex relationships. Recently, Graph Neural Networks (GNNs) have shown great promise in various applications, such as social network analysis, recommendation systems, drug discovery, and more. However, many adversarial attacks can happen over the data, whether during training (poisoning attack) or during testing (evasion attack), which can adversely manipulate the desired outcome from the GNN model. Therefore, it is crucial to make the GNNs robust to such attacks. The existing robustness methods are computationally demanding and perform poorly when the intensity of attack increases. This paper presents a computationally efficient framework, namely, pLAPGNN, based on weighted p-Laplacian for making GNNs robust. Empirical evaluation on real datasets establishes the efficacy and efficiency of the proposed method.

artificial intelligence, graph neural network, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.06143

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.50)
Government > Military (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

RobustFSM: Submodular Maximization in Federated Setting with Malicious Clients

Tran, Duc A., Truong, Dung, Le, Duy

arXiv.org Artificial IntelligenceNov-11-2025

Abstract--Submodular maximization is an optimization problem benefiting many machine learning applications, where we seek a small subset best representing an extremely large dataset. We focus on the federated setting where the data are locally owned by decentralized clients who have their own definitions for the quality of representability. This setting requires repetitive aggregation of local information computed by the clients. While the main motivation is to respect the privacy and autonomy of the clients, the federated setting is vulnerable to client misbehaviors: malicious clients might share fake information. An analogy is backdoor attack in conventional federated learning, but our challenge differs freshly due to the unique characteristics of submodular maximization. We propose RobustFSM, a federated submodular maximization solution that is robust to various practical client attacks. Its performance is substantiated with an empirical evaluation study using real-world datasets. Numerical results show that the solution quality of RobustFSM substantially exceeds that of the conventional federated algorithm when attacks are severe. The degree of this improvement depends on the dataset and attack scenarios, which can be as high as 200%.

artificial intelligence, machine learning, robustfsm, (18 more...)

arXiv.org Artificial Intelligence

2511.02029

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

Rethinking Crystal Symmetry Prediction: A Decoupled Perspective

Yu, Liheng, Zhao, Zhe, Wang, Xucong, Wu, Di, Wang, Pengkun

arXiv.org Artificial IntelligenceNov-11-2025

Efficiently and accurately determining the symmetry is a crucial step in the structural analysis of crystalline materials. Existing methods usually mindlessly apply deep learning models while ignoring the underlying chemical rules. More importantly, experiments show that they face a serious sub-property confusion (SPC) problem. To address the above challenges, from a decoupled perspective, we introduce the XRDecoupler framework, a problem-solving arsenal specifically designed to tackle the SPC problem. Imitating the thinking process of chemists, we innovatively incorporate multidimensional crystal symmetry information as superclass guidance to ensure that the model's prediction process aligns with chemical intuition. We further design a hierarchical PXRD pattern learning model and a multi-objective optimization approach to achieve high-quality representation and balanced optimization. Comprehensive evaluations on three mainstream databases (e.g., CCDC, CoREMOF, and InorganicData) demonstrate that XRDecoupler excels in performance, interpretability, and generalization.

artificial intelligence, machine learning, space group, (21 more...)

arXiv.org Artificial Intelligence

2511.06976

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mitigating Modality Imbalance in Multi-modal Learning via Multi-objective Optimization

Fernando, Heshan, Ram, Parikshit, Zhou, Yi, Dan, Soham, Samulowitz, Horst, Baracaldo, Nathalie, Chen, Tianyi

arXiv.org Artificial IntelligenceNov-11-2025

Multi-modal learning (MML) aims to integrate information from multiple modalities, which is expected to lead to superior performance over single-modality learning. However, recent studies have shown that MML can underperform, even compared to single-modality approaches, due to imbalanced learning across modalities. Methods have been proposed to alleviate this imbalance issue using different heuristics, which often lead to computationally intensive subroutines. In this paper, we reformulate the MML problem as a multi-objective optimization (MOO) problem that overcomes the imbalanced learning issue among modalities and propose a gradient-based algorithm to solve the modified MML problem. We provide convergence guarantees for the proposed method, and empirical evaluations on popular MML benchmarks showcasing the improved performance of the proposed method over existing balanced MML and MOO baselines, with up to ~20x reduction in subroutine computation time. Our code is available at https://github.com/heshandevaka/MIMO.

artificial intelligence, modality, optimization problem, (13 more...)

arXiv.org Artificial Intelligence

2511.06686

Country: Europe (0.28)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Improving Asset Allocation in a Fast Moving Consumer Goods B2B Company: An Interpretable Machine Learning Framework for Commercial Cooler Assignment Based on Multi-Tier Growth Targets

Castro, Renato, Paredes, Rodrigo, Kahn, Douglas

arXiv.org Artificial IntelligenceNov-11-2025

In the fast-moving consumer goods (FMCG) industry, deciding where to place physical assets, such as commercial beverage coolers, can directly impact revenue growth and execution efficiency. Although churn prediction and demand forecasting have been widely studied in B2B contexts, the use of machine learning to guide asset allocation remains relatively unexplored. This paper presents a framework focused on predicting which beverage clients are most likely to deliver strong returns in volume after receiving a cooler. Using a private dataset from a well-known Central American brewing and beverage company of 3,119 B2B traditional trade channel clients that received a cooler from 2022-01 to 2024-07, and tracking 12 months of sales transactions before and after cooler installation, three growth thresholds were defined: 10%, 30% and 50% growth in sales volume year over year. The analysis compares results of machine learning models such as XGBoost, LightGBM, and CatBoost combined with SHAP for interpretable feature analysis in order to have insights into improving business operations related to cooler allocation; the results show that the best model has AUC scores of 0.857, 0.877, and 0.898 across the thresholds on the validation set. Simulations suggest that this approach can improve ROI because it better selects potential clients to grow at the expected level and increases cost savings by not assigning clients that will not grow, compared to traditional volume-based approaches with substantial business management recommendations

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.06642

Country: South America > Peru (0.29)

Genre: Research Report > New Finding (0.88)

Industry:

Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (1.00)
Banking & Finance > Trading (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
(2 more...)

Add feedback