AITopics | optimizing

Collaborating Authors

optimizing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

b19aa25ff58940d974234b48391b9549-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 21:15:54 GMT

All strings generated by the CFG can be broken down into a (non-unique) tree of production ruleswiththenon-terminal startingsymbolS atitshead. Although each individual production rule is a simplereplacement operation, thecombination ofmanysuchrulescanspecific astringspacewith complex syntactical constraints. However,whensampling strings from the grammar, we found this simple sampling strategy to produce long and repetitive strings. In fact, these tasks are considerably more challenging than the common benchmarks used to test standard BO frameworks. We triedSEkernels withbothindividual andtiedlength scales across latentdimensions, however,this did not have a significant effect on performance, possibly due to difficulties in estimating many kernel parameters inthese low-data BO problems. This ranking matches the relative performance of the BO routines based on these surrogate models (Figure 7). Figure 7.d visualizes the intrinsic representation of an SSK when kernel parameters are purposely chosen to provide a bad fit.

artificial intelligence, optimizing, representation, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

Jaccard Metric Losses: Optimizing the Jaccard Index with Soft Labels

Neural Information Processing SystemsDec-27-2025, 03:49:07 GMT

Intersection over Union (IoU) losses are surrogates that directly optimize the Jaccard index. Leveraging IoU losses as part of the loss function have demonstrated superior performance in semantic segmentation tasks compared to optimizing pixel-wise losses such as the cross-entropy loss alone. However, we identify a lack of flexibility in these losses to support vital training techniques like label smoothing, knowledge distillation, and semi-supervised learning, mainly due to their inability to process soft labels. To address this, we introduce Jaccard Metric Losses (JMLs), which are identical to the soft Jaccard loss in standard settings with hard labels but are fully compatible with soft labels. We apply JMLs to three prominent use cases of soft labels: label smoothing, knowledge distillation and semi-supervised learning, and demonstrate their potential to enhance model accuracy and calibration. Our experiments show consistent improvements over the cross-entropy loss across 4 semantic segmentation datasets (Cityscapes, PASCAL VOC, ADE20K, DeepGlobe Land) and 13 architectures, including classic CNNs and recent vision transformers. Remarkably, our straightforward approach significantly outperforms state-of-the-art knowledge distillation and semi-supervised learning methods.

jaccard index, jaccard metric loss, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

When Does Optimizing a Proper Loss Yield Calibration?

Neural Information Processing SystemsDec-27-2025, 01:23:09 GMT

Optimizing proper loss functions is popularly believed to yield predictors with good calibration properties; the intuition being that for such losses, the global optimum is to predict the ground-truth probabilities, which is indeed calibrated. However, typical machine learning models are trained to approximately minimize loss over restricted families of predictors, that are unlikely to contain the ground truth. Under what circumstances does optimizing proper loss over a restricted family yield calibrated models? What precise calibration guarantees does it give? In this work, we provide a rigorous answer to these questions. We replace the global optimality with a local optimality condition stipulating that the (proper) loss of the predictor cannot be reduced much by post-processing its predictions with a certain family of Lipschitz functions. We show that any predictor with this local optimality satisfies smooth calibration as defined in [Kakade and Foster, 2008, Błasiok et al., 2023]. Local optimality is plausibly satisfied by well-trained DNNs, which suggests an explanation for why they are calibrated from proper loss minimization alone. Finally, we show that the connection between local optimality and calibration error goes both ways: nearly calibrated predictors are also nearly locally optimal.

calibration, name change, optimizing, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Optimizing over trained GNNs via symmetry breaking

Neural Information Processing SystemsDec-26-2025, 07:43:15 GMT

Optimization over trained machine learning models has applications including: verification, minimizing neural acquisition functions, and integrating a trained surrogate into a larger decision-making problem. This paper formulates and solves optimization problems constrained by trained graph neural networks (GNNs). To circumvent the symmetry issue caused by graph isomorphism, we propose two types of symmetry-breaking constraints: one indexing a node 0 and one indexing the remaining nodes by lexicographically ordering their neighbor sets. To guarantee that adding these constraints will not remove all symmetric solutions, we construct a graph indexing algorithm and prove that the resulting graph indexing satisfies the proposed symmetry-breaking constraints. For the classical GNN architectures considered in this paper, optimizing over a GNN with a fixed graph is equivalent to optimizing over a dense neural network. Thus, we study the case where the input graph is not fixed, implying that each edge is a decision variable, and develop two mixed-integer optimization formulations. To test our symmetry-breaking strategies and optimization formulations, we consider an application in molecular design.

name change, optimizing, symmetry, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

Is Sequence Information All You Need for Bayesian Optimization of Antibodies?

Ober, Sebastian W., McCarter, Calvin, Raghu, Aniruddh, Li, Yucen Lily, Amin, Alan N., Wilson, Andrew Gordon, Elliott, Hunter

arXiv.org Artificial IntelligenceSep-30-2025

Bayesian optimization is a natural candidate for the engineering of antibody therapeutic properties, which is often iterative and expensive. However, finding the optimal choice of surrogate model for optimization over the highly structured antibody space is difficult, and may differ depending on the property being optimized. Moreover, to the best of our knowledge, no prior works have attempted to incorporate structural information into antibody Bayesian optimization. In this work, we explore different approaches to incorporating structural information into Bayesian optimization, and compare them to a variety of sequence-only approaches on two different antibody properties, binding affinity and stability. In addition, we propose the use of a protein language model-based ``soft constraint,'' which helps guide the optimization to promising regions of the space. We find that certain types of structural information improve data efficiency in early optimization rounds for stability, but have equivalent peak performance. Moreover, when incorporating the protein language model soft constraint we find that the data efficiency gap is diminished for affinity and eliminated for stability, resulting in sequence-only methods that match the performance of structure-based methods, raising questions about the necessity of structure in Bayesian optimization for antibodies.

artificial intelligence, machine learning, optimization, (19 more...)

arXiv.org Artificial Intelligence

2509.24933

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.56)

Add feedback

A Dynamic Programs For SSK Evaluations and Gradients We now detail recursive calculation strategies for calculating k n (a, b) and its gradients with O (nl

Neural Information Processing SystemsAug-15-2025, 20:53:15 GMT

A recursive strategy is able to efficiently calculate the contributions of particular substring, pre-calculating contributions of the smaller sub-strings contained within the target string. Context-free grammars (CFG) are 4-tuples G = ( V, Σ,R,S), consisting of: a set of non-terminal symbols V, a set of terminal symbols Σ (also known as an alphabet), a set of production rules R, a non-terminal starting symbol S from which all strings are generated. The CFG for the symbolic regression task of Section 5.3 is given by the following rules: S S '+' T S S ' ' T S S '/' T S T T '(' S ')' T ' sin (' S ')' T'exp (' S ')' T'x' T '1' T '2' T '3', We now provide implementation details for our GA acquisition function optimizers. The GA begins with a randomly sampled population and ends once the best string in the population stops improving between iterations (Algorithm 1). Although seemingly simple tasks, our synthetic string optimization tasks of Section 5.1 are deceptively We now provide comprehensive experimental results across the synthetic string optimization tasks.

kernel, latent space, representation, (10 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.36)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.36)

Add feedback

Improving Model Classification by Optimizing the Training Dataset

Tukan, Morad, Mualem, Loay, Netzer, Eitan, Sigalat, Liran

arXiv.org Artificial IntelligenceJul-23-2025

In the era of data-centric AI, the ability to curate high-quality training data is as crucial as model design. Coresets offer a principled approach to data reduction, enabling efficient learning on large datasets through importance sampling. However, conventional sensitivity-based coreset construction often falls short in optimizing for classification performance metrics, e.g., $F1$ score, focusing instead on loss approximation. In this work, we present a systematic framework for tuning the coreset generation process to enhance downstream classification quality. Our method introduces new tunable parameters--including deterministic sampling, class-wise allocation, and refinement via active sampling, beyond traditional sensitivity scores. Through extensive experiments on diverse datasets and classifiers, we demonstrate that tuned coresets can significantly outperform both vanilla coresets and full dataset training on key classification metrics, offering an effective path towards better and more efficient model training.

artificial intelligence, coreset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.16729

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Quick-Draw Bandits: Quickly Optimizing in Nonstationary Environments with Extremely Many Arms

Everett, Derek, Lu, Fred, Raff, Edward, Camacho, Fernando, Holt, James

arXiv.org Machine LearningJun-2-2025

Canonical algorithms for multi-armed bandits typically assume a stationary reward environment where the size of the action space (number of arms) is small. More recently developed methods typically relax only one of these assumptions: existing non-stationary bandit policies are designed for a small number of arms, while Lipschitz, linear, and Gaussian process bandit policies are designed to handle a large (or infinite) number of arms in stationary reward environments under constraints on the reward function. In this manuscript, we propose a novel policy to learn reward environments over a continuous space using Gaussian interpolation. We show that our method efficiently learns continuous Lipschitz reward functions with $\mathcal{O}^*(\sqrt{T})$ cumulative regret. Furthermore, our method naturally extends to non-stationary problems with a simple modification. We finally demonstrate that our method is computationally favorable (100-10000x faster) and experimentally outperforms sliding Gaussian process policies on datasets with non-stationarity and an extremely large number of arms.

bandit policy, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1145/3711896.3737097

2505.24692

Country:

North America > Canada > Ontario > Toronto (0.05)
North America > United States > Virginia > Fairfax County > McLean (0.04)
North America > United States > Maryland > Baltimore (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.68)

Add feedback

Optimizing over Multiple Distributions under Generalized Quasar-Convexity Condition

Neural Information Processing SystemsMay-26-2025, 15:38:26 GMT

We study a typical optimization model where the optimization variable is composed of multiple probability distributions. Though the model appears frequently in practice, such as for policy problems, it lacks specific analysis in the general setting. For this optimization problem, we propose a new structural condition/landscape description named generalized quasar-convexity (GQC) beyond the realms of convexity. In contrast to original quasar-convexity \citep{hinder2020near}, GQC allows an individual quasar-convex parameter \gamma_i for each variable block i and the smaller of \gamma_i implies less block-convexity. To minimize the objective function, we consider a generalized oracle termed as the internal function that includes the standard gradient oracle as a special case.

artificial intelligence, generalized quasar-convexity condition, optimization problem, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Optimizing the Privacy-Utility Balance using Synthetic Data and Configurable Perturbation Pipelines

Sharma, Anantha, Devabhaktuni, Swetha, Mohan, Eklove

arXiv.org Artificial IntelligenceApr-29-2025

The Banking, Financial Services, and Insurance (BFSI) sector operates on vast volumes of highly sensitive customer data, creating an enduring tension between the drive for data-driven insights and the imperative to comply with strict privacy and security regulations such as GDPR [1] and CCP A [2]. Traditional anonymization methods like masking, aggregation, k-anonymity, L-diversity, and T-closeness often degrade data quality to the point where sophisticated analytics, fraud detection, risk modeling, and machine learning applications suffer significant performance drops. Moreover, these legacy approaches can remain vulnerable to linkage and inference attacks, undermining both privacy guarantees and competitive innovation in financial institutions. The need for advanced techniques that can create privacy-preserving datasets without sacrificing analytical utility is paramount. In response, advanced techniques for creating privacy-preserving datasets have emerged, broadly categorized as purely synthetic data generation and advanced data perturbation. Purely synthetic data, often created using deep generative models (like GANs), aims to capture the statistical patterns of real data without any one-to-one mapping to real individuals. Advanced data perturbation applies carefully calibrated noise, transformations, and privacy-enhancing techniques like differential privacy to original datasets, seeking to obscure sensitive information while retaining analytical value. These methods can include context-aware transformations, where the nature of the data and its intended use inform the perturbation strategy, ensuring that the resulting dataset remains useful for specific tasks. However, the challenge remains to balance privacy and utility effectively. Traditional methods often fail to provide sufficient privacy guarantees or result in datasets that are too noisy for practical use.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2504.18596

Country: North America > United States (0.68)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback