AITopics

doi: 10.4204/EPTCS.391.12

2310.02346

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Zelikman, Eric, Lorch, Eliana, Mackey, Lester, Kalai, Adam Tauman

Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

arXiv.org Machine LearningOct-3-2023

Several recent advances in AI systems (e.g., Tree-of-Thoughts and Program-Aided Language Models) solve problems by providing a "scaffolding" program that structures multiple calls to language models to generate better outputs. A scaffolding program is written in a programming language such as Python. In this work, we use a language-model-infused scaffolding program to improve itself. We start with a seed "improver" that improves an input program according to a given utility function by querying a language model several times and returning the best solution. We then run this seed improver to improve itself. Across a small set of downstream tasks, the resulting improved improver generates programs with significantly better performance than its seed improver. A variety of self-improvement strategies are proposed by the language model, including beam search, genetic algorithms, and simulated annealing. Since the language models themselves are not altered, this is not full recursive self-improvement. Nonetheless, it demonstrates that a modern language model, GPT-4 in our proof-of-concept experiments, is capable of writing code that can call itself to improve itself. We consider concerns around the development of self-improving technologies and evaluate the frequency with which the generated code bypasses a sandbox. A language model can be queried to optimize virtually any objective describable in natural language. However, a program that makes multiple, structured calls to a language model can often produce outputs with higher objective values (Yao et al., 2022; 2023; Zelikman et al., 2023; Chen et al., 2022). We refer to these as "scaffolding" programs, typically written (by humans) in a programming language such as Python. Our key observation is that, for any distribution over optimization problems and any fixed language model, the design of a scaffolding program is itself an optimization problem. In this work, we introduce the Self-Taught Optimizer (STOP), a method in which code that applies a language model to improve arbitrary solutions is applied recursively to improve itself. Our approach begins with an initial seed'improver' scaffolding program that uses the language model to improve a solution to some downstream task.

large language model, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2310.02304

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

EAST: Environment Aware Safe Tracking using Planning and Control Co-Design

Li, Zhichao, Yi, Yinzhuang, Niu, Zhuolin, Atanasov, Nikolay

This paper considers the problem of autonomous robot navigation in unknown environments with moving obstacles. We propose a new method that systematically puts planning, motion prediction and safety metric design together to achieve environmental adaptive and safe navigation. This algorithm balances optimality in travel distance and safety with respect to passing clearance. Robot adapts progress speed adaptively according to the sensed environment, being fast in wide open areas and slow down in narrow passages and taking necessary maneuvers to avoid dangerous incoming obstacles. In our method, directional distance measure, conic-shape motion prediction and custom costmap are integrated properly to evaluate system risk accurately with respect to local geometry of surrounding environments. Using such risk estimation, reference governor technique and control barrier function are worked together to enable adaptive and safe path tracking in dynamical environments. We validate our algorithm extensively both in simulation and challenging real-world environments.

artificial intelligence, obstacle, robot, (15 more...)

2310.01363

Country: North America > United States > California > San Diego County (0.14)

Genre: Research Report (0.64)

Industry:

Transportation (0.68)
Energy > Oil & Gas (0.46)
Information Technology (0.46)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Mitra, Purbesh, Ulukus, Sennur

A Learning Based Scheme for Fair Timeliness in Sparse Gossip Networks

We consider a gossip network, consisting of $n$ nodes, which tracks the information at a source. The source updates its information with a Poisson arrival process and also sends updates to the nodes in the network. The nodes themselves can exchange information among themselves to become as timely as possible. However, the network structure is sparse and irregular, i.e., not every node is connected to every other node in the network, rather, the order of connectivity is low, and varies across different nodes. This asymmetry of the network implies that the nodes in the network do not perform equally in terms of timelines. Due to the gossiping nature of the network, some nodes are able to track the source very timely, whereas, some nodes fall behind versions quite often. In this work, we investigate how the rate-constrained source should distribute its update rate across the network to maintain fairness regarding timeliness, i.e., the overall worst case performance of the network can be minimized. Due to the continuous search space for optimum rate allocation, we formulate this problem as a continuum-armed bandit problem and employ Gaussian process based Bayesian optimization to meet a trade-off between exploration and exploitation sequentially.

artificial intelligence, optimization problem, update rate, (17 more...)

2310.01396

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.89)

Technology:

Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Nature Inspired Evolutionary Swarm Optimizers for Biomedical Image and Signal Processing -- A Systematic Review

Adhikary, Subhrangshu

The challenge of finding a global optimum in a solution search space with limited resources and higher accuracy has given rise to several optimization algorithms. Generally, the gradient-based optimizers converge to the global solution very accurately, but they often require a large number of iterations to find the solution. Researchers took inspiration from different natural phenomena and behaviours of many living organisms to develop algorithms that can solve optimization problems much quicker with high accuracy. These algorithms are called nature-inspired meta-heuristic optimization algorithms. These can be used for denoising signals, updating weights in a deep neural network, and many other cases. In the state-of-the-art, there are no systematic reviews available that have discussed the applications of nature-inspired algorithms on biomedical signal processing. The paper solves that gap by discussing the applications of such algorithms in biomedical signal processing and also provides an updated survey of the application of these algorithms in biomedical image processing. The paper reviews 28 latest peer-reviewed relevant articles and 26 nature-inspired algorithms and segregates them into thoroughly explored, lesser explored and unexplored categories intending to help readers understand the reliability and exploration stage of each of these algorithms.

algorithm, application, optimization algorithm, (16 more...)

2311.1283

Country: Asia > India > West Bengal (0.04)

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(3 more...)

Takhanov, Rustem, Tezekbayev, Maxat, Pak, Artur, Bolatov, Arman, Kadyrsizova, Zhibek, Assylbekov, Zhenisbek

Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

The discrete logarithm problem is a fundamental challenge in number theory with significant implications for cryptographic protocols. In this paper, we investigate the limitations of gradient-based methods for learning the parity bit of the discrete logarithm in finite cyclic groups of prime order. Our main result, supported by theoretical analysis and empirical verification, reveals the concentration of the gradient of the loss function around a fixed point, independent of the logarithm's base used. This concentration property leads to a restricted ability to learn the parity bit efficiently using gradient-based methods, irrespective of the complexity of the network architecture being trained. Our proof relies on Boas-Bellman inequality in inner product spaces and it involves establishing approximate orthogonality of discrete logarithm's parity bit functions through the spectral norm of certain matrices. Empirical experiments using a neural network-based approach further verify the limitations of gradient-based learning, demonstrating the decreasing success rate in predicting the parity bit as the group order increases.

discrete logarithm, gradient-based method, parity bit, (13 more...)

2310.01611

Country:

Asia > Kazakhstan > Akmola Region > Astana (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Indiana > Allen County > Fort Wayne (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Núñez-Molina, Carlos, Asai, Masataro, Fernández-Olivares, Juan, Mesejo, Pablo

On Using Admissible Bounds for Learning Forward Search Heuristics

In recent years, there has been growing interest in utilizing modern machine learning techniques to learn heuristic functions for forward search algorithms. Despite this, there has been little theoretical understanding of \emph{what} they should learn, \emph{how} to train them, and \emph{why} we do so. This lack of understanding has resulted in the adoption of diverse training targets (suboptimal vs optimal costs vs admissible heuristics) and loss functions (e.g., square vs absolute errors) in the literature. In this work, we focus on how to effectively utilize the information provided by admissible heuristics in heuristic learning. We argue that learning from poly-time admissible heuristics by minimizing mean square errors (MSE) is not the correct approach, since its result is merely a noisy, inadmissible copy of an efficiently computable heuristic. Instead, we propose to model the learned heuristic as a truncated gaussian, where admissible heuristics are used not as training targets but as lower bounds of this distribution. This results in a different loss function from the MSE commonly employed in the literature, which implicitly models the learned heuristic as a gaussian distribution. We conduct experiments where both MSE and our novel loss function are applied to learning a heuristic from optimal plan costs. Results show that our proposed method converges faster during training and yields better heuristics, with 40% lower MSE on average.

configuration, heuristic learning, learning, (11 more...)

2308.11905

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Yan, Yuling, Su, Weijie J., Fan, Jianqing

The Isotonic Mechanism for Exponential Family Estimation

In 2023, the International Conference on Machine Learning (ICML) required authors with multiple submissions to rank their submissions based on perceived quality. In this paper, we aim to employ these author-specified rankings to enhance peer review in machine learning and artificial intelligence conferences by extending the Isotonic Mechanism to exponential family distributions. This mechanism generates adjusted scores that closely align with the original scores while adhering to author-specified rankings. Despite its applicability to a broad spectrum of exponential family distributions, implementing this mechanism does not require knowledge of the specific distribution form. We demonstrate that an author is incentivized to provide accurate rankings when her utility takes the form of a convex additive function of the adjusted review scores. For a certain subclass of exponential family distributions, we prove that the author reports truthfully only if the question involves only pairwise comparisons between her submissions, thus indicating the optimality of ranking in truthful information elicitation. Moreover, we show that the adjusted scores improve dramatically the estimation accuracy compared to the original scores and achieve nearly minimax optimality when the ground-truth scores have bounded total variation. We conclude the paper by presenting experiments conducted on the ICML 2023 ranking data, which show significant estimation gain using the Isotonic Mechanism.

exponential family distribution, isotonic mechanism, review score, (13 more...)

2304.1116

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Remil, Youcef, Bendimerad, Anes, Chambard, Mathieu, Mathonat, Romain, Plantevit, Marc, Kaytoue, Mehdi

Mining Java Memory Errors using Subjective Interesting Subgroups with Hierarchical Targets

arXiv.org Artificial IntelligenceOct-1-2023

Software applications, especially Enterprise Resource Planning (ERP) systems, are crucial to the day-to-day operations of many industries. Therefore, it is essential to maintain these systems effectively using tools that can identify, diagnose, and mitigate their incidents. One promising data-driven approach is the Subgroup Discovery (SD) technique, a data mining method that can automatically mine incident datasets and extract discriminant patterns to identify the root causes of issues. However, current SD solutions have limitations in handling complex target concepts with multiple attributes organized hierarchically. To illustrate this scenario, we examine the case of Java out-of-memory incidents among several possible applications. We have a dataset that describes these incidents, including their context and the types of Java objects occupying memory when it reaches saturation, with these types arranged hierarchically. This scenario inspires us to propose a novel Subgroup Discovery approach that can handle complex target concepts with hierarchies. To achieve this, we design a pattern syntax and a quality measure that ensure the identified subgroups are relevant, non-redundant, and resilient to noise. To achieve the desired quality measure, we use the Subjective Interestingness model that incorporates prior knowledge about the data and promotes patterns that are both informative and surprising relative to that knowledge. We apply this framework to investigate out-of-memory errors and demonstrate its usefulness in incident diagnosis. To validate the effectiveness of our approach and the quality of the identified patterns, we present an empirical study. The source code and data used in the evaluation are publicly accessible, ensuring transparency and reproducibility.

enterprise resource planning, machine learning, pattern recognition, (20 more...)

2310.00781

Country:

Europe > United Kingdom > North Sea > Central North Sea (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)

Genre:

Research Report (0.82)
Overview (0.67)

Technology:

Information Technology > Enterprise Applications > Enterprise Resource Planning (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)

arXiv.org Machine LearningOct-1-2023

OKRidge: Scalable Optimal k-Sparse Ridge Regression

Liu, Jiachang, Rosen, Sam, Zhong, Chudi, Rudin, Cynthia

We consider an important problem in scientific discovery, namely identifying sparse governing equations for nonlinear dynamical systems. This involves solving sparse ridge regression problems to provable optimality in order to determine which terms drive the underlying dynamics. We propose a fast algorithm, OKRidge, for sparse ridge regression, using a novel lower bound calculation involving, first, a saddle point formulation, and from there, either solving (i) a linear system or (ii) using an ADMM-based approach, where the proximal operators can be efficiently evaluated by solving another linear system and an isotonic regression problem. We also propose a method to warm-start our solver, which leverages a beam search. Experimentally, our methods attain provable optimality with run times that are orders of magnitude faster than those of the existing MIP formulations solved by the commercial solver Gurobi.

artificial intelligence, machine learning, optimization problem, (12 more...)

arXiv.org Machine Learning

2304.06686

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.81)