Goto

Collaborating Authors

 guration


PreNeT: Leveraging Computational Features to Predict Deep Neural Network Training Time

arXiv.org Artificial Intelligence

Training deep learning models, particularly Transformer-based architectures such as Large Language Models (LLMs), demands substantial computational resources and extended training periods. While optimal configuration and infrastructure selection can significantly reduce associated costs, this optimization requires preliminary analysis tools. This paper introduces PreNeT, a novel predictive framework designed to address this optimization challenge. PreNeT facilitates training optimization by integrating comprehensive computational metrics, including layer-specific parameters, arithmetic operations and memory utilization. A key feature of PreNeT is its capacity to accurately predict training duration on previously unexamined hardware infrastructures, including novel accelerator architectures. This framework employs a sophisticated approach to capture and analyze the distinct characteristics of various neural network layers, thereby enhancing existing prediction methodologies. Through proactive implementation of PreNeT, researchers and practitioners can determine optimal configurations, parameter settings, and hardware specifications to maximize cost-efficiency and minimize training duration. Experimental results demonstrate that PreNeT achieves up to 72% improvement in prediction accuracy compared to contemporary state-of-the-art frameworks.


Learning to Classify Galaxy Shapes Using the EM Algorithm

Neural Information Processing Systems

The eld of astronomy is increasingly data-driven as new observing instruments permit the rapid collection of massive archives of sky image data. In this paper we investigate the problem of identifying bent-double radio galaxies in the FIRST (Faint Images of the Radio Sky at Twenty-cm) Survey data set [1]. FIRST produces large numbers of radio images of the deep sky using the Very Large Array at the National Radio Astronomy Observatory. It is scheduled to cover more that 10,000 square degrees of the northern and southern caps (skies). Of particular scienti c interest to astronomers is the identi cation and cataloging of sky objects with a "bent-double" morphology, indicating clusters of galaxies ([8], see Figure 1). Due to the very large number of observed deep-sky radio sources, (on the order of 106 so far) it is infeasible for the astronomers to label all of them manually. The data from the FIRST Survey (http://sundog.stsci.edu/) is available in both raw image format and in the form of a catalog of features that have been automatically derived from the raw images by an image analysis program [8]. Each entry corresponds to a single detectable "blob" of bright intensity relative to the sky background: these entries are called


Towards Modeling Human Attention from Eye Movements for Neural Source Code Summarization

arXiv.org Artificial Intelligence

These descriptions are called "summaries" and are a key component of software documentation for programmers. A programmer may read a short summary like "takes a screenshot" to quickly understand what a section of code does, without resorting to reading the source code. Despite the usefulness of these summaries, programmers often neglect to write or update them. The result is that automatic source code summarization has long been an appetizing target in software engineering research. The scienti c community has long sought to enable machines to understand code in the way people do, so that those machines can describe code like a person would. A con uence of recent advances in both software engineering and machine learning research is bearing fruit, such that automated code summarization seems almost within reach. In particular, neural source code summarization has held the vanguard of the state of the art since around 2017. Neural code summarization refers to approaches based on neural networks, namely the encoderdecoder architecture [61].


Ergodic Annealing

arXiv.org Artificial Intelligence

The recent years and events lead to a massive development of content-oriented cloud services. The most popular and voluminous content o¤ered in today's networks are videos that must be e¢ ciently delivered to end customers. The objective of the service provider (root) is to optimize the delivery of content to its costumers (terminals). In this optimization problem the cost is usually assumed to be known (left graph). Yet, in reality it is often unknown because it depends on many stochastic factors, such as the tra¢ c on the network, the level of demand, and so on (right graph). Figure 1: Graphical representation of networks where information travels from a root to a set of terminals over channels with known or unknown cost.


Computer-Aided Algorithm Design: Automated Tuning, Configuration, Selection, and Beyond

AAAI Conferences

In this talk, I will introduce computer-aided algorithm design and discuss its main ingredients: design patterns, which provide ways of structuring potentially large spaces of candidate algorithms, and meta-algorithmic optimisation procedures, which are used for finding good designs within these spaces. After explaining how this algorithm design approach differs from and complements related approaches in program synthesis, genetic programming and so-called hyperheuristics, I will illustrate its success using examples from our own work in SAT-based software verification (Hutter et al. 2007), timetabling (Chiarandini, Fawcett, and Hoos 2008) and mixed integer programming (Hutter, Hoos, and Leyton-Brown 2010). Furthermore, I will argue why this approach can be expected to be particularly useful and effective for building better solvers for rich and diverse classes of combinatorial problems, such as planning and scheduling. Finally, I will outline out how programming by optimisation — a design paradigm that emphasises the automated construction of performance-optimised algorithm by means of searching large spaces of alternative designs — has the potential to transform the design of high-performance algorithm from a craft that is based primarily on experience and intuition into a principled and highly effective engineering effort.


Comparing Beliefs, Surveys, and Random Walks

Neural Information Processing Systems

Survey propagation is a powerful technique from statistical physics that has been applied to solve the 3-SAT problem both in principle and in practice. We give, using only probability arguments, a common derivation of survey propagation, belief propagation and several interesting hybrid methods. We then present numerical experiments which use WSAT (a widely used random-walk based SAT solver) to quantify the complexity of the 3-SAT formulae as a function of their parameters, both as randomly generated and after simpli£cation, guided by survey propagation. Some properties of WSAT which have not previously been reported make it an ideal tool for this purpose - its mean cost is proportional to the number of variables in the formula (at a £xed ratio of clauses to variables) in the easy-SAT regime and slightly beyond, and its behavior in the hard-SAT regime appears to re¤ect the underlying structure of the solution space that has been predicted by replica symmetry-breaking arguments. An analysis of the tradeoffs between the various methods of search for satisfying assignments shows WSAT to be far more powerful than has been appreciated, and suggests some interesting new directions for practical algorithm development.


Comparing Beliefs, Surveys, and Random Walks

Neural Information Processing Systems

Survey propagation is a powerful technique from statistical physics that has been applied to solve the 3-SAT problem both in principle and in practice. We give, using only probability arguments, a common derivation of survey propagation, belief propagation and several interesting hybrid methods. We then present numerical experiments which use WSAT (a widely used random-walk based SAT solver) to quantify the complexity of the 3-SAT formulae as a function of their parameters, both as randomly generated and after simpli£cation, guided by survey propagation. Some properties of WSAT which have not previously been reported make it an ideal tool for this purpose - its mean cost is proportional to the number of variables in the formula (at a £xed ratio of clauses to variables) in the easy-SAT regime and slightly beyond, and its behavior in the hard-SAT regime appears to re¤ect the underlying structure of the solution space that has been predicted by replica symmetry-breaking arguments. An analysis of the tradeoffs between the various methods of search for satisfying assignments shows WSAT to be far more powerful than has been appreciated, and suggests some interesting new directions for practical algorithm development.


Comparing Beliefs, Surveys, and Random Walks

Neural Information Processing Systems

It consists of a ensemble of randomly generated logical expressions, each depending onN Boolean variablesx i, and constructed by taking the AND of M clauses. Each clausea consists of the OR of 3 "literals"y i,a .


Learning to Classify Galaxy Shapes Using the EM Algorithm

Neural Information Processing Systems

We describe the application of probabilistic model-based learning to the problem of automatically identifying classes of galaxies, based on both morphological and pixel intensity characteristics. The EM algorithm can be used to learn how to spatially orient a set of galaxies so that they are geometrically aligned. We augment this "ordering-model" with a mixture model on objects, and demonstrate how classes of galaxies can be learned in an unsupervised manner using a two-level EM algorithm. The resulting models provide highly accurate classi£cation of galaxies in cross-validation experiments.


Learning to Classify Galaxy Shapes Using the EM Algorithm

Neural Information Processing Systems

We describe the application of probabilistic model-based learning to the problem of automatically identifying classes of galaxies, based on both morphological and pixel intensity characteristics. The EM algorithm can be used to learn how to spatially orient a set of galaxies so that they are geometrically aligned. We augment this "ordering-model" with a mixture model on objects, and demonstrate how classes of galaxies can be learned in an unsupervised manner using a two-level EM algorithm. The resulting models provide highly accurate classi£cation of galaxies in cross-validation experiments.