Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API. Thanks to our define-by-run API, the code written with Optuna enjoys high modularity, and the user of Optuna can dynamically construct the search spaces for the hyperparameters. Early adopters may want to upgrade and provide feedback for a smoother transition to the coming full release. You can install a pre-release version by pip install -U --pre optuna.
With mathematical optimization, companies can capture the key features of their business problems in an optimization model and can generate optimal solutions (which are used as the basis to make optimal decisions). Data scientists with some basic mathematical programming skills can easily learn how to build, implement, and maintain mathematical optimization applications. The Gurobi Python API borrows ideas from modeling languages, enabling users to deploy and solve mathematical optimization models with scripts that are easy to write, read, and maintain. Such modules can even be embedded in decision support systems for production-ready applications.
The influence maximization paradigm has been used by researchers in various fields in order to study how information spreads in social networks. While previously the attention was mostly on efficiency, more recently fairness issues have been taken into account in this scope. In the present paper, we propose to use randomization as a mean for achieving fairness. While this general idea is not new, it has not been applied in this area. Similar to previous works like Fish et al. (WWW ’19) and Tsang et al. (IJCAI ’19), we study the maximin criterion for (group) fairness. In contrast to their work however, we model the problem in such a way that, when choosing the seed sets, probabilistic strategies are possible rather than only deterministic ones. We introduce two different variants of this probabilistic problem, one that entails probabilistic strategies over nodes (node-based problem) and a second one that entails probabilistic strategies over sets of nodes (set-based problem). After analyzing the relation between the two probabilistic problems, we show that, while the original deterministic maximin problem was inapproximable, both probabilistic variants permit approximation algorithms that achieve a constant multiplicative factor of 1 − 1/e minus an additive arbitrarily small error that is due to the simulation of the information spread. For the node-based problem, the approximation is achieved by observing that a polynomial-sized linear program approximates the problem well. For the set-based problem, we show that a multiplicative-weight routine can yield the approximation result. For an experimental study, we provide implementations of multiplicative-weight routines for both the set-based and the node-based problems and compare the achieved fairness values to existing methods. Maybe non-surprisingly, we show that the ex-ante values, i.e., minimum expected value of an individual (or group) to obtain the information, of the computed probabilistic strategies are significantly larger than the (ex-post) fairness values of previous methods. This indicates that studying fairness via randomization is a worthwhile path to follow. Interestingly and maybe more surprisingly, we observe that even the ex-post fairness values, i.e., fairness values of sets sampled according to the probabilistic strategies computed by our routines, dominate over the fairness achieved by previous methods on many of the instances tested.
Ozaki, Yoshihiko | Tanigaki, Yuki (National Institute of Advanced Industrial Science and Technology) | Watanabe, Shuhei (University of Freiburg) | Nomura, Masahiro (CyberAgent, Inc.) | Onishi, Masaki (National Institute of Advanced Industrial Science and Technology)
Practitioners often encounter challenging real-world problems that involve a simultaneous optimization of multiple objectives in a complex search space. To address these problems, we propose a practical multiobjective Bayesian optimization algorithm. It is an extension of the widely used Tree-structured Parzen Estimator (TPE) algorithm, called Multiobjective Tree-structured Parzen Estimator (MOTPE). We demonstrate that MOTPE approximates the Pareto fronts of a variety of benchmark problems and a convolutional neural network design problem better than existing methods through the numerical results. We also investigate how the configuration of MOTPE affects the behavior and the performance of the method and the effectiveness of asynchronous parallelization of the method based on the empirical results.
Mathematical optimization is the process of finding the best set of inputs that maximizes (or minimizes) the output of a function. In the field of optimization, the function being optimized is called the objective function. A wide range of out-of-the-box tools exist for solving optimization problems that only work with well-behaved functions, also called convex functions. Well-behaved functions contain a single optimum, whether it is a maximum or a minimum value. Here a function can be thought of as a surface with a single valley (minimum) and/or hill (maximum).
A computer-aided process planning system should ideally generate and optimize process plans to ensure the application of good manufacturing practices and maintain the consistency of the desired functional specifications of a part during its production processes. Crucial processes, such as selecting machining resources, determining set-up plans and sequencing operations of a part should be considered simultaneously to achieve global optimal solutions. In this paper, these processes are integrated and modelled as a constraint-based optimization problem, and a tabu search-based approach is proposed to solve it effectively. In the optimization model, costs of the utilized machines and cutting tools, machine changes, tool changes, set-ups and departure from good manufacturing practices (penalty function) are the optimization evaluation criteria. Precedence constraints from the geometric and manufacturing interactions between features and their related operations in a part are defined and classified according to their effects on the plan feasibility and processing quality.
In order to define an optimization problem, you need three things: variables, constraints and an objective. The variables can take different values, the solver will try to find the best values for the variables. Constraints are things that are not allowed or boundaries, by setting these correctly you are sure that you will find a solution you can actually use in real life. The objective is the goal you have in the optimization problem, this is what you want to maximize or minimize. If it's not completely clear by now, here is a more thorough introduction.
My career as a practitioner and researcher in the data science space has spanned more than 30 years, and during that time I have seen a lot of new advanced analytics technologies – which were touted as "the latest and greatest," "cutting-edge," or "game-changing" or another similar superlative – sizzle and then fizzle. The hype cycles (as Gartner calls them) of these technologies were short – as they failed to deliver real-world business impact and attain long-term commercial viability. One advanced analytics technology that bucks that trend and has been around ever since I entered the professional arena in the early 1990s (and actually long before that with the introduction of linear programming in the 1940s) is mathematical optimization. For decades, mathematical optimization has been widely used by companies of all sizes and stripes to address their complex business problems. The secret to mathematical optimization's staying power is that it has consistently demonstrated that it is capable of generating optimal solutions to large-scale, real-world business problems – and has thereby produced significant business value.
As for the so-called "new transformer engine," it turns out this is the term NVIDIA uses to refer to "a combination of software and custom NVIDIA Hopper Tensor Core technology designed specifically to accelerate transformer model training and inference." NVIDIA notes that the transformer engine intelligently manages and dynamically chooses between FP8 and 16-bit calculations, automatically handling re-casting and scaling between FP8 and 16-bit in each layer to deliver up to 9x faster AI training and up to 30x faster AI inference speedups on large language models compared to the prior generation A100. So while this is not a radical redesign, the combination of performance and efficiency improvements result in a 6x speedup compared to Ampere, as NVIDIA's technical blog elaborates. NVIDIA's focus on improving performance for transformer models is not at all misplaced. Transformer models are the backbone of language models used widely today, such as BERT and GPT-3. Initially developed for natural language processing use cases, their versatility is increasingly being applied to computer vision, drug discovery, and more, as we have been documenting in our State of AI coverage. According to a metric shared by NVIDIA, 70% of published research in AI in the last 2 years is based on transformers.
"Have you ever ordered 43 Chicken McNuggets at McDonald's?" When I first heard this story it completely blew my mind, but it is actually true that the Chicken McNuggets have a mathematical story, and it is pretty interesting. Originally, you only have boxes of 6,9 and 20 Chicken McNuggets. While he was eating with his son, the mathematician Henri Picciotto started to think about the actual numbers that he could order with a combination of these three values. This number is known as McNuggets Number. We will start by giving an in depth detail of the discrete domain that we are considering and we will end up solving an optimization problem about it.