optimization problem


TechBytes with Vanya Cohen, Machine Learning Engineer at Luminoso

#artificialintelligence

Growing up in Seattle, I was exposed to tech at a pretty young age. Most of my friends' parents worked for Microsoft. I spent a lot of my free time working on little coding projects, and even started my own business developing Video game mods in high school. Movies like 2001: A Space Odyssey captured my imagination, and gave me a sense that AI was going to be an important part of the future, even if it seemed distant at the time. But I really wanted to get involved.


Google Open Sources TFCO to Help Build Fair Machine Learning Models

#artificialintelligence

Fairness is a highly subjective concept and is not different when comes to machine learning. We typically feels that the referees are "unfair" to our favorite team when they lose a close match or that any outcome is extremely "fair" when it goes our way. Given that machine learning models cannot rely on subjectivity, we need an efficient way to quantify fairness. A lot of research has been done in this area mostly framing fairness as an outcome optimization problem. Recently, Google AI research open sourced the Tensor Flow Constrained Optimization Library(TFCO), an optimization framework that can be used for optimizing different objectives of a machine learning model including fairness.



PhD Studentship on Artificial Intelligence for Railway Operations and Management Project Opportunities PhD

#artificialintelligence

Defining the roadmaps for Artificial Intelligence applications for railway operations and network management Applications are invited for a PhD studentship in innovative approaches in artificial intelligence for railway scheduling and operations, to be based in Institute for Transport Studies at University of Leeds. The position is an opportunity to combine cutting-edge research at the intersection of railway scheduling and artificial intelligence techniques such as machine learning, neural networks. The overall objective of the PhD research project is to investigate the potential of Artificial Intelligence (AI) in the rail sector and contribute to the definition of roadmaps for future research in operational intelligence and network management. In particular, the student will develop and compare different AI approaches, e.g. machine learning, deep and reinforcement learning, for railway traffic planning and management. He or she will have a chance to investigate using AI for solving combinatorial optimization problems, AI for supporting optimization models, with special focus on the optimization models for railway operations and management.


How generative design could reshape the future of product development

#artificialintelligence

Most product-development tasks are complex optimization problems. Design teams approach them iteratively, refining an initial best guess through rounds of engineering analysis, interpretation, and refinement. But each such iteration takes time and money, and teams may achieve only a handful of iterations within the development timeline. Because teams rarely have the opportunity to explore alternative solutions that depart significantly from their base-case assumptions, too often the final design is suboptimal. Today's technology offers an alternative.



Policy Optimization via Importance Sampling

Neural Information Processing Systems

Policy optimization is an effective reinforcement learning approach to solve continuous control tasks. Recent achievements have shown that alternating online and offline optimization is a successful choice for efficient trajectory reuse. However, deciding when to stop optimizing and collect new trajectories is non-trivial, as it requires to account for the variance of the objective function estimate. In this paper, we propose a novel, model-free, policy search algorithm, POIS, applicable in both action-based and parameter-based settings. We first derive a high-confidence bound for importance sampling estimation; then we define a surrogate objective function, which is optimized offline whenever a new batch of trajectories is collected.


Preconditioned Spectral Descent for Deep Learning

Neural Information Processing Systems

Deep learning presents notorious computational challenges. These challenges include, but are not limited to, the non-convexity of learning objectives and estimating the quantities needed for optimization algorithms, such as gradients. While we do not address the non-convexity, we present an optimization solution that ex- ploits the so far unused "geometry" in the objective function in order to best make use of the estimated gradients. Previous work attempted similar goals with preconditioned methods in the Euclidean space, such as L-BFGS, RMSprop, and ADA-grad. In stark contrast, our approach combines a non-Euclidean gradient method with preconditioning.


Learning Chordal Markov Networks via Branch and Bound

Neural Information Processing Systems

We present a new algorithmic approach for the task of finding a chordal Markov network structure that maximizes a given scoring function. The algorithm is based on branch and bound and integrates dynamic programming for both domain pruning and for obtaining strong bounds for search-space pruning. Empirically, we show that the approach dominates in terms of running times a recent integer programming approach (and thereby also a recent constraint optimization approach) for the problem. Papers published at the Neural Information Processing Systems Conference.


Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

Neural Information Processing Systems

We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). OLP uses its experience so far to estimate the MDP. It chooses actions by optimistically maximizing estimated future rewards over a set of next-state transition probabilities that are close to the estimates: a computation that corresponds to solving linear programs. We show that the total expected reward obtained by OLP up to time $T$ is within $C(P)\log T$ of the reward obtained by the optimal policy, where $C(P)$ is an explicit, MDP-dependent constant. OLP is closely related to an algorithm proposed by Burnetas and Katehakis with four key differences: OLP is simpler, it does not require knowledge of the supports of transition probabilities and the proof of the regret bound is simpler, but our regret bound is a constant factor larger than the regret of their algorithm.