Multi-armed bandits a simple but very powerful framework for algorithms that make decisions over time under uncertainty. An enormous body of work has accumulated over the years, covered in several books and surveys. This book provides a more introductory, textbook-like treatment of the subject. Each chapter tackles a particular line of work, providing a self-contained, teachable technical introduction and a review of the more advanced results. The chapters are as follows: Stochastic bandits; Lower bounds; Bayesian Bandits and Thompson Sampling; Lipschitz Bandits; Full Feedback and Adversarial Costs; Adversarial Bandits; Linear Costs and Semi-bandits; Contextual Bandits; Bandits and Zero-Sum Games; Bandits with Knapsacks; Incentivized Exploration and Connections to Mechanism Design.
Life's most valuable asset is health. Continuously understanding the state of our health and modeling how it evolves is essential if we wish to improve it. Given the opportunity that people live with more data about their life today than any other time in history, the challenge rests in interweaving this data with the growing body of knowledge to compute and model the health state of an individual continually. This dissertation presents an approach to build a personal model and dynamically estimate the health state of an individual by fusing multi-modal data and domain knowledge. The system is stitched together from four essential abstraction elements: 1. the events in our life, 2. the layers of our biological systems (from molecular to an organism), 3. the functional utilities that arise from biological underpinnings, and 4. how we interact with these utilities in the reality of daily life. Connecting these four elements via graph network blocks forms the backbone by which we instantiate a digital twin of an individual. Edges and nodes in this graph structure are then regularly updated with learning techniques as data is continuously digested. Experiments demonstrate the use of dense and heterogeneous real-world data from a variety of personal and environmental sensors to monitor individual cardiovascular health state. State estimation and individual modeling is the fundamental basis to depart from disease-oriented approaches to a total health continuum paradigm. Precision in predicting health requires understanding state trajectory. By encasing this estimation within a navigational approach, a systematic guidance framework can plan actions to transition a current state towards a desired one. This work concludes by presenting this framework of combining the health state and personal graph model to perpetually plan and assist us in living life towards our goals.
Google has perfected its ability to execute web search results for its users all over the world. In the early days of the Internet, the search engine was primarily suited for displaying search results for English users. Non-English-speaking users have complained that search results are often displayed in the wrong language entirely. However, Google is becoming more proficient at providing search results in other languages as well. A lot of factors can play a role, but one of the biggest is its use of deep learning to understand semantic references--enter semantic indexing.
Pay-per-click advertising includes various formats (e.g., search, contextual, and social) with a total investment of more than 140 billion USD per year. An advertising campaign is composed of some subcampaigns-each with a different ad-and a cumulative daily budget. The allocation of the ads is ruled exploiting auction mechanisms. In this paper, we propose, for the first time to the best of our knowledge, an algorithm for the online joint bid/budget optimization of pay-per-click multi-channel advertising campaigns. We formulate the optimization problem as a combinatorial bandit problem, in which we use Gaussian Processes to estimate stochastic functions, Bayesian bandit techniques to address the exploration/exploitation problem, and a dynamic programming technique to solve a variation of the Multiple-Choice Knapsack problem. We experimentally evaluate our algorithm both in simulation-using a synthetic setting generated from real data from Yahoo!-and in a real-world application over an advertising period of two months.