Goto

Collaborating Authors

Results


MLSys 2021: Bridging the divide between machine learning and systems

#artificialintelligence

Machine learning MLSys 2021: Bridging the divide between machine learning and systems Amazon distinguished scientist and conference general chair Alex Smola on what makes MLSys unique -- both thematically and culturally. Email Alex Smola, Amazon vice president and distinguished scientist The Conference on Machine Learning and Systems ( MLSys), which starts next week, is only four years old, but Amazon scientists already have a rich history of involvement with it. Amazon Scholar Michael I. Jordan is on the steering committee; vice president and distinguished scientist Inderjit Dhillon is on the board and was general chair last year; and vice president and distinguished scientist Alex Smola, who is also on the steering committee, is this year's general chair. As the deep-learning revolution spread, MLSys was founded to bridge two communities that had much to offer each other but that were often working independently: machine learning researchers and system developers. Registration for the conference is still open, with the very low fees of $25 for students and $100 for academics and professionals. "If you look at the big machine learning conferences, they mostly focus on, 'Okay, here's a cool algorithm, and here are the amazing things that it can do. And by the way, it now recognizes cats even better than before,'" Smola says. "They're conferences where people mostly show an increase in capability. At the same time, there are systems conferences, and they mostly care about file systems, databases, high availability, fault tolerance, and all of that. "Now, why do you need something in-between? Well, because quite often in machine learning, approximate is good enough. You don't necessarily need such good guarantees from your systems.


Perform interactive data processing using Spark in Amazon SageMaker Studio Notebooks

#artificialintelligence

Amazon SageMaker Studio is the first fully integrated development environment (IDE) for machine learning (ML). With a single click, data scientists and developers can quickly spin up Studio notebooks to explore datasets and build models. You can now use Studio notebooks to securely connect to Amazon EMR clusters and prepare vast amounts of data for analysis and reporting, model training, or inference. You can apply this new capability in several ways. For example, data analysts may want to answer a business question by exploring and querying their data in Amazon EMR, viewing the results, and then either alter the initial query or drill deeper into the results.


Preparing data for ML models using AWS Glue DataBrew in a Jupyter notebook

#artificialintelligence

AWS Glue DataBrew is a new visual data preparation tool that makes it easy for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning (ML). In this post, we examine a sample ML use case and show how to use DataBrew and a Jupyter notebook to upload a dataset, clean and normalize the data, and train and publish an ML model. We look for anomalies by applying the Amazon SageMaker Random Cut Forest (RCF) anomaly detection algorithm on a public dataset that records power consumption for more than 300 random households. To make it easier for you to get started, we created an AWS CloudFormation template that automatically configures a Jupyter notebook instance with the required libraries and installs the plugin. We used Amazon Deep Learning AMI to configure the out-of-the-box Jupyter server.


When Artificial Intelligence (AI) meets 3 retail industry pain points

#artificialintelligence

At the beginning of 2020, Artificial Intelligence (AI) was predicted to become what some experts called a "key ingredient" technology across many industries over the next decade. Fast-forward to today, and interest in AI is surging even more. As we are now deeply entrenched in the COVID-19 pandemic, the need to enhance manual business processes with more automation through AI has hit many industries with urgency, but especially retail. About two-thirds of executives surveyed by McKinsey in June said they had accelerated the implementation of robotics, artificial intelligence, and other emerging technologies in response to COVID-19. The retail industry has a unique opportunity to learn from Consumer behavior this year and implement AI solutions that can meet shoppers' needs in the long term.


Global Big Data Conference

#artificialintelligence

While many know UK company Ocado as an online grocery retailer, it's really one of the most innovative tech companies in the world. Ocado was founded in 2000 as an entirely online experience and therefore never had a brick-and-mortar store to serve its customers, who number 580,000 each day. Its technology expertise came about out of necessity as it began to build the software and hardware it needed to be efficient, productive, and competitive. Today, Ocado uses artificial intelligence (AI) and machine learning in many ways throughout its business. Since 2000, Ocado tried to piece together the technology they needed to succeed by purchasing products off the shelf.


Deep Learning for NLP and Speech Recognition: Kamath, Uday, Liu, John, Whitaker, James: 9783030145989: Amazon.com: Books

#artificialintelligence

Uday Kamath has more than 20 years of experience architecting and building analytics-based commercial solutions. He currently works as the Chief Analytics Officer at Digital Reasoning, one of the leading companies in AI for NLP and Speech Recognition, heading the Applied Machine Learning research group. Most recently, Uday served as the Chief Data Scientist at BAE Systems Applied Intelligence, building machine learning products and solutions for the financial industry, focused on fraud, compliance, and cybersecurity. Uday has previously authored many books on machine learning such as Machine Learning: End-to-End guide for Java developers: Data Analysis, Machine Learning, and Neural Networks simplified and Mastering Java Machine Learning: A Java developer's guide to implementing machine learning and big data architectures. Uday has published many academic papers in different machine learning journals and conferences.


Global Big Data Conference

#artificialintelligence

A look at how Zulily is using the latest tools in artificial intelligence, machine learning, and cloud computing to innovate and serve its customers with purpose. Each day at Zulily we add 9,000 products to our online store and process more than 5 billion clicks from online shoppers. That is more virtual inventory than you'll find in the warehouses of many retailers, and it's by design. We've built a supply chain where we hold only some goods: most of the time, we don't purchase inventory until our customers have, so we are able to pass down savings from our unique supply chain down to our customers around the world. To the customer, that means a constantly changing and new shopping experience.


H&M wants to democratize AI with reusable components

#artificialintelligence

In AI implementation, organizations grapple with scaling issues. Advancing investments from the pilot stage into business critical processes is challenging, due to constraints in accessing talent and organizational culture pitfalls. But given the threats the retail industry faces -- consumers pulling back on spending and inventory challenges -- digital is imperative. H&M "must act quickly to improve its online proposition globally" as it adjusts to shifting shopping habits everywhere, analyst firm GlobalData said in a research note. This year, H&M already planned to open fewer stores as it expanded digital operations globally.


Privacy-Preserving Dynamic Personalized Pricing with Demand Learning

arXiv.org Machine Learning

The prevalence of e-commerce has made detailed customers' personal information readily accessible to retailers, and this information has been widely used in pricing decisions. When involving personalized information, how to protect the privacy of such information becomes a critical issue in practice. In this paper, we consider a dynamic pricing problem over $T$ time periods with an \emph{unknown} demand function of posted price and personalized information. At each time $t$, the retailer observes an arriving customer's personal information and offers a price. The customer then makes the purchase decision, which will be utilized by the retailer to learn the underlying demand function. There is potentially a serious privacy concern during this process: a third party agent might infer the personalized information and purchase decisions from price changes from the pricing system. Using the fundamental framework of differential privacy from computer science, we develop a privacy-preserving dynamic pricing policy, which tries to maximize the retailer revenue while avoiding information leakage of individual customer's information and purchasing decisions. To this end, we first introduce a notion of \emph{anticipating} $(\varepsilon, \delta)$-differential privacy that is tailored to dynamic pricing problem. Our policy achieves both the privacy guarantee and the performance guarantee in terms of regret. Roughly speaking, for $d$-dimensional personalized information, our algorithm achieves the expected regret at the order of $\tilde{O}(\varepsilon^{-1} \sqrt{d^3 T})$, when the customers' information is adversarially chosen. For stochastic personalized information, the regret bound can be further improved to $\tilde{O}(\sqrt{d^2T} + \varepsilon^{-2} d^2)$


Learning Product Rankings Robust to Fake Users

arXiv.org Machine Learning

In many online platforms, customers' decisions are substantially influenced by product rankings as most customers only examine a few top-ranked products. Concurrently, such platforms also use the same data corresponding to customers' actions to learn how these products must be ranked or ordered. These interactions in the underlying learning process, however, may incentivize sellers to artificially inflate their position by employing fake users, as exemplified by the emergence of click farms. Motivated by such fraudulent behavior, we study the ranking problem of a platform that faces a mixture of real and fake users who are indistinguishable from one another. We first show that existing learning algorithms---that are optimal in the absence of fake users---may converge to highly sub-optimal rankings under manipulation by fake users. To overcome this deficiency, we develop efficient learning algorithms under two informational environments: in the first setting, the platform is aware of the number of fake users, and in the second setting, it is agnostic to the number of fake users. For both these environments, we prove that our algorithms converge to the optimal ranking, while being robust to the aforementioned fraudulent behavior; we also present worst-case performance guarantees for our methods, and show that they significantly outperform existing algorithms. At a high level, our work employs several novel approaches to guarantee robustness such as: (i) constructing product-ordering graphs that encode the pairwise relationships between products inferred from the customers' actions; and (ii) implementing multiple levels of learning with a judicious amount of bi-directional cross-learning between levels.