Machine Learning: Overviews

How data can predict which employees are about to quit: Rather than relying on exit interviews and their comparisons to occasional employee surveys to determine engagement, organizations can turn instead to big data and advanced analytics to identify those workers at greatest risk of quitting.


Rather than relying on exit interviews and their comparisons to occasional employee surveys to determine engagement, organizations can turn instead to big data and advanced analytics to identify those workers at greatest risk of quitting. A new Harvard Business Review article outlines how applying machine learning algorithms to turnover data and employee information can provide a much more accurate picture of workplace satisfaction. This measure of "turnover propensity" comprised two main indicators: turnover shocks, which are organizational and personal events that cause workers to reconsider their jobs, and job embeddedness, which describes an employee's social ties in their workplace and interest in the work they do. Though achieving this kind of "proactive anticipation" will require a sizable investment of time and effort to develop the necessary data and algorithms, the payoff will likely be worth it: "Leaders can proactively engage valued employees at risk of leaving through interviews, to better understand how the firm can increase the odds that they stay," per HBR. More articles on leadership and management: Can your anesthesia department handle NORA?

Artificial Intelligence and the rise of related patent applications. - Steer & Co


Following the World Intellectual Property Organization (WIPO) report in early 2019, a new report from the UK Intellectual Property Office (UKIPO) now identifies the growth in terms of published AI patent applications. This insight provides an overview of the UKIPO findings and considerations for technology businesses in this space. AI is the use of technology to perform tasks that would usually require some intelligence, if done by humans. A patent is a registered intellectual property right, which seeks to create a monopoly over the exploitation of an invention. Patents historically can take years to process from application, publication to grant.

AutoML: A Survey of the State-of-the-Art


Deep learning has penetrated all aspects of our lives and brought us great convenience. However, the process of building a high-quality deep learning system for a specific task is not only time-consuming but also requires lots of resources and relies on human expertise, which hinders the development of deep learning in both industry and academia. To alleviate this problem, a growing number of research projects focus on automated machine learning (AutoML). In this paper, we provide a comprehensive and up-to-date study on the state-of-the-art AutoML. First, we introduce the AutoML techniques in details according to the machine learning pipeline.

Making Machine Learning Models Clinically Useful


Recent advances in supervised machine learning have improved diagnostic accuracy and prediction of treatment outcomes, in some cases surpassing the performance of clinicians.1 In supervised machine learning, a mathematical function is constructed via automated analysis of training data, which consists of input features (such as retinal images) and output labels (such as the grade of macular edema). With large training data sets and minimal human guidance, a computer learns to generalize from the information contained in the training data. The result is a mathematical function, a model, that can be used to map a new record to the corresponding diagnosis, such as an image to grade macular edema. Although machine learning–based models for classification or for predicting a future health state are being developed for diverse clinical applications, evidence is lacking that deployment of these models has improved care and patient outcomes.2 One barrier to demonstrating such improvement is the basis used to assess the performance of a model.

When will lifelong learning come of age?


Last month's announcement by Amazon that it plans to spend $700 million (£569 million) over six years to retrain a third of its US workforce was eye-catching for many reasons. One was the price tag: even for the world's second most valuable company, spending three-quarters of a billion dollars over half a decade to retrain 100,000 workers is a huge undertaking. Also noteworthy was the firm's reasoning. Amazon explicitly attributed its move to the rise of automation, machine learning and other technology: the so-called fourth industrial revolution. There was a sense that the pioneer of online retailing, famed for its use of automation, was merely an early accepter of an inescapable truth that all employers will soon have to face: that the skills of their existing workforces will no longer have any market value as their old roles are taken by machines and new roles are created. The company reportedly has 20,000 current vacancies.

Here's how researchers are making machine learning more efficient and affordable for everyone


The research and development of neural networks is flourishing thanks to recent advancements in computational power, the discovery of new algorithms, and an increase in labelled data. Before the current explosion of activity in the space, the practical applications of neural networks were limited. Much of the recent research has allowed for broad application, the heavy computational requirements for machine learning models still restrain it from truly entering the mainstream. Now, emerging algorithms are on the cusp of pushing neural networks into more conventional applications through exponentially increased efficiency. Neural networks are a prominent focal point in the current state of computer science research.

Reinforcement Learning Explained: Overview, Comparisons and Applications in Business


Imagine you're completing a mission in a computer game. Maybe you're going through a military depot to find a secret weapon. You get points for the right actions (killing an enemy) and lose them for the wrong ones (falling into a pit or getting hit). If you're playing on high difficulty, you might not conclude this task in just one attempt. Try after try, you learn which consecutive actions are needed to get out of a location safe, armed, and equipped with bonuses like extra health points or small artifacts in your bag.

Deep Aging Clocks: The emergence of AI-based biomarkers of aging and longevity


Summary: Combining multiple artificial intelligence agents sheds light on the aging process and can help further understanding of what contributes to healthy aging. There are two kinds of age: chronological age, which is the number of years one has lived, and biological age, which is influenced by our genes, lifestyle, behavior, the environment, and other factors. Biological age is the superior measure of true age and is the most biologically relevant feature, as it closely correlates with mortality and health status. The search for reliable predictors of biological age has been ongoing for several decades, and until recently, largely without success. Since 2016 the use of deep learning techniques to find predictors of chronological and biological age has been gaining popularity in the aging research community.

The History of Digital Spam

Communications of the ACM

Spam! That's what Lorrie Faith Cranor and Brian LaMacchia exclaimed in the title of a popular call-to-action article that appeared 20 years ago in Communications.10 And yet, despite the tremendous efforts of the research community over the last two decades to mitigate this problem, the sense of urgency remains unchanged, as emerging technologies have brought new dangerous forms of digital spam under the spotlight. Furthermore, when spam is carried out with the intent to deceive or influence at scale, it can alter the very fabric of society and our behavior. In this article, I will briefly review the history of digital spam: starting from its quintessential incarnation, spam emails, to modern-days forms of spam affecting the Web and social media, the survey will close by depicting future risks associated with spam and abuse of new technologies, including artificial intelligence (AI), for example, digital humans. After providing a taxonomy of spam, and its most popular applications emerged throughout the last two decades, I will review technological and regulatory approaches proposed in the literature, and suggest some possible solutions to tackle this ubiquitous digital epidemic moving forward. An omni-comprehensive, universally acknowledged definition of digital spam is hard to formalize. Laws and regulation attempted to define particular forms of spam, for example, email (see 2003's Controlling the Assault of Non-Solicited Pornography and Marketing Act.) However, nowadays, spam occurs in a variety of forms, and across different techno-social systems. Each domain may warrant a slight different definition that suits what spam is in that precise context: some features of spam in a domain, for example, volume in mass spam campaigns, may not apply to others, for example, carefully targeted phishing operations.

Interpretable and Steerable Sequence Learning via Prototypes Machine Learning

One of the major challenges in machine learning nowadays is to provide predictions with not only high accuracy but also user-friendly explanations. Although in recent years we have witnessed increasingly popular use of deep neural networks for sequence modeling, it is still challenging to explain the rationales behind the model outputs, which is essential for building trust and supporting the domain experts to validate, critique and refine the model. We propose ProSeNet, an interpretable and steerable deep sequence model with natural explanations derived from case-based reasoning. The prediction is obtained by comparing the inputs to a few prototypes, which are exemplar cases in the problem domain. For better interpretability, we define several criteria for constructing the prototypes, including simplicity, diversity, and sparsity and propose the learning objective and the optimization procedure. ProSeNet also provides a user-friendly approach to model steering: domain experts without any knowledge on the underlying model or parameters can easily incorporate their intuition and experience by manually refining the prototypes. We conduct experiments on a wide range of real-world applications, including predictive diagnostics for automobiles, ECG, and protein sequence classification and sentiment analysis on texts. The result shows that ProSeNet can achieve accuracy on par with state-of-the-art deep learning models. We also evaluate the interpretability of the results with concrete case studies. Finally, through user study on Amazon Mechanical Turk (MTurk), we demonstrate that the model selects high-quality prototypes which align well with human knowledge and can be interactively refined for better interpretability without loss of performance.