Collaborating Authors


The AI Index 2021 Annual Report Artificial Intelligence

Welcome to the fourth edition of the AI Index Report. This year we significantly expanded the amount of data available in the report, worked with a broader set of external organizations to calibrate our data, and deepened our connections with the Stanford Institute for Human-Centered Artificial Intelligence (HAI). The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop intuitions about the complex field of AI. The report aims to be the most credible and authoritative source for data and insights about AI in the world.

A Survey on Data Pricing: from Economics to Data Science Artificial Intelligence

How can we assess the value of data objectively, systematically and quantitatively? Pricing data, or information goods in general, has been studied and practiced in dispersed areas and principles, such as economics, marketing, electronic commerce, data management, data mining and machine learning. In this article, we present a unified, interdisciplinary and comprehensive overview of this important direction. We examine various motivations behind data pricing, understand the economics of data pricing and review the development and evolution of pricing models according to a series of fundamental principles. We discuss both digital products and data products. We also consider a series of challenges and directions for future work.

The History of Digital Spam

Communications of the ACM

Spam! That's what Lorrie Faith Cranor and Brian LaMacchia exclaimed in the title of a popular call-to-action article that appeared 20 years ago in Communications.10 And yet, despite the tremendous efforts of the research community over the last two decades to mitigate this problem, the sense of urgency remains unchanged, as emerging technologies have brought new dangerous forms of digital spam under the spotlight. Furthermore, when spam is carried out with the intent to deceive or influence at scale, it can alter the very fabric of society and our behavior. In this article, I will briefly review the history of digital spam: starting from its quintessential incarnation, spam emails, to modern-days forms of spam affecting the Web and social media, the survey will close by depicting future risks associated with spam and abuse of new technologies, including artificial intelligence (AI), for example, digital humans. After providing a taxonomy of spam, and its most popular applications emerged throughout the last two decades, I will review technological and regulatory approaches proposed in the literature, and suggest some possible solutions to tackle this ubiquitous digital epidemic moving forward. An omni-comprehensive, universally acknowledged definition of digital spam is hard to formalize. Laws and regulation attempted to define particular forms of spam, for example, email (see 2003's Controlling the Assault of Non-Solicited Pornography and Marketing Act.) However, nowadays, spam occurs in a variety of forms, and across different techno-social systems. Each domain may warrant a slight different definition that suits what spam is in that precise context: some features of spam in a domain, for example, volume in mass spam campaigns, may not apply to others, for example, carefully targeted phishing operations.

Orchestrating data analytics to enhance the investor experience


James Williams, managing editor at Hedgeweek, assesses how data analytics techniques can be used to personalise client experiences for investment managers. The amount of data is growing exponentially. According to IDC, there were 16.3 zettabytes of information generated in 2017 alone; one zettabyte is 1 billion terabytes. However you cut it, that's a huge number. One that is too large to comprehend. In simplistic terms, according to one industry professional "if every piece of data were a penny, it would cover the earth's surface five times over". Indeed, with Amazon and Apple both hitting the trillion dollar market cap mark, and Alphabet and Microsoft sitting at over USD900 billion, it is clear that the stock market values data as the most valuable resource, not oil or consumer products. Against this growing tsunami, investment managers and service providers alike are looking for ways to ingest and make sense of it all. To find information that they can translate into insights and turn into knowledge, that if done correctly, could lead to improved business performance and enriched customer relationships.

Top Data Sources for Journalists in 2018 (350 Sources)


There are many different types of sites that provide a wealth of free, freemium and paid data that can help audience developers and journalists with their reporting and storytelling efforts, The team at State of Digital Publishing would like to acknowledge these, as derived from manual searches and recognition from our existing audience. Kaggle's a site that allows users to discover machine learning while writing and sharing cloud-based code. Relying primarily on the enthusiasm of its sizable community, the site hosts dataset competitions for cash prizes and as a result it has massive amounts of data compiled into it. Whether you're looking for historical data from the New York Stock Exchange, an overview of candy production trends in the US, or cutting edge code, this site is chockful of information. It's impossible to be on the Internet for long without running into a Wikipedia article.

Vincent Granville


Granville V., Rasson J.P. Multivariate discriminate analysis and maximum penalized likelihood.... Journal of the Royal Statistical Society, Series B, 57 (1995), 501-517.

My data science journey


Granville V., Rasson J.P. Multivariate discriminate analysis and maximum penalized likelihood.... Journal of the Royal Statistical Society, Series B, 57 (1995), 501-517.