Communications of the ACM


You've Got Mail!

Communications of the ACM

The first networked electronic mail message was sent by Ray Tomlinson of Bolt Beranek and Newman in 1971. This year, according to market research firm Radicati Group, 3.8 billion email users worldwide will send 281 billion messages every day. You may feel like a substantial number of them end up in your inbox. And yet, some observers say email is dying. It's so'last century', they say, compared to social media messaging, texting, and powerful new collaboration tools.


We are Done with 'Hacking'

Communications of the ACM

In the 1970s, when Microsoft and Apple were founded, programming was an art only a limited group of dedicated enthusiasts actually knew how to perform properly. CPUs were rather slow, personal computers had a very limited amount of memory, and monitors were lo-res. To create something decent, a programmer had to fight against actual hardware limitations. In order to win in this war, programmers had to be both trained and talented in computer science, a science that was at that time mostly about algorithms and data structures. The first three volumes of the famous book The Art of Computer Programming by Donald Knuth, a Stanford University professor and a Turing Award recipient, were published in 1968–1973.


Teach the Law (and the AI) 'Foreseeability'

Communications of the ACM

Ryan Calo's "law and Technology" Viewpoint "Is the Law Ready for Driverless Cars?" (May 2018) explored the implications, as Calo said, of " ... genuinely unforeseeable categories of harm" in potential liability cases where death or injury is caused by a driverless car. He argued that common law would take care of most other legal issues involving artificial intelligence in driverless cars, apart from such "foreseeability." Calo also said the courts have worked out problems like AI before and seemed confident that AI foreseeability will eventually be accommodated. One can agree with this overall judgment but question the time horizon. AI may be quite different from anything the courts have seen or judged before for many reasons, as the technology is indeed designed to someday make its own decisions.


On Neural Networks

Communications of the ACM

I am only a layman in the neural network space so the ideas and opinions in this column are sure to be refined by comments from more knowledgeable readers. The recent successes of multilayer neural networks have made headlines. Much earlier work on what I imagine to be single-layer networks proved to have limitations. Indeed, the famous book, Perceptrons,a by Turing laureate Marvin Minsky and his colleague Seymour Papert put the kibosh (that's a technical term) on further research in this space for some time. Among the most visible signs of advancement in this arena is the success of the DeepMind AlphaGo multilayer neural network that beat the international grand Go champion, Lee Sedol, four games out of five in March 2016 in Seoul.b


Making Machine Learning Robust Against Adversarial Inputs

Communications of the ACM

Machine learning has advanced radically over the past 10 years, and machine learning algorithms now achieve human-level performance or better on a number of tasks, including face recognition,31 optical character recognition,8 object recognition,29 and playing the game Go.26 Yet machine learning algorithms that exceed human performance in naturally occurring scenarios are often seen as failing dramatically when an adversary is able to modify their input data even subtly. Machine learning is already used for many highly important applications and will be used in even more of even greater importance in the near future. Search algorithms, automated financial trading algorithms, data analytics, autonomous vehicles, and malware detection are all critically dependent on the underlying machine learning algorithms that interpret their respective domain inputs to provide intelligent outputs that facilitate the decision-making process of users or automated systems. As machine learning is used in more contexts where malicious adversaries have an incentive to interfere with the operation of a given machine learning system, it is increasingly important to provide protections, or "robustness guarantees," against adversarial manipulation. The modern generation of machine learning services is a result of nearly 50 years of research and development in artificial intelligence--the study of computational algorithms and systems that reason about their environment to make predictions.25 A subfield of artificial intelligence, most modern machine learning, as used in production, can essentially be understood as applied function approximation; when there is some mapping from an input x to an output y that is difficult for a programmer to describe through explicit code, a machine learning algorithm can learn an approximation of the mapping by analyzing a dataset containing several examples of inputs and their corresponding outputs. Google's image-classification system, Inception, has been trained with millions of labeled images.28 It can classify images as cats, dogs, airplanes, boats, or more complex concepts on par or improving on human accuracy. Increases in the size of machine learning models and their accuracy is the result of recent advancements in machine learning algorithms,17 particularly to advance deep learning.7 One focus of the machine learning research community has been on developing models that make accurate predictions, as progress was in part measured by results on benchmark datasets. In this context, accuracy denotes the fraction of test inputs that a model processes correctly--the proportion of images that an object-recognition algorithm recognizes as belonging to the correct class, and the proportion of executables that a malware detector correctly designates as benign or malicious. The estimate of a model's accuracy varies greatly with the choice of the dataset used to compute the estimate.



Celebrating Excellence

Communications of the ACM

As we do every year, ACM convenes a gala event to celebrate and honor colleagues in our computing universe who have achieved pinnacle success in the field. Our most prestigious recognition is the ACM A.M. Turing Award and the 2017 award goes to John Hennessy and David Patterson: Their primary insight was to find a method to systematically and quantitatively evaluate machine instructions for their utility and to eliminate the least used of them, replacing them with sequences of simpler instructions with faster execution times requiring lower power. In the end, their designs resulted in Reduced Instruction Set Complexity or RISC. Today, most chips make use of this form of instruction set. A complete summary of their accomplishments can be found within this issue and at the ACM Awards website.a


RISC Management

Communications of the ACM

At a time when "making an impact" can feel like a vague or even overwhelming prospect, it's worth reviewing the accomplishments of two scientists who have done just that: ACM A.M. Turing Award recipients John Hennessy and David Patterson. What began as a simple-sounding insight--that you could improve microprocessor performance by including only instructions that are actually used--blossomed into a paradigm shift as the two honed their ideas in the MIPS (Microprocessor without Interlocked Pipeline Stages) and RISC (Reduced Instruction Set Computer) processors, respectively. A subsequent textbook, Computer Architecture: A Quantitative Approach, introduced generations of students not just to that particular architecture, but to critical principles that continue to guide designers as they balance constraints and strive for maximum efficiency. David, you began working on what became the RISC architecture after a leave of absence at Digital Equipment Corporation (DEC). DAVID PATTERSON: My sabbatical at DEC focused on reducing microprogramming bugs.


Privacy in Decentralized Cryptocurrencies

Communications of the ACM

Cryptocurrencies promise to revolutionize the financial industry, forever changing the way we transfer money. Instead of relying on a central authority (for example, a government entity or a bank) to issue and manage money, cryptocurrencies rely on the mathematical design and security proofs of the underlying cryptographic protocols. Using cryptography and distributed algorithms, cryptocurrencies offer a fully decentralized setting where no single entity can monitor or block the transfer of funds. Cryptocurrencies have grown from early prototypes to a global phenomenon with millions of participating individuals and institutions.17 Bitcoin28 was the first such currency launched in 2009 and in the years since has grown to a market capitalization of over $15 billion (as of January 2017). This has led to the emergence of many alternative cryptocurrencies with additional services or different properties as well as to a fruitful line of academic research. Apart from its other benefits (decentralized architecture, small transaction fees, among others), Bitcoin's design attempts to provide some level of "pseudonymity" by not directly publishing the identities of the participating parities. In practice, there is no bound on the number of addresses a user can create; therefore there exists no single address a user can be related with. However, this pseudonymity is far from the desired unlinkability property in centralized e-cash protocols,11 where when Alice sends an amount to Bob, the original source of these funds cannot be deduced. The reason for this problem is that in most decentralized cryptocurrencies all transaction information (payer and payee address, amount, among others) is publicly visible, stored in a distributed data structure called blockchain (for example, see www.blockchain.info). Therefore, an attacker can easily observe how money flows. In this article, we review widely studied mechanisms for achieving privacy in blockchain-based cryptocurrencies such as Bitcoin. We focus on mixing services that can be used as a privacy overlay on top of a cryptocurrency; and privacy-preserving alternative coins that, by design, aim to achieve strong privacy properties. We discuss and compare the privacy guarantees achieved by known mechanisms, as well as their performance and practical adoption.


Identifying Patterns in Medical Records through Latent Semantic Analysis

Communications of the ACM

Mountains of data are constantly being accumulated, including in the form of medical records of doctor visits and treatments. The question is what actionable information can be gleaned from it beyond a one-time record of a specific medical examination. Arguably, if one were to combine the data in a large corpus of many patients suffering from the same condition, then overall patterns that apply beyond a specific instance of a specific doctor visit might be observed. Such patterns might reveal how medical conditions are related to one another over a broad set of patients, as well as how these conditions might be related to the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) of the Centers for Disease Control and Prevention (CDC) Classification of Diseases, Functioning, and Disability codes (henceforth, ICD codesa). Conceivably, applying such a method to a large dataset could even suggest new avenues of medical and public health research by identifying new associations, along with the relative strength of the associations compared to other associations.