Collaborating Authors

The 2018 Survey: AI and the Future of Humans


"Please think forward to the year 2030. Analysts expect that people will become even more dependent on networked artificial intelligence (AI) in complex digital systems. Some say we will continue on the historic arc of augmenting our lives with mostly positive results as we widely implement these networked tools. Some say our increasing dependence on these AI and related systems is likely to lead to widespread difficulties. Our question: By 2030, do you think it is most likely that advancing AI and related technology systems will enhance human capacities and empower them? That is, most of the time, will most people be better off than they are today? Or is it most likely that advancing AI and related technology systems will lessen human autonomy and agency to such an extent that most people will not be better off than the way things are today? Please explain why you chose the answer you did and sketch out a vision of how the human-machine/AI collaboration will function in 2030.

Apache Spark: A Unified Engine for Big Data Processing


Analyses performed using Spark of brain activity in a larval zebrafish: embedding dynamics of whole-brain activity into lower-dimensional trajectories. The growth of data volumes in industry and research poses tremendous opportunities, as well as tremendous computational challenges. As data sizes have outpaced the capabilities of single machines, users have needed new systems to scale out computations to multiple nodes. As a result, there has been an explosion of new cluster programming models targeting diverse computing workloads.1,4,7,10 At first, these models were relatively specialized, with new models developed for new workloads; for example, MapReduce4 supported batch processing, but Google also developed Dremel13 for interactive SQL queries and Pregel11 for iterative graph algorithms. In the open source Apache Hadoop stack, systems like Storm1 and Impala9 are also specialized. Even in the relational database world, the trend has been to move away from "one-size-fits-all" systems.18 Unfortunately, most big data applications need to combine many different processing types. The very nature of "big data" is that it is diverse and messy; a typical pipeline will need MapReduce-like code for data loading, SQL-like queries, and iterative machine learning. Specialized engines can thus create both complexity and inefficiency; users must stitch together disparate systems, and some applications simply cannot be expressed efficiently in any engine. In 2009, our group at the University of California, Berkeley, started the Apache Spark project to design a unified engine for distributed data processing. Spark has a programming model similar to MapReduce but extends it with a data-sharing abstraction called "Resilient Distributed Datasets," or RDDs.25 Using this simple extension, Spark can capture a wide range of processing workloads that previously needed separate engines, including SQL, streaming, machine learning, and graph processing2,26,6 (see Figure 1).

Big Data Meet Cyber-Physical Systems: A Panoramic Survey Machine Learning

The world is witnessing an unprecedented growth of cyber-physical systems (CPS), which are foreseen to revolutionize our world {via} creating new services and applications in a variety of sectors such as environmental monitoring, mobile-health systems, intelligent transportation systems and so on. The {information and communication technology }(ICT) sector is experiencing a significant growth in { data} traffic, driven by the widespread usage of smartphones, tablets and video streaming, along with the significant growth of sensors deployments that are anticipated in the near future. {It} is expected to outstandingly increase the growth rate of raw sensed data. In this paper, we present the CPS taxonomy {via} providing a broad overview of data collection, storage, access, processing and analysis. Compared with other survey papers, this is the first panoramic survey on big data for CPS, where our objective is to provide a panoramic summary of different CPS aspects. Furthermore, CPS {require} cybersecurity to protect {them} against malicious attacks and unauthorized intrusion, which {become} a challenge with the enormous amount of data that is continuously being generated in the network. {Thus, we also} provide an overview of the different security solutions proposed for CPS big data storage, access and analytics. We also discuss big data meeting green challenges in the contexts of CPS.

Cybersecurity in the Internet of Things is a game of incentives


Cybersecurity was the virtual elephant in the showroom at this month's Consumer Electronics Show in Las Vegas. Attendees of the annual tech trade show, organized by the Consumer Technology Association, relished the opportunity to experience a future filled with delivery drones, autonomous vehicles, virtual and augmented reality and a plethora of "Internet of things" devices, including fridges, wearables, televisions, routers, speakers, washing machines and even robot home assistants. Given the proliferation of connected devices--already, there are estimated to be at least 6.4 billion--there remains the critical question of how to ensure their security. The cybersecurity challenge posed by the internet of things is unique. The scale of connected devices magnifies the consequences of insecurity.