Although artificial intelligence (AI) is solving real-world challenges and transforming industries, there are serious concerns about its ability to behave and make decisions in a responsible way. Many AI ethics principles and guidelines for responsible AI have been recently issued by governments, organisations, and enterprises. However, these AI ethics principles and guidelines are typically high-level and do not provide concrete guidance on how to design and develop responsible AI systems. To address this shortcoming, we first present an empirical study where we interviewed 21 scientists and engineers to understand the practitioners' perceptions on AI ethics principles and their implementation. We then propose a template that enables AI ethics principles to be operationalised in the form of concrete patterns and suggest a list of patterns using the newly created template. These patterns provide concrete, operationalised guidance that facilitate the development of responsible AI systems.
The development of AI applications is a multidisciplinary effort, involving multiple roles collaborating with the AI developers, an umbrella term we use to include data scientists and other AI-adjacent roles on the same team. During these collaborations, there is a knowledge mismatch between AI developers, who are skilled in data science, and external stakeholders who are typically not. This difference leads to communication gaps, and the onus falls on AI developers to explain data science concepts to their collaborators. In this paper, we report on a study including analyses of both interviews with AI developers and artifacts they produced for communication. Using the analytic lens of shared mental models, we report on the types of communication gaps that AI developers face, how AI developers communicate across disciplinary and organizational boundaries, and how they simultaneously manage issues regarding trust and expectations.
In the past decade, Artificial Intelligence (AI) has become a part of our daily lives due to major advances in Machine Learning (ML) techniques. In spite of an explosive growth in the raw AI technology and in consumer facing applications on the internet, its adoption in business applications has conspicuously lagged behind. For business/mission-critical systems, serious concerns about reliability and maintainability of AI applications remain. Due to the statistical nature of the output, software 'defects' are not well defined. Consequently, many traditional quality management techniques such as program debugging, static code analysis, functional testing, etc. have to be reevaluated. Beyond the correctness of an AI model, many other new quality attributes, such as fairness, robustness, explainability, transparency, etc. become important in delivering an AI system. The purpose of this paper is to present a view of a holistic quality management framework for ML applications based on the current advances and identify new areas of software engineering research to achieve a more trustworthy AI.
Artificial Intelligence (AI) or Machine Learning (ML) systems have been widely adopted as value propositions by companies in all industries in order to create or extend the services and products they offer. However, developing AI/ML systems has presented several engineering problems that are different from those that arise in, non-AI/ML software development. This study aims to investigate how software engineering (SE) has been applied in the development of AI/ML systems and identify challenges and practices that are applicable and determine whether they meet the needs of professionals. Also, we assessed whether these SE practices apply to different contexts, and in which areas they may be applicable. We conducted a systematic review of literature from 1990 to 2019 to (i) understand and summarize the current state of the art in this field and (ii) analyze its limitations and open challenges that will drive future research. Our results show these systems are developed on a lab context or a large company and followed a research-driven development process. The main challenges faced by professionals are in areas of testing, AI software quality, and data management. The contribution types of most of the proposed SE practices are guidelines, lessons learned, and tools.
Yet in practice, while the default machine-learning setup is already amenable to modular implementations supporting change and experimentation, the devil is in the details. Pipeline stages may be separated by stable interfaces, but still interact; it is quite common that experimentation affects multiple stages, for example, combining changes to data cleaning with different feature encoding and different hyperparameters during model training. More importantly, decisions in the machine-learning pipeline interact with decisions in other (non-ML) parts of the system, such as how the data is collected or what infrastructure is available to scale training. None of the pipeline stages can be truly considered entirely in isolation without considering other stages and other parts of the larger system, as we will explore. Data scientists tend to work in a very exploratory and iterative mode (see also chapter Data Science and Software Engineering Process Models), often initially on local machines with static snapshots of the data.