algorithmic audit
Testing software for non-discrimination: an updated and extended audit in the Italian car insurance domain
Rondina, Marco, Vetrò, Antonio, Coppola, Riccardo, Regragrui, Oumaima, Fabris, Alessandro, Silvello, Gianmaria, Susto, Gian Antonio, De Martin, Juan Carlos
Context. As software systems become more integrated into society's infrastructure, the responsibility of software professionals to ensure compliance with various non-functional requirements increases. These requirements include security, safety, privacy, and, increasingly, non-discrimination. Motivation. Fairness in pricing algorithms grants equitable access to basic services without discriminating on the basis of protected attributes. Method. We replicate a previous empirical study that used black box testing to audit pricing algorithms used by Italian car insurance companies, accessible through a popular online system. With respect to the previous study, we enlarged the number of tests and the number of demographic variables under analysis. Results. Our work confirms and extends previous findings, highlighting the problematic permanence of discrimination across time: demographic variables significantly impact pricing to this day, with birthplace remaining the main discriminatory factor against individuals not born in Italian cities. We also found that driver profiles can determine the number of quotes available to the user, denying equal opportunities to all. Conclusion. The study underscores the importance of testing for non-discrimination in software systems that affect people's everyday lives. Performing algorithmic audits over time makes it possible to evaluate the evolution of such algorithms. It also demonstrates the role that empirical software engineering can play in making software systems more accountable.
- North America > United States > New York > New York County > New York City (0.05)
- Africa > Middle East > Morocco (0.04)
- Europe > Romania (0.04)
- (7 more...)
- Research Report > Experimental Study (0.68)
- Research Report > New Finding (0.68)
- Banking & Finance > Insurance (1.00)
- Government > Regional Government > Europe Government (0.68)
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Software (0.75)
Actionable Auditing Revisited
Non-target corporations Kairos and Amazon have overall error rates of 6.60% and 8.66%, respectively. These are the worst current performances of the companies analyzed in the follow-up audit. Nonetheless, when comparing to the previous May 2017 performance of target corporations, the Kairos and Amazon error rates are lower than the former error rates of IBM (12.1%) and Face (9.9%) and only slightly higher than Microsoft's performance (6.2%) from the initial study.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)
- Law > Statutes (1.00)
- Law > Civil Rights & Constitutional Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Regional Government > North America Government > United States Government (0.47)
Problems with audits for bias in AI systems highlighted in research paper
Sasha Costanza-Chock, co-author of a research paper which looks into algorithmic audits, says there are many areas that require improvement in order to bolster the effectiveness of the process and reduce harms from bias in AI used in the real world, like facial recognition systems. Speaking about the Algorithmic Justice League paper on a recent episode of technology news podcast Marketplace, Costanza-Chock posits that it is very difficult to determine the effectiveness of algorithmic audits in the current dispensation because of non-disclosure agreements that bind first and second party auditors who have more access to the data and systems of companies they are auditing. Bias has been found in algorithms not only related to biometric matching, but adjacent areas like liveness detection, as well as unrelated AI applications. While putting together the research paper, which identifies emerging best practices as well as methods and tools for AI audits, the teams found out that a number of variations exist in the algorithmic auditing process as there is no harmonized standard or regulation on what auditors should look out for, said the co-author. While some of the audits focus on accuracy or fairness of training and sample data, some look at the privacy and security implications of the systems under audit, and only about half of the auditors they spoke to said they check to find out if companies have quality systems to enable users to channel complaints of AI bias harms in real-time.
It's Time to Develop the Tools We Need to Hold Algorithms Accountable
When algorithms fail, people get hurt. We now have the evidence - of false arrests [1] and wrongful accusations [2] perpetuated by avoidable errors; glitches blocking access to healthcare [3] or housing [4]; and biased outcomes creating rather than removing barriers for the most vulnerable to succeed [5]. Much of this evidence was collected by algorithmic auditors, who meticulously analyze these systems for failures and communicate concretely about the ways in which these systems fall short of ensuring the safety of those impacted. Given the increasingly visible policy developments mandating audit activity and the proliferation of deployed algorithmic products, algorithmic audits are increasingly crucial tools for holding vendors and operators accountable for the impacts of the algorithmic systems they choose to release into the real world. However, despite the increasingly prevalent academic discussion of algorithmic audits, such audits remain incredibly difficult to execute. Audits are often completed with much difficulty and are surprisingly ad hoc, developed in isolation of other efforts and reliant on either custom tooling or mainstream resources that fall short of facilitating the actual audit goals of accountability.
Who Decides if AI is Fair? The Labels Problem in Algorithmic Auditing
Mishra, Abhilash, Gorana, Yash
Labelled "ground truth" datasets are routinely used to evaluate and audit AI algorithms applied in high-stakes settings. However, there do not exist widely accepted benchmarks for the quality of labels in these datasets. We provide empirical evidence that quality of labels can significantly distort the results of algorithmic audits in real-world settings. Using data annotators typically hired by AI firms in India, we show that fidelity of the ground truth data can lead to spurious differences in performance of ASRs between urban and rural populations. After a rigorous, albeit expensive, label cleaning process, these disparities between groups disappear. Our findings highlight how trade-offs between label quality and data annotation costs can complicate algorithmic audits in practice. They also emphasize the need for development of consensus-driven, widely accepted benchmarks for label quality.
- Asia > India (0.29)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Oceania > Australia > New South Wales > Sydney (0.04)
Audits attempt to straighten out the "wild, wild west" of algorithms
AI algorithms employed in everything from hiring to lending to criminal justice have a persistent and often invisible problem with bias. The big picture: One solution could be audits that aim to determine whether an algorithm is working as intended, whether it's disproportionately affecting different groups of people and, if there are problems, how they can be fixed. How it works: Algorithmic audits -- usually conducted by outside companies -- involve examining an algorithm's code and the data used to train it, and assessing its potential impact on populations through interviews with stakeholders and those who might be affected by it. Between the lines: Financial audits exist in part to open up the black box of a company's internal operations to outside investors, and ensure that a company remains in compliance with financial laws and regulations. Details: Algorithmic audits can help companies screen their AI products for flaws that may not be apparent at first glance.
Job Screening Service Halts Facial Analysis of Applicants
Job hunters may now need to impress not just prospective bosses but artificial intelligence algorithms too--as employers screen candidates by having them answer interview questions on a video that is then assessed by a machine. HireVue, a leading provider of software for vetting job candidates based on an algorithmic assessment, said Tuesday it is killing off a controversial feature of its software: analyzing a person's facial expressions in a video to discern certain characteristics. Job seekers screened by HireVue sit in front of a webcam and answer questions. Their behavior, intonation, and speech is fed to an algorithm that assigns certain traits and qualities. HireVue says that an "algorithmic audit" of its software conducted last year shows it does not harbor bias.
- Information Technology > Security & Privacy (0.51)
- Telecommunications (0.40)