Goto

Collaborating Authors

 data annotator


Advancing Data Equity: Practitioner Responsibility and Accountability in NLP Data Practices

arXiv.org Artificial Intelligence

While research has focused on surfacing and auditing algorithmic bias to ensure equitable AI development, less is known about how NLP practitioners - those directly involved in dataset development, annotation, and deployment - perceive and navigate issues of NLP data equity. This study is among the first to center practitioners' perspectives, linking their experiences to a multi-scalar AI governance framework and advancing participatory recommendations that bridge technical, policy, and community domains. Drawing on a 2024 questionnaire and focus group, we examine how U.S.-based NLP data practitioners conceptualize fairness, contend with organizational and systemic constraints, and engage emerging governance efforts such as the U.S. AI Bill of Rights. Findings reveal persistent tensions between commercial objectives and equity commitments, alongside calls for more participatory and accountable data workflows. We critically engage debates on data diversity and diversity washing, arguing that improving NLP equity requires structural governance reforms that support practitioner agency and community consent.


James Muldoon, Mark Graham and Callum Cant: 'AI feeds off the work of human beings'

The Guardian

James Muldoon is a reader in management at the University of Essex, Mark Graham a professor at the Oxford Internet Institute and Callum Cant a senior lecturer at the University of Essex business school. They work together at Fairwork, a project that appraises the working conditions in digital workplaces, and they are co-authors of Feeding the Machine: The Hidden Human Labour Powering AI. Why did you write the book? James Muldoon: The idea for the book emerged out of field work we did in Kenya and Uganda on the data annotation industry. We spoke to a number of data annotators, and the working conditions were just horrendous.


We are all AI's free data workers

MIT Technology Review

The secret to making AI chatbots sound smart and spew less toxic nonsense is to use a technique called reinforcement learning from human feedback, which uses input from people to improve the model's answers. It relies on a small army of human data annotators who evaluate whether a string of text makes sense and sounds fluent and natural. They decide whether a response should be kept in the AI model's database or removed. Even the most impressive AI chatbots require thousands of human work hours to behave in a way their creators want them to, and even then they do it unreliably. The work can be brutal and upsetting, as we will hear this week when the ACM Conference on Fairness, Accountability, and Transparency (FAccT) gets underway.


What is Data Annotation and How Applied in Machine Learning?

#artificialintelligence

Modern businesses operate in highly competitive markets. Because of this, it can be difficult to find new business opportunities. Customer experiences are always changing. Finding the right talent to help you achieve common business goals can be a major challenge. However, businesses want to do the best possible thing.


How AI Can Improve Job Quality

#artificialintelligence

AI can improve or worsen job quality. What constitutes a quality job? If you were to ask family and friends, they would probably say good pay, benefits, and stable working conditions, but for many workers, workplace technologies, especially AI, are affecting job quality. That's important because the U.S. has a serious job quality problem. The number one ESG challenge companies are grappling with is the treatment of workers.


Protecting Endangered Animals With AI

#artificialintelligence

While AI is making a big impact in pretty much every business area, it is also important to note some of the ways it is helping to save our planet. Conservationists are increasingly turning to AI as an innovative solution to overcome various biodiversity crises. It helps protect a diverse set of species and assists law enforcement agents who are often short-staffed, and it is almost impossible for them to cover a vast stretch of land, such as a national park. This is one of the reasons why AI is so useful because it can take a lot of the time-consuming work off the shoulders of human workers, such as constantly monitoring surveillance data. In this article, we will talk about some of the interesting ways AI is being used to protect endangered species and the data annotation that is required to create it.


How AI and Data Annotation Are Improving Football Officiating

#artificialintelligence

There are many calls made by referees that are still debated by fans even to this day. This includes the controversial goal given to George Hurst in the 1966 World Cup Final, which allowed him to score a hat trick. Who can forget Diego Maradona's famous handball in 1986, which resulted in a goal against England? FIFA is trying to not only reduce such infamous moments but to help out the referees who often do not have a clear sight of what's going on. This is why the officials at FIFA have been experimenting with new AI technology that can track player motions and allow the referees to make more accurate offside calls.


5 Foundational Facts to Understand the Data Labeling Process

#artificialintelligence

The data labeling market is growing at a remarkable rate. Recent estimates suggest that it will be worth over $38 billion by 2028. While it is fascinating to observe the fast pace this market is growing, many laypeople and even some IT professionals are also confused about the concept behind data labeling. What is data labeling and what are its implications for the data science profession? Working with AI (artificial intelligence) models involves various components, one of the most important of them being data labeling.


Unbiased AI becomes mission-critical in 2021

#artificialintelligence

This article is fourth in a 5-part series on predictions in AI in 2021 -- catch up on the first, second, and third in the series. Perhaps the most succinct summary of the relationship between artificial intelligence (AI) and data can be described as follows: an AI model is only as good as the data it was trained on. Training data serves as the foundation of AI solutions everywhere and can make or break their success. Data management is a key focal point for companies building machine learning (ML) models, and this domain will only continue to grow in importance in 2021 and beyond. In the coming years, it will be more evident than ever how steep the price is of getting this area of AI wrong.


The Human-power Behind AI: Machine Learning Needs Annotators

#artificialintelligence

NEW YORK, NY / ACCESSWIRE / November 6, 2020 / "The global data collection and labeling market size was valued at USD 1.0 billion in 2019 and is expected to witness a CAGR of 26.0% from 2020 to 2027," quote from a market analysis report by grand view research. At present, the application scenes of artificial intelligence are constantly enriched, and applications are changing our lives by providing automated and smart services. Behind the rapid growth of the AI industry, the new profession of data annotator is also expanding. There is a popular saying in the data annotation industry, "more intelligent, more labors". The data that AI algorithms learn from must be annotated one by one through the human annotators.