poor data quality


6 Tips for Building a Training Data Strategy for Machine Learning

#artificialintelligence

Artificial intelligence (AI) and machine learning (ML) are frequently used terms these days. AI refers to the concept of machines mimicking human cognition. ML is an approach used to create AI. If AI is when a computer can carry out a set of tasks based on instruction, ML is a machine's ability to ingest, parse, and learn from that data itself in order to become more accurate or precise about accomplishing that task. Executives in industries such as automotive, finance, government, healthcare, retail, and tech may already have a basic understanding of ML and AI.


Poor data quality causing majority of artificial intelligence projects to stall

#artificialintelligence

A majority of enterprises engaged in artificial intelligence and machine learning initiatives (78 percent) said these projects have stalled--and data quality is one of the culprits--according to a new study from Dimensional Research. Nearly eight out of 10 organizations using AI and ML report that projects have stalled, and 96 percent of these companies have run into problems with data quality, data labeling required to train AI, and building model confidence, said the report, which was commissioned by training platform provider Alegion. For the research, Dimensional conducted a worldwide survey of 227 enterprise data scientists, other AI technologists, and business stakeholders involved in active AI and ML projects. Data issues are causing enterprises to quickly burn through AI project budgets and face project hurdles, the study said. Other findings of the survey: 70 percent of the respondents report that their first AI/ML investment was within last 24 months; more than half of enterprises said they have undertaken fewer than four AI and ML projects; and only half of enterprises have released AI/ML projects into production.


CenturyLink's No Sweat Approach to AI Light Reading

#artificialintelligence

"In the past, large volumes of data made us sweat". So said Pari Bajpay, vice president of Next Generation Enablement at CenturyLink, during a presentation titled "Can AI deliver its promise of a cost-effective, improved experience in telecom?" at the TM Forum's recent Digital Transformation World event in Nice. "We didn't have the networking, compute and storage capacity to cope. A lot of the data would be turned off and you would only work on the critical aspects of the data because what you had on the other end of it was humans that could not process such large volumes," noted Bajpay. However, as big data technology has matured, Bajpay and his team at CenturyLink have grappled with the issue and are now leveraging AI to extract more value from their data.


How AI and Big Data are Improving Research Results Qualtrics

#artificialintelligence

Market research is a $44.5 B market and growing. Online research is among the fastest growing parts of the market thanks to the pervasiveness of the web and the ease with which we can now collect data. However, as the world conducts more and more survey research, the issues that we see elsewhere with big data are now affecting the survey research industry as well, specifically the issue of data quality. Thanks to the growth in online survey research, billions of survey responses are collected every year. But 1/4th of those responses are of poor quality[1].


If Your Data Is Bad, Your Machine Learning Tools Are Useless

#artificialintelligence

Poor data quality is enemy number one to the widespread, profitable use of machine learning. While the caustic observation, "garbage-in, garbage-out" has plagued analytics and decision-making for generations, it carries a special warning for machine learning. The quality demands of machine learning are steep, and bad data can rear its ugly head twice -- first in the historical data used to train the predictive model and second in the new data used by that model to make future decisions. To properly train a predictive model, historical data must meet exceptionally broad and high quality standards. First, the data must be right: It must be correct, properly labeled, de-deduped, and so forth.