NEW YORK – STRATA DATA CONFERENCE – Sept. 26, 2017 -- Trifacta, the global leader in data wrangling, today announced that its Trifacta Wrangler Enterprise and Edge products now integrate with DataRobot's automated machine learning platform. This technology integration enables customers in financial services, life sciences and insurance to streamline the execution of machine learning at scale in order to become AI-driven. The integration between Trifacta and DataRobot supports a variety of deployment options across on-premise and cloud data platforms. "Consensus works with today's leading connected device retailers to seamlessly integrate their operations across in-store, web, social and mobile channels. Our ability to manage and analyze diverse data from all of these different sources allows us to quickly identify critical events such as fraudulent activity for our Connected Commerce Revenue Cloud customers," said Harrison Lynch, senior director of product development, Consensus Corporation.
The focus of Trifacta is enabling people who know their data best (analysts & business people) to effectively explore, structure, and join together diverse data sources for a variety of business purposes. The company also just released a new product, and this presented a good opportunity for us to have a discussion with Joe Hellerstein, Trifacta's co-founder and CSO, and Joe Scheuermann, VP of marketing, for their thoughts on machine learning, data wrangling, Hadoop, and more. In a world where more and more objects are coming online and vendors are getting involved in the supply chain, how can you keep track of what's yours and what's not? This means Trifacta is sitting on tons of data about using data. But their founding vision, Scheuermann says, "was not exclusively focused on wrangling big data.
At its Google Cloud Next conference in San Francisco back in March, Google unveiled Cloud Dataprep, a service that lets companies clean their structured and unstructured datasets for analysis in, for example, Google's BigQuery, or even for use in training machine learning models. Over the past six months, Cloud Dataprep has been in private beta, but Google is now officially graduating the service to public beta for anyone to use. Some reports indicate that analysts and data scientists can spend up to 80 percent of their time cleaning and preparing raw data for analysis. This is where Dataprep comes into play, as it can automatically detect data type, schema, and even where there is mismatched or missing data. A key facet of Dataprep is the visual layout, which makes it easier for people who aren't data engineers to alter or add to their datasets.
Today's business leaders understand that they compete on information as much as the goods and services that they provide. No longer is there any doubt that organisations should be data-driven; now, it's a matter of how to accelerate and mature analytics initiatives, and how to increase their accessibility beyond a core data science team. Reaching this "transformative" level in data and analytics is inextricably tied to growth, and a top priority for organisations worldwide. Gartner reports that in recent years, this area has been a number one investment priority for CIOs. In that same report, however, Gartner explains that the majority of organisations have been slow to advance in data and analytics.
A new study commissioned by Trifacta is shining the light on the costs of poor data quality, particularly for organizations implementing AI initiatives. The study found that dirty and disorganized data are linked to AI projects that take longer, are more expensive, and do not deliver the anticipated results. As more firms ramp up AI initiatives, the consequences of poor data quality are expected to grow. The relatively sorry state of data quality is not a new phenomenon. Ever since humans started recording events, we've had to deal with errors.