data warehouse

3 Benefits of a Self-Adapting Data Warehouse


As technology advances to meet new data demands, it also creates new areas of opportunity for business growth and operational efficiency. Today, for example, some technologies already enable automatic query optimization, while machine learning algorithms help automate a variety of once-manual functions. Advances in technology are even starting to let data warehouses tune themselves. This capability is accelerating the speed at which data warehouses deliver value to businesses. With the advance of machine learning and availability of near-infinite storage and computing power in the cloud, we're headed toward an exciting new era: the age of the self-adapting data warehouse.

The real big-data problem and why only machine learning can fix it - SiliconANGLE


Why do so many companies still struggle to build a smooth-running pipeline from data to insights? They invest in heavily hyped machine-learning algorithms to analyze data and make business predictions. Then, inevitably, they realize that algorithms aren't magic; if they're fed junk data, their insights won't be stellar. So they employ data scientists that spend 90% of their time washing and folding in a data-cleaning laundromat, leaving just 10% of their time to do the job for which they were hired. What is flawed about this process is that companies only get excited about machine learning for end-of-the-line algorithms; they should apply machine learning just as liberally in the early cleansing stages instead of relying on people to grapple with gargantuan data sets, according to Andy Palmer, co-founder and chief executive officer of Tamr Inc., which helps organizations use machine learning to unify their data silos.

Analytics, Data Warehouses & Data Insights Actifio


Data Scientists, Analysts, and Database Administrators (DBAs) create many copies of production data for a range of analysis use cases. And increasingly, these data environments must be spun up (and torn down) rapidly, regardless of data size. Actifio's software platform delivers a new level of data agility, managing data throughout its lifecycle, and providing instant access to virtual full data images, on-premises or in any cloud. Due to the inherent inefficiencies in restore processes, particularly when accessed from tape, these environments are provisioned infrequently, and updated even less often. To overcome this time problem, DBAs often create physical copies of production databases.

Data virtualization use cases cover more integration tasks


Gartner predicts that 60% of organizations will deploy data virtualization software as part of their data integration tool set by 2020. That's a big jump from the adoption rate of about 35% the consulting and market research company cited in a November 2018 report on the data virtualization market. But the technology "is rapidly gaining momentum," a group of four Gartner analysts wrote in the report. The analysts said data virtualization use cases are on the rise partly because IT teams are struggling to physically integrate a growing number of data silos, as relational database management system (DBMS) environments are augmented by big data systems and other new data sources. They also pointed to increased technology maturity that has removed deployment barriers for data virtualization users.

CenturyLink's No Sweat Approach to AI Light Reading


"In the past, large volumes of data made us sweat". So said Pari Bajpay, vice president of Next Generation Enablement at CenturyLink, during a presentation titled "Can AI deliver its promise of a cost-effective, improved experience in telecom?" at the TM Forum's recent Digital Transformation World event in Nice. "We didn't have the networking, compute and storage capacity to cope. A lot of the data would be turned off and you would only work on the critical aspects of the data because what you had on the other end of it was humans that could not process such large volumes," noted Bajpay. However, as big data technology has matured, Bajpay and his team at CenturyLink have grappled with the issue and are now leveraging AI to extract more value from their data.

Designing and Implementing Data Warehouse for Agricultural Big Data Artificial Intelligence

In recent years, precision agriculture that uses modern information and communication technologies is becoming very popular. Raw and semi-processed agricultural data are usually collected through various sources, such as: Internet of Thing (IoT), sensors, satellites, weather stations, robots, farm equipment, farmers and agribusinesses, etc. Besides, agricultural datasets are very large, complex, unstructured, heterogeneous, non-standardized, and inconsistent. Hence, the agricultural data mining is considered as Big Data application in terms of volume, variety, velocity and veracity. It is a key foundation to establishing a crop intelligence platform, which will enable resource efficient agronomy decision making and recommendations. In this paper, we designed and implemented a continental level agricultural data warehouse by combining Hive, MongoDB and Cassandra. Our data warehouse capabilities: (1) flexible schema; (2) data integration from real agricultural multi datasets; (3) data science and business intelligent support; (4) high performance; (5) high storage; (6) security; (7) governance and monitoring; (8) replication and recovery; (9) consistency, availability and partition tolerant; (10) distributed and cloud deployment. We also evaluate the performance of our data warehouse.

A Knowledge Graph-based Approach for Exploring the U.S. Opioid Epidemic Artificial Intelligence

The United States is in the midst of an opioid epidemic with recent estimates indicating that more than 130 people die every day due to drug overdose. The over-prescription and addiction to opioid painkillers, heroin, and synthetic opioids, has led to a public health crisis and created a huge social and economic burden. Statistical learning methods that use data from multiple clinical centers across the US to detect opioid over-prescribing trends and predict possible opioid misuse are required. However, the semantic heterogeneity in the representation of clinical data across different centers makes the development and evaluation of such methods difficult and non-trivial. We create the Opioid Drug Knowledge Graph (ODKG) -- a network of opioid-related drugs, active ingredients, formulations, combinations, and brand names. We use the ODKG to normalize drug strings in a clinical data warehouse consisting of patient data from over 400 healthcare facilities in 42 different states. We showcase the use of ODKG to generate summary statistics of opioid prescription trends across US regions. These methods and resources can aid the development of advanced and scalable models to monitor the opioid epidemic and to detect illicit opioid misuse behavior. Our work is relevant to policymakers and pain researchers who wish to systematically assess factors that contribute to opioid over-prescribing and iatrogenic opioid addiction in the US.

Machine Learning Deployment Options: in the Cloud vs. at the Edge - insideBIGDATA


In this special guest feature, Neil Cohen, Vice President at Edge Intelligence, examines the question: where should businesses develop and execute machine learning? This article explores the pros and cons of in the cloud versus at the edge. Neil brings more than 15 years of combined marketing and product management experience to his role as VP of Product Management & Marketing for Edge Intelligence. Previously, he was VP of Global Marketing at Akamai Technologies, where he ran worldwide marketing for a $1.3 billion cybersecurity and web performance business. He was also VP of Product Marketing at Akamai where he helped the organization double revenue and repeatedly launched new products and helped grow them into businesses exceeding hundreds of millions of dollars.

The 5 exciting machine learning, data science and big data trends for 2019 - Edvancer Eduventures


Big data and analytics have become crucial to business. But will that spine develop, or will it change the landscape of business yet again? Here's a sneak peek into what the following months look like. Just a while ago big data was a lucrative new phenomenon promising a smooth business takeover. Now, since data and analytics are imperative to business and deeply embedded, the question arises whether technology will have a growth spurt in the coming year, continue to mold and restructure businesses or be replaced by something else.

Updated: Difference Between Business Intelligence and Data Science


I'm reposting this blog (with updated graphics) because I still get many questions about the difference between Business Intelligence and Data Science. I recently had a client ask me to explain to his management team the difference between a Business Intelligence (BI) Analyst and a Data Scientist. I frequently hear this question, and typically resort to showing Figure 1 (BI Analyst vs. Data Scientist Characteristics chart, which shows the different attitudinal approaches for each)... But these slides lack the context required to satisfactorily answer the question – I'm never sure the audience really understands the inherent differences between what a BI analyst does and what a data scientist does. The key is to understand the differences between the BI analyst's and data scientist's goals, tools, techniques and approaches.