Goto

Collaborating Authors

data mining


Scalable Signal Reconstruction for a Broad Range of Applications

Communications of the ACM

Signal reconstruction problem (SRP) is an important optimization problem where the objective is to identify a solution to an underdetermined system of linear equations that is closest to a given prior. It has a substantial number of applications in diverse areas, such as network traffic engineering, medical image reconstruction, acoustics, astronomy, and many more. Unfortunately, most of the common approaches for solving SRP do not scale to large problem sizes. We propose a novel and scalable algorithm for solving this critical problem. Specifically, we make four major contributions. First, we propose a dual formulation of the problem and develop the DIRECT algorithm that is significantly more efficient than the state of the art. Second, we show how adapting database techniques developed for scalable similarity joins provides a substantial speedup over DIRECT. Third, we describe several practical techniques that allow our algorithm to scale--on a single machine--to settings that are orders of magnitude larger than previously studied. Finally, we use the database techniques of materialization and reuse to extend our result to dynamic settings where the input to the SRP changes. Extensive experiments on real-world and synthetic data confirm the efficiency, effectiveness, and scalability of our proposal. The database community has been at the forefront of grappling with challenges of big data and has developed numerous techniques for the scalable processing and analysis of massive datasets. These techniques often originate from solving core data management challenges but then find their way into effectively addressing the needs of big data analytics. We study how database techniques can benefit large-scale signal reconstruction,13 which is of interest to research communities as diverse as computer networks,15 medical imaging,7 etc. We demonstrate that the scalability of existing solutions can be significantly improved using ideas originally developed for similarity joins5 and selectivity estimation for set similarity queries.3 Signal reconstruction problem (SRP): The essence of SRP is to solve a linear system of the form AX b, where X is a high-dimensional unknown signal (represented by an m-d vector in Rm), b is a low-dimensional projection of X that can be observed in practice (represented by an n-d vector in Rn with n m), and A is an n m matrix that captures the linear relationship between X and b.


AI in Analytics: Powering the Future of Data Analytics - Dataconomy

#artificialintelligence

Augmented analytics: the combination of AI and analytics is the latest innovation in data analytics. For organizations, data analysis has evolved from hiring "unicorn" data scientists – to having smart applications that provide actionable insights for decision-making in just a few clicks, thanks to AI. Augmenting by definition means making something greater in strength or value. Augmented analytics, also known as AI-driven analytics, helps in identifying hidden patterns in large data sets and uncovers trends and actionable insights. It leverages technologies such as Analytics, Machine Learning, and Natural Language Generation to automate data management processes and assist with the hard parts of analytics. The capabilities of AI are poised to augment analytics activities and enable companies to internalize data-driven decision-making while enabling everyone in the organization to easily deal with data.


Accelerating the Practical Use of AI - InformationWeek

#artificialintelligence

The hype for how artificial intelligence can miraculously change the world continues to fill media outlets. Still, the reality of how rapidly the science behind AI is evolving and becoming mainstream in every industry and facet of business will not be impeded. By the year 2025, the intersection of "advanced" AI and intelligent machines will become a part of every user's "things I just know how to use." As more industries adopt AI solutions and become savvy about how AI impacts their engagement with suppliers and employees, it is important for organizations to follow four key steps to implement it. While roles like data scientist, chief data officer, and senior data engineer are vital to implementing AI/ML systems, the two following roles are imperative for practical implementation.


Global Big Data Conference

#artificialintelligence

According to the National Oceanic and Atmospheric Administration (NOAA), more than 80% of the ocean "remains unmapped, unobserved, and unexplored" – despite constituting more than 70% of the planet's surface. Now, a pair of Navy veterans are looking to change that with a line of autonomous robot vehicles that will plunge the ocean's depths in search of big data for the company's clients. "The company really started when Joe [Wolfel] and I first got together, which was back in 2004," said Judson Kauffman, who shares the CEO role with Wolfel, in an interview with Datanami. "We met in [Navy] SEAL training together, and ended up being assigned the same unit, and then went into combat together and became very close friends. There, they developed the idea for Terradepth, which "stemmed from some knowledge that we gained in the Navy" – really, Kauffman said, "just of how ignorant humanity is of what's underwater, what's in the sea." "It was shocking to learn how little we know, how little the U.S. Navy knew," he continued – and the more they dug into the issue after their time in the Navy, the more surprised they were.


Google: Learn cloud skills for free with our new training tracks

ZDNet

Google is offering a free course for people who are on the hunt for skills to use containers, big data and machine-learning models in Google Cloud. The initial batch of courses consists of four tracks aimed at data analysts, cloud architects, data scientists and machine-learning engineers. The January 2021 course offers a fast track to understand key tools for engineers and architects to use in Google Cloud. It includes a series on getting started in Google Cloud, another focussing on its BigQuery data warehouse, one that delves into the Kubernetes engine for managing containers, another for the Anthos application management platform, and a final chapter on Google's standard interfaces for natural language processing and computer vision AI. Participants need to sign up to Google's "skills challenge" and will be given 30 days' free access to Google Cloud labs.


Global Big Data Conference

#artificialintelligence

According to the AI Council, the biggest barrier to AI deployment is skills - and it starts as early as school. With artificial intelligence estimated to have the potential to deliver as much as a 10% increase to the UK's GDP before 2030, the challenge remains to unlock the technology's potential – and to do so, a panel of AI experts recommends placing a bet on young brains. A new report from the AI Council, an independent committee that provides advice to the UK government on all algorithmic matters, finds that steps need to be taken from the very start of children's education for artificial intelligence to flourish across the country. The goal, for the next ten years, should be no less ambitious than to ensure that every child leaves school with a basic sense of how AI works. This is not only about understanding the basics of coding and ethics, but about knowing enough to be a confident user of AI products, to look out for potential risks and to engage with the opportunities that the technology presents.


The Robots are Coming: Is AI the Future of Biotech?

#artificialintelligence

AI, or artificial intelligence, has taken root in biotech. In this article, we explore its newfound niches in the industry. Artificial intelligence (AI) and machine learning (ML) have become ubiquitous in tech startups, fueled largely by the increasing availability and amount of data and cheaper, more powerful computers. Now, if you are a new tech startup, ML or AI capabilities represent your minimum ticket to enter the industry. Over the past few years, AI and ML have started to peek their heads into the realm of biotech, due to an analogous transformation of biotech data.


Top Data Science Education Initiatives By Institutions In 2020

#artificialintelligence

While normal education suffered a standstill in 2020, there were a lot of online courses and programs that were initiated by some of the most prestigious institutions as well as big tech giants so that the process of learning and skill development doesn't suffer. As the trend has been for a few years now, some of the most interesting initiatives were seen in the field of data science. In this article, we have listed some of the prominent data science education programs and initiatives in 2020. Microsoft, in collaboration with Netflix, has launched three new learning modules on beginners concepts in data science, along with machine learning and artificial intelligence. The design of these courses is inspired by the Netflix original film -- 'Over The Moon,' where a young girl Fei Fei, who builds a rocket to the moon, embarks on a mission to prove the existence of Moon Goddess.


15 Free Data Science, Machine Learning & Statistics eBooks for 2021 - KDnuggets

#artificialintelligence

An Introduction to Statistical Learning, with Applications in R (ISLR) can be considered a less advanced treatment of the topics found in another classic of the genre written by some of the same authors, The Elements of Statistical Learning. Another major difference between these 2 titles, beyond the level of depth of the material covered, is that ISLR introduces these topics alongside practical implementations in a programming language, in this case R.


Data-driven 2021: Predictions for a new year in data, analytics and AI

ZDNet

Towards the end of each year, I receive a slew of predictions, from data/analytics industry executives and luminaries, focused on the year ahead. This year, those predictions filled a 49-page-long document. While I couldn't include all of them, I've rounded up many of this year's prognostications, from over 30 companies, in this post. The roster includes numerous well-known data/analytics players, including Cloudera, Databricks, Micro Focus, Qlik, SAS, and Snowflake, to name a few. Thoughts from execs at Andreessen Horowitz, the Deloitte AI Institute and O'Reilly are in the mix as well, as are those from executives at smaller but still important industry players.