Goto

Collaborating Authors

 Information Fusion


Paxata Self-Service Data Integration and Management

#artificialintelligence

Paxata provides an embedded catalog for storing curated data assets. Google-like search, tags and annotations allow users to easily explore the catalog and find the right data for a project. The data preparation paradigm follows a Google Sheets style of simultaneous project viewing and editing to enable data owners and analysts to collaborate and rapidly prototype and build desired outcomes on a unified, single view of data.


Taking Analytics to the Edge: Moving Processing to the Data Rather than Data to the Processing

#artificialintelligence

Ground-breaking changes are happening on the edge of computing. We're long past the days when all analytics can be centralized in datacenters or even in the cloud. It's an increasingly decentralized world where analytics has to take place in real time right where individual sensors are, or in the fog when there's a need to collect information from multiple devices for fast insights. We recently sat down to talk with renowned technology consultant Marc Staimer about computing in the edge, the fog, and the core. In part two of our conversation, we're going to take a more in-depth look at matching the analytics requirements to the location of the analysis, especially when those analytics need to take place on the edge or in the fog.


Big Data ETL Architect - IoT BigData Jobs

#artificialintelligence

At U.S. Bank, we're passionate about helping customers and the communities where we live and work. The fifth-largest bank in the United States, we're one of the country's most respected, innovative and successful financial institutions. U.S. Bank is an equal opportunity employer committed to creating a diverse workforce. We consider all qualified applicants without regard to race, religion, color, sex, national origin, age, sexual orientation, disability or veteran status, among other factors. U.S. Bank is seeking a proficient Big Data ETL Architect with experience in Big Data technologies and Data Architecture to contribute toward the success of our technology initiatives.


Big Data Integration Engineer - IoT BigData Jobs

#artificialintelligence

Wargaming is looking for a Big Data Integration Engineer with experience in ETL, Hadoop and Oracle. This is a fantastic opportunity to be part of a vastly growing and award winning global brand in the gaming industry that is at the cutting edge of PC and console gaming technology. Our games develop massive amounts of data โ€“ over 100 million lines of data from just a few dozen game sessions โ€“ and this role will be critical in using that data to better understand player behavior so we can continually improve our games and increase player happiness. In this role, the Big Data Integration Engineer would support and perform activities involved in the design and development of ETL/ELT/data integration process and programs which include data analysis, source to target data mapping, job scheduling, and the development and testing of PL/SQL packages.


Sensors

#artificialintelligence

Sensors provide valuable data about physical magnitudes and environmental phenomena. However, the translation of these data into concrete actions requires processing the inputs that may come from one or many types of sensors, including sensor networks. Such processing can benefit from Artificial Intelligence (AI), and the use of machine learning, neural networks (including deep architectures), and information fusion methods have been common in this field. Currently, these concepts can be applied in different IoT architectures, where there are sensor and actuator nodes that communicate and create the networks. These types of networks tend to be autonomous networks that adapt to several conditions, creating smart IoT networks.


Things I Have Learned About Data Science - KDnuggets

#artificialintelligence

If you think your data is clean, perhaps you have not looked into it yet; if you think your data is messy, it's even messier. Nobody cares how you did it; just do it correctly. People do not care how much you know until they know how much you care (about them and their business). In 2-3 years, nobody will talk about Big Data anymore. It always pays off to be damn good at numbers, Excel, and PowerPoint (and yes, presentation skills); Tableau is a big plus. Downloading some code and data and running them does not make you a data scientist. The same is true for doing data science courses. Participating in Kaggle competitions does not make you a data scientist, although it can help you learn from others. Winning Kaggle competitions does not necessarily make you a good data scientist. ETL is always needed - be good at it and learn a good tool for it (Talend is a good one). Also, learn scripting languages for ETL. Deep learning is cool, but it's still cool if you don't use it when you don't need it, and in 99% of cases you don't need it. Algorithms are commodities, your data is not. Ideas are commodities, execution is not. Deep learning expertise will soon become a commodity; problem-solving skills won't.


Human Action Recognition Using Deep Multilevel Multimodal (M2) Fusion of Depth and Inertial Sensors

arXiv.org Machine Learning

Multimodal fusion frameworks for Human Action Recognition (HAR) using depth and inertial sensor data have been proposed over the years. In most of the existing works, fusion is performed at a single level (feature level or decision level), missing the opportunity to fuse rich mid-level features necessary for better classification. To address this shortcoming, in this paper, we propose three novel deep multilevel multimodal fusion frameworks to capitalize on different fusion strategies at various stages and to leverage the superiority of multilevel fusion. At input, we transform the depth data into depth images called sequential front view images (SFIs) and inertial sensor data into signal images. Each input modality, depth and inertial, is further made multimodal by taking convolution with the Prewitt filter. Creating "modality within modality" enables further complementary and discriminative feature extraction through Convolutional Neural Networks (CNNs). CNNs are trained on input images of each modality to learn low-level, high-level and complex features. Learned features are extracted and fused at different stages of the proposed frameworks to combine discriminative and complementary information. These highly informative features are served as input to a multi-class Support Vector Machine (SVM). We evaluate the proposed frameworks on three publicly available multimodal HAR datasets, namely, UTD Multimodal Human Action Dataset (MHAD), Berkeley MHAD, and UTD-MHAD Kinect V2. Experimental results show the supremacy of the proposed fusion frameworks over existing methods.


Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning

#artificialintelligence

This article describes the machine learning services provided in SQL Server 2017, which support in-database use of the Python and R languages. The integration of SQL Server with open source languages popular for machine learning makes it easier to use the appropriate tool--SQL, Python, or R--for data exploration and modeling. R and Python scripts can also be used in T-SQL scripts or Integration Services packages, expanding the capabilities of ETL and database scripting. What has this to do with stone soup, you ask? It's a metaphor, of course, but one that captures the essence of why SQL Server works so well with Python and R. To illustrate the point, I'll provide a simple walkthrough of data exploration and modeling combining SQL and Python, using a food and nutrition analysis dataset from the US Department of Agriculture. You might have heard that data science is more of a craft than a science. Many ingredients have to come together efficiently, to process intake data and generate models and predictions that can be consumed by business users and end customers. However, what works well at the level of "craftsmanship" often has to change at commercial scale. Much like the home cook who has ventured out of the kitchen into a restaurant or food factory, big changes are required in the roles, ingredients, and processes. Moreover, cooking can no longer be a "one-man show;" you need the help of professionals with different specializations and their own tools to create a successful product or make the process more efficient. These specialists include data scientists, data developers and taxonomists, SQL developers, DBAS, application developers, and the domain specialists or end users who consume the results. Any kitchen would soon be chaos if the tools used by each professional were incompatible with each other, or if processes had to be duplicated and slightly changed at each step. What restaurant would survive if carrots chopped up at one station were unusable at the next?


How can AI Automate End-to-End Data Science?

arXiv.org Artificial Intelligence

Data science is labor-intensive and human experts are scarce but heavily involved in every aspect of it. This makes data science time consuming and restricted to experts with the resulting quality heavily dependent on their experience and skills. To make data science more accessible and scalable, we need its democratization. Automated Data Science (AutoDS) is aimed towards that goal and is emerging as an important research and business topic. We introduce and define the AutoDS challenge, followed by a proposal of a general AutoDS framework that covers existing approaches but also provides guidance for the development of new methods. We categorize and review the existing literature from multiple aspects of the problem setup and employed techniques. Then we provide several views on how AI could succeed in automating end-to-end AutoDS. We hope this survey can serve as insightful guideline for the AutoDS field and provide inspiration for future research.


Machine Learning in population health: Creating conditions that ensure good health.

#artificialintelligence

Machine Learning (ML) in healthcare has an affinity for patient-centred care and individual-level predictions. Both individual health and population health are not divergent, but at the same time, both are not the same and may require different approaches. ML in public health applications receives far less attention. The skills available to public health organizations to transition towards an integrated data analytics is limited. Hence the latest advances in ML and artificial intelligence (AI) have made very little impact on public health analytics and decision making.