Goto

Collaborating Authors

 Information Fusion


Artificial Intelligence (AI) and Machine Learning Oracle

#artificialintelligence

The platform consists of tools for every step in the modern machine learning lifecycle--from developing machine-learning models to building intelligent applications to integrating machine learning outputs into business intelligence and visualization tools--so it's accessible by business teams. The platform features a portfolio of data management solutions which offer unmatched ability to store and process data at any scale--and data integration tools to ensure data in any format can be accessed for machine learning model building. The platform runs on top of Oracle Cloud Infrastructure, which is optimized for running AI workloads, offering high-speed network fabric and a wide range of GPU and CPU compute options for small- to large-scale model building, training, and production deployments.


MEx: Multi-modal Exercises Dataset for Human Activity Recognition

arXiv.org Artificial Intelligence

MEx: Multi-modal Exercises Dataset is a multi-sensor, multi-modal dataset, implemented to benchmark Human Activity Recognition(HAR) and Multi-modal Fusion algorithms. Collection of this dataset was inspired by the need for recognising and evaluating quality of exercise performance to support patients with Musculoskeletal Disorders(MSD). We select 7 exercises regularly recommended for MSD patients by physiotherapists and collected data with four sensors a pressure mat, a depth camera and two accelerometers. The dataset contains three data modalities; numerical time-series data, video data and pressure sensor data posing interesting research challenges when reasoning for HAR and Exercise Quality Assessment. This paper presents our evaluation of the dataset on number of standard classification algorithms for the HAR task by comparing different feature representation algorithms for each sensor. These results set a reference performance for each individual sensor that expose their strengths and weaknesses for the future tasks. In addition we visualise pressure mat data to explore the potential of the sensor to capture exercise performance quality. With the recent advancement in multi-modal fusion, we also believe MEx is a suitable dataset to benchmark not only HAR algorithms, but also fusion algorithms of heterogeneous data types in multiple application domains.


AWS Lake Formation Automates Data Lake Management - SDxCentral

#artificialintelligence

Amazon Web Services (AWS) launched general availability of its fully-managed Lake Formation platform designed to help organizations better manage their data lakes. The service helps with the building, securing, and managing of those data repositories. Lake Formation, which was initially announced at the AWS re:Invent show late last year, is built on AWS' Glue extract, transform, and load (ETL) service. It automates the provisioning and configuring of storage; crawls the data to extract schema and metadata tags; automatically optimizes the partitioning of the data; and transforms the data into formats like Apache Parquet and ORC for easier analytics. Data can be ingested from different sources using pre-defined templates.


Content Intelligence Technology Can Give Resellers a Competitive Edge

#artificialintelligence

The trend toward digital is the main reason more than half of the companies on the Fortune 500 have disappeared since the year 2000, according to former Accenture CEO Pierre Nanterme. As more companies who still depend on legacy systems are now ready to embrace new digital technologies, value-added resellers (VARs) can tap into this market and enhance their competitive edge by embracing innovative and disruptive technologies that significantly enhance the way businesses operate. Robotic process automation (RPA) is one key technology fueling digital transformation across enterprises of all sizes and industries. RPA is a powerful force in transforming business processes, maximizing internal resources and enhancing customer experiences. Interest in RPA has increased rapidly, growing more than 10 times in popularity in less than two years.


A 20-Year Community Roadmap for Artificial Intelligence Research in the US

arXiv.org Artificial Intelligence

Decades of research in artificial intelligence (AI) have produced formidable technologies that are providing immense benefit to industry, government, and society. AI systems can now translate across multiple languages, identify objects in images and video, streamline manufacturing processes, and control cars. The deployment of AI systems has not only created a trillion-dollar industry that is projected to quadruple in three years, but has also exposed the need to make AI systems fair, explainable, trustworthy, and secure. Future AI systems will rightfully be expected to reason effectively about the world in which they (and people) operate, handling complex tasks and responsibilities effectively and ethically, engaging in meaningful communication, and improving their awareness through experience. Achieving the full potential of AI technologies poses research challenges that require a radical transformation of the AI research enterprise, facilitated by significant and sustained investment. These are the major recommendations of a recent community effort coordinated by the Computing Community Consortium and the Association for the Advancement of Artificial Intelligence to formulate a Roadmap for AI research and development over the next two decades.


Data Integration and Machine Learning: 3 Real-World Use Cases

#artificialintelligence

Learn how applying the concept of machine learning to capacity management can make the process more effective and efficient. Machine learning involves computers assimilating information and then drawing conclusions from that data without being explicitly programmed to do so. This technology has significant positive implications for businesses. Yet, machine learning can be improved even further. The answer lies in data integration.


How dataops improves data, analytics, and machine learning

#artificialintelligence

Have you noticed that most organizations are trying to do a lot more with their data? Businesses are investing heavily in data science programs, self-service business intelligence tools, artificial intelligence programs, and organizational efforts to promote data-driven decision making. Some are developing customer facing applications by embedding data visualizations into web and mobile products or collecting new forms of data from sensors (Internet of Things), wearables, and third-party APIs. Still others are harnessing intelligence from unstructured data sources such as documents, images, videos, and spoken language. Much of the work around data and analytics is on delivering value from it.


Improving Outbreak Detection with Stacking of Statistical Surveillance Methods

arXiv.org Machine Learning

Epidemiologists use a variety of statistical algorithms for the early detection of outbreaks. The practical usefulness of such methods highly depends on the trade-off between the detection rate of outbreaks and the chances of raising a false alarm. Recent research has shown that the use of machine learning for the fusion of multiple statistical algorithms improves outbreak detection. Instead of relying only on the binary output (alarm or no alarm) of the statistical algorithms, we propose to make use of their p-values for training a fusion classifier. In addition, we also show that adding additional features and adapting the labeling of an epidemic period may further improve performance. For comparison and evaluation, a new measure is introduced which captures the performance of an outbreak detection method with respect to a low rate of false alarms more precisely than previous works. Our results on synthetic data show that it is challenging to improve the performance with a trainable fusion method based on machine learning. In particular, the use of a fusion classifier that is only based on binary outputs of the statistical surveillance methods can make the overall performance worse than directly using the underlying algorithms. However, the use of p-values and additional information for the learning is promising, enabling to identify more valuable patterns to detect outbreaks.


Introducing Dagster - Nick Schrock - Medium

#artificialintelligence

Today the team at Elementl is proud to announce an early release of Dagster, an open-source library for building systems like ETL processes and ML pipelines. We believe they are, in reality, a single class of software system. We call them data applications. Dagster is a library for building these data applications. We define a data application as a graph of functional computations that produce and consume data assets.


All Sparse PCA Models Are Wrong, But Some Are Useful. Part I: Computation of Scores, Residuals and Explained Variance

arXiv.org Machine Learning

Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA) that combines variance maximization and sparsity with the ultimate goal of improving data interpretation. When moving from PCA to sPCA, there are a number of implications that the practitioner needs to be aware of. A relevant one is that scores and loadings in sPCA may not be orthogonal. For this reason, the traditional way of computing scores, residuals and variance explained that is used in the classical PCA cannot directly be applied to sPCA models. This also affects how sPCA components should be visualized. In this paper we illustrate this problem both theoretically and numerically using simulations for several state-of-the-art sPCA algorithms, and provide proper computation of the different elements mentioned. We show that sPCA approaches present disparate and limited performance when modeling noise-free, sparse data. In a follow-up paper, we discuss the theoretical properties that lead to this problem.