Goto

Collaborating Authors

 Information Fusion


Understand Workflow Management with Kubeflow

#artificialintelligence

Kubeflow is an open-source platform that makes it easy to deploy and manage machine learning (ML) workflows on Kubernetes, a popular open-source system for automating containerized applications' deployment, scaling, and management. Kubeflow can help you run machine learning tasks on your computer by making it easy to set up and manage a cluster of computers to work together on the task. It acts like a "traffic cop" for your computer work, ensuring all the tasks' different steps are done in the right order and that all the computers are working together correctly. This way, you can focus on the task at hand, such as making predictions or finding patterns in your data, and let Kubeflow handle the underlying infrastructure. Imagine you have a big toy box with many different toys inside. Kubeflow is like the toy box organizer.


KG-Hub -- Building and Exchanging Biological Knowledge Graphs

arXiv.org Artificial Intelligence

Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of knowledge graphs is lacking. Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of knowledge graphs. Features include a simple, modular extract-transform-load (ETL) pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate knowledge graphs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph machine learning, including node embeddings and training of models for link prediction and node classification.


Probabilistic Neural Data Fusion for Learning from an Arbitrary Number of Multi-fidelity Data Sets

arXiv.org Artificial Intelligence

In many applications in engineering and sciences analysts have simultaneous access to multiple data sources. In such cases, the overall cost of acquiring information can be reduced via data fusion or multi-fidelity (MF) modeling where one leverages inexpensive low-fidelity (LF) sources to reduce the reliance on expensive high-fidelity (HF) data. In this paper, we employ neural networks (NNs) for data fusion in scenarios where data is very scarce and obtained from an arbitrary number of sources with varying levels of fidelity and cost. We introduce a unique NN architecture that converts MF modeling into a nonlinear manifold learning problem. Our NN architecture inversely learns non-trivial (e.g., non-additive and non-hierarchical) biases of the LF sources in an interpretable and visualizable manifold where each data source is encoded via a low-dimensional distribution. This probabilistic manifold quantifies model form uncertainties such that LF sources with small bias are encoded close to the HF source. Additionally, we endow the output of our NN with a parametric distribution not only to quantify aleatoric uncertainties, but also to reformulate the network's loss function based on strictly proper scoring rules which improve robustness and accuracy on unseen HF data. Through a set of analytic and engineering examples, we demonstrate that our approach provides a high predictive power while quantifying various sources uncertainties.


Deep Multi-modal Fusion of Image and Non-image Data in Disease Diagnosis and Prognosis: A Review

arXiv.org Artificial Intelligence

The rapid development of diagnostic technologies in healthcare is leading to higher requirements for physicians to handle and integrate the heterogeneous, yet complementary data that are produced during routine practice. For instance, the personalized diagnosis and treatment planning for a single cancer patient relies on the various images (e.g., radiological, pathological, and camera images) and non-image data (e.g., clinical data and genomic data). However, such decision-making procedures can be subjective, qualitative, and have large inter-subject variabilities. With the recent advances in multi-modal deep learning technologies, an increasingly large number of efforts have been devoted to a key question: how do we extract and aggregate multi-modal information to ultimately provide more objective, quantitative computer-aided clinical decision making? This paper reviews the recent studies on dealing with such a question. Briefly, this review will include the (1) overview of current multi-modal learning workflows, (2) summarization of multi-modal fusion methods, (3) discussion of the performance, (4) applications in disease diagnosis and prognosis, and (5) challenges and future directions.


ETL Data Analyst at Verisk - Jersey City, NJ, United States

#artificialintelligence

We help the world see new possibilities and inspire change for better tomorrows. Our analytic solutions bridge content, data, and analytics to help business, people, and society become stronger, more resilient, and sustainable. At the heart of what we do is help clients manage risk. Verisk (Nasdaq: VRSK) provides data and insights to our customers in insurance, energy and the financial services markets so they can make faster and more informed decisions. Our global team uses AI, machine learning, automation, and other emerging technologies to collect and analyze billions of records.


Towards Long-term Autonomy: A Perspective from Robot Learning

arXiv.org Artificial Intelligence

In the future, service robots are expected to be able to operate autonomously for long periods of time without human intervention. Many work striving for this goal have been emerging with the development of robotics, both hardware and software. Today we believe that an important underpinning of long-term robot autonomy is the ability of robots to learn on site and on-the-fly, especially when they are deployed in changing environments or need to traverse different environments. In this paper, we examine the problem of long-term autonomy from the perspective of robot learning, especially in an online way, and discuss in tandem its premise "data" and the subsequent "deployment".


Thales and NukkAI Partner to Develop Solution for Military Applications - Defense Advancement

#artificialintelligence

French artificial intelligence start-up NukkAI has signed a contract with Thales to develop an AI-based data fusion solution for military applications. Thales believes military analysts in operations centers face significant challenges in extracting relevant information from the huge volumes of data generated by multiple sources such as video and audio streams, websites, Twitter feeds, satellite imagery, social media, and telephone conversations. According to Thales, real-time data analytics will enable them to develop advanced military strategies with greater efficiency. As a result, Thales is planning to implement NukkAI's solution in a number of its military data processing programs. When operators are swamped by information, the solution will use real-time data exploitation and fusion methods to automatically review the knowledge available so that analysts can focus on elements of interest.


Project Manager – Data Integration at Publicis Groupe - Troy, MI, United States

#artificialintelligence

Epsilon is looking for a Project Manager to manage/execute General Motors B2C marketing projects per the direction of the Client and the Internal Sales Team. The ideal candidate will need to operate in a fast-paced, data-driven environment and be responsible for managing all data integration points between Epsilon and GM. They must be able to understand and articulate business requirements as conveyed by internal or external Clients and have the ability to turn written and verbal instructions into comprehensive Business Requirements. They must be able to work on multiple large programs simultaneously, manage change, and think critically to complete projects. When you're one of us, you get to run with the best.


Epic Clarity ETL Administrator at Prominence Advisors - Madison, Wisconsin, United States - Remote

#artificialintelligence

Prominence is looking for a senior Epic Clarity ETL Administrator experienced with SQL programming. Prominence is a healthcare technology strategy and implementation firm, focused on helping the nation's leading healthcare organizations to do more with their data. Founded by former Epic managers, we understand the technology landscape in healthcare and provide IT staffing, advisory services, and analytics solutions to create robust data ecosystems that support clinical workflows, automate operational processes, and expedite research. Whether it's guiding a technology implementation, establishing governance principles, or developing leading edge analytics, we help our customers make sense out of the mountain of data at their fingertips in order to deliver higher quality care at a lower cost. Ranked as a best place to work over 27 times (and counting!),


Multimodal Inverse Cloze Task for Knowledge-based Visual Question Answering

arXiv.org Artificial Intelligence

We present a new pre-training method, Multimodal Inverse Cloze Task, for Knowledge-based Visual Question Answering about named Entities (KVQAE). KVQAE is a recently introduced task that consists in answering questions about named entities grounded in a visual context using a Knowledge Base. Therefore, the interaction between the modalities is paramount to retrieve information and must be captured with complex fusion models. As these models require a lot of training data, we design this pre-training task from existing work in textual Question Answering. It consists in considering a sentence as a pseudo-question and its context as a pseudo-relevant passage and is extended by considering images near texts in multimodal documents. Our method is applicable to different neural network architectures and leads to a 9% relative-MRR and 15% relative-F1 gain for retrieval and reading comprehension, respectively, over a no-pre-training baseline.