Information Fusion
Structure fusion based on graph convolutional networks for semi-supervised classification
Lin, Guangfeng, Wang, Jing, Liao, Kaiyang, Zhao, Fan, Chen, Wanjun
Suffering from the multi-view data diversity and complexity for semi-supervised classification, most of existing graph convolutional networks focus on the networks architecture construction or the salient graph structure preservation, and ignore the the complete graph structure for semi-supervised classification contribution. To mine the more complete distribution structure from multi-view data with the consideration of the specificity and the commonality, we propose structure fusion based on graph convolutional networks (SF-GCN) for improving the performance of semi-supervised classification. SF-GCN can not only retain the special characteristic of each view data by spectral embedding, but also capture the common style of multi-view data by distance metric between multi-graph structures. Suppose the linear relationship between multi-graph structures, we can construct the optimization function of structure fusion model by balancing the specificity loss and the commonality loss. By solving this function, we can simultaneously obtain the fusion spectral embedding from the multi-view data and the fusion structure as adjacent matrix to input graph convolutional networks for semi-supervised classification. Experiments demonstrate that the performance of SF-GCN outperforms that of the state of the arts on three challenging datasets, which are Cora,Citeseer and Pubmed in citation networks.
Multi-Label Product Categorization Using Multi-Modal Fusion Models
Wirojwatanakul, Pasawee, Wangperawong, Artit
In this study, we investigated multi-modal approaches using images, descriptions, and title to categorize e-commerce products on Amazon.com. Specifically, we examined late fusion models, where the modalities are fused at the decision level. Products were each assigned multiple labels, and the hierarchy in the labels were flattened and filtered. For our individual baseline models, we modified a CNN architecture to classify the description and title, and then modified Keras' ResNet-50 to classify the images, achieving F1 scores of 77.0%, 82.7%, and 61.0%, respectively. In comparison, our tri-modal late fusion model can classify products more accurately than single modal models can, improving the F1 score to 88.2%. Each modality complemented the shortcomings of the other modalities, demonstrating that increasing the number of modalities can be an effective method for improving the accuracy of multi-label classification problems.
Extract, Shoehorn, and Load
A lot of data is moved from system to system in an important and increasing part of the computing landscape. This is traditionally known as ETL (extract, transform, and load). While many systems are extremely good at this process, the source for the extraction and the destination for the load frequently have different representations for their data. It is common for this transformation to squeeze, truncate, or pad the data to make it fit into the target. This is really like using a shoehorn to fit into a shoe that is too small. Sometimes it's a needed step.
Integrating Knowledge and Reasoning in Image Understanding
Aditya, Somak, Yang, Yezhou, Baral, Chitta
Deep learning based data-driven approaches have been successfully applied in various image understanding applications ranging from object recognition, semantic segmentation to visual question answering. However, the lack of knowledge integration as well as higher-level reasoning capabilities with the methods still pose a hindrance. In this work, we present a brief survey of a few representative reasoning mechanisms, knowledge integration methods and their corresponding image understanding Figure 1: The diagram shows the information hierarchy for applications developed by various groups images and the knowledge associated with each level of information. of researchers, approaching the problem from a variety of angles. Furthermore, we discuss upon key efforts on integrating external knowledge with neural paper is to present a survey of recent works (including a few networks. Taking cues from these efforts, we of our works) in image understanding where knowledge and conclude by discussing potential pathways to improve reasoning plays an important role.
Talend and Qubole Serverless Platform for Machine Learning: Choosing Between a Cab vs Your Own Car - Talend Real-Time Open Source Data Integration Software
Before going to the world of integration, machine learning, etc., I would like to discuss with all of you about a scenario many of you might experience when you live in a mega city. I lived in the London suburbs for almost 2 years (and it's a city quite close to my heart too), so let me use London as this story's background. When I moved to London, one question which came to my mind was whether I should buy a car or not. The public transport system in London is quite dense and amazing (Oh!!! I just love the amazing London Underground and I miss it in Toronto).
CloverDX Drinks and Data Meetup - London
Data integration software and ETL tools provided by the CloverDX platform (formerly known as CloverETL) offer solutions for data management tasks such as data integration, data migration, or data quality. CloverDX is a vital part of enterprise solutions such as data warehousing, business intelligence (BI) or master data management (MDM). CloverDX Designer (formerly known as CloverETL Designer) is a visual data transformation designer that helps define data flows and transformations in a quick, visual, and intuitive way. CloverDX Server (formerly known as CloverETL Server) is an enterprise ETL and data integration runtime environment. It offers a set of enterprise features such as automation, monitoring, user management, real-time ETL, data API services, clustering, or cloud data integration.
Data virtualization use cases cover more integration tasks
Gartner predicts that 60% of organizations will deploy data virtualization software as part of their data integration tool set by 2020. That's a big jump from the adoption rate of about 35% the consulting and market research company cited in a November 2018 report on the data virtualization market. But the technology "is rapidly gaining momentum," a group of four Gartner analysts wrote in the report. The analysts said data virtualization use cases are on the rise partly because IT teams are struggling to physically integrate a growing number of data silos, as relational database management system (DBMS) environments are augmented by big data systems and other new data sources. They also pointed to increased technology maturity that has removed deployment barriers for data virtualization users.
Senior Database Developer - IoT BigData Jobs
Zeta Global is currently seeking a strong Database Developer to join our Technical Services team for a long term & rewarding full-time role. In this role we're looking for someone that is comfortable working with / supporting multiple databases & data-driven, web-based, marketing applications and solutions. Job Description: Developer position is primarily responsible for design, development, deployment, and production support for API, middle tier and database solutions, interacting with RESTful and SOAP API's, service layer, batch file import and extract, and web-based applications. The ability to work in a team environment is necessary. Candidate will focus on developing in a multi-tiered environment.
On Single Source Robustness in Deep Fusion Models
Algorithms that fuse multiple input sources benefit from both complementary and shared information. Shared information may provide robustness to faulty or noisy inputs, which is indispensable for safety-critical applications like self-driving cars. We investigate learning fusion algorithms that are robust against noise added to a single source. We first demonstrate that robustness against single source noise is not guaranteed in a linear fusion model. Motivated by this discovery, two possible approaches are proposed to increase robustness: a carefully designed loss with corresponding training algorithms for deep fusion models, and a simple convolutional fusion layer that has a structural advantage in dealing with noise. Experimental results show that both training algorithms and our fusion layer make a deep fusion-based 3D object detector robust against noise applied to a single source, while preserving the original performance on clean data.
Commuting Conditional GANs for Robust Multi-Modal Fusion
Roheda, Siddharth, Krim, Hamid, Riggan, Benjamin S.
This paper presents a data driven approach to multi-modal fusion, where optimal features for each sensor are selected from a common hidden space between the different modalities. The existence of such a hidden space is then used in order to detect damaged sensors and safeguard the performance of the system. Experimental results show that such an approach can make the system robust against noisy/damaged sensors, without requiring human intervention to inform the system about the damage.