AITopics | Information Fusion

Collaborating Authors

Information Fusion

News Overviews Instructional Materials AI-Alerts Classics

Knowledge Perceived Multi-modal Pretraining in E-commerce

Zhu, Yushan, Tou, Huaixiao, Zhang, Wen, Ye, Ganqiang, Chen, Hui, Zhang, Ningyu, Chen, Huajun

arXiv.org Artificial IntelligenceAug-20-2021

In this paper, we address multi-modal pretraining of product data in the field of E-commerce. Current multi-modal pretraining methods proposed for image and text modalities lack robustness in the face of modality-missing and modality-noise, which are two pervasive problems of multi-modal product data in real E-commerce scenarios. To this end, we propose a novel method, K3M, which introduces knowledge modality in multi-modal pretraining to correct the noise and supplement the missing of image and text modalities. The modal-encoding layer extracts the features of each modality. The modal-interaction layer is capable of effectively modeling the interaction of multiple modalities, where an initial-interactive feature fusion model is designed to maintain the independence of image modality and text modality, and a structure aggregation module is designed to fuse the information of image, text, and knowledge modalities. We pretrain K3M with three pretraining tasks, including masked object modeling (MOM), masked language modeling (MLM), and link prediction modeling (LPM). Experimental results on a real-world E-commerce dataset and a series of product-based downstream tasks demonstrate that K3M achieves significant improvements in performances than the baseline and state-of-the-art methods when modality-noise or modality-missing exists.

information, modality, representation, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3474085.3475648

2109.00895

Country:

Asia > South Korea (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Promising Solution (0.54)

Industry: Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > e-Commerce (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Efficient Online Estimation of Causal Effects by Deciding What to Observe

Gupta, Shantanu, Lipton, Zachary C., Childers, David

arXiv.org Machine LearningAug-20-2021

Researchers often face data fusion problems, where multiple data sources are available, each capturing a distinct subset of variables. While problem formulations typically take the data as given, in practice, data acquisition can be an ongoing process. In this paper, we aim to estimate any functional of a probabilistic model (e.g., a causal effect) as efficiently as possible, by deciding, at each time, which data source to query. We propose online moment selection (OMS), a framework in which structural assumptions are encoded as moment conditions. The optimal action at each step depends, in part, on the very moments that identify the functional of interest. Our algorithms balance exploration with choosing the best action as suggested by current estimates of the moments. We propose two selection strategies: (1) explore-then-commit (OMS-ETC) and (2) explore-then-greedy (OMS-ETG), proving that both achieve zero asymptotic regret as assessed by MSE. We instantiate our setup for average treatment effect estimation, where structural assumptions are given by a causal graph and data sources may include subsets of mediators, confounders, and instrumental variables.

data source, oms-etg, proposition 2, (16 more...)

arXiv.org Machine Learning

2108.09265

Country:

Asia > Vietnam (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)

Add feedback

Bootstrap a Modern Data Stack in 5 minutes with Terraform - KDnuggets

#artificialintelligenceAug-13-2021, 00:01:50 GMT

Modern Data Stack (MDS) is a stack of technologies that makes a modern data warehouse perform 10–10,000x better than a legacy data warehouse. Ultimately, an MDS saves time, money, and effort. The four pillars of an MDS are a data connector, a cloud data warehouse, a data transformer, and a BI & data exploration tool. Easy integration is made possible with managed and open-source tools that pre-build hundreds of ready-to-use connectors. What used to take a team of data engineers to build and maintain regularly can now be replaced with a tool for simple use cases.

data warehouse, modern data stack, warehouse, (14 more...)

#artificialintelligence

Industry: Information Technology > Services (0.33)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Data Science > Data Integration (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.48)

Add feedback

The best consultancy for business, with sales and marketing data insights too: we review

#artificialintelligenceJul-30-2021, 08:00:06 GMT

With digital marketing, good, clean, and insightful data is a key pillar which a business stands to drive growth and profits. Having clear and precise data-driven outcomes should be a priority for all marketers. When used in tandem with well-defined marketing and sales goals, and various marketing tools and techniques, companies will discover that their lead to sale conversion process can be far less cumbersome and more rewarding. Possessing clean data will help marketers identify detailed segments based on user attributes, past behaviours, interactions, and other necessary data points. Data can be leveraged for highly targeted campaigns which will drive marketing return on investment (ROI).

best consultancy, consultancy, data integration, (9 more...)

#artificialintelligence

Country: Oceania > Australia (0.40)

Industry: Marketing (1.00)

Technology:

Information Technology > Data Science > Data Integration (0.44)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.44)

Add feedback

A guide to ETL Testing

#artificialintelligenceJul-29-2021, 23:25:31 GMT

Even though the above diagram is a bit of simplification, this is how most ETL workflows may look like. To put simply, ETL is an automated process to move data from source systems to target systems, involving various stages for Extract, Transform and Load sub-processes, without data-loss and while maintaining data-integrity. This also, is usually referred to as data-migration. The objective of ETL is to have a clean, classified, enriched and curated data at one place (data warehouse or data lake). Machine-learning models and analytic tools are run against this data to fetch useful information and predictions, based on which business decisions can be taken.

etl testing, etl workflow, move data

#artificialintelligence

Genre: Workflow (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)

Add feedback

Multimodal Co-learning: Challenges, Applications with Datasets, Recent Advances and Future Directions

Rahate, Anil, Walambe, Rahee, Ramanna, Sheela, Kotecha, Ketan

arXiv.org Artificial IntelligenceJul-29-2021

Multimodal deep learning systems which employ multiple modalities like text, image, audio, video, etc., are showing better performance in comparison with individual modalities (i.e., unimodal) systems. Multimodal machine learning involves multiple aspects: representation, translation, alignment, fusion, and co-learning. In the current state of multimodal machine learning, the assumptions are that all modalities are present, aligned, and noiseless during training and testing time. However, in real-world tasks, typically, it is observed that one or more modalities are missing, noisy, lacking annotated data, have unreliable labels, and are scarce in training or testing and or both. This challenge is addressed by a learning paradigm called multimodal co-learning. The modeling of a (resource-poor) modality is aided by exploiting knowledge from another (resource-rich) modality using transfer of knowledge between modalities, including their representations and predictive models. Co-learning being an emerging area, there are no dedicated reviews explicitly focusing on all challenges addressed by co-learning. To that end, in this work, we provide a comprehensive survey on the emerging area of multimodal co-learning that has not been explored in its entirety yet. We review implementations that overcome one or more co-learning challenges without explicitly considering them as co-learning challenges. We present the comprehensive taxonomy of multimodal co-learning based on the challenges addressed by co-learning and associated implementations. The various techniques employed to include the latest ones are reviewed along with some of the applications and datasets. Our final goal is to discuss challenges and perspectives along with the important ideas and directions for future work that we hope to be beneficial for the entire research community focusing on this exciting domain.

learning, modality, representation, (16 more...)

arXiv.org Artificial Intelligence

2107.13782

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Manitoba > Winnipeg Metropolitan Region > Winnipeg (0.04)
Asia > India (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.67)

Industry:

Education (1.00)
Information Technology (0.67)
Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.93)

Add feedback

Ray 1.5 looks to simplify data exchanges, refactored architecture keeps jobs going

#artificialintelligenceJul-28-2021, 12:46:09 GMT

Distributed execution framework Ray 1.5 is ready for downloading, providing devs in the machine learning space with a first look at new data …

ray 1, simplify data exchange

#artificialintelligence

Industry: Media > News (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.52)
Information Technology > Data Science > Data Integration (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.40)

Add feedback

Sapper - As simple as connecting the dots

#artificialintelligenceJul-27-2021, 12:15:37 GMT

Sapper makes it easy to automate processes, experiences and solutions to help you adapt to today’s digital age. We have consolidated the most common tasks of integrating applications, data, preparing data for analytics or interacting with Bots into one simple product The Sapper solution is rooted in the team’s deep expertise in the latest AI, Automation and Cloud technologies and strong partnerships. Its as simple as connecting the dots. Imagine, Develop & Automate your Automation, ai.

integration, intelligently automate integration, sapper, (1 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Integration (0.42)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.42)

Add feedback

ETL Testing in a nutshell

#artificialintelligenceJul-26-2021, 10:20:57 GMT

Even though the above diagram is a bit of simplification, this is how most ETL workflows may look like. To put it simply, ETL is an automated process to move data from source systems to target systems, involving various stages for Extract, Transform and Load sub-processes, without data-loss and while maintaining data-integrity. This also, is usually referred to as data-migration. The objective of ETL is to have a clean, classified, enriched and curated data at one place (data warehouse or data lake). Machine Learning models and analytic tools are run against this data to fetch useful information and predictions, based on which business decisions can be taken.

etl testing, etl workflow, nutshell, (1 more...)

#artificialintelligence

Genre: Workflow (0.45)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)

Add feedback

6 Key Components of a Successful Data Strategy

#artificialintelligenceJul-23-2021, 07:06:19 GMT

I've already mentioned data catalogs as one strategic tool. By necessity, they're provisioned by IT and data management teams, who know how to work with the various features in data catalog software and how to set up and deploy them. We can make a useful distinction between tools provisioned in this way by IT and tools adopted by end users. Both have an important role to play in a data strategy, complementing rather than contradicting each other. Data management tools are almost always the domain of IT.

application, key component, successful data strategy

#artificialintelligence

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Integration (0.38)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.38)

Add feedback