Collaborating Authors


[Research] Companies for compiling training data


I need to retrieve data for Machine Learning training using sample data from event sites for training of a web scraper. Are you sure this is what you want to do? A web scraper is typically rule-based (e.g. But to answer your question, Amazon Mechanical Turk is by far the largest platform if you want to access (mostly unskilled) workforce, as long as you can frame your task into a questionnaire (i.e.

DigiTech Insight Magazine


The global spending on the artificial intelligence (AI) market is also estimated to reach $118.6 billion by 2025. A Business Wire research unveiled that the amount spent on cloud AI in the media and entertainment (M & E) industry is anticipated to reach $1,860.9 million by 2025 from $329 million in 2019. The worldwide AI market adoption rate is estimated to reach $118.6 billion by 2025 [source:] Here are some of the examples of how AI is changing the media landscape. The AI market for social media is estimated to reach 3,714.89 million at 28.77% CAGR by 2025.

Facebook makes it easier for users to see News Feed stories in chronological order


For years, Facebook insisted that its algorithms were best placed to curate what people saw on their News Feed. It was like being told that the machine learning systems knew what you wanted better than you did. Only recently, the social media giant has taken an altogether different tack. After allowing users to prioritize posts from select friends and Pages, it's now introducing a new "Feed Filter" menu that gives you quick access to its "Most Recent" setting, which allows you to switch off its algorithmically-ranked News Feed. That way, you'll be shown posts from friends, Pages and groups in the order they were posted.

[D] What's the simplest, most lightweight but complete and 100% open source MLOps toolkit?


I know this has been asked many times and in many different ways. And there are tons of blog posts, articles, videos and courses addressing this and comparing hundreds of tools, libraries, frameworks… And that's part of my problem: I am facing so many options that I feel like Buridan's ass, dying of starvation for not knowing what to do. Although I don't want to write too much, I need to speak a little about our situation, in order to put the question in our context. We have only four people, which could be qualified as beginner data scientists. One of us has a profile that is a little bit more "engineer", so data engineer could be more suitable for him.

[P] Backprop: a library to easily finetune and use state-of-the-art models


I'd like to share Backprop, a Python library I've been co-authoring for the last few months. Our goal is to make finetuning and using models as easy as possible, even without extensive ML experience. We've currently got support for text and image-based tasks, with wrappers around models like Google's T5, OpenAI's CLIP, and Facebook's BART, among others. Once you've got your training data, you can just import your model/task, and then finetune with a single line of code. We've also got some features that make deployment for production easy, but for full transparency, deployment is through a paid platform we've developed that is by no means necessary to use the library.

Category Aware Explainable Conversational Recommendation Artificial Intelligence

Most conversational recommendation approaches are either not explainable, or they require external user's knowledge for explaining or their explanations cannot be applied in real time due to computational limitations. In this work, we present a real time category based conversational recommendation approach, which can provide concise explanations without prior user knowledge being required. We first perform an explainable user model in the form of preferences over the items' categories, and then use the category preferences to recommend items. The user model is performed by applying a BERT-based neural architecture on the conversation. Then, we translate the user model into item recommendation scores using a Feed Forward Network. User preferences during the conversation in our approach are represented by category vectors which are directly interpretable. The experimental results on the real conversational recommendation dataset ReDial [12] demonstrate comparable performance to the state-of-the-art, while our approach is explainable. We also show the potential power of our framework by involving an oracle setting of category preference prediction. Keywords: Conversational Recommendation · Category Preference Based Recommendation · Explainable Conversational Recommendation · Cold Start Explainable Recommendation.

A Survey on Multimodal Disinformation Detection Artificial Intelligence

Recent years have witnessed the proliferation of fake news, propaganda, misinformation, and disinformation online. While initially this was mostly about textual content, over time images and videos gained popularity, as they are much easier to consume, attract much more attention, and spread further than simple text. As a result, researchers started targeting different modalities and combinations thereof. As different modalities are studied in different research communities, with insufficient interaction, here we offer a survey that explores the state-of-the-art on multimodal disinformation detection covering various combinations of modalities: text, images, audio, video, network structure, and temporal information. Moreover, while some studies focused on factuality, others investigated how harmful the content is. While these two components in the definition of disinformation -- (i) factuality and (ii) harmfulness, are equally important, they are typically studied in isolation. Thus, we argue for the need to tackle disinformation detection by taking into account multiple modalities as well as both factuality and harmfulness, in the same framework. Finally, we discuss current challenges and future research directions.

The AI Index 2021 Annual Report Artificial Intelligence

Welcome to the fourth edition of the AI Index Report. This year we significantly expanded the amount of data available in the report, worked with a broader set of external organizations to calibrate our data, and deepened our connections with the Stanford Institute for Human-Centered Artificial Intelligence (HAI). The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop intuitions about the complex field of AI. The report aims to be the most credible and authoritative source for data and insights about AI in the world.

[D] Simple Questions Thread December 20, 2020


Hi, I'm working in a museum, currently trying to optically characterize a big historic lens. Unfortunately, it is mounted in a device which can't really be taken apart (issues of conservation), so conventional methods are rather hard to do. I've been loosely following the advances in neural network based approaches ("Two minute papers" kinda stuff) and was wondering if anyone has already realized a solution to my problem using machine learning or similar techniques. That is: Print out a defined optical pattern (like a QR code), "wave" it on one side of the lens and record the image with a camera on the other to get a 3D model of the lens in the end. In my head, it should be possible to train a network using conventional light simulation of randomly generated glass bodies.