Collaborating Authors


AI web scraping augments data collection


Web scraping involves writing a software robot that can automatically collect data from various webpages. Simple bots might get the job done, but more sophisticated bots use AI to find the appropriate data on a page and copy it to the appropriate data field to be processed by an analytics application. AI web scraping-based use cases include e-commerce, labor research, supply chain analytics, enterprise data capture and market research, said Sarah Petrova, co-founder at Techtestreport. These kinds of applications rely heavily on data and the syndication of data from different parties. Commercial applications use web scraping to do sentiment analysis about new product launches, curate structured data sets about companies and products, simplify business process integration and predictively gather data.

More than 267 millions of Facebook user phone numbers exposed online


Security expert Bob Diachenko, along with Comparitech, has discovered more than 267 million Facebook user IDs, phone numbers and names in an unsecured database. The huge trove of data is likely the result of an illegal scraping operation or Facebook API abuse by a group of hackers in Vietnam. The exposed data could be used by threat actors to conduct large-scale SMS spam and phishing campaigns. "A database containing more than 267 million Facebook user IDs, phone numbers, and names was left exposed on the web for anyone to access without a password or any other authentication." "Comparitech partnered with security researcher Bob Diachenko to uncover the Elasticsearch cluster.

The Ultimate Beginner's Guide to Data Scraping, Cleaning, and Visualization


If you have a model that has acceptable results but isn't amazing, take a look at your data! Taking the time to clean and preprocess your data the right way can make your model a star. In order to look at scraping and preprocessing in more detail, let's look at some of the work that went into "You Are What You Tweet: Detecting Depression in Social Media via Twitter Usage." That way, we can really examine the process of scraping Tweets and then cleaning and preprocessing them. We'll also do a little exploratory visualization, which is an awesome way to get a better sense of what your data looks like!

Wavethrough Vulnerability In Microsoft Edge Could Allow Data Scraping


We all know Microsoft has recently launched a massive'bug fix bundle' where it released patches for around 50 vulnerabilities including the patch for Cortana's Lock Screen Bypass Vulnerability. However, not many know about'all' of these vulnerabilities for which Microsoft released fixes. It was also strange that it released patches together for 50 different bugs. Seems like the team has been silently working out how to solve various issues reported to them over the past months. Now, an independent security researcher has unveiled one such issue.