Information Retrieval

The Use of NLP to Extract Unstructured Medical Data From Text - insideBIGDATA


When working in healthcare, a lot of the relevant information for making accurate predictions and recommendations is only available in free-text clinical notes. Much of this data is trapped in free-text documents in unstructured form. This data is needed in order to make healthcare decisions. Hence, it is important to be able to extract data in the best possible way such that the information obtained can be analyzed and used. State-of-the-art NLP algorithms can extract clinical data from text using deep learning techniques such as healthcare-specific word embeddings, named entity recognition models, and entity resolution models.

Can China's new AI news anchors give Anderson Cooper a run for his money?


China's state-owned Xinhua News Agency introduced so-called "composite anchors" on Wednesday, combining the images and voices of human anchors with artificial intelligence (AI) technology. The new AI anchors, launched by Xinhua and Beijing-based search engine operator Sogou during the World Internet Conference in Wuzhen, can deliver the news with "the same effect" as human anchors because the machine learning programme is able to synthesise realistic-looking speech, lip movements and facial expressions, according to a Xinhua news report on Wednesday. "AI anchors have officially become members of the Xinhua News Agency reporting team. They will work with other anchors to bring you authoritative, timely and accurate news information in both Chinese and English," Xinhua said. The AI anchors are now available throughout Xinhua's internet and mobile platforms such as its official Chinese and English apps, WeChat public account, and online TV webpage.

Analysis of Google's New Schema Speakable Markup - Search Engine Journal


Google announced official support for the The speakable specification will help Google Assistant and Google Home choose which content to read aloud. This new structured data markup is important because it may point to what you'll need to know to get more traffic should/when Google expands this structured data to all websites. The support for this new markup is currently limited to News content. However, it is likely that support for the speakable attribute will inevitably expand as Google gains experience with this new structured data markup.

Experimentation & Measurement for Search Engine Optimization


For many of our potential guests, planning a trip starts at the search engine. At Airbnb, we want our product to be painless to find for past guests, and easy to discover for new ones. Search engine optimization (SEO) is the process of improving our site -- and more specifically our landing pages--to ensure that when a traveller looks for accommodations for their next trip, Airbnb is one of the top results on their favorite search engine. Search engines such as Google, Yahoo, Naver, and Baidu deploy their own fleet of "bots" across the internet to build map of the web and scrape information, or "index", from the pages that they hit. When indexing pages and ranking them for specific search queries, search engines will take into account a variety of factors, including relevance, site performance, and authority.

What is Text Clustering? - insideBIGDATA


Automatic document organization, topic extraction, information retrieval and filtering all have one thing in common. They require text clustering (sometimes also known as document clustering) to be done quickly and accurately. If you've never heard of text clustering, this post will explain what it is, what it does, and how its currently being used to aid businesses. We'll also briefly discuss how a business could employ text clustering too! First, let's define text clustering.

Farewell Google's clean homepage: New Discover feed will guess your interests


Google has started rolling out its new Discover feed to US users visiting Google announced the new feed as part of its 20th anniversary revamp of search on mobile, which replaces today's clean blank page with a search box with many more suggestions, in line with the Google app for iOS and Android. Google wants the site to surface relevant information for users by predicting what they're interested in rather than waiting for users to type in a search term. As 9to5Google reports, Google's mobile search site now has a feed of cards with suggested content under a topic category with the Discover star icon. Clicking the topic displays more related articles and allows users to follow the topic.

Search engine for CCTV lets you find people from their description

New Scientist

Finding someone in a surveillance video could soon be as easy as Googling them. Descriptions of people of interest, such as a suspect or a missing person, are normally given in terms of their height, gender or clothing. But using this information to find a short woman wearing a red jacket in a video, say, often requires scanning hours of footage manually, which is no easy task. But a new search tool can do it automatically.

SEO Copywriting: How to Write Content For People and Optimize For Google


If you want to build your blog audience, you're going to have to get smarter with your content. According to Copyblogger, SEO is the most misunderstood topic online. But, SEO content isn't complicated, once you understand that people come first, before search algorithms. SEO firms make their money understanding these simple concepts. Thriving in your online business means that you must go beyond simply "writing content." Your content needs to accomplish two goals: first, appeal to the end-user (customers, clients, prospects, readers, etc.) and second, solve a particular problem. But, how do you create content that meets those goals? How do you create content that ranks well with Google and also persuades people? Don't worry if you can't afford an expensive SEO copywriter. You can do this following simple rules. And, that's what you're going to learn in this article. We all know what happens when you type a search query into a search engine and hit "enter": You get a list of search results that are relevant to your search term. Those results pages appear as a result of search engine optimization (SEO). In a nutshell, SEO is a method of optimizing (enhancing the effectiveness of) your content for the search engines, in order to help it rank higher than content from other sites that target the same search terms.

AI and machine learning means Google now wants brands to pinpoint niches


With Google's use of AI and machine-learning helping it pinpoint, more clearly than ever, the specific factors that satisfy search queries in different niches and contexts, brands and retailers are being encouraged to tightly tailor their search strategies. A new study, "Searchmetrics Google Ranking Factors 2018", reveals, for example, that high-ranking Google results for searches related to the'weight loss' niche are 4x more likely to have a video on the page than results for'financial planning' or'credit' niches. And that eCommerce sites in the'furniture' niche can get away with displaying nearly 28 images on a page (more than most other niches) and still rank highly despite the fact that more images can sometimes make pages load slower. According to Jordan Koene, Chief Evangelist, VP Professional Services, Google's use of sophisticated AI and machine-learning techniques, such as its RankBrain system, help it to better understand the real intention behind the words that searchers enter in the search box – and learn what types of web pages will satisfy individual searches. "Google now recognizes much more clearly if someone's searching online to buy a table, for instance, or needs personal finance advice or wants to learn weight loss exercises. And by tracking user signals such as how often certain results are clicked and how long people spend there, the search engine learns what factors – such as more or less images or text, or whether a site uses encryption to protect personal information entered by visitors are appropriate for satisfying searchers in individual niches."

Norm-Ranging LSH for Maximum Inner Product Search Machine Learning

Neyshabur and Srebro proposed Simple-LSH, which is the state-of-the-art hashing method for maximum inner product search (MIPS) with performance guarantee. We found that the performance of Simple-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose Norm-ranging LSH, which addresses the excessive normalization problem caused by long tails in Simple-LSH by partitioning a dataset into multiple sub-datasets and building a hash index for each sub-dataset independently. We prove that Norm-ranging LSH has lower query time complexity than Simple-LSH. We also show that the idea of partitioning the dataset can improve other hashing based methods for MIPS. To support efficient query processing on the hash indexes of the sub-datasets, a novel similarity metric is formulated. Experiments show that Norm-ranging LSH achieves an order of magnitude speedup over Simple-LSH for the same recall, thus significantly benefiting applications that involve MIPS.