AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Google's Matt Brittin admits firm needs to work harder to remove illegal content

Daily Mail - Science & techDec-28-2016, 13:25:04 GMT

Google's auto complete can automatically load offensive search terms Matt Brittin urged people to make their own judgements on harmful content Mr Brittin told BBC Radio 4's Today programme the tool did help save time But the Google Europe boss said the firm would also refine its algorithm Google's auto complete can automatically load offensive search terms Mr Brittin told BBC Radio 4's Today programme the tool did help save time Classic mini SNES may be on the way: Nintendo patent... Is YOUR data safe? Facebook admits government requests for... 'Should I get a divorce?' spikes on Google search trends... Why you've been pouring ketchup the WRONG way: Scientist... Classic mini SNES may be on the way: Nintendo patent... Is YOUR data safe? Facebook admits government requests for... 'Should I get a divorce?' spikes on Google search trends... Why you've been pouring ketchup the WRONG way: Scientist... The top result for people who searched'Did the Holocaust happen'? was an article by white supremacist site Stormfront (pictured) Fox Valley Mall forced to close as huge brawl breaks out Chaos as people rush out of NJ mall after reports of gunshots Grizzly bear attacks TV woman who recklessly tries to stroke it Tears of joy? Emotional moment boy learns his mum is pregnant Self-driving car predicts horrific crash and slams on breaks Mayhem outside Fox Valley Mall as police make several arrests Partygoers allegedly arrested at a'mixed' party in Jeddah CAT ATTACK: Pet pounces on man as he opens Christmas present Hero dog saves his injured'girlfriend' on deadly railway track Just beautiful! Tears of joy? Emotional moment boy learns his mum is pregnant Partygoers allegedly arrested at a'mixed' party in Jeddah Hero dog saves his injured'girlfriend' on deadly railway track SWAT teams dispatched, families flee from'gunfire' and... Star Wars actress Carrie Fisher dies aged 60 four days after... Carrie Fisher'relapsed' before European tour that ended in... 'How could they let him drink and smoke himself to death?'... 'Step Up' actress, 46, who vanished on her way to Christmas... Health curse of the middle aged: 80% are now'overweight,... 'He became a recluse because he couldn't bear people to see... How Carrie Fisher's brutal wit and very public battles with... Bikini-clad Ivanka Trump and shirtless husband Jared enjoy a... George Michael's £100m fortune'will go to his Godchildren':... It's my quinceañera and I'll cry if I want to!

artificial intelligence, information retrieval, natural language, (15 more...)

Daily Mail - Science & tech

Country:

Asia > Middle East > Saudi Arabia > Mecca Province > Jeddah (0.46)
Europe (0.26)

Industry:

Transportation > Ground > Rail (1.00)
Media (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.58)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.41)

Add feedback

How will Google's AI Improvements Change SEO for Marketers? – Marketing and Entrepreneurship

#artificialintelligenceDec-28-2016, 08:25:13 GMT

If you prefer reading, here's the quick recap on what changes AI will bring to marketers according to these four industry influencers, plus some of my personal suggestions of what you should do in face of these changes: According to Sam Mallikarjunan, Head of Growth of HubSpot Labs, visual content will have an increasing influence on SEO, as he says, "search engines are getting good at knowing what a video, audio clip, or image is actually about." Not only does Google favor YouTube videos in search results, they're also getting better at analyzing what visual content is about. Just like how content writers had to learn to optimize headings and keywords, visual artists will have to start thinking about SEO when creating visual content like images and videos. SEO for videos, for example, means optimizing keyword targeting, descriptions, tags, video length, and more. Here's a great guide on optimizing videos for SEO from Brian Dean, if you want to learn more.

artificial intelligence, information retrieval, natural language, (18 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.31)

Add feedback

How to build a search engine - Part 2: Configuring elasticsearch

@machinelearnbotDec-27-2016, 04:45:03 GMT

In this post we will focus on configuring the elasticsearch bit. I have chosen the Wikipedia people dump for the dataset. This is the wiki pages of a subset of people on Wikipedia. This dataset consists of three columns – URI, name, text. As the column names suggest, URI is the actual wiki link to that person's page, name is the person's name.

artificial intelligence, information retrieval, natural language, (18 more...)

@machinelearnbot

Technology:

Information Technology > Communications (1.00)
Information Technology > Information Management > Search (0.66)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.55)

Add feedback

Latent Tree Models for Hierarchical Topic Detection

Chen, Peixian, Zhang, Nevin L., Liu, Tengfei, Poon, Leonard K. M., Chen, Zhourong, Khawar, Farhan

arXiv.org Machine LearningDec-21-2016

We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree models (HLTMs). The variables at the bottom level of an HLTM are observed binary variables that represent the presence/absence of words in a document. The variables at other levels are binary latent variables, with those at the lowest latent level representing word co-occurrence patterns and those at higher levels representing co-occurrence of patterns at the level below. Each latent variable gives a soft partition of the documents, and document clusters in the partitions are interpreted as topics. Latent variables at high levels of the hierarchy capture long-range word co-occurrence patterns and hence give thematically more general topics, while those at low levels of the hierarchy capture short-range word co-occurrence patterns and give thematically more specific topics. Unlike LDA-based topic models, HLTMs do not refer to a document generation process and use word variables instead of token variables. They use a tree structure to model the relationships between topics and words, which is conducive to the discovery of meaningful topics and topic hierarchies.

latent variable, survey article, us government, (24 more...)

arXiv.org Machine Learning

1605.0665

Country:

North America > United States (0.68)
Asia > China (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (1.00)
Banking & Finance (1.00)
Energy > Oil & Gas (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

How to build a search engine: Part 1

@machinelearnbotDec-18-2016, 05:25:02 GMT

In this multi-part series, we will explore how to build a search engine. It will be quite powerful and industrial strength. The first part will focus on getting the right tools and getting technology stack ready. We will build this search engine with an AngularJS front-end and use elasticsearch as the computation back end. Most applications of today are data driven.

elasticsearch, information retrieval, natural language, (19 more...)

@machinelearnbot

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Conditional Random Fields (CRF): Short Survey

@machinelearnbotDec-16-2016, 18:15:03 GMT

CRF is not very good for keywords extraction as soon as it cannot handle unknown words. Moreover, adding new data to the training dataset forcers us to re-train the whole CRF model – and it may be quite time-consuming due to the high complexity of the training phase of the algorithm. CRF shows good performance when dealing with entity recognition (any types of entities, including named entities, time expressions, etc.). It can use both linguistic (characters, words) and non-linguistic information (upper/lower case, punctuation marks, spaces etc.). The achievable quality of entity recognition is about 0.7-0.85

information retrieval, machine learning, natural language, (18 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.81)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.55)

Add feedback

How Search Engines Are Killing Clever URLs

The Atlantic - TechnologyDec-16-2016, 14:45:50 GMT

Although investors scrambled--and shelled out up to $185,000 a pop--for the chance to snatch up the new domains and profit as gatekeepers, uptake among end-users has been underwhelming. More than three years after the program's launch, roughly 26 million new generic top-level domains have been registered, compared with the 164 million registered "legacy" top-level domains. Cyrus Namazi, the vice president of domain-name services and industry engagement at ICANN, acknowledged that demand for new top-level domains won't eclipse that for legacies "any time soon." Yet Namazi believes registrations for the new extensions will continue to grow. "We are in the embryonic stages of the expansion," he said.

artificial intelligence, information retrieval, natural language, (7 more...)

The Atlantic - Technology

Technology:

Information Technology > Information Management > Search (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Omnity search engine finds documents relevant to yours – regardless of language

#artificialintelligenceDec-15-2016, 00:45:09 GMT

With the amount of published research, patents, white papers, and other written knowledge out there, it's hard to be even reasonably sure you're aware of the goings-on around a certain topic or field. Omnity is a search engine made to make it easier by extracting the gist of documents you give it and finding related ones from a library of millions -- and now supports over a hundred languages. The process is simple and free, at least for the public-facing databases Omnity has assembled, comprising U.S. patents, SEC filings, PubMed papers, clinical trials, Library of Congress collections, and more. You upload a document or text snippet, and the system scans it, looking for the least common words and phrases -- which generally indicate things like topic, experiment type, equipment used, that sort of thing. It then looks through its own libraries to find documents with similar or related phrases that appear in a manner that suggests relevance. For example, say you put in the results of your clinical trial testing a food additive on a certain strain of mice, and found it resulted in a certain condition.

artificial intelligence, information retrieval, natural language, (17 more...)

#artificialintelligence

Country: North America > United States (0.57)

Genre: Research Report > New Finding (0.58)

Industry:

Law (0.92)
Government > Regional Government > North America Government > United States Government (0.57)

Technology:

Information Technology > Information Management > Search (0.65)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.65)

Add feedback

What we've learned about SEO in 2016

#artificialintelligenceDec-14-2016, 21:25:05 GMT

Since the inception of the search engine, SEO has been an important, yet often misunderstood industry. For some, these three little letters bring massive pain and frustration. For others, SEO has saved their business. One thing is for sure: having a clear and strategic search strategy is what often separates those who succeed from those who don't. As we wrap up 2016, let's take a look at how the industry has grown and shifted over the past year, and then look ahead to 2017.

artificial intelligence, information retrieval, natural language, (13 more...)

#artificialintelligence

Industry: Information Technology > Services (0.48)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.57)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.36)

Add feedback

Omnity's search engine uses rare word matching to find unexpected results

EngadgetDec-13-2016, 15:25:03 GMT

When it comes to search, there's Google and there's everyone else -- the company is basically synonymous with searching the internet. But Omnity, a relatively new company from San Francisco, thinks own search that's based on "semantic mapping" offers something that Google can't do. Omnity's trick is that it looks for the connections between documents on the internet based on rare words -- the theory that research that has several of the same rare words will likely be about related topics, even if that research doesn't directly link to or cite each other. Thus far, Omnity has operated primarily by selling enterprise plans to companies and educational institutions. Omnity can search not only all of the public datasets it scans (like patents, scientific, engineering and medical documents, clinical trials, case law, SEC filings and so forth) but also a company's internal documents -- for some companies, Omnity indexes 150 petabytes of data.

artificial intelligence, information retrieval, natural language, (10 more...)

Engadget

Country: North America > United States > California > San Francisco County > San Francisco (0.26)

Industry:

Law (0.57)
Health & Medicine (0.57)

Technology:

Information Technology > Information Management > Search (0.53)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.42)

Add feedback