Goto

Collaborating Authors

 Information Retrieval


Oblivion handles hundreds of right to be forgotten demands in SECONDS

AITopics Original Links

In the year since the European Court of Justice ruled that anyone can ask Google to remove personal information about them, the site has evaluated more than one million links. Each request has to be verified and processed by a dedicated team of people, but the sheer volume can cause delays. To speed this up, researchers from Germany and New Zealand have developed an algorithm capable of analysing hundreds of such requests in seconds. Oblivion (illustrated) allows a user to automatically find and tag their personal information on the web, using both text - or natural language processing (NLP) - and image recognition. And they hope to offer it to Google, and other search engines, to help them manage future demands.


Computer Laboratory – Obituaries: Karen Spärck Jones, 1935–2007

AITopics Original Links

Professor Karen Spärck Jones was one of the pioneers in information retrieval (IR) and natural language processing (NLP). She worked in these areas since the late 1950s and made major contributions to the understanding of information systems. Her international status as a researcher was recognised by the most prestigious awards in her field, the ACM SIGIR Salton Award, the American Society for Information Science and Technologys Award of Merit, the Association for Computational Linguistics Lifetime Achievement Award, the BCS Lovelace Medal, and the ACM-AAAI Allen Newell Award, as well as by her election as a Fellow of the British Academy, of the American Association for Artificial Intelligence, and as a European AI Fellow. Karen Spärck Jones started her research career at the Cambridge Language Research Unit in the late 1950s, working on the use of thesauri for language processing. At this time she collaborated with Roger Needham, whom she married in 1958.


Intelligent Searching Agents on the Web

AITopics Original Links

Many web search engines use the concept of a'spider' - automated software which goes out onto the web and trawls through the contents of each server it encounters, indexing documents as it finds them. This approach results in the kinds of databases maintained by services such as Alta Vista and Excite - huge indexes to a vast chunk of what's currently available on the web. However, the problems which users can face when using such databases are beginning to be well documented. A recent JISC-funded investigation [1] into the use of web search engines indicates that users can typically encounter a number of difficulties. These include the issue of finding information relevant to their needs, and the problem of information overload - when far too much information is returned from a search.


SIGIR Special Interest Group on Information Retrieval

AITopics Original Links

The ACM Turing Award, the most prestigious technical award in the field of computing (often referred to as the "Nobel Prize of computing"), is turning 50! The ACM is throwing a big birthday celebration June 23-24, 2017 at the Westin St.Francis in San Francisco, California (https://www.acm.org/turing-award-50). Watch this video to learn more about the award and event, which includes commentary from many past recipients: https://www.youtube.com/watch?v l7qprcl6a-Y . SIGIR is an event sponsor and as part of our sponsorship, we are sending 10 student delegates to the celebration. Those selected will receive travel grants of approximately $1000-2000USD to attend the celebration.


The Role of Intelligent Systems in the National Information Infrastructure

AITopics Original Links

The National Information Infrastructure (NII) will have profound effects on the lives of every citizen. It promises to deliver to people in their homes and offices a vast array of information in many forms, changing the ways in which business is conducted, offering new educational opportunities, bringing geographically dispersed library resources and entertainment materials to everyone's doorstep. It will connect people to people, and help them with their jobs and tasks. For the NII to be useful, however, people will need easy and efficient access to its resources. Today's computers are complex and difficult to use, even for experts. The NII will be orders of magnitude more complex than current systems; it could easily become a labyrinth of databases and services that is inconvenient for experts and inaccessible to many Americans. The field of artificial intelligence (AI) can play a pivotal role in meeting major challenges of the NII. AI uses the theoretical and experimental tools of ...


Finding relevant data in a sea of languages

AITopics Original Links

"About 6,000 languages are currently spoken in the world today," says Elizabeth Salesky of MIT Lincoln Laboratory's Human Language Technology (HLT) Group. "Within the law enforcement community, there are not enough multilingual analysts who possess the necessary level of proficiency to understand and analyze content across these languages," she continues. This problem of too many languages and too few specialized analysts is one Salesky and her colleagues are now working to solve for law enforcement agencies, but their work has potential application for the Department of Defense and Intelligence Community. The research team is taking advantage of major advances in language recognition, speaker recognition, speech recognition, machine translation, and information retrieval to automate language processing tasks so that the limited number of linguists available for analyzing text and spoken foreign languages can be used more efficiently. "With HLT, an equivalent of 20 times more foreign language analysts are at your disposal," says Salesky.


ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

arXiv.org Artificial Intelligence

Appendix A. Relational MDs and the UCI Property Here, we formally extend the class of matching dependencies (MDs) introduced in Section 2.1, which we will call classical MDs, to the larger class of relational MDs. This extension is motivated by the application of MDs to blocking for entity resolution, but applications can be easily foreseen in other areas where declarative relational knowledge may be useful in combination with matching and merging. We also identify classes of relational MDs for which a single clean instance exists, no matter how the MDs are enforced, that can be computed through the chase procedure in polynomial time in the size of the database on which the MDs are enforced. We say that the MDs (in some cases in combination with an initial instance) have the unique clean instance property (UCI property). More details can be found in [11, 6, 7]. Definition 1. the form: Given a relational schema R, a relational MD is a formula of ϕ: t


AllAnalytics - Pierre DeBois - How IoT and AI Devices are Changing Search

#artificialintelligence

Yet despite this fact of marketplace nature, many companies that offer tech products and services face the daunting task of making changes that customers may perceive as messing with a very good thing, especially if the offering is wildly successful. Having a considerable share of the search engine marketplace against Bing and Yahoo, Google has become the default starting point for queries among many businesses, small and large, and an essential platform for optimizing digital marketing strategies. But now IoT home devices are rivaling search engines for consumer attention, potentially threatening their dominance in the long run. Clickz reported a BloomReach study that indicates Amazon's emerging position as a consumer starting point for product search and price comparison. The survey of 2,000 US consumers revealed that people are increasingly hitting the Amazon website first, with its share of surveyed respondents reaching 55%, an 11% increase over the previous year's results.


How to build a search engine: Part 4

@machinelearnbot

This is the last part on building an end-end search engine. In this part we will take a look at how to go about building the front end. This will be an AngularJS application and will consist of some HTML and Javascript. All codes are readily available on Github along with the data itself. Here we will just do a walkthrough of what we are doing to make it all happen.


Google for the dark web: The US Government tech that could scour the hidden internet for criminal activity

Daily Mail - Science & tech

In today's data-rich world, companies, governments and individuals want to analyze anything and everything they can get their hands on – and the World Wide Web has loads of information. At present, the most easily indexed material from the web is text. But as much as 89 to 96 percent of the content on the internet is actually something else – images, video, audio, in all thousands of different kinds of nontextual data types. A map showing hotbeds of dark web activity related to illegal products. Tor - short for The Onion Router - is a seething matrix of encrypted websites that allows users to surf beneath the everyday internet with complete anonymity. It uses numerous layers of security and encryption to render users anonymous online.