Goto

Collaborating Authors

 Information Retrieval


How Artificial Intelligence will impact professional writing

#artificialintelligence

An AI algorithm developed by researchers at Salesforce generates snippets of text that describe the essence of long text. These tools can help writers skim through a lot of articles and find relevant topics to write about. "Since new semantic technologies are now mature enough to read human language, journalists and professional writers can finally go back to writing for people," Cuofano says. "The next revolution (which is already coming) is the leap from NLP to a subset of it called NLU (Natural Language Understanding)," Cuofano says.


Google search is a powerful job hunting tool thanks to AI

Engadget

After announcing a slew of new updates to its smart home, VR and mobile products, Google unveiled the latest feature coming to its core function -- the search engine. In the next few weeks, users in the US will be able to look for job listings on Google.com This function will make it easier to discover jobs close to you, as well as positions that have been traditionally more difficult for existing portals to find and classify (like in retail and service). According to Google, "almost half of U.S. employers say they still have issues filling open positions," while job seekers aren't necessarily aware of listings available near them. The search giant says this is because high turnover, low traffic and inconsistency related to job posts make them difficult for engines to classify.


SEO in 2017 - Learning to play by 7 new rules - Smart Insights Digital Marketing Advice

#artificialintelligence

Competition – it motivates all of us to become better at what we do. Marketers are no different, especially when they have to stay on top of the SEO game. They research the latest SEO trends; they use the best SEO tools out there. The problem is, unlike a sports competition, the goal posts for SEO are always changing. This, of course, makes the game more interesting, but also pretty frustrating for those trying to "keep up."


An Ensemble Blocking Approach for Entity Resolution of Heterogeneous Datasets

AAAI Conferences

Entity Resolution, also called record linkage or deduplication, refers to the process of identifying and merging duplicate versions of the same entity into a unified representation. The standard practice is to use a Rule based or Machine Learning based model that compares pairs of records and assigns a score to represent the pairs’ Match/Non-Match status. However, performing an exhaustive pair-wise comparison on all pairs of records leads to quadratic matcher complexity and hence a Blocking step is performed before the Matching to group similar entities into smaller blocks that the matcher can then examine exhaustively. Several blocking schemes have been developed to efficiently and effectively block the input dataset into manageable groups. At our organization, we perform deduplication on massive datasets of people profiles collected from disparate sources with varying informational content. We observed that, employing a single blocking technique did not cover the base for all possible scenarios due to high heterogeneity in our data sources. In this paper, we describe our ensemble approach to blocking that combines two different blocking techniques to leverage their respective strengths.


Enforcing Relational Matching Dependencies with Datalog for Entity Resolution

AAAI Conferences

Entity resolution (ER) is about identifying and merging records in a database that represent the same real-world entity. Matching dependencies (MDs) have been introduced and investigated as declarative rules that specify ER policies. An ER process induced by MDs over a dirty instance leads to multiple clean instances, in general. General answer sets programs have been proposed to specify the MD-based cleaning task and its results. In this work, we extend MDs to relational MDs, which capture more application semantics, and identify classes of relational MDs for which the general ASP can be automatically rewritten into a stratified Datalog program, with the single clean instance as its standard model.


Document Embedding Strategies for Job Title Classification

AAAI Conferences

Automatic and accurate classification of items enables numerous downstream applications in many domains. These applications can range from faceted browsing of items to product recommendations and big data analytics. In the online recruitment domain, we refer to classifying job ads to a predefined occupation taxonomy as job title classification. A large-scale job title classification system can power various downstream applications such as query expansion, semantic search, job recommendations and labor market analytics. Such classification systems mostly use Bag-of-Words (BOW) model for document representation and consider only the job titles when classifying job ads. However the BOW model lacks the semantic discrimination capability that is needed to accurately classify job ads when they contain multiple aspects of the job such as the job description, job requirements, company overview and other details. In this paper we explore the applicability of recent advances in the word and document embedding space to the problem of job title classification. We investigate several document embedding approaches and propose a novel customized document embedding strategy for job title classification that addresses the multi-aspect job ad issue. Our experimental results show that incorporating document embedding approaches in a job title classification system improves the classification accuracy on entire job ads compared to approaches based on the BOW model.


Flipboard on Flipboard

#artificialintelligence

Imagine if you lost your keys and instead of fishing around in the couch cushions, you could just pull out your phone and search for them. This is not only possible; it's possible now, and it's almost as intriguing as it is terrifying. Today at Microsoft Build, the software giant's annual conference for developers, Microsoft showed off exactly this sort of tech. By melding things that have already been around for a few years--machine-learning powered image recognition and consumer-grade cameras--with the ludicrous computing horsepower in the cloud, Microsoft is able to index people and things in a room in real time. Once you can identify people and objects by feeding the computers images of Bob and jackhammers so they can learn what each of those things look like, you can start applying a framework of rules and triggers on top of the real world.


3 predictions about the future of SEO

#artificialintelligence

SEO is a constantly changing and growing industry. No longer is search engine optimization seen as internet "black magic," but it is now regarded as an essential part of any serious digital marketing strategy. Last year, it was estimated that businesses invested more that $65 billion on SEO services, and that number is projected to climb to over $70 billion by 2018. We've come a long way as an industry -- and from the looks of it, our best days are still ahead of us. The hardest thing in the world of search is predicting what will come next.


Machine Learning with World Knowledge: The Position and Survey

arXiv.org Machine Learning

Machine learning has become pervasive in multiple domains, impacting a wide variety of applications, such as knowledge discovery and data mining, natural language processing, information retrieval, computer vision, social and health informatics, ubiquitous computing, etc. Two essential problems of machine learning are how to generate features and how to acquire labels for machines to learn. Particularly, labeling large amount of data for each domain-specific problem can be very time consuming and costly. It has become a key obstacle in making learning protocols realistic in applications. In this paper, we will discuss how to use the existing general-purpose world knowledge to enhance machine learning processes, by enriching the features or reducing the labeling work. We start from the comparison of world knowledge with domain-specific knowledge, and then introduce three key problems in using world knowledge in learning processes, i.e., explicit and implicit feature representation, inference for knowledge linking and disambiguation, and learning with direct or indirect supervision. Finally we discuss the future directions of this research topic.


Search engine results can now chat to computer users

The Independent - Tech

Microsoft has started testing search engine results that can chat directly to users. The company wants developers to create custom chatbots that can be added to search listings on Bing. Users will be able to ask the bots for basic information about venues such as restaurants and cinemas, such as opening hours and parking information. They're powered by Skype, and the functionality has already been rolled out to a small number of venues, including a restaurant in Seattle called Monsoon. Unfortunately, at the time of publication, the chatbot didn't actually allow me to submit any questions.