AITopics

Industry: Education (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Al-Zaidy, Rabah A. (The Pennsylvania State University) | Choudhury, Sagnik Ray (The Pennsylvania State University) | Giles, C. Lee (The Pennsylvania State University)

Automatic Summary Generation for Scientific Data Charts

Scientific charts in the web, whether as images or embedded in digital documents, contain valuable information that is not fully available to information retrieval tools. The information used to describe these charts is typically extracted from the image metadata rather than the information the graphic was initially designed to express. The problem of understanding digital charts found in scholarly documents, and inferring useful textual information from their graphical components is the focus of this study. We present an approach to automatically read the chart data, specifically bar charts, and provide the user with a textual summary of the chart. The proposed method follows a knowledge discovery approach that relies on a versatile graph representation of the chart. This representation is derived from analyzing a chart's original data values, from which useful features are extracted. The data features are in turn used to construct a semantic-graph. To generate a summary, the semantic-graph of the chart is mapped to appropriately crafted protoforms, which are constructs based on fuzzy logic. We verify the effectiveness of our framework by conducting experiments on bar charts extracted from over 1,000 PDF documents. Our preliminary results show that, under certain assumptions, 83% of the produced summaries provide plausible descriptions of the bar charts.

accuracy, bar chart, protoform, (15 more...)

Workshops at the Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Centre County > University Park (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report (0.48)

Technology:

Information Technology > Visualization (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
(3 more...)

Naim, Sheikh Motahar (University of Texas at El Paso) | Kader, Md Abdul (University of Texas at El Paso) | Boedihardjo, Arnold P. (US Army Corps of Engineers) | Hossain, M. Shahriar (University of Texas at El Paso)

Encoding Lineage in Scholarly Articles

The development of new scientific concepts today is an outcome of the accumulated knowledge built over time. Every scientific domain requires understanding of the trends of the dependencies between its subdomains. Analyses of trends to capture such dependencies using conventional document modeling techniques is a challenging task due to two reasons: (1) conventional vector-space modeling based representation of documents does not realize the history of the content, and (2) neither feature-level nor document-level causality is provided with any digital library metadata or citation network. In this paper, we propose an intuitive temporal representation of a scientific article that encodes inherent historic characteristics of the content. This intuitive representation of each document is then leveraged to discover causal relationships between scientific articles. In addition, we provide a mechanism to explore the lineage of each document in terms of other previously published documents, which illustrates how the theme of the document under analysis evolved over time. Empirical studies reported in the paper show that the proposed technique identifies meaningful causal relationships and discovers meaningful lineage in the scientific literature that could not be discovered through the citation network of the articles.

data mining, information retrieval, machine learning, (21 more...)

Workshops at the Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Texas > El Paso County > El Paso (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
Asia > Taiwan (0.04)

Genre: Research Report (0.34)

Industry:

Health & Medicine (0.94)
Government > Military (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
(2 more...)

Krstovski, Kriste (Harvard-Smithsonian Center for Astrophysics) | Smith, David A. (Northeastern University) | Kurtz, Michael J. (Harvard-Smithsonian Center for Astrophysics)

Automatic Construction of Evaluation Sets and Evaluation of Document Similarity Models in Large Scholarly Retrieval Systems

Retrieval systems for scholarly literature offer the ability for the scientific community to search, explore and download scholarly articles across various scientific disciplines. Mostly used by the experts in the particular field, these systems contain user community logs including information on user specific downloaded articles. In this paper we present a novel approach for automatically evaluating document similarity models in large collections of scholarly publications. Unlike typical evaluation settings that use test collections consisting of query documents and human annotated relevance judgments, we use download logs to automatically generate pseudo-relevant set of similar document pairs. More specifically we show that consecutively downloaded document pairs, extracted from a scholarly information retrieval (IR) system, could be utilized as a test collection for evaluating document similarity models. Another novel aspect of our approach lies in the method that we employ for evaluating the performance of the model by comparing the distribution of consecutively downloaded document pairs and random document pairs in log space. Across two families of similarity models, that represent documents in the term vector and topic spaces, we show that our evaluation approach achieves very high correlation with traditional performance metrics such as Mean Average Precision (MAP), while being more efficient to compute.

information retrieval, machine learning, natural language, (17 more...)

Workshops at the Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.88)

Olteanu, Dan (University of Oxford)

Factorized Databases: A Knowledge Compilation Perspective

artificial intelligence, information retrieval query processing, natural language, (19 more...)

This paper overviews recent work on compilation of relational queries into lossless factorized representations. The primary motivation for this compilation is to avoid redundancy in the representation of query results and speed up their computation and subsequent analytics.

Workshops at the Thirtieth AAAI Conference on Artificial Intelligence

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.51)

#artificialintelligenceApr-11-2016, 23:54:36 GMT

Creating Content for Google's RankBrain

Google revealed in October that it uses artificial intelligence to help with 15% of search queries. Named RankBrain, the system analyzes vague, ambiguous queries and matches them with the most relevant results. In fact, Google's Greg Corrado told Bloomberg that RankBrain is now the third-highest signal contributing to a search-query result. Google – and similar search-engine services – are getting smarter. As marketers, we no longer can rely solely on traditional digital strategies such as link-building or social-media signaling.

artificial intelligence, information retrieval, natural language, (17 more...)

Industry:

Marketing (0.30)
Consumer Products & Services > Restaurants (0.30)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.60)

The Independent - TechApr-11-2016, 20:25:41 GMT

Hollywood removes Netflix from its legal streaming site search engine

Netflix has mysteriously been removed from the American film industry's search engine for legal streaming sites. WhereToWatch, which was set up by the Motion Picture Association of America (MPAA) in 2014, lets users search a range of legal streaming services for their favourite TV shows and films. The idea was to provide internet users with a handy resource which would steer them away from illegal streams and downloads, protecting them from potential legal trouble and helping the studios at the same time. However, as TorrentFreak reports, Netflix has been removed from the WhereToWatch search results, despite being the one of the most-used legal streaming services in the world. Netflix results have also been removed from the UK equivalent of the site, FindAnyFilm, although some Netflix results are still available on GoWatchIt, the search engine which powers WhereToWatch.

artificial intelligence, information retrieval, natural language, (9 more...)

The Independent - Tech

Industry:

Media > Television (1.00)
Media > Film (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.92)

#artificialintelligenceApr-10-2016, 18:44:20 GMT

700 SQL Queries per Second in Apache Spark with FiloDB

Apache Spark is increasingly thought of as the new jack-of-all-trades distributed platform for big data crunching – what with everything from traditional MapReduce-like workloads, streaming, graph computation, statistics, and machine learning all in one package. Except for Spark Streaming, with its micro-batches, Spark is focused for the most part on higher-latency, rich/complex analytics workloads. What about using Spark as an embedded, web-speed / low-latency query engine? This post will dive into using Apache Spark for low-latency, higher concurrency reporting / dashboard / SQL-like applications - up to hundreds of queries a second! Launching Spark applications on a cluster, or even on localhost, has a pretty high overhead.

artificial intelligence, machine learning, natural language, (16 more...)

Technology:

Information Technology > Databases (0.85)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.35)
Information Technology > Artificial Intelligence > Machine Learning (0.35)

#artificialintelligenceApr-9-2016, 23:36:30 GMT

Creating an Intelligent Search Engine with Big Data - White Paper

As data grows, organizations are increasingly seeking for an intelligent information discovery and analytics platform that goes beyond keyword searches and better understands users' intent. With Google Now and Cortana, advanced question answering systems are starting to become ubiquitous. Recently, Gartner has also started discussing'insight engines,' a new technology that can provide natural, total, and proactive search, analytics, and discovery. Please let us know the email address we should be sending a PDF copy of the white paper to. A download link will be immediately emailed to you - please check your junk mail if you have a strong email filter.

data mining, information retrieval, question answering, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.77)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.47)

#artificialintelligenceApr-8-2016, 21:50:25 GMT

Bing just became the best search engine for developers

At your day job as a professional code Googler – I mean developer – you probably search for quick snippets multiple times a day to find the best way to perform a particular task. Almost always as developers we end up on Stack Overflow or Mozilla Developer Network, but now Microsoft's Bing has given us something even better: executable code directly in search results. Some of the biggest names in tech are coming to TNW Conference in Amsterdam this May. Thanks to a collaboration with HackerRank, if you search for something like string concat C#, you'll get an interactive code editor with a result that can be run directly from that page to see how it works. It's a seriously fantastic feature that I hope Google adds soon – I'm not sure I'd switch search engine for this, but I'm incredibly jealous.

artificial intelligence, information retrieval, natural language, (6 more...)

Country: Europe > Netherlands > North Holland > Amsterdam (0.29)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.65)