Many autonomous and heterogeneous information sources are becoming increasingly available to users through the Internet, especially through the World Wide Web. In order to make the information available in a consolidated, uniform, and efficient manner, it is necessary to integrate the different information sources. The integration of Internet sources poses several challenges that have not been sufficiently addressed by work on the integration of corporate databases residing on an Intranet [LMR90]. We believe that the most important ones are heterogeneity, large number of sources, redundancy, availability, source autonomy, and diverse access methods and querying interfaces.
We present a computationally efficient technique to compute the distance of high-dimensional appearance descriptor vectors between image windows. The method exploits the relation between appearance distance and spatial overlap. We derive an upper bound on appearance distance given the spatial overlap of two windows in an image, and use it to bound the distances of many pairs between two images. We propose algorithms that build on these basic operations to efficiently solve tasks relevant to many computer vision applications, such as finding all pairs of windows between two images with distance smaller than a threshold, or finding the single pair with the smallest distance. In experiments on the PASCAL VOC 07 dataset, our algorithms accurately solve these problems while greatly reducing the number of appearance distances computed, and achieve larger speedups than approximate nearest neighbour algorithms based on trees and on hashing . For example, our algorithm finds the most similar pair of windows between two images while computing only 1% of all distances on average.
The goal of this notebook is to build and analyse a map of the 10,000 most popular subreddits on Reddit. To do this we need a means to measure the similarity of two subreddits. In a great article on FiveThirtyEight Trevor Martin did an analysis of subreddits by considering the overlaps of users commenting on two different subreddits. Our interest is a little broader -- we want to map out and visualize the space of subreddits, and attempt to cluster subreddits into their natural groups. With that done we can then explore some of the clusters and find interesting stories to tell.
It's impossible to ignore the fact that advances in artificial intelligence (AI) is changing how we do our current jobs. But what has captured even more interest is how the increasing capability of this technology will affect future jobs. In trying to determine the specific effects on which jobs and which sectors, many studies have been undertaking but it's hard to capture this information. To add further research to this topic the Brookings Institution issued a report on Nov. 20, presenting a new method of analyzing this issue. "By employing a novel technique developed by Stanford University Ph.D. candidate Michael Webb, the new report establishes job exposure levels by analyzing the overlap between AI-related patents and job descriptions," the report said.
There will be gift-giving as well as trees and menorahs aplenty this weekend as Christmas and Hanukkah exactly overlap for the first time in almost 30 years. Hanukkah may be seen by many non-Jews as the Jewish equivalent at Christmas, but it is rare that the two holidays coincide and even rarer still that they begin at the same time. But, in 2016, the first day of Hanukkah will be Dec. 25, although in the Hebrew calendar days begin at sundown the day before meaning the celebrations will actually get going on Christmas Eve. The last time that happened was in 1978. The two holidays do overlap more frequently, the last time being five years ago when Christmas fell midway between the eight days of Hanukkah.