During a non-stop, two-hour keynote address at its annual I/O developers conference, Google unveiled a barrage of new products and updates. Here's a rundown of the most important things discussed: Google CEO Sundar Pichai kicked off the keynote by unveiling a new computer-vision system coming soon to Google Assistant. Apparently, as Pichai explained, you'll be able to point your phone's camera at something, and the phone will understand what it's seeing. Pichai gave examples of the system recognizing a flower, a series of restaurants on a street in New York (and automatically pulling in their ratings and information from Google), and the network name and password for a wifi router from the back of the router itself--the phone then automatically connecting to the network. Theoretically, in the future, you'll be searching the world not through text or your voice, but by pointing your camera at things.

This short paper is describing a demonstrator that is complementing the paper "Towards Cross-Media Feature Extraction" in these proceedings. The demo is exemplifying the use of textual resources, out of which semantic information can be extracted, for supporting the semantic annotation and indexing of associated video material in the soccer domain. Entities and events extracted from textual data are marked-up with semantic classes derived from an ontology modeling the soccer domain. We show further how extracted Audio-Video features by video analysis can be taken into account for additional annotation of specific soccer event types, and how those different types of annotation can be combined.

In many machine learning applications, crowdsourcing has become the primary means for label collection. In this paper, we study the optimal error rate for aggregating labels provided by a set of non-expert workers. Under the classic Dawid-Skene model, we establish matching upper and lower bounds with an exact exponent $mI(\pi)$ in which $m$ is the number of workers and $I(\pi)$ the average Chernoff information that characterizes the workers' collective ability. Such an exact characterization of the error exponent allows us to state a precise sample size requirement $m>\frac{1}{I(\pi)}\log\frac{1}{\epsilon}$ in order to achieve an $\epsilon$ misclassification error. In addition, our results imply the optimality of various EM algorithms for crowdsourcing initialized by consistent estimators.