A photo from a summer camp posted to the camp's website so parents can view them. Venture capital-backed Waldo Photos has been selling the service to identify specific children in the flood of photos provided daily to parents by many sleep-away camps. Camps working with the Austin, Texas-based company give parents a private code to sign up. When the camp uploads photos taken during activities to its website, Waldo's facial recognition software scans for matches in the parent-provided headshots. Once it finds a match, the Waldo system (as in "Where's Waldo?") then automatically texts the photos to the child's parents.
Among various feature extraction algorithms, those based on genetic algorithms are promising owing to their potential parallelizability and possible applications in large scale and high dimensional data classification. However, existing genetic algorithm based feature extraction algorithms are either limited in searching optimal projection basis vectors or costly in both time and space complexities and thus not directly applicable to high dimensional data. In this paper, a direct evolutionary feature extraction algorithm is proposed for classifying high-dimensional data. It constructs projection basis vectors using the linear combination of the basis of the search space and the technique of orthogonal complement. It also constrains the search space when seeking for the optimal projection basis vectors. It evaluates individuals according to the classification performance on a subset of the training samples and the generalization ability of the projection basis vectors represented by the individuals. We compared the proposed algorithm with some representative feature extraction algorithms in face recognition, including the evolutionary pursuit algorithm, Eigenfaces, and Fisherfaces. The results on the widely-used Yale and ORL face databases show that the proposed algorithm has an excellent performance in classification while reducing the space complexity by an order of magnitude.
Editor's note: This post is part of our Trainspotting series, a deep dive into the visual and audio detection components of our Caltrain project. You can find the introduction to the series here. SVDS has previously used real-time, publicly available data to improve Caltrain arrival predictions. However, the station-arrival time data from Caltrain was not reliable enough to make accurate predictions. Using a Raspberry PiCamera and USB microphone, we were able to detect trains, their speed, and their direction.
Machine Learning has become more and more popular in the last years, especially as part of the new discipline of data science. In connection with this, several tools and APIs (Application Program Interface) on Machine Learning have been made available to non-experts. Microsoft, for instance has developed the Azure Machine Learning suite and has made several APIs available on the Cortana Analytics Gallery that allow developers to work on Machine Learning with relative ease. One of the APIs made available on the Cortana Analytics Gallery is the Face API from Microsoft's Project Oxford. Project Oxford consists of a collection of machine-learning based APIs which deal with computer based vision, speech recognition and natural language processing.
This short paper is describing a demonstrator that is complementing the paper "Towards Cross-Media Feature Extraction" in these proceedings. The demo is exemplifying the use of textual resources, out of which semantic information can be extracted, for supporting the semantic annotation and indexing of associated video material in the soccer domain. Entities and events extracted from textual data are marked-up with semantic classes derived from an ontology modeling the soccer domain. We show further how extracted Audio-Video features by video analysis can be taken into account for additional annotation of specific soccer event types, and how those different types of annotation can be combined.