Flexible Models for Microclustering with Application to Entity Resolution

Neural Information Processing Systems

Most generative models for clustering implicitly assume that the number of data points in each cluster grows linearly with the total number of data points. Finite mixture models, Dirichlet process mixture models, and Pitman-Yor process mixture models make this assumption, as do all other infinitely exchangeable clustering models. However, for some applications, this assumption is inappropriate. For example, when performing entity resolution, the size of each cluster should be unrelated to the size of the data set, and each cluster should contain a negligible fraction of the total number of data points. These applications require models that yield clusters whose sizes grow sublinearly with the size of the data set. We address this requirement by defining the microclustering property and introducing a new class of models that can exhibit this property. We compare models within this class to two commonly used clustering models using four entity-resolution data sets.


Up, up and away: Passenger-carrying drone to fly in Dubai

Boston Herald

Up, up and away: Dubai hopes to have a passenger-carrying drone regularly buzzing through the skyline of this futuristic city-state in July. The arrival of the Chinese-made EHang 184 -- which already has had its flying debut over Dubai's iconic, sail-shaped Burj al-Arab skyscraper hotel -- comes as the Emirati city also has partnered with other cutting-edge technology companies, including Hyperloop One. The question is whether the egg-shaped, four-legged craft will really take off as a transportation alternative in this car-clogged city already home to the world's longest driverless metro line. Mattar al-Tayer, the head of Dubai's Roads & Transportation Agency, announced plans to have the craft regularly flying at the World Government Summit. Before his remarks on Monday, most treated the four-legged, eight-propeller craft as just another curiosity at an event that views itself as a desert Davos.


Optimal strategies for the control of autonomous vehicles in data assimilation

arXiv.org Machine Learning

We propose a method to compute optimal control paths for autonomous vehicles deployed for the purpose of inferring a velocity field. In addition to being advected by the flow, the vehicles are able to effect a fixed relative speed with arbitrary control over direction. It is this direction that is used as the basis for the locally optimal control algorithm presented here, with objective formed from the variance trace of the expected posterior distribution. We present results for linear flows near hyperbolic fixed points.


Watson Will Soon Be a Bus Driver In Washington D.C.

#artificialintelligence

IBM has teamed up with Local Motors, a Phoenix-based automotive manufacturer that made the first 3D-printed car, to create a self-driving electric bus. Named "Olli," the bus has room for 12 people and uses IBM Watson's cloud-based cognitive computing system to provide information to passengers. In addition to automatically driving you where you want to go using Phoenix Wings autonomous driving technology, Olli can respond to questions and provide information, similar to Amazon's Echo home assistant. The bus debuts today in the Washington D.C. area for the public to use during select times over the next several months, and the IBM-Local Motors team hopes to introduce Olli to the Miami and Las Vegas areas by the end of the year. By using Watson's speech to text, natural language classifier, entity extraction, and text to speech APIs, the bus can provide several services beyond taking you to your destination.