lda2vec: Tools for interpreting natural language • /r/MachineLearning


It builds off of paragraph vectors. But paragraph vectors aren't interpretable (or at least as interpretable) as LDA-like vectors; I can't hand my CEO a 512-dimensional paragraph vector to show her what's been trending. But I can hand her an LDA vector, because it's a sparse mixture of interpretable vectors. Because that vector is essentially'on' in three or four categories, those categories sum to 100%, and you don't have to deal with negative coefficients. The whole point here is to gear the internal representations so they're more amenable to humans -- it'd be cute if it scored better, but life isn't a Kaggle contest.

Interpreting the skies using Big Data Analytics – Tech2


Who has not had their favorite pastime as star gazing through grandpa's telescope? No matter what age, one is always in love with exploring the unknown--to know what lies beneath or what our origins are. The skies have always provided the canvas to one's imagination. From Galileo till the present times, astronomy has evolved and grown at an unprecedented rate, particularly in the past decade. Astronomy--in this new era--is making massive improvements in our investigation of the Universe, laying bare the secrets of dark energy and dark matter, the formation and evolution of galaxies, and the structure of our own Milky Way.

Researchers want to achieve machine translation of the 24 languages of the EU


The aim of their collaboration is to achieve machine-based translation between the languages of the European Union so that comprehensible texts are achieved for as many language combinations as possible. Two of the EU-funded research projects are being led by the Saarbrücken computer linguist Josef van Genabith. Anyone who wants to learn Finnish has to be prepared to deal with a complex grammar that includes fifteen different cases. The grammatical cases are marked in part by appending syllables to nouns resulting in a dizzying array of word forms and expressive possibilities. "Teaching a computer to understand all these grammatical nuances and to translate them correctly into another language is exceptionally difficult," says Josef van Genabith, Professor of Translation-Oriented Language Technologies at Saarland University and a Scientific Director at the German Research Center for Artificial Intelligence (DFKI).

Political cyberhacker Andrés Sepúlveda reveals how he digitally rigged elections across Latin America

The Independent - Tech

A political cyber hacker who manipulated elections across Latin America has told his story for the first time from a prison cell. Andrés Sepúlveda explained his major role in the "unseen politics" that wins and loses elections which he has said is endemic around the world in an interview with Bloomberg. From creating fake Twitter accounts to fool the voting public to digitally eavesdropping on party campaign teams, Mr Sepúlveda said he travelled the continent for eight years rigging major political events. Now serving 10 years for charges including use of malicious software, conspiracy to commit crime, violation of personal data, and espionage, the 31-year-old is reportedly hoping to prove the public he has reformed to get a reduced sentence. Yet at the peak of his powers he made, among many other systems, a software programme called Social Media Predator which managed a vast range of fake Twitter accounts to help portray a distorted virtual reality to voters.

Google's Neural Machine Translation engine learns three new languages, with more on the way


Late last year, Google announced a breakthrough in translating. Dubbed Neural Machine Translation, it let Google's AI-powered engine tackle full sentences instead of just words, giving translations a more natural feel. Now Google is expanding it to a several more languages. Back when it launched, Google's new translator was available for English and just eight other languages, including French, German, Spanish, Portuguese, Chinese, Japanese, Korean, and Turkish. Now it is bringing it to Hindi, Russian, and Vietnamese, with more rolling out in the coming weeks.