Grammars & Parsing


Essential Arts & Culture: Parsing Kusama, outcry over Philip Johnson update, art's woman problem

Los Angeles Times

The Kusama show at the Broad is raising the crowds (if not our critic's inspiration). Los Angeles just had a Philip Glass moment. There's been an architectural furor over possible changes to a work by Philip Johnson. I'm Carolina A. Miranda, staff writer for the Los Angeles Times, with the week's blazing culture news: Yayoi Kusama's exhibition of Infinity Mirror Rooms at the Broad is the hot museum show in L.A. right now. But Times art critic Christopher Knight says if you didn't score a ticket, you're not missing much.


fekr/postagga

#artificialintelligence

You can use postagga to process annotated text samples into full fledged parsers capable of understanding "free speech" input as structured data. That is our "Natural language input" First step in understanding this sentence is to extract some structure from it so it is easier to interpret. A state basically refers to words; it is matched with tag sets (A word can very well relate to mutiple tags, if your preferred tagger wants to!!). If we had multiple states with:get-value flag on, we'll find multiple words in the corresponding entry in the output; that's why the step key is referring a vector of words in the output map.


AP FACT CHECK: Parsing an Unfettered Trump on Border Wall

U.S. News

THE FACTS: It's not clear what he means by renovations. His administration has not outlined sweeping renovations to be done in that time. Its request to Congress for $1.6 billion in wall financing for the budget year that begins Oct. 1 incudes money for 14 miles of replacement barrier in San Diego and it's not certain Congress will approve even that. Money has been approved for three miles of border protection in Calexico, California.



Four deep learning trends from ACL 2017

#artificialintelligence

Though attention often plays the role of word alignment in NMT, Koehn and Knowles note that it learns to play other, harder-to-understand roles too; thus it is not always as understandable as we might hope. In Parameter Free Hierarchical Graph-Based Clustering for Analyzing Continuous Word Embeddings, Trost and Klakow perform clustering on word embeddings, then cluster those clusters, and so on to obtain a hierarchical tree-like structure. Neural networks are powerful because they can learn arbitrary continuous representations, but humans find discrete information – like language itself – easier to understand. These systems should ideally produce a proof or derivation of the answer – for a semantic parsing question answering system, this might be the semantic parse itself, or a relevant excerpt from the knowledge base.


Genetic Programming (Machine Learning/AI): "Santa Fe Trail" problem - Syntax Trees

#artificialintelligence

The syntax tree of the fittest individual is shown for each generation until a solution with perfect fitness is found - and beyond. Not too exciting for a small function/terminal set and a program size limit of 50 instructions but there you go! For details on the "Santa Fe Trail problem" please see https://en.wikipedia.org/wiki/Santa_F...


Four deep learning trends from ACL 2017

@machinelearnbot

Instead she concluded that Structure Is Coming Back, and provided via example one reason to embrace its return: linguistic structure reduces the search space of possible outputs, making it easier to generate well-formed output. Chris Dyer also argued for the importance of incorporating linguistic structure into deep learning in his CoNLL keynote Should Neural Network Architecture Reflect Linguistic Structure? Like Noah Smith, he drew attention to the inductive biases inherent in the sequential approach, arguing that RNNs have an inductive bias towards sequential recency, while syntax-guided hierarchical architectures (such as recursive NNs and RNNGs) have an inductive bias towards syntactic recency. While the "language is just sequences" paradigm argues that RNNs can compute anything, researchers are increasingly interested in how the inductive biases of the sequential model affect what they do compute.


How to make a racist AI without really trying

#artificialintelligence

Recognizing whether people are expressing positive or negative opinions about things has obvious business applications. It's simplistic, sometimes too simplistic, but it's one of the easiest ways to get measurable results from NLP. In a few steps, you can put text in one end and get positive and negative scores out the other, and you never have to figure out what you should do with a parse tree or a graph of entities or any difficult representation like that. This model is not the point of that paper, so don't take this as an attack on their results; it was there as an example of a well-known way to use word vectors.


How to make a racist AI without really trying

#artificialintelligence

Recognizing whether people are expressing positive or negative opinions about things has obvious business applications. It's simplistic, sometimes too simplistic, but it's one of the easiest ways to get measurable results from NLP. In a few steps, you can put text in one end and get positive and negative scores out the other, and you never have to figure out what you should do with a parse tree or a graph of entities or any difficult representation like that. This model is not the point of that paper, so don't take this as an attack on their results; it was there as an example of a well-known way to use word vectors.


Finding the right representation for your NLP data - Tryolabs Blog

@machinelearnbot

When considering what information is important for a certain decision procedure (say, a classification task), there's an interesting gap between what's theoretically --that is, actually-- important on the one hand and what gives good results in practice as input to machine learning (ML) algorithms, on the other. On the other hand, embedding syntactic structures in a vector space while making the distance relation meaningful is not quite as easy. Funnily enough, when I tried the two Dancing Monkeys in a Tuxedo sentences with Stanford's recursive sentiment analysis tool, it classified both sentences as negative. What you can do in this case is to restructure your input vector so that instead of having a unique, separate feature for the sentiment of the review, you use feature combinations (also called feature crosses) so that all word frequency features include information about the sentiment.