I review current statistical work on syntactic parsing and then consider part-of-speech tagging, which was the first syntactic problem to successfully be attacked by statistical techniques and also serves as a good warm-up for the main topic-statistical parsing. Here, I consider both the simplified case in which the input string is viewed as a string of parts of speech and the more interesting case in which the parser is guided by statistical information about the particular words in the sentence. Finally, I anticipate future research directions.
Over the last few years, a number of areas of natural language processing have begun applying graph-based techniques. These include, among others, text summarization, syntactic parsing, word-sense disambiguation, ontology construction, sentiment and subjectivity analysis, and text clustering. In this paper, we present some of the most successful graph-based representations and algorithms used in language processing and try to explain how and why they work.
With increases in the complexity of information that must be communicated either by or to computer comes a corresponding need to find ways to communicate that information simply and effectively. It makes little sense to force the burden of communication on a single medium, restricted to just one of spoken or written text, gestures, diagrams, or graphical animation, when in many situations information is only communicated effectively through combinations of media.
We are open sourcing the code for evaluating several popular metrics for natural language generation that we have used in our paper Relevance of Unsupervised Metrics in Task-Oriented Dialogue for Evaluating Natural Language Generation. This code computes pre-existing word-overlap-based and embedding-similarity-based metrics at once using a single command. We hope that, by making evaluation on these metrics convenient, it will facilitate comparisons in NLP and dialogue literature.