"As for why I tell a lot of stories, there's a joke about that. There was once a man who had a computer, and he asked it, 'Do you compute that you will ever be able to think like a human being?' And after assorted grindings and beepings, a slip of paper came out of the computer that said, 'That reminds me of a story . . . "
– from ANGELS FEAR: TOWARDS AN EPISTEMOLOGY OF THE SACRED. Gregory Bateson & Mary Catherine Bateson. (Part III 'Metalogue').
The technology relies on natural language generation (NLG), a cornerstone of much of the progress which has been made in recent years thanks to artificial intelligence and automation. The PA project is known as RADAR – Reporters and Data and Robots – and relies on open data sets from government, local authorities and public services. Urbs Media editor-in-chief Gary Rogers told me that they initially started looking at the possibilities of generating stories for national media using open data sources, but soon realized that its highly geographically-segmented nature meant it was very well suited for local stories. "So instead of writing one story about a dataset – a national story – you could write 10 regional stories or 200 local authority-based stories.
Among them is RADAR, a collaboration between the UK and Ireland's Press Association (PA) and Urbs Media, a startup that creates localized news stories using AI. That influx of cash will be used to create a service that will ramp up automated news efforts, with natural language generation (NLG) AI programs pumping out up to 30,000 stories per month across localized distribution networks starting in 2018. To start, the AI will be tasked with producing low-level stories from templates created by human writers. Wordsmith, an NLG AI program, has been producing automated stories for the U.S.'s Associated Press (AP) since 2014, and other traditional outlets like the New York Times and Los Angeles Times have used automation for low-level reporting.
We are open sourcing the code for evaluating several popular metrics for natural language generation that we have used in our paper Relevance of Unsupervised Metrics in Task-Oriented Dialogue for Evaluating Natural Language Generation. This code computes pre-existing word-overlap-based and embedding-similarity-based metrics at once using a single command. We hope that, by making evaluation on these metrics convenient, it will facilitate comparisons in NLP and dialogue literature.
Anyways, here are some more back-pedaling clarifications in response to Yann's response: I am not against the use of deep learning methods on language tasks. What I am against is a tendency of the "deep-learning community" to enter into fields (NLP included) in which they have only a very superficial understanding, and make broad and unsubstantiated claims without taking the time to learn a bit about the problem domain. Sloppy papers with broad titles such as "Adversarial Generation of Natural Language" are harmful. The fast pace of arxiv can have a very positive effect on the field, but, "with great power comes great responsibility" and we have to be careful not to abuse the power.
Do you love natural language and believe that getting machines to use it the same way humans do is the most interesting thing one can do? Are you experienced at applying machine learning (ML) to sophisticated natural language processing (NLP) tasks? Are you excited to solve difficult problems in human-computer communication, machine translation, question-answering, knowledge base construction and querying, natural language understanding, natural language generation, search, and multi-modal modeling while applying the latest algorithms and techniques? Are you comfortable operating with large data sets?
Machine learning's trendy tech is powering products all over Silicon Valley, but Chicagoans are using artificial intelligence in a more practical sense, Uptake's Adam McElhinney says. "I see a big contrast between the way Chicago companies are using machine learning and the way some of the Silicon Valley companies are maybe using machine learning," said McElhinney, vice president of data science at the predictive-analytics startup that targets construction, mining and other industrial sectors. He joined representatives from Avant, Civis Analytics, Groupon and Narrative Science Thursday evening for "Data-Driven Chicago," an event at Groupon's headquarters where each company shared how they're using machine learning to make sense of massive amounts of data. Andrew Paley, director of product design at Narrative Science, highlighted his startup's advanced natural-language-generation platform Quill, which he said was originally developed by a Northwestern University student as part of a class project to convert baseball box scores into "old-timey," human-sounding sports stories.
I have also tried extensively to use WGAN's to generate language sequences. I just don't understand why it doesn't converge to results that are as good as Max Likelihood. Even with curriculum learning and peephole LSTM's, you would think it would converge to a good optimum but the results still show that max likelihood is a better approach /. Can anyone think of why this doesn't work better than Max Likelihood?
NLG tools automatically analyze data, interpret it, identify the most significant parts, and generate written reports in plain English. In essence, NLG brings artificial intelligence to business intelligence (BI), automating routine analysis, saving business users time and money. Quill analyzes call detail data and automatically generates a personalized performance report written in English for thousands of customer service representatives, something that is difficult to do with traditional BI tools. Although NLG vendors say their tools augment, not replace, jobs of report writers and analysts, in some cases, NLG tools will reduce the number of people required to generate and analyze data.
While we still haven't quite achieved artificial intelligence, natural language processing has become very popular over the past few years and is used in many products today. The goal of NLP is to make interactions between computers and humans feel exactly like interactions between humans and humans. They are Natural Language Understanding (NLU) and Natural Language Generation (NLG). Modern NLP algorithms use statistical machine learning to apply these rules to the natural language and determine the most likely meaning behind what was said.