Goto

Collaborating Authors

 Grammars & Parsing


Excel pro tips: Importing and parsing data

PCWorld

Data imported from other spreadsheets or databases is already separated into fields, using something called a field delimiter--a comma, tab, space, or custom character--to separate one field from another. These databases import easily into Excel and place all the fields in separate columns. If your company pays bills and/or banks online, these sites usually offer copies of the company's records in electronic form. CSV (comma separated values) is the most common data exchange format and, if offered, the best one to use. But what happens when all the data imports into one cell?


Extract Subject Matter of Documents Using NLP

#artificialintelligence

Understanding large corpora is an increasingly popular problem. Modern startups and established companies are working diligently to produce models that can extract meaningful data from a body of text. In this post, I will explain some Natural Language Processing (NLP) techniques that can be used to extract the main subject of a particular document. In addition to identifying the main subject, I will explain a technique for getting Subject Verb and Object sets, everywhere the subject is mentioned. To further explain what I'm talking about take a look at this TechCrunch article.


ParaText: CSV parsing at 2.5 GB per second

#artificialintelligence

For almost 50 years, CSV has been the format of choice for tabular data. Given the ubiquity of CSV and the pervasive need to deal with CSV in real workflows -- where speed, accuracy, and fault tolerance is a must -- we decided to build a CSV reader that runs in parallel. We conducted extensive benchmarks of ParaText against 7 CSV readers and 5 binary readers. Please refer to our benchmarking whitepaper for more details. In our tests, ParaText can load a CSV file from a cold disk at a rate of 2.5 GB/second and 4.2 GB/second out-of-core from a warm disk.


Stanford CoreNLP

@machinelearnbot

The classpath must include all of the CoreNLP dependencies. The memory requirements of the server are the same as that of CoreNLP, though it will grow as you load more models (e.g., memory increases if you load both the PCFG and Shift-Reduce constituency parser models). A safe minimum is 4gb; 8gb is recommended if you can spare it. The server can be stopped programmatically by making a call to the /shutdown endpoint with an appropriate shutdown key. This key is saved to the file /tmp/corenlp.shutdown


AI: Google AI Tool 'Parsey McParseface' Could Detect Lies, Eliminate Problems Of Human Language With Artificial Intelligence Language Program

#artificialintelligence

Artificial intelligence is one of the world's fastest-developing fields of study, and Google AI tools have already surpassed our expectations of the human brain-like capabilities of AI technology. Having created a groundbreaking "parsing" program, new Google AI tool Parsey McParseface could detect lies and eliminate problems of human language with an artificial intelligence language program. Google AI recently stunned the world upon releasing its AI poetry program, which uses a technique called recurrent neural network language model (RNNLM) to write classical, authentic, poetry touted as capable of "making a Vogon proud." The latest AI development as premiered by Google is a language parsing tool -- an artificial intelligence program capable of sorting through passages of human language and detecting inconsistencies in rhetoric and prose -- dubbed Parsey McParseface. Google's AI language tool was given the McParseface name when, 18 months into the program's development and still unable to think of a suitable title, Google developers named the sophisticated AI tool as a tongue-in-cheek reference to the viral poll that almost saw a polar research vessel called Boaty McBoatface.


Disrupted AI - why Google's 'Parsey McParseface' is big news in AI - iDisrupted

#artificialintelligence

Before you even ask, the name has no meaning. When Google was trying to figure out what to call its language parsing technology, someone suggested Parsey McParseface; it's a bit like Apple's Liam, which has no clever backstory either. The overall AI model is called SyntaxNet (please make your SkyNet jokes now); 'ol Parsey is just for English. Combining machine learning and search techniques, Parsey McParseface is 94 percent accurate, according to Google. It also leans on SyntaxNet's neural-network framework for analyzing the linguistic structure of a sentence or statement, which parses the functional role of each word in a sentence.


Google Has a New AI That Understands English. And Its Name is 'Parsey McParseface'

#artificialintelligence

How machines deal with comprehending human languages is called Natural Language Understanding (NLU), and revolutionary changes in this technology have given us the many virtual assistants we have today. However, NLU still has many obstacles to go through due to the ambiguous nature of the countless languages all over the world. Now, Google claims they're cutting through these difficulties as they announced the open sourcing of a neural network software developed with TensorFlow, SyntaxNet, together withโ€ฆParsey McParseface, apparently an English parser. Parsing, in linguistics, is the breaking down of sentences into their component parts to define what each part means. Experts assert that this is a first key component in NLU systems.


Google's machine learning gains natural language understanding - TechCentral.ie

#artificialintelligence

Google is promoting natural language understanding with the open-sourcing of SyntaxNet, a neural network framework, and Parsey McParseface, an advanced parser for English text. Implemented in Google's open source TensorFlow machine intelligence library and released this month, SyntaxNet provides the code needed to train natural language understanding (NLU) models on data along with the Parsey McParseface parser for analysing English text. "Parsey McParseface is built on powerful machine learning algorithms that learn to analyse the linguistic structure of language and that can explain the functional role of each word in a given sentence," said Slav Petrov, Google senior staff research scientist. The project arose out of Google's pondering of how computers can read and understand human language in order to process it in intelligent ways. Accessible on GitHub, SyntaxNet serves as a framework for a syntactic parser, a key first component in many NLU systems, Petrov said.


Google's machine learning gains natural language understanding

#artificialintelligence

Google is promoting natural language understanding with the open-sourcing of SyntaxNet, a neural network framework, and Parsey McParseface, an advanced parser for English text. Implemented in Google's open source TensorFlow machine intelligence library and released this month, SyntaxNet provides the code needed to train natural language understanding (NLU) models on your data along with the Parsey McParseface parser for analyzing English text. "Parsey McParseface is built on powerful machine learning algorithms that learn to analyze the linguistic structure of language and that can explain the functional role of each word in a given sentence," said Slav Petrov, Google senior staff research scientist. The project arose out of Google's pondering of how computers can read and understand human language in order to process it in intelligent ways. Accessible on GitHub, SyntaxNet serves as a framework for a syntactic parser, a key first component in many NLU systems, Petrov said.


Google just open sourced something called 'Parsey McParseface,' and it could change AI forever

#artificialintelligence

As much as we love to fawn over artificial intelligence (AI), it's still not great at recognizing and parsing natural language. That's why Google is open sourcing its new language parsing model for English, which it calls'Parsey McParseface.' Before you even ask, the name has no meaning. When Google was trying to figure out what to call its language parsing technology, someone suggested Parsey McParseface; it's a bit like Apple's Liam, which has no clever backstory either. The overall AI model model is called SyntaxNet (please make your SkyNet jokes now); 'ol Parsey is just for English. Our biggest ever edition of TNW Conference is fast approaching!