I recently read the "Learning to generate reviews and discovering sentiment" paper by OPENAI and found it to be super cool. But I could not understand how they are using the language model as feature extractor. Suppose we have 150 characters in a review, how do we extract features from these 150 characters when our input is 64 characters at a time.
Binary classifiers are employed as discriminators in GAN-based unsupervised style transfer models to ensure that transferred sentences are similar to sentences in the target domain. One difficulty with the binary discriminator is that error signal is sometimes insufficient to train the model to produce rich-structured language. In this paper, we propose a technique of using a target domain language model as the discriminator to provide richer, token-level feedback during the learning process. Because our language model scores sentences directly using a product of locally normalized probabilities, it offers more stable and more useful training signal to the generator. We train the generator to minimize the negative log likelihood (NLL) of generated sentences evaluated by a language model.
BEGIN ARTICLE PREVIEW: You don’t need a sledgehammer to crack a nut. Jonathan Frankle is researching artificial intelligence — not noshing pistachios — but the same philosophy applies to his “lottery ticket hypothesis.” It posits that, hidden within massive neural networks, leaner subnetworks can complete the same task more efficiently. The trick is finding those “lucky” subnetworks, dubbed winning lottery tickets. In a new paper, Frankle and colleagues discovered such subnetworks lurking within BERT, a state-of-the-art neural network approach to natural language processing (NLP). As a branch of artificial intelligence, NLP aims to decipher and analyze human language, with applications like predictive te
I currently work as a Data Scientist for Informatica and I thought I'd share my process for learning new things. Recently I've been wanting to explore more into Deep Learning, especially Machine Vision and Natural Language Processing. I've been procrastinating a lot, mostly because it's been summer, but now that it's fall and starting to cool down and get dark early, I'm going to be spending more time learning when it's dark out. And the thing that deeply interests me is Deep Learning and Artificial Intelligence, partly out of intellectual curiosity and partly out of greed, as most businesses and products will incorporate Deep Learning/ML in some way. I started doing research and realized that an understanding and knowledge of Deep Learning was within my reach, but I also realized that I still have a lot to learn, more than I initially thought.
It all starts with a language model. Let's assume we have the sequence [my, cat's, breath, smells, like, cat, ____] and we want to guess the final word. There are several ways to create a language model. The most straightforward is an n-gram model that counts occurrences to estimate frequencies. A bare-bones implementation requires only a dozen lines of Python code and can be surprisingly powerful.