Collaborating Authors


NLP Deep Learning Training on Downstream tasks using Pytorch Lightning -- Summarization on XSum…


These are not strong scores and will need training for additional Epochs. A single Epoch took 2 hrs 8 mins to run. The test loss is close to the training loss. The example notebook of Transformers using their Trainer shows a Training loss of 2.72 after 1 Epoch with a running time of 1 hr 22 mins. Both the loss and the running time is lower than this notebook using Pytorch Lightning. Is the lightning framework introducing delays in Training step?

Machine Learning in GCP


In this blog, I will briefly talk about the different Machine Learning options that are available in Google Cloud Platform and walk through an example project of my own. This will include briefly talking about the older AI Platform service as well as introducing the new Vertex AI service. My project will give an example of how to read data from a GCS bucket, perform exploratory data analysis in a managed Jupyter notebook instance, train a model in that notebook, save the model to a different GCS bucket, and finally use that model in a full-stack application. Here is the repository with the code for this application. AI Platform was GCP's original Machine Learning service.

Feeding the machine: We give an AI some headlines and see what it does


There's a moment in any foray into new technological territory when you realize you may have embarked on a Sisyphean task. Staring at the multitude of options available to take on the project, you research your options, read the documentation, and start to work--only to find that actually just defining the problem may be more work than finding the actual solution. Reader, this is where I found myself two weeks into this adventure in machine learning. I familiarized myself with the data, the tools, and the known approaches to problems with this kind of data, and I tried several approaches to solving what on the surface seemed to be a simple machine-learning problem: based on past performance, could we predict whether any given Ars headline will be a winner in an A/B test? Things have not been going particularly well. In fact, as I finished this piece, my most recent attempt showed that our algorithm was about as accurate as a coin flip.

The Best Sci-Fi Comedy Is Existential


Tom Gerencer's book Intergalactic Refrigerator Repairmen Seldom Carry Cash features 19 pieces of humorous science fiction. Gerencer selected the stories out of literally hundreds that he's written over the past two decades. "If you go to Walmart, and you go into the section with the big Tupperware bins that you can put clothes and stuff in, I would just write and write and write, and fill a notebook with short stories--or fragments of short stories--and then I would put them into the bin, and then I would fill another notebook and put that in the bin, and fill another notebook, and now I have five or six bins in the basement, and there are several bins that I lost at some point," Gerencer says in Episode 473 of the Geek's Guide to the Galaxy podcast. "It is certainly an avalanche of words." With titles like "Trailer Trash Savior" and "Apocalyptic Nostrils of the Moon," you might expect the stories to be light-hearted, but Gerencer's work also contains a dark streak of existential angst, frequently dealing with questions such as: How can we be happy?

4 Cool Python Libraries That You Should Know About


Some of my most popular blogs are about Python libraries. I believe that they are so popular because Python libraries have the power to save us a lot of time and headaches. The problem is that most people focus on those most popular libraries but forget that multiple less-known Python libraries are just as good as their most famous cousins. Finding new Python libraries can also be problematic. Sometimes we read about these great libraries, and when we try them, they don't work as we expected.



This course will teach you foundations of deep learning and TensorFlow as well as prepare you to pass the TensorFlow Developer Certification exam (optional). Videos going through the rest of the notebooks (03 - 10) are available in the full course. New You can now read the full course as an online book! (note: this is a work in progress, but 95% of it should run fine) Check out the livestream Q&A celebrating the course launch on YouTube. Otherwise, many of them might be answered below. This table is the ground truth for course materials.

Kaggle BIPOC Grant program-My experience


This year, Kaggle started a new program called the BIPOC (Black, Indigenous, People of Color) Grant Program. It aims to empower underrepresented data scientists with support to advance their careers and aspirations. I am grateful that I was one of the few people who became a part of this wonderful program. All the students who became part of the program were assigned a mentor as well. I had done a few basic projects before I became a part of this program.

Great New Resource for Natural Language Processing Research and Applications - KDnuggets


With all of the massive and relentless advancements of natural language processing recently, keeping up with research breakthroughs and SOTA practices can be fraught with challenges. Where to find papers, which papers present which ideas, tracking down code that goes along with papers, these are all very real struggles. What if there was a single spot you could go to get the jump on all of these different activities, and come away with everything you need to keep up with the NLP Joneses? If you haven't heard, Ricky Costa, CEO at Quantum Stat, very recently announced the launch of The NLP Index. Ricky describes the NLP index as "a new asset in NLP code discovery," and goes on to say: It has over 3,000 code repositories and I've already created a nice side bar with some of the most important topics in NLP today!

SAS and Microsoft Certifications for Data Scientists


There are numerous reasons why a data scientist would be interested in a SAS or Microsoft professional certification. First, it is a great way to pick up a new skill or even improve an existing skill. Certifications can help with professional and career development. And now, you can even take certification exams from the comfort of your own home. I've had the opportunity to earn several SAS and Microsoft certifications, so in today's article, I want to share my thoughts around each one to help you decide which is right for you!

When his hobbies went on hiatus, this Kaggler made fighting COVID-19 with data his mission


Most medical articles have methods & results sections and matches in those sections are more important. I had little to no expectations entering this competition, so I wouldn't say I was surprised by anything. It was great to see so many smart and capable people all working together to try to help in whatever way they could. All of the work is driven by the Kaggle platform. The list of notebooks cover all the submissions for Round 1 and Round 2 of the CORD-19 challenge. All of the notebooks are in Python.