Collaborating Authors


PostgreSQL and Machine Learning


I will show you how to apply Machine Learning algorithms on data from the PostgreSQL database to get insights and predictions. I will use an Automated Machine Learning (AutoML) supervised. It is an open-source python package. Thanks to AutoML I will get quick access to many ML algorithms: Decision Tree, Logistic Regression, Random Forest, Xgboost, Neural Network. The AutoML will handle feature engineering as well.

US Consumer Technology Spending Expands in August


Despite the uncertainty expressed at the end of July, consumers' self-reported tech spending expanded significantly according to August interviewing. In the latest round of results from IDC's Consumer Purchase & Subscription Index survey, consumer tech spending blew past the April benchmark as COVID-19 concern stabilized and optimism won out. Once again, consumer spend on devices led the way as respondents reported a 17% increase in their device expenditures, up 22% over the April baseline. Outlays with services rose 2%. Most devices saw gains, with video game consoles, phone accessories, and PCs/tablets as particularly strong performers.

Global quieting of high-frequency seismic noise due to COVID-19 pandemic lockdown measures


Noise from trains, airplanes, industrial processes, and other sources is recorded on seismometers worldwide. Disentangling this noise is important for extracting out natural signals, but the noise can also roughly track population movements. Lecocq et al. compiled seismic observations around the world and found a substantial decrease in noise resulting from lockdown measures imposed in response to the coronavirus disease 2019 pandemic (see the Perspective by Denolle and Nissen-Meyer). These observations tightly correspond to when the measures went into effect and offer a way to track aggregate behavior. This quiet period also offers the chance to extract anthropogenic sources of noise from those of natural processes. Science , this issue p. [1338][1]; see also p. [1299][2] Human activity causes vibrations that propagate into the ground as high-frequency seismic waves. Measures to mitigate the coronavirus disease 2019 (COVID-19) pandemic caused widespread changes in human activity, leading to a months-long reduction in seismic noise of up to 50%. The 2020 seismic noise quiet period is the longest and most prominent global anthropogenic seismic noise reduction on record. Although the reduction is strongest at surface seismometers in populated areas, this seismic quiescence extends for many kilometers radially and hundreds of meters in depth. This quiet period provides an opportunity to detect subtle signals from subsurface seismic sources that would have been concealed in noisier times and to benchmark sources of anthropogenic noise. A strong correlation between seismic noise and independent measurements of human mobility suggests that seismology provides an absolute, real-time estimate of human activities. [1]: /lookup/doi/10.1126/science.abd2438 [2]: /lookup/doi/10.1126/science.abd8358

Learning to Summarize with Human Feedback


Note that our human feedback models generate summaries that are significantly shorter than summaries from models trained on CNN/DM. At a given summary length, our 6.7B human feedback model trained on Reddit performs almost as well as a fine-tuned 11B T5 model, despite not being re-trained on CNN/DM. To test our models' generalization, we also applied them directly to the popular CNN/DM news dataset. These articles are more than twice as long as Reddit posts and are written in a very different style. Our models have seen news articles during pre-training, but all of our human data and RL fine-tuning was on the Reddit TL;DR dataset.

Architecture of a real-world Machine Learning system


This article is the 2nd in a series dedicated to Machine Learning platforms. It was supported by Digital Catapult and PAPIs. In the previous article, I presented an overview of ML development platforms, whose job is to help create and package ML models. Model building is just one capability, out of many, required in ML systems. I ended that article by mentioning other types of ML platforms, and limitations when building real-world ML systems.

RStudio AI Blog: An introduction to weather forecasting with deep learning


With all that is going on in the world these days, is it frivolous to talk about weather prediction? Asked in the 21st century, this is bound to be a rhetorical question. Today, no lengthy justification is needed as to why prediction of atmospheric states is vital: Due to global warming, frequency and intensity of severe weather conditions – droughts, wildfires, hurricanes, heatwaves – have risen and will continue to rise. And while accurate forecasts don't change those events per se, they constitute essential information in mitigating their consequences. This goes for atmospheric forecasts on all scales: from so-called "nowcasting" (operating on a range of about six hours), over medium-range (three to five days) and sub-seasonal (weekly/monthly), to climate forecasts (concerned with years and decades).

End-to-end Object Detection with Transformers


We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor generation that explicitly encode our prior knowledge about the task. The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss that forces unique predictions via bipartite matching, and a transformer encoder-decoder architecture. Given a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel. The new model is conceptually simple and does not require a specialized library, unlike many other modern detectors.

Researchers find face masks have no 'significant' effect on speech recognition accuracy


Can face masks affect the accuracy of automatic speech recognition systems? That's the question researchers at the Educational Testing Service (ETS), the nonprofit assessment organization headquartered in Princeton, New Jersey, sought to answer in a study published this week. Drawing on recordings from ETS' English language proficiency test, for which exam-takers were required to wear face masks, they found that while differences between the recordings and no-mask baselines existed, they didn't lead to "significant" variations in scores. The pandemic has led to a dramatic increase in the use of face masks worldwide, with 65% of U.S. adults saying they wore a mask in stores during the month of May, according to the Pew Research Center. This has potential implications for the speech algorithms underpinning smart speakers, smart displays, mobile apps, and indeed automated language proficiency tests. Face coverings come in all sizes and thicknesses and can impact a wearer's speech patterns, for example by distorting the sound of a person's speech or by greatly attenuating it.

Step by step guide to explaining your ML project during a data science interview.


This is Part 2 of the Interview Question series that I recently started. In Part 1, we talked about another important data science interview question pertaining to scaling your ML model. Be sure to check that out! A typical open-ended question that often comes up during interviews (both first and second round) is related to your personal (or side) projects. And trust me when I say this, this question is the best thing that can happen to you during an interview.

Why artificial intelligence is key to improving phishing defenses


As attackers constantly evolve their tactics to side-step more traditional defenses, artificial intelligence and machine learning technologies are stepping in to help organizations improve defenses. Technologies like MessageControl offer a critical extra layer of protection, especially when fully integrated into a multi-tenant platform to help inform cross-product detection. A Capgemini Research Institute study found that 69% of senior executive respondents said they would be unable to respond to a cyberattack without artificial intelligence. The same study found two-thirds of organizations plan to employ artificial intelligence by 2020, demonstrating the mandate security leaders face in implementing this technology in a focused and valuable way: at their email perimeters and inside their organizations. By constantly'learning' an organization's environment and user behaviors to get smarter over time, a baseline of normal is created, with deviations from that highlighting potential threats.