Goto

Collaborating Authors

data science


Alternative Python libraries for Data Science

#artificialintelligence

Dabl library has been created by Andreas Mueller, one of the core developers and maintainers of the scikit-learn machine learning library. The idea behind dabl is to make supervised machine learning more accessible to beginners and reduce boilerplate for common tasks. Dabl takes inspirations from scikit-learn and auto-sklearn. The library is being developed actively and hence isn't recommended for production use. Dabl can be used for automated preprocessing of the dataset, quick EDA as well as initial model building as part of a typical machine learning pipeline.


Dealing with Imbalanced Data in Machine Learning - KDnuggets

#artificialintelligence

As an ML engineer or data scientist, sometimes you inevitably find yourself in a situation where you have hundreds of records for one class label and thousands of records for another class label. Upon training your model you obtain an accuracy above 90%. You then realize that the model is predicting everything as if it's in the class with the majority of records. Excellent examples of this are fraud detection problems and churn prediction problems, where the majority of the records are in the negative class. What do you do in such a scenario?


Save hundreds on these Python, AI and data science courses

Engadget

In this age of big data, companies worldwide need to sift through the avalanche of information at their disposal to enhance their products, services and overall profitability. Many companies rely on programming languages like Python and the advancements made in artificial intelligence (AI) and data science to get that job done. Right now, you can save hundreds on The Ultimate Python & Artificial Intelligence Certification Bundle, featuring nine in-depth courses and 38 hours of video content that catches you up to speed on everything Python, AI and data science.


4 Powerful Use Cases for Data Science in Finance

#artificialintelligence

There are a plethora of success stories demonstrating how major financial players capitalise on their data. The coronavirus pandemic and the global measures that have followed have created a perfect economic storm. The financial sector stands at the front line of a growing credit crisis, with banks trying to manage disruption and maintain strict compliance amid social distancing guidelines which are at odds with their processes. Then there are the extraordinarily low interest rates and increasingly cash-insecure consumers to contend with. To navigate the immediate obstacles, financial institutions must assess short-to-medium-term financial risks and adapt to new ways of operating in a post-pandemic world.


Are big data and machine learning methods enough? Part 1

#artificialintelligence

Sir David Hand gave a brilliant plenary talk and set the stage for a great panel discussion by cautioning us to remember that thinking is required and to be aware of all the dark data out there -- the data that we don't see, but that we need to take into account. Dark Data: Why What You Don't Know Matters is his latest book (see a blog post about it; if you haven't read it, you can get a sample excerpt). The panelists included Cameron Willden, statistician at W.L. Gore, who supports engineers and scientists across many different product lines; Sam Gardner, founder of Wildstats Consulting, with more than 30 years of experience doing statistical problem solving for government and industry; and JMP's Jason Wiggins, a 20-year US Synthetic veteran with expertise in process optimization, measurement systems analysis and predictive modeling/data mining. We ran out of time before we could answer all the questions from the livestream audience, but our panelists have kindly agreed to provide answers to many of them, further sharing the wisdom from their collective experiences. The questions are grouped by topic -- there were so many, we are doing two posts.


Difference Between Data Science and Machine Learning

#artificialintelligence

One of the most well-known hesitations emerges among modern innovations such as artificial intelligence, machine learning, big data, data science, deep learning, and more. While they are closely interconnected, each has individual functionality. In the course of recent years, the fame of these technologies has risen so much that few organizations have now woken up to their significance on huge levels and are progressively hoping to actualize them for their business development. While the terms Data Science and Machine learning fall in a similar space, they have their particular applications and significance. There might be overlaps in these areas once in a while, yet basically, every one of these terms has unique uses of their own.


How is machine learning different from AI and data science?- Edvancer Eduventures

#artificialintelligence

In this blog post, I will explain how machine learning fits into the broader landscape of data and computer science. This means understanding how machine learning interrelates with parent fields and sister disciplines. This is important, as these are the terms you will see time and again when searching for relevant study materials and hear mentioned ad nauseam in machine learning books. Relevant disciplines can also be difficult and confusing to tell apart at first glance, such as'machine learning' and'data mining.' The lineage of machine learning can be understood by first examining its forefathers.


Extending Target Encoding

#artificialintelligence

At the very beginning of this millennium, when my hair was a lot darker, I wrote a little article with a pretty long name, entitled "A preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems". It was a straightforward article, which I decided to write, driven by a very practical need for a method to deal with data types that were hard to plug into Machine Learning (ML) models. At the time, I was working on ML models to detect fraudulent e-commerce transactions. Therefore I was dealing with very "sparse" categorical variables, like ZIP codes, IP addresses, or SKUs. I could not find an easy way to "preprocess" such variables, except for the traditional one-hot encoding, which didn't scale well to situations where one deals with hundreds or even thousands of unique values. A decision tree method popular at the time, the C5.0 algorithm by R. Quinlan, provided the capability to group together individual sets of values as part of the tree generation process.


'Wearing too many hats': How to bridge the AI skills gap

#artificialintelligence

Organizations with an interdisciplinary team have a "far higher ratio of success" when deploying AI projects, said Arun Chandrasekaran, distinguished VP analyst at Gartner, speaking at a Gartner IT Symposium/Xpo Americas session last week. Interdisciplinary teams that blend roles across business and data science have a higher ratio of success with AI projects, as well as a faster time to production. This trend "clearly tells us that AI needs to be a team sport, said Chandrasekaran. "However, in reality what we see in most organizations is data scientists wearing too many hats, because there's a dearth of skills across other areas," he said. Organizations with an interdisciplinary team have a "far higher ratio of success" when deploying AI projects, said Arun Chandrasekaran, distinguished VP analyst at Gartner, speaking at a Gartner IT Symposium/Xpo Americas session last week. Interdisciplinary teams that blend roles across business and data science have a higher ratio of success with AI projects, as well as a faster time to production. This trend "clearly tells us that AI needs to be a team sport, said Chandrasekaran.


Top 12 Python Developer Skills You Must Need to Know

#artificialintelligence

Python is the most powerful language you can still read. Python is actively being used in various domains such as Data Science, Machine Learning, Web Applications, and much more. In this section, we'll cover more than ten must-have skills for python developers that would help you master the art of working with Python -- Before jumping into a framework or a development environment, it is crucial to first master the core concepts of any programming language. The same is the case with Python or any programming language for that matter. If you don't know where to start, you can find some good and useful resources on the internet.