Goto

Collaborating Authors

data scientist


Feature Engineering in SQL and Python: A Hybrid Approach - KDnuggets

#artificialintelligence

I knew SQL long before learning about Pandas, and I was intrigued by the way Pandas faithfully emulates SQL. Stereotypically, SQL is for analysts, who crunch data into informative reports, whereas Python is for data scientists, who use data to build (and overfit) models. Although they are almost functionally equivalent, I'd argue both tools are essential for a data scientist to work efficiently. From my experience with Pandas, I've noticed the following: Those problems are naturally solved when I began feature engineering directly in SQL. If you know a little bit of SQL, it's time to put it into good use.


Top 10 Big Data Startups in the United States to Watch In 2020

#artificialintelligence

Data is growing by leaps and bounds, the convergence of extremely large data sets both structured and unstructured define Big Data. The increasing awareness of the Internet of Things (IoT) devices among organizations and volume, variety, velocity and veracity at which data is generated have caught the attention of the enterprise in a bid to enhance digital technologies and guide digital transformation. Analytics Insights eliminates that the big data market size will grow at a CAGR of 10.9%, globally from US$ 193.5 billion in 2020 to US$ 301.5 billion by 2023. This region is witnessing significant developments in the big data market gaining remarkable traction in the BFSI industry vertical. Numerai is the world's first hedge fund, to predict the stock market.


Apple Data Science Interview Questions

#artificialintelligence

Apple Inc. is one of the biggest technology companies in the world that designs, develops, and sells consumer electronics, computer software, and online services. Apple is constantly in need of creative, passionate, and dedicated data scientists that can sit on any number of their teams. From its researched-based artificial intelligence development team at Siri to cloud-base architecture development team at iCloud, Apple has slowly but steadily been building data science teams to handle the avalanche of data accumulated on a daily basis. As with other big tech companies, the role of a data scientist at Apple varies a lot and is dependent on the teams you are assigned to. This means the job will require everything from analytics to machine learning software design to plain engineering.


Build AI you can trust with responsible ML

#artificialintelligence

As AI reaches critical momentum across industries and applications, it becomes essential to ensure the safe and responsible use of AI. AI deployments are increasingly impacted by the lack of customer trust in the transparency, accountability, and fairness of these solutions. Microsoft is committed to the advancement of AI and machine learning (ML), driven by principles that put people first, and tools to enable this in practice. In collaboration with the Aether Committee and its working groups, we are bringing the latest research in responsible AI to Azure. Let's look at how the new responsible ML capabilities in Azure Machine Learning and our open-source toolkits empower data scientists and developers to understand ML models, protect people and their data, and control the end-to-end ML process.


What's Next After Machine Learning?

#artificialintelligence

Recently, there is increasing attention towards machine learning and its application within personal and business contexts. "The study of computer algorithms that improve automatically through experience" Machine learning leverages mathematical models and big data to solve business problems. This requires two key support areas of data engineering and data science. Within data engineering, the evolution of cloud computing has allowed big data to be stored inexpensively. Within data science, the rise of data scientists and data science tools have allowed better ease of model building and exploration. However, machine learning is NOT the summit of a data analytics evolution journey.


5 Key Challenges In Today's Era of Big Data

#artificialintelligence

Digital transformation will create trillions of dollars of value. While estimates vary, the World Economic Forum in 2016 estimated an increase in $100 trillion in global business and social value by 2030. Due to AI, PwC has estimated an increase of $15.7 trillion and McKinsey has estimated an increase of $13 trillion in annual global GDP by 2030. We are currently in the middle of an AI renaissance, driven by big data and breakthroughs in machine learning and deep learning. These breakthroughs offer opportunities and challenges to companies depending on the speed at which they adapt to these changes.


2 books to strengthen your command of python machine learning

#artificialintelligence

This post is part of "AI education", a series of posts that review and explore educational content on data science and machine learning. Mastering machine learning is not easy, even if you're a crack programmer. I've seen many people come from a solid background of writing software in different domains (gaming, web, multimedia, etc.) thinking that adding machine learning to their roster of skills is another walk in the park. And every single one of them has been dismayed. I see two reasons for why the challenges of machine learning are misunderstood. First, as the name suggests, machine learning is software that learns by itself as opposed to being instructed on every single rule by a developer.


IBM Research releases differential privacy library that works with machine learning

#artificialintelligence

Differential privacy has become an integral way for data scientists to learn from the majority of their data while simultaneously ensuring that those results do not allow any individual's data to be distinguished or re-identified. To help more researchers with their work, IBM released the open-source Differential Privacy Library. The library "boasts a suite of tools for machine learning and data analytics tasks, all with built-in privacy guarantees," according to Naoise Holohan, a research staff member on IBM Research Europe's privacy and security team. "Our library is unique to others in giving scientists and developers access to lightweight, user-friendly tools for data analytics and machine learning in a familiar environment–in fact, most tasks can be run with only a single line of code," Holohan wrote in a blog post on Friday. "What also sets our library apart is our machine learning functionality enables organizations to publish and share their data with rigorous guarantees on user privacy like never before."


Data Scientist - IoT BigData Jobs

#artificialintelligence

DuPont has a rich history of scientific discovery that has enabled countless innovations and today, we're looking for more people, in more places, to collaborate with us to make life the best that it can be. DuPont Pioneer is aggressively building Big Data and Predictive Analytics capabilities in order to deliver improved services to our customers. We seek a strong data scientist with a background in math, statistics, machine learning and scientific computing to join our team. This is a critical position with the potential to make immediate, significant impact on our business. The successful candidate will have an extensive background in statistical computing and machine learning through courses or thesis/dissertation, and proven experience validating models against experimental data.


Zicklin Grad Students Take Top Spot in Pitney Bowes Data Challenge - Zicklin School of Business

#artificialintelligence

Nearly five dozen students from Baruch College and the Zicklin School of Business got to show off their data-crunching skills recently when they participated in the Baruch College – Pitney Bowes Data Challenge, held on May 1. The winning team of Zicklin graduate students -- Drace (Yilei) Zhan (MS Statistics, '20), Nishtha Ram (MS Quantitative Methods & Modeling, '21), Huimin Chen (MS Information Systems, '21), Kang Li (MS QMM, '20), and Rosario Campoverde (MBA, '20) -- outperformed 50 other undergraduate and graduate students across Baruch and Zicklin to take first place. The competition was the culmination of a year-long collaboration among Pitney Bowes and the Paul H. Chook Department of Information Systems and Statistics, the Graduate Career Management Center, and the Starr Career Development Center. The partnership included seminars held throughout the year on machine learning, design thinking, marketing analytics, and other topics, presented by Pitney Bowes data scientists; and a free bootcamp on Python and AWS that was led by Zicklin professors. It was funded by a $10,000 grant from the NYC/CUNY Workforce Development Initiative.