Goto

Collaborating Authors

 model development process


Supervised & Unsupervised Approach to Topic Modelling in Python

#artificialintelligence

This article will provide a high level intuition behind topic modelling and its associated applications. It will do a deep dive into various ways one can approach solving a problem which requires topic modelling and how you can solve those problems in both a supervised and unsupervised manner. I placed an emphasis on restructuring the data and initial problem such that the solution can be executed in a variety of methods. Topic modelling is a subsection of natural language processing (NLP) or text mining which aims to build models in order to parse various bodies of text with the goal of identifying topics mapped to the text. These models assist in identifying big picture topics associated with documents at scale.


With great ML comes great responsibility: 5 key model development questions

#artificialintelligence

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! The rapid growth in machine learning (ML) capabilities has led to an explosion in its use. Natural language processing and computer vision models that seemed far-fetched a decade ago are now commonly used across multiple industries. We can make models that generate high-quality complex images from never before seen prompts, deliver cohesive textual responses with just a simple initial seed, or even carry out fully coherent conversations.


Making AI accountable: Blockchain, governance, and auditability

#artificialintelligence

The past few years have brought much hand wringing and arm waving about artificial intelligence (AI), as business people and technologists alike worry about the outsize decisioning power they believe these systems to have. As a data scientist, I am accustomed to being the voice of reason about the possibilities and limitations of AI. In this article I'll explain how companies can use blockchain technology for model development governance, a breakthrough to better understand AI, make the model development process auditable, and identify and assign accountability for AI decisioning. While there is widespread awareness about the need to govern AI, the discussion about how to do so is often nebulous, such as in "How to Build Accountability into Your AI" in Harvard Business Review: A healthy ecosystem for managing AI must include governance processes and structures.... Accountability for AI means looking for solid evidence of governance at the organizational level, including clear goals and objectives for the AI system; well-defined roles, responsibilities, and lines of authority; a multidisciplinary workforce capable of managing AI systems; a broad set of stakeholders; and risk-management processes. Additionally, it is vital to look for system-level governance elements, such as documented technical specifications of the particular AI system, compliance, and stakeholder access to system design and operation information.


What is MLOps?

#artificialintelligence

Ever liked something on Instagram and then, almost immediately, had related content in your feed? Or search for something on Google and then be spammed with ads for that exact thing moments later? These are symptoms of an increasingly automated world. Behind the scenes, they are the result of state-of-the-art MLOps pipelines. We take a look at MLOps and what it takes to deploy machine learning models effectively. We start by discussing some key aspects of DevOps.


Data Scientists are from Mars and Software Developers are from Venus - KDnuggets

#artificialintelligence

Figure 1: Data Scientists are from Mars and Software Developers are from Venus. Mars and Venus are very different planets. Mars's atmosphere is very thin and it can get very cold, while Venus's atmosphere is very thick and it can get very hot -- hot enough to melt lead! Yet, they are our closest sister planets. They have a number of similarities too.


Will AI Replace the Humans In the Loop?

#artificialintelligence

Data annotation might seem like a purely technical challenge. Machine learning is widely seen as a means to replace human effort, and even as a threat to people's jobs. But these views miss the fundamental point that people and machines perform best when each works together to augment the other's capabilities. When most people think of machine learning, they place the emphasis on the machine. However, for the machine to learn, people have to train machine learning models and maintain them in production thereafter.


Why data quality is key to successful ML Ops

#artificialintelligence

Machine learning has been, and will continue to be, one of the biggest topics in data for the foreseeable future. And while we in the data community are all still riding the high of discovering and tuning predictive algorithms that can tell us whether a picture shows a dog or a blueberry muffin, we're also beginning to realize that ML isn't just a magic wand you can wave at a pile of data to quickly get insightful, reliable results. Instead, we are starting to treat ML like other software engineering disciplines that require processes and tooling to ensure seamless workflows and reliable outputs. "Poor data quality is Enemy #1 to the widespread, profitable use of machine learning, and for this reason, the growth of machine learning increases the importance of data cleansing and preparation. The quality demands of machine learning are steep, and bad data can backfire twice -- first when training predictive models and second in the new data used by that model to inform future decisions."


Effective testing for machine learning systems.

#artificialintelligence

Working as a core maintainer for PyTorch Lightning, I've grown a strong appreciation for the value of tests in software development. As I've been spinning up a new project at work, I've been spending a fair amount of time thinking about how we should test machine learning systems. A couple weeks ago, one of my coworkers sent me a fascinating paper on the topic which inspired me to dig in, collect my thoughts, and write this blog post. In this blog post, we'll cover what testing looks like for traditional software development, why testing machine learning systems can be different, and discuss some strategies for writing effective tests for machine learning systems. We'll also clarify the distinction between the closely related roles of evaluation and testing as part of the model development process.


Experiment Management: How to Organize Your Model Development Process

#artificialintelligence

In every project, there is a phase where the business_specification is created that usually entails a timeframe, budget, and goal of the machine learning project. When say goal, I mean a set of KPIs, business metrics, or if you are super lucky machine learning metrics. At this stage, it is very important to manage business expectations but it's a story for another day. If you are interested in those things I suggest you take a look at some articles by Cassie Kozyrkov, for instance, this one. Assuming that you and your team know what is the business goal you can do initial_research and cook up a baseline approach, a first creative_idea.


Deep Learning Specialization by Andrew Ng – 21 Lessons Learned

@machinelearnbot

I recently completed all available material (as of October 25, 2017) for Andrew Ng's new deep learning course on Coursera. I found all 3 courses extremely useful and learned an incredible amount of practical knowledge from the instructor, Andrew Ng. Ng does an excellent job of filtering out the buzzwords and explaining the concepts in a clear and concise manner. For example, Ng makes it clear that supervised deep learning is nothing more than a multidimensional curve fitting procedure and that any other representational understandings, such as the common reference to the human biological nervous system, are loose at best. The specialization only requires basic linear algebra knowledge and basic programming knowledge in Python.