Collaborating Authors


Open-source language AI challenges big tech's models


Researchers have warned against possible harms from AI that processes and generates text.Credit: Getty An international team of around 1,000 largely academic volunteers has tried to break big tech's stranglehold on natural-language processing and reduce its harms. Trained with US$7-million-worth of publicly funded computing time, the BLOOM language model will rival in scale those made by firms Google and OpenAI, but will be open-source. BLOOM will also be the first model of its scale to be multilingual. The collaboration, called BigScience, launched an early version of the model on 17 June, and hopes that it will ultimately help to reduce harmful outputs of artificial intelligence (AI) language systems. Models that recognize and generate language are increasingly used by big tech firms in applications from chat bots to translators, and can sound so eerily human that a Google engineer this month claimed that the firm's AI model was sentient (Google strongly denies that the AI possesses sentience).

Building a python toolbox for robot behavior


If you've been subject to my posts on Twitter or LinkedIn, you may have noticed that I've done no writing in the last 6 months. Besides the whole… full-time job thing … this is also because at the start of the year I decided to focus on a larger coding project. At my previous job, I stood up a system for task and motion planning (TAMP) using the Toyota Human Support Robot (HSR). You can learn more in my 2020 recap post. While I'm certainly able to talk about that work, the code itself was closed in two different ways: Rewind to 2020: The original simulation tool (left) and a generated Gazebo world with a Toyota HSR (right). So I thought, there are some generic utilities here that could be useful to the community.

Building Your First Image Classification Machine Learning Project


One common IoT project requirement is the need to detect the presence of something in an image. For example, a security system might need to detect potential intruders, a wildlife monitoring system might need to detect animals, or a facial recognition system might need to detect, well, faces. The issue of detecting things in images, or image classification, has historically been an advanced task, requiring a deep understanding of how both machine learning and a variety of mathematical processes work. The good news is that over the last few years a series of tools has made the image classification process far more approachable for the average developer. In this article, you'll learn how to build your first image classifier with Edge Impulse, and how to deploy that image classifier to a Raspberry Pi.

What It Means to Be an AI Developer in 2022 and Beyond


Picture fast-food restaurants being able to tailor the kinds of food they keep on hand depending on which cars flow through the drive-thru line. Picture advanced cameras being able to detect problematic porosity in finished car parts. Or radiologists who work with a virtual assistant to sift through X-rays and surface troublesome ones for a second look. These operational improvements all drive on artificial intelligence (AI), which is seeing an explosion in use cases in practically every industry. Where there is data, there are opportunities for efficiency--and even more opportunities for AI.

My experience working as a Technical Writer with FluxML


"One thing that open-source can't get enough of is documentation" -- Anonymous This summer, I started working as a technical writer with FluxML under Julia Season of Contributions, and as expected, this experience was very different from writing code. During the beginning of the summer, I decided to take up a technical writer's job that involved writing documentation and tutorials for Machine Learning. At the same time, I was learning Julia, and the FluxML ecosystem sounded like a perfect place for me. I applied for the position through Google Season of Docs but unfortunately couldn't get in because of limited openings. Fortunately, Julia Language decided to fund me for the next few months to work on FluxML under Julia Season of Contributions!

GitHub - Nneji123/Serving-Machine-Learning-Models


This repository contains instructions, template source code and examples on how to serve/deploy machine learning models using various frameworks and applications such as Docker, Flask, FastAPI, BentoML, Streamlit, MLflow and even code on how to deploy your machine learning model as an android app. The Repository also has code and how-to's for deploying your apps to various cloud platforms(AWS, Heroku, Vercel etc), working with Github actions for CI/CD(Continuous Integration and Continuous Development), TDD(Test driven development) with pytest and other useful information. Before we get into building and deploying our models we'll have to setup our environment. I use'pyenv' for managing different versions of python and pyenv-virtualenv for setting up virtual environments. You can learn how to install pyenv on your operating system by checking out their official github.

The Hidden Governance in AI


Measurement modeling could further the government's understanding of AI policymaking tools. Governments are increasingly using artificial intelligence (AI) systems to support policymaking, deliver public services, and manage internal people and processes. AI systems in public-facing services range from predictive machine-learning systems used in fraud and benefit determinations to chatbots used to communicate with the public about their rights and obligations across a range of settings. The integration of AI into agency decision-making processes that affect the public's rights poses unique challenges for agencies. System design decisions about training data, model design, thresholds, and interface design can set policy--thereby affecting the public's rights.

PyGAD - Python Genetic Algorithm! -- PyGAD 2.17.0 documentation


PyGAD is an open-source Python library for building the genetic algorithm and optimizing machine learning algorithms. It works with Keras and PyTorch. PyGAD supports different types of crossover, mutation, and parent selection operators. PyGAD allows different types of problems to be optimized using the genetic algorithm by customizing the fitness function. Besides building the genetic algorithm, it builds and optimizes machine learning algorithms.

Documentation Matters: Human-Centered AI System to Assist Data Science Code Documentation in Computational Notebooks


Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations. Inspired by human documentation practices learned from 80 highly-voted Kaggle notebooks, we design and implement Themisto, an automated documentation generation system to explore how human-centered AI systems can support human data scientists in the machine learning code documentation scenario. Themisto facilitates the creation of documentation via three approaches: a deep-learning-based approach to generate documentation for source code, a query-based approach to retrieve online API documentation for source code, and a user prompt approach to nudge users to write documentation. We evaluated Themisto in a within-subjects experiment with 24 data science practitioners, and found that automated documentation generation techniques reduced the time for writing documentation, reminded participants to document code they would have ignored, and improved participants' satisfaction with their computational notebook.

Building a Fast Interactive Dashboard in Jupyter through Gradio


Some days ago, I discovered a very interesting Python package, named Gradio. According to its authors, Gradio permits to build demos for Machine Learning. The package is exploited by machine learning teams at Google, Facebook, and Amazon. Thus, I decided to study this package and build a little demo. While reading the documentation, I was very pleased to discover an interesting feature, that other similar packages, such as streamlit do not provide.