Collaborating Authors

Information Technology

Deep Learning at Scale with PyTorch, Azure Databricks, and Azure Machine Learning


PyTorch is a popular open source machine learning framework. PyTorch is ideal for deep learning applications such as computer vision and natural language processing. MLflow is an open source platform for the end-to-end machine learning lifecycle. Delta Lake is an open source storage layer that brings reliability to data lakes. Azure Databricks is the first-party Databricks service on Azure that provides massive scale data engineering and collaborative data science.

AI algorithms to improve Air Force navigation capabilities - Military Embedded Systems


AI.Reverie, company specializing in synthetic data for improved artificial intelligence (AI), announced that it has won a $1.5 million Phase 2 Small Business Innovation Research (SBIR) contract by AFWERX to build AI algorithms and improve navigation capabilities for the U.S. Air Force. According to the company, AI.Reverie will be supporting the 7th Bomb Wing at Dyess Air Force Base through their Rapid Capabilities office by leveraging synthetic data to train and improve the accuracy of vision algorithms for navigation. The use of synthetic data, or computer-generated images, aims to solve the resource barriers associated with real data: the high cost and slow turnaround of hand-labeled photos stalls deployment of vision algorithms needed to save lives. AI.Reverie's Phase 2 SBIR contract closely follows its co-publication with the IQT Lab CosmiQ Works of a paper highlighting the value of synthetic data to train computer vision algorithms. The research partners also released RarePlanes, the largest open dataset of real and synthetic overhead imagery for academic and commercial use.

Fujitsu Develops AI Tech for High-Dimensional Data Without Labeled Training Data


In recent years, there has been a surge in demand for AI-driven big data analysis in various business fields. AI is also expected to help support the detection of anomalies in data to reveal things like unauthorized attempts to access networks, or abnormalities in medical data for thyroid values or arrhythmia data. Data used in many business operations is high-dimensional data. As the number of dimensions of data increases, the complexity of calculations required to accurately characterize the data increases exponentially, a phenomenon widely known as the "Curse of Dimensionality"(1). In recent years, a method of reducing the dimensions of input data using deep learning has been identified as a promising candidate for helping to avoid this problem. However, since the number of dimensions is reduced without considering the data distribution and probability of occurrence after the reduction, the characteristics of the data have not been accurately captured, and the recognition accuracy of the AI is limited and misjudgment can occur (Figure 1). Solving these problems and accurately acquiring the distribution and probability of high-dimensional data remain important issues in the AI field.

Letting robots manipulate cables


For humans, it can be challenging to manipulate thin flexible objects like ropes, wires, or cables. But if these problems are hard for humans, they are nearly impossible for robots. As a cable slides between the fingers, its shape is constantly changing, and the robot's fingers must be constantly sensing and adjusting the cable's position and motion. Standard approaches have used a series of slow and incremental deformations, as well as mechanical fixtures, to get the job done. Recently, a group of researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) pursued the task from a different angle, in a manner that more closely mimics us humans.

Why Your Business Needs Artificial Intelligence Consulting


Artificial Intelligence (AI) consulting can help make the integration of AI solutions a seamless experience by providing you with the right solutions specific to your needs and specifications. Even though a large number of enterprises are jumping on the AI bandwagon, there are still some enterprises that are refraining from implementing AI. According to O'Riley, about 17% of enterprises find it difficult to identify appropriate use cases of AI, and 23% haven't even realized the need for AI implementation. These hurdles can easily be mitigated by employing AI consulting services to understand the potential benefits of AI at the workplace and ways to integrate AI solutions without affecting current infrastructure and procedures. AI consulting can provide enterprises the roadmap for AI implementation at the workplace.

Approximation spaces of deep neural networks


We study the expressivity of deep neural networks. Measuring a network's complexity by its number of connections or by its number of neurons, we consider the class of functions for which the error of best approximation with networks of a given complexity decays at a certain rate when increasing the complexity budget. Using results from classical approximation theory, we show that this class can be endowed with a (quasi)-norm that makes it a linear function space, called approximation space. We establish that allowing the networks to have certain types of "skip connections" does not change the resulting approximation spaces. We also discuss the role of the network's nonlinearity (also known as activation function) on the resulting spaces, as well as the role of depth. For the popular ReLU nonlinearity and its powers, we relate the newly constructed spaces to classical Besov spaces. The established embeddings highlight that some functions of very low Besov smoothness can nevertheless be well approximated by neural networks, if these networks are sufficiently deep.

Global Big Data Conference


One of the quickest evolving branches in computer science, AI (short for artificial intelligence) focuses on the concept of developing machines and algorithms which are able to simulate human thinking. Even though its implementations were few and far between in the recent past, it has since found its uses in many different areas. The main benefits of AI are related to its learning concept – the more data is fed to it, the more accurate it becomes. This makes it perfect for automated tasks that require precision since they can't get tired or worn out as humans can. On the other hand, its largest con is the high cost of implementation, especially for inexperienced users.

Is your model overfitting? Or maybe underfitting? An example using a neural network in python


Underfitting means that our ML model can neither model the training data nor generalize to new unseen data. A model that underfits the data will have poor performance on the training data. For example, in a scenario where someone would use a linear model to capture non-linear trends in the data, the model would underfit the data. A textbook case of underfitting is when the model's error on both the training and test sets (i.e. during training and testing) is very high. It is obvious that there is a trade-off between overfitting and underfitting.

Supercharge Your Shallow ML Models With Hummingbird


Since the most recent resurgence of deep learning in 2012, a lion's share of new ML libraries and frameworks have been created. The ones that have stood the test of time (PyTorch, Tensorflow, ONNX, etc) are backed by massive corporations, and likely aren't going away anytime soon. This also presents a problem, however, as the deep learning community has diverged from popular traditional ML software libraries like scikit-learn, XGBoost, and LightGBM. When it comes time for companies to bring multiple models with different software and hardware assumptions into production, things get…hairy. Using microservices in Kubernetes can solve the design pattern issue to an extent by keeping things de-coupled…if that's even what you want?

ICML 2020 Test of Time award


The International Conference on Machine Learning (ICML) Test of Time award is given to a paper from ICML ten years ago that has had significant impact. This year the award goes to Niranjan Srinivas, Andreas Krause, Sham Kakade and Matthias Seeger for their work "Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design". This paper brought together the fields of Bayesian optimization, bandits and experimental design by analyzing Gaussian process bandit optimization, giving a novel approach to derive finite-sample regret bounds in terms of a mutual information gain quantity. This paper has had profound impact over the past ten years, including the method itself, the proof techniques used, and the practical results. These have all enriched our community by sparking creativity in myriad subsequent works, ranging from theory to practice.