Collaborating Authors


Deploy Deep Learning Models Using Streamlit and Heroku


Deep Learning and Machine Learning models trained by many data professionals either end up in an inference.ipynb Those meticulous model architectures capable of creating awe in the real world never see the light of the day. Those models just sit there in the background processing requests via an API gateway doing their job silently and making the system more intelligent. People using those intelligent systems don't always credit the Data Professionals who spent hours or weeks or months collecting data, cleaning the collected data, formatting the data to use it correctly, writing the model architecture, training that model architecture and validating it. And if the validation metrics are not very good, again going back to square one and repeating the cycle.

Top 6 Deep Learning Models You Should Master for Killer AI Applications


The field of deep learning has gained popularity with the rise of available processing power, storage space, and big data. Instead of using traditional machine learning models, AI engineers have been gradually switching to deep learning models. Where there is abundant data, deep learning models almost always outperform traditional machine learning models. Therefore, as we collect more data at every passing year, it makes sense to use deep learning models. Furthermore, the field of deep learning is also growing fast.

DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection Artificial Intelligence

Deep learning models are increasingly used in mobile applications as critical components. Unlike the program bytecode whose vulnerabilities and threats have been widely-discussed, whether and how the deep learning models deployed in the applications can be compromised are not well-understood since neural networks are usually viewed as a black box. In this paper, we introduce a highly practical backdoor attack achieved with a set of reverse-engineering techniques over compiled deep learning models. The core of the attack is a neural conditional branch constructed with a trigger detector and several operators and injected into the victim model as a malicious payload. The attack is effective as the conditional logic can be flexibly customized by the attacker, and scalable as it does not require any prior knowledge from the original model. We evaluated the attack effectiveness using 5 state-of-the-art deep learning models and real-world samples collected from 30 users. The results demonstrated that the injected backdoor can be triggered with a success rate of 93.5%, while only brought less than 2ms latency overhead and no more than 1.4% accuracy decrease. We further conducted an empirical study on real-world mobile deep learning apps collected from Google Play. We found 54 apps that were vulnerable to our attack, including popular and security-critical ones. The results call for the awareness of deep learning application developers and auditors to enhance the protection of deployed models.

An Empirical Comparison of Deep Learning Models for Knowledge Tracing on Large-Scale Dataset Artificial Intelligence

Knowledge tracing (KT) is the problem of modeling each student's mastery of knowledge concepts (KCs) as (s)he engages with a sequence of learning activities. It is an active research area to help provide learners with personalized feedback and materials. Various deep learning techniques have been proposed for solving KT. Recent release of large-scale student performance dataset \cite{choi2019ednet} motivates the analysis of performance of deep learning approaches that have been proposed to solve KT. Our analysis can help understand which method to adopt when large dataset related to student performance is available. We also show that incorporating contextual information such as relation between exercises and student forget behavior further improves the performance of deep learning models.

Deeplite Neutrino: An End-to-End Framework for Constrained Deep Learning Model Optimization Artificial Intelligence

Designing deep learning-based solutions is becoming a race for training deeper models with a greater number of layers. While a large-size deeper model could provide competitive accuracy, it creates a lot of logistical challenges and unreasonable resource requirements during development and deployment. This has been one of the key reasons for deep learning models not being excessively used in various production environments, especially in edge devices. There is an immediate requirement for optimizing and compressing these deep learning models, to enable on-device intelligence. In this research, we introduce a black-box framework, Deeplite Neutrino for production-ready optimization of deep learning models. The framework provides an easy mechanism for the end-users to provide constraints such as a tolerable drop in accuracy or target size of the optimized models, to guide the whole optimization process. The framework is easy to include in an existing production pipeline and is available as a Python Package, supporting PyTorch and Tensorflow libraries. The optimization performance of the framework is shown across multiple benchmark datasets and popular deep learning models. Further, the framework is currently used in production and the results and testimonials from several clients are summarized.

Do We Really Need Deep Learning Models for Time Series Forecasting? Machine Learning

Time series forecasting is a crucial task in machine learning, as it has a wide range of applications including but not limited to forecasting electricity consumption, traffic, and air quality. Traditional forecasting models relied on rolling averages, vector auto-regression and auto-regressive integrated moving averages. On the other hand, deep learning and matrix factorization models have been recently proposed to tackle the same problem with more competitive performance. However, one major drawback of such models is that they tend to be overly complex in comparison to traditional techniques. In this paper, we try to answer whether these highly complex deep learning models are without alternative. We aim to enrich the pool of simple but powerful baselines by revisiting the gradient boosting regression trees for time series forecasting. Specifically, we reconfigure the way time series data is handled by Gradient Tree Boosting models in a windowed fashion that is similar to the deep learning models. For each training window, the target values are concatenated with external features, and then flattened to form one input instance for a multi-output gradient boosting regression tree model. We conducted a comparative study on nine datasets for eight state-of-the-art deep-learning models that were presented at top-level conferences in the last years. The results demonstrated that the proposed approach outperforms all of the state-of-the-art models.

A Multi-modal Deep Learning Model for Video Thumbnail Selection Artificial Intelligence

Thumbnail is the face of online videos. The explosive growth of videos both in number and variety underpins the importance of a good thumbnail because it saves potential viewers time to choose videos and even entice them to click on them. A good thumbnail should be a frame that best represents the content of a video while at the same time capturing viewers' attention. However, the techniques and models in the past only focus on frames within a video, and we believe such narrowed focus leave out much useful information that are part of a video. In this paper, we expand the definition of content to include title, description, and audio of a video and utilize information provided by these modalities in our selection model. Specifically, our model will first sample frames uniformly in time and return the top 1,000 frames in this subset with the highest aesthetic scores by a Double-column Convolutional Neural Network, to avoid the computational burden of processing all frames in downstream task. Then, the model incorporates frame features extracted from VGG16, text features from ELECTRA, and audio features from TRILL. These models were selected because of their results on popular datasets as well as their competitive performances. After feature extraction, the time-series features, frames and audio, will be fed into Transformer encoder layers to return a vector representing their corresponding modality. Each of the four features (frames, title, description, audios) will pass through a context gating layer before concatenation. Finally, our model will generate a vector in the latent space and select the frame that is most similar to this vector in the latent space. To the best of our knowledge, we are the first to propose a multi-modal deep learning model to select video thumbnail, which beats the result from the previous State-of-The-Art models.

Benchmarking Inference Performance of Deep Learning Models on Analog Devices Artificial Intelligence

Analog hardware implemented deep learning models are promising for computation and energy constrained systems such as edge computing devices. However, the analog nature of the device and the associated many noise sources will cause changes to the value of the weights in the trained deep learning models deployed on such devices. In this study, systematic evaluation of the inference performance of trained popular deep learning models for image classification deployed on analog devices has been carried out, where additive white Gaussian noise has been added to the weights of the trained models during inference. It is observed that deeper models and models with more redundancy in design such as VGG are more robust to the noise in general. However, the performance is also affected by the design philosophy of the model, the detailed structure of the model, the exact machine learning task, as well as the datasets.

Honey I Shrunk the Model: Why Big Machine Learning Models Must Go Small


Bigger is not always better for machine learning. Yet, deep learning models and the datasets on which they're trained keep expanding, as researchers race to outdo one another while chasing state-of-the-art benchmarks. However groundbreaking they are, the consequences of bigger models are severe for both budgets and the environment alike. For example, GPT-3, this summer's massive, buzzworthy model for natural language processing, reportedly cost $12 million to train. What's worse, UMass Amherst researchers found that the computing power required to train a large AI model can produce over 600,000 pounds of CO2 emissions – that's five times the amount of the typical car over its lifespan.

The Five Major Platforms For Machine Learning Model Development


Over the past two decades, the biggest evolution of Artificial Intelligence has been the maturation of deep learning as an approach for machine learning, the expansion of big data and the knowledge of how to effectively manage big data systems, and affordable and accessible compute power that can handle some of the most challenging machine learning model development. Today's data scientists and machine learning engineers now have a wide range of choices for how they build models to address the various patterns of AI for their particular needs. However, The diversity in options is actually part of the challenge for those looking to build machine learning models. There are just too many choices. This, compounded by the fact that there are different ways you can go about developing a machine learning model, is the issue that many AI software vendors do a particularly poor job of explaining what their products actually do.