Instructional Material
Machine Learning Tutorial for Beginners - Great Learning
Let us start with an easy example, say you are teaching a kid to differentiate dogs from cats. How would you do it? You may show him/her a dog and say "here is a dog" and when you encounter a cat you would point it out as a cat. When you show the kid enough dogs and cats, he may learn to differentiate between them. If he is trained well, he may be able to recognize different breeds of dogs which he hasn't even seen. Similarly, in Supervised Learning, we have two sets of variables.
Learn Machine Learning Maths Behind - Development
Machine learning and the world of artificial intelligence (AI) are no longer science fiction. Get started with the new breed of software that is able to learn without being explicitly programmed, machine learning can access, analyze, and find patterns in Big Data in a way that is beyond human capabilities. The business advantages are huge, and the market is expected to be worth $47 billion and more by 2020. In this course, you will implement your own custom algorithm on top of SAP's HANA Database, which is an In-Memory database capable of Performing huge calculation over a large set of Data. We are going to use Native SQL to write the algorithm of Naive Bayes.
Information-Theoretic Methods for Trustworthy Machine Learning
Machine learning has enabled tremendously exciting technologies, but at the same time it raises questions as to how it should be deployed in a responsible and trustworthy manner. How can machine learning be made secure, reliable, robust, fair, and private? This workshop will explore the information-theoretic foundations of these aspects of machine learning. The workshop will include invited talks by experts on these topics from both academy and industry, student poster presentations, and time for fruitful discussions. Keynote talks will be given by Tara Javidi, Ilya Mironov, Todd Coleman, and Ayfer Ozgur.
MLflow Empowering AI Training. MLflow is an open-source platform to…
Artificial intelligence (AI) is intelligence -- perceiving, synthesizing, and inferring information -- demonstrated by machines. Today, AI is no longer profound technology in a science lab. Instead, it is at amateurs' fingertips to create decent artwork, generate sophisticated conversation, and perform other intelligent tasks using DALL·E, Stable Diffusion, GPT-3, ChatGPT, Point·E, Whisper, etc. Have you ever wondered how a realistic image is generated by a natural language description? The intelligence comes from Machine Learning (ML), the study of computer algorithms that can improve automatically through experience and by the use of data. These textbook algorithms are publicly available and ready to be used.
Student Engagement Detection Using Emotion Analysis, Eye Tracking and Head Movement with Machine Learning
Sharma, Prabin, Joshi, Shubham, Gautam, Subash, Maharjan, Sneha, Khanal, Salik Ram, Reis, Manuel Cabral, Barroso, João, Filipe, Vítor Manuel de Jesus
With the increase of distance learning, in general, and e-learning, in particular, having a system capable of determining the engagement of students is of primordial importance, and one of the biggest challenges, both for teachers, researchers and policy makers. Here, we present a system to detect the engagement level of the students. It uses only information provided by the typical built-in web-camera present in a laptop computer, and was designed to work in real time. We combine information about the movements of the eyes and head, and facial emotions to produce a concentration index with three classes of engagement: "very engaged", "nominally engaged" and "not engaged at all". The system was tested in a typical e-learning scenario, and the results show that it correctly identifies each period of time where students were "very engaged", "nominally engaged" and "not engaged at all". Additionally, the results also show that the students with best scores also have higher concentration indexes.
Applications of statistical causal inference in software engineering
This paper focuses on the application of one type of empirical methods, namely statistical causal inference (SCI, see section 2). Such methods have their roots in a number of applied fields (from AI to econometrics) and aim to provide a framework for making valid inferences about causal effects based on interventional or observational data. More specifically, we focus on SCI methods that use graphical models as developed by Pearl and colleagues [1, 2]. This framework has been shown to be equivalent of the potential-outcomes framework (also called the Neyman-Rubin Causal Model [3]) but enriches it by making use of an explicit causal structure called a graphical causal model. Making assumptions about causal effects explicit through a graphical structure has several advantages. First, it helps determine whether causal effects can be estimated and how they might be estimated (see section 2).
Learning and Verification of Task Structure in Instructional Videos
Narasimhan, Medhini, Yu, Licheng, Bell, Sean, Zhang, Ning, Darrell, Trevor
Given the enormous number of instructional videos available online, learning a diverse array of multi-step task models from videos is an appealing goal. We introduce a new pre-trained video model, VideoTaskformer, focused on representing the semantics and structure of instructional videos. We pre-train VideoTaskformer using a simple and effective objective: predicting weakly supervised textual labels for steps that are randomly masked out from an instructional video (masked step modeling). Compared to prior work which learns step representations locally, our approach involves learning them globally, leveraging video of the entire surrounding task as context. From these learned representations, we can verify if an unseen video correctly executes a given task, as well as forecast which steps are likely to be taken after a given step. We introduce two new benchmarks for detecting mistakes in instructional videos, to verify if there is an anomalous step and if steps are executed in the right order. We also introduce a long-term forecasting benchmark, where the goal is to predict long-range future steps from a given step. Our method outperforms previous baselines on these tasks, and we believe the tasks will be a valuable way for the community to measure the quality of step representations. Additionally, we evaluate VideoTaskformer on 3 existing benchmarks -- procedural activity recognition, step classification, and step forecasting -- and demonstrate on each that our method outperforms existing baselines and achieves new state-of-the-art performance.
Python Reinforcement Learning using OpenAI Gymnasium – Full Course
Learn the basics of reinforcement learning and how to implement it using Gymnasium (previously called OpenAI Gym). Gymnasium is an open source Python library originally created by OpenAI that provides a collection of pre-built environments for reinforcement learning agents. It provides a standard API to communicate between learning algorithms and environments, as well as a standard set of environments compliant with that API. Reinforcement learning is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward.
CH-Go: Online Go System Based on Chunk Data Storage
Lu, H., Li, C., Yang, Y., Li, C., Islam, A.
The training and running of an online Go system require the support of effective data management systems to deal with vast data, such as the initial Go game records, the feature data set obtained by representation learning, the experience data set of self-play, the randomly sampled Monte Carlo tree, and so on. Previous work has rarely mentioned this problem, but the ability and efficiency of data management systems determine the accuracy and speed of the Go system. To tackle this issue, we propose an online Go game system based on the chunk data storage method (CH-Go), which processes the format of 160k Go game data released by Kiseido Go Server (KGS) and designs a Go encoder with 11 planes, a parallel processor and generator for better memory performance. Specifically, we store the data in chunks, take the chunk size of 1024 as a batch, and save the features and labels of each chunk as binary files. Then a small set of data is randomly sampled each time for the neural network training, which is accessed by batch through yield method. The training part of the prototype includes three modules: supervised learning module, reinforcement learning module, and an online module. Firstly, we apply Zobrist-guided hash coding to speed up the Go board construction. Then we train a supervised learning policy network to initialize the self-play for generation of experience data with 160k Go game data released by KGS. Finally, we conduct reinforcement learning based on REINFORCE algorithm. Experiments show that the training accuracy of CH- Go in the sampled 150 games is 99.14%, and the accuracy in the test set is as high as 98.82%. Under the condition of limited local computing power and time, we have achieved a better level of intelligence. Given the current situation that classical systems such as GOLAXY are not free and open, CH-Go has realized and maintained complete Internet openness.