Instructional Material
YZR-net : Self-supervised Hidden representations Invariant to Transformations for profanity detection
Joshi, Vedant Sandeep, Tatinati, Sivanagaraja, Wang, Yubo
In the past few years due to the Covid19 pandemic the adoption of e-learning platforms has increased significantly. The widespread restrictions have forced students to continue their education via online means which causes them to spend a significant amount of their time watching videos and attending classes. This sudden change from offline to online learning has affected a lot of students therefore making an attempt to build systems that can accurately simulate the experience of offline learning can help in smoothing out this drastic transition. Live classes is one such way that gives the students a chance to escape the monotony of watching recorded videos on a daily basis. The interaction aspect of such classes allow the students to clarify small scale doubts instantaneously and at the same time gives teachers the opportunity to compliment the students on good behaviour. All these tiny bits significantly affect the learning outcome for a student by making the course content more interesting and thus improving their overall engagement on the platform. In order to mimic this offline style of interaction there can be a multitude of implementations like live polls or quizzes to check whether the student is paying attention, dynamic interactive diagrams that fuel the curiosity of students by giving them a chance to tinker with it, in-session feedback to understand the student's opinions or the in-class chats mechanism between the participants of a given session. Unlike all the other mechanisms, chats are the most open medium of communication and provide the maximum opportunity to interact with each other.
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Fan, Linxi, Wang, Guanzhi, Jiang, Yunfan, Mandlekar, Ajay, Yang, Yuncong, Zhu, Haoyi, Tang, Andrew, Huang, De-An, Zhu, Yuke, Anandkumar, Anima
Autonomous agents have made great strides in specialist domains like Atari games and Go. However, they typically learn tabula rasa in isolated environments with limited and manually conceived objectives, thus failing to generalize across a wide spectrum of tasks and capabilities. Inspired by how humans continually learn and adapt in the open world, we advocate a trinity of ingredients for building generalist agents: 1) an environment that supports a multitude of tasks and goals, 2) a large-scale database of multimodal knowledge, and 3) a flexible and scalable agent architecture. We introduce MineDojo, a new framework built on the popular Minecraft game that features a simulation suite with thousands of diverse open-ended tasks and an internet-scale knowledge base with Minecraft videos, tutorials, wiki pages, and forum discussions. Using MineDojo's data, we propose a novel agent learning algorithm that leverages large pre-trained video-language models as a learned reward function. Our agent is able to solve a variety of open-ended tasks specified in free-form language without any manually designed dense shaping reward. We open-source the simulation suite, knowledge bases, algorithm implementation, and pretrained models (https://minedojo.org) to promote research towards the goal of generally capable embodied agents.
Contrastive Learning for Online Semi-Supervised General Continual Learning
Michel, Nicolas, Negrel, Romain, Chierchia, Giovanni, Bercher, Jean-François
We study Online Continual Learning with missing labels and propose SemiCon, a new contrastive loss designed for partly labeled data. We demonstrate its efficiency by devising a memory-based method trained on an unlabeled data stream, where every data added to memory is labeled using an oracle. Our approach outperforms existing semi-supervised methods when few labels are available, and obtain similar results to state-of-the-art supervised methods while using only 2.6% of labels on Split-CIFAR10 and 10% of labels on Split-CIFAR100.
A Combined Approach of Process Mining and Rule-based AI for Study Planning and Monitoring in Higher Education
Wagner, Miriam, Helal, Hayyan, Roepke, Rene, Judel, Sven, Doveren, Jens, Goerzen, Sergej, Soudmand, Pouya, Lakemeyer, Gerhard, Schroeder, Ulrik, van der Aalst, Wil
This paper presents an approach of using methods of process mining and rule-based artificial intelligence to analyze and understand study paths of students based on campus management system data and study program models. Process mining techniques are used to characterize successful study paths, as well as to detect and visualize deviations from expected plans. These insights are combined with recommendations and requirements of the corresponding study programs extracted from examination regulations. Here, event calculus and answer set programming are used to provide models of the study programs which support planning and conformance checking while providing feedback on possible study plan violations. In its combination, process mining and rule-based artificial intelligence are used to support study planning and monitoring by deriving rules and recommendations for guiding students to more suitable study paths with higher success rates. Two applications will be implemented, one for students and one for study program designers.
Text Classification using Watson NLP
You can downsample the dataset in the data processing step to reduce the model training time. Some of the product categories have fewer instances compared to others. So, you can drop those categories before training the model. Finally, you can carry out the train-test split using the sampling method on the Pandas dataframe. One crucial step required here is to convert the dataframe into the JSON or CSV format as required by the Watson NLP classification algorithm.
Why Meta Took Down its 'Hallucinating' AI Model Galactica?
On Wednesday, MetaAI and Papers with Code announced the release of Galactica, an open-source large language model trained on scientific knowledge, with 120 billion parameters. However, just days after its launch, Meta took Galactica down. Interestingly, every result generated by Galactica came with the warning- Outputs may be unreliable. Language Models are prone to hallucinate text. "Galactica is trained on a large and curated corpus of humanity's scientific knowledge. This includes over 48 million papers, textbooks and lecture notes, millions of compounds and proteins, scientific websites, encyclopedias and more," the paper said.
Unsupervised Machine Learning
This course introduces you to one of the main types of Machine Learning: Unsupervised Learning. You will learn how to find insights from data sets that do not have a target or labeled variable. You will learn several clustering and dimension reduction algorithms for unsupervised learning as well as how to select the algorithm that best suits your data. The hands-on section of this course focuses on using best practices for unsupervised learning. By the end of this course you should be able to: Explain the kinds of problems suitable for Unsupervised Learning approaches Explain the curse of dimensionality, and how it makes clustering difficult with many features Describe and use common clustering and dimensionality-reduction algorithms Try clustering points where appropriate, compare the performance of per-cluster models Understand metrics relevant for characterizing clusters Who should take this course?
Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2022): Workshop and Shared Task Report
Hürriyetoğlu, Ali, Tanev, Hristo, Zavarella, Vanni, Yeniterzi, Reyyan, Mutlu, Osman, Yörük, Erdem
We provide a summary of the fifth edition of the CASE workshop that is held in the scope of EMNLP 2022. The workshop consists of regular papers, two keynotes, working papers of shared task participants, and task overview papers. This workshop has been bringing together all aspects of event information collection across technical and social science fields. In addition to the progress in depth, the submission and acceptance of multimodal approaches show the widening of this interdisciplinary research topic.
Build Flask App For Image Recognition Using Deep Learning Model
The web app we will make is about predicting the image of a hand sign digit. The model is trained on the dataset named "American Hand Digit Sign Language" found on Kaggle. This tutorial will focus on making a web app using the Flask web framework, so all the necessary backend processes, including data preparation, data preprocessing, and training a model, are already done. We will implement our model in action by embedding it on the client side.. The web app has a good-looking user interface in which we have an image upload area and an image preview section where we can see the preview of the uploaded image.
Launching the v2.0 of Deep Reinforcement Learning Course with Hugging Face 🤗
I'm super excited to announce the launch of the v2.0 Deep Reinforcement Learning Course with Hugging Face starting on December the 5th. After the first version from May to July 2022 with more than 5,000 students, we heard your feedback and we updated the course: adding more RL libraries, new environments such as Minecraft and Doom, and creating contests with our AI vs AI to compete with your trained agents against your classmates. Let's see in more detail what you're going to do. In this course, you're going to compare your agent's results with other classmates using our updated leaderboard: But the addition in this v2.0 is that for some environments you'll be able to make them play against other's classmates' AI For instance, in Snowball fight, you're going to try to beat other AIs: For now, you can sign up to our discord server to exchange with the community and with us https://discord.gg/ydHrjt3WP5 Please check our FAQ, and if you don't find answers you can contact us on our Discord Server .