Inductive Learning
Improved Adversarial Training Through Adaptive Instance-wise Loss Smoothing
Deep neural networks can be easily fooled into making incorrect predictions through corruption of the input by adversarial perturbations: human-imperceptible artificial noise. So far adversarial training has been the most successful defense against such adversarial attacks. This work focuses on improving adversarial training to boost adversarial robustness. We first analyze, from an instance-wise perspective, how adversarial vulnerability evolves during adversarial training. We find that during training an overall reduction of adversarial loss is achieved by sacrificing a considerable proportion of training samples to be more vulnerable to adversarial attack, which results in an uneven distribution of adversarial vulnerability among data. Such "uneven vulnerability", is prevalent across several popular robust training methods and, more importantly, relates to overfitting in adversarial training. Motivated by this observation, we propose a new adversarial training method: Instance-adaptive Smoothness Enhanced Adversarial Training (ISEAT). It jointly smooths both input and weight loss landscapes in an adaptive, instance-specific, way to enhance robustness more for those samples with higher adversarial vulnerability. Extensive experiments demonstrate the superiority of our method over existing defense methods. Noticeably, our method, when combined with the latest data augmentation and semi-supervised learning techniques, achieves state-of-the-art robustness against $\ell_{\infty}$-norm constrained attacks on CIFAR10 of 59.32% for Wide ResNet34-10 without extra data, and 61.55% for Wide ResNet28-10 with extra data. Code is available at https://github.com/TreeLLi/Instance-adaptive-Smoothness-Enhanced-AT.
Defect detection using weakly supervised learning
Sevetlidis, Vasileios, Pavlidis, George, Balaska, Vasiliki, Psomoulis, Athanasios, Mouroutsos, Spyridon, Gasteratos, Antonios
In many real-world scenarios, obtaining large amounts of labeled data can be a daunting task. Weakly supervised learning techniques have gained significant attention in recent years as an alternative to traditional supervised learning, as they enable training models using only a limited amount of labeled data. In this paper, the performance of a weakly supervised classifier to its fully supervised counterpart is compared on the task of defect detection. Experiments are conducted on a dataset of images containing defects, and evaluate the two classifiers based on their accuracy, precision, and recall. Our results show that the weakly supervised classifier achieves comparable performance to the supervised classifier, while requiring significantly less labeled data.
Review of Extreme Multilabel Classification
Dasgupta, Arpan, Katyan, Siddhant, Das, Shrutimoy, Kumar, Pawan
Extreme multilabel classification or XML, is an active area of interest in machine learning. Compared to traditional multilabel classification, here the number of labels is extremely large, hence, the name extreme multilabel classification. Using classical one versus all classification wont scale in this case due to large number of labels, same is true for any other classifiers. Embedding of labels as well as features into smaller label space is an essential first step. Moreover, other issues include existence of head and tail labels, where tail labels are labels which exist in relatively smaller number of given samples. The existence of tail labels creates issues during embedding. This area has invited application of wide range of approaches ranging from bit compression motivated from compressed sensing, tree based embeddings, deep learning based latent space embedding including using attention weights, linear algebra based embeddings such as SVD, clustering, hashing, to name a few. The community has come up with a useful set of metrics to identify correctly the prediction for head or tail labels.
Deep Augmentation: Enhancing Self-Supervised Learning through Transformations in Higher Activation Space
Brüel-Gabrielsson, Rickard, Wang, Tongzhou, Baradad, Manel, Solomon, Justin
We introduce Deep Augmentation, an approach to data augmentation using dropout to dynamically transform a targeted layer within a neural network, with the option to use the stop-gradient operation, offering significant improvements in model performance and generalization. We demonstrate the efficacy of Deep Augmentation through extensive experiments on contrastive learning tasks in computer vision and NLP domains, where we observe substantial performance gains with ResNets and Transformers as the underlying models. Our experimentation reveals that targeting deeper layers with Deep Augmentation outperforms augmenting the input data, and the simple network- and data-agnostic nature of this approach enables its seamless integration into computer vision and NLP pipelines.
Test of Time: Instilling Video-Language Models with a Sense of Time
Bagad, Piyush, Tapaswi, Makarand, Snoek, Cees G. M.
Modelling and understanding time remains a challenge in contemporary video understanding models. With language emerging as a key driver towards powerful generalization, it is imperative for foundational video-language models to have a sense of time. In this paper, we consider a specific aspect of temporal understanding: consistency of time order as elicited by before/after relations. We establish that seven existing video-language models struggle to understand even such simple temporal relations. We then question whether it is feasible to equip these foundational models with temporal awareness without re-training them from scratch. Towards this, we propose a temporal adaptation recipe on top of one such model, VideoCLIP, based on post-pretraining on a small amount of video-text data. We conduct a zero-shot evaluation of the adapted models on six datasets for three downstream tasks which require varying degrees of time awareness. We observe encouraging performance gains especially when the task needs higher time awareness. Our work serves as a first step towards probing and instilling a sense of time in existing video-language models without the need for data and compute-intense training from scratch.
Federated Learning without Full Labels: A Survey
Jin, Yilun, Liu, Yang, Chen, Kai, Yang, Qiang
Data privacy has become an increasingly important concern in real-world big data applications such as machine learning. To address the problem, federated learning (FL) has been a promising solution to building effective machine learning models from decentralized and private data. Existing federated learning algorithms mainly tackle the supervised learning problem, where data are assumed to be fully labeled. However, in practice, fully labeled data is often hard to obtain, as the participants may not have sufficient domain expertise, or they lack the motivation and tools to label data. Therefore, the problem of federated learning without full labels is important in real-world FL applications. In this paper, we discuss how the problem can be solved with machine learning techniques that leverage unlabeled data. We present a survey of methods that combine FL with semi-supervised learning, self-supervised learning, and transfer learning methods. We also summarize the datasets used to evaluate FL methods without full labels. Finally, we highlight future directions in the context of FL without full labels.
Personalizing Task-oriented Dialog Systems via Zero-shot Generalizable Reward Function
Siddique, A. B., Maqbool, M. H., Taywade, Kshitija, Foroosh, Hassan
Task-oriented dialog systems enable users to accomplish tasks using natural language. State-of-the-art systems respond to users in the same way regardless of their personalities, although personalizing dialogues can lead to higher levels of adoption and better user experiences. Building personalized dialog systems is an important, yet challenging endeavor and only a handful of works took on the challenge. Most existing works rely on supervised learning approaches and require laborious and expensive labeled training data for each user profile. Additionally, collecting and labeling data for each user profile is virtually impossible. In this work, we propose a novel framework, P-ToD, to personalize task-oriented dialog systems capable of adapting to a wide range of user profiles in an unsupervised fashion using a zero-shot generalizable reward function. P-ToD uses a pre-trained GPT-2 as a backbone model and works in three phases. Phase one performs task-specific training. Phase two kicks off unsupervised personalization by leveraging the proximal policy optimization algorithm that performs policy gradients guided by the zero-shot generalizable reward function. Our novel reward function can quantify the quality of the generated responses even for unseen profiles. The optional final phase fine-tunes the personalized model using a few labeled training examples. We conduct extensive experimental analysis using the personalized bAbI dialogue benchmark for five tasks and up to 180 diverse user profiles. The experimental results demonstrate that P-ToD, even when it had access to zero labeled examples, outperforms state-of-the-art supervised personalization models and achieves competitive performance on BLEU and ROUGE metrics when compared to a strong fully-supervised GPT-2 baseline
Toward Open-domain Slot Filling via Self-supervised Co-training
Mosharrof, Adib, Fereidouni, Moghis, Siddique, A. B.
Slot filling is one of the critical tasks in modern conversational systems. The majority of existing literature employs supervised learning methods, which require labeled training data for each new domain. Zero-shot learning and weak supervision approaches, among others, have shown promise as alternatives to manual labeling. Nonetheless, these learning paradigms are significantly inferior to supervised learning approaches in terms of performance. To minimize this performance gap and demonstrate the possibility of open-domain slot filling, we propose a Self-supervised Co-training framework, called SCot, that requires zero in-domain manually labeled training examples and works in three phases. Phase one acquires two sets of complementary pseudo labels automatically. Phase two leverages the power of the pre-trained language model BERT, by adapting it for the slot filling task using these sets of pseudo labels. In phase three, we introduce a self-supervised cotraining mechanism, where both models automatically select highconfidence soft labels to further improve the performance of the other in an iterative fashion. Our thorough evaluations show that SCot outperforms state-of-the-art models by 45.57% and 37.56% on SGD and MultiWoZ datasets, respectively. Moreover, our proposed framework SCot achieves comparable performance when compared to state-of-the-art fully supervised models.
SPEC: Summary Preference Decomposition for Low-Resource Abstractive Summarization
Chen, Yi-Syuan, Song, Yun-Zhu, Shuai, Hong-Han
Neural abstractive summarization has been widely studied and achieved great success with large-scale corpora. However, the considerable cost of annotating data motivates the need for learning strategies under low-resource settings. In this paper, we investigate the problems of learning summarizers with only few examples and propose corresponding methods for improvements. First, typical transfer learning methods are prone to be affected by data properties and learning objectives in the pretext tasks. Therefore, based on pretrained language models, we further present a meta learning framework to transfer few-shot learning processes from source corpora to the target corpus. Second, previous methods learn from training examples without decomposing the content and preference. The generated summaries could therefore be constrained by the preference bias in the training set, especially under low-resource settings. As such, we propose decomposing the contents and preferences during learning through the parameter modulation, which enables control over preferences during inference. Third, given a target application, specifying required preferences could be non-trivial because the preferences may be difficult to derive through observations. Therefore, we propose a novel decoding method to automatically estimate suitable preferences and generate corresponding summary candidates from the few training examples. Extensive experiments demonstrate that our methods achieve state-of-the-art performance on six diverse corpora with 30.11%/33.95%/27.51% and 26.74%/31.14%/24.48% average improvements on ROUGE-1/2/L under 10- and 100-example settings.
PODCAST SATELLITE: THE VOICE OF ISRAEL: NEW AI & THE 4TH INDUSTRIAL REVOLUTION
PODCAST SATELLITE10th of Elul, 5782 Prince HandleyPresident / RegentUniversity of Excellence NEW AI & THE 4TH INDUSTRIAL REVOLUTION FUTURE OF ARTIFICIAL INTELLIGENCE האינטליגנציה המלאכותית החדשה Listen HERE >>> Prince Handley 24/7 Commentary (FREE) > Email this message to a friend and help them! ______________________ DESCRIPTION WARNING: What you are about to learn will challenge your intellect. It will also enlighten you to “behind the scenes” activity that is happening today … and affecting your FUTURE. We will discuss the 4th Industrial Revolution (IR-4) and WHY―unlike the previous three Industrial Revolutions―it will be dangerous. People can lose their rights, their jobs … their lives as a result of traveling “uncharted” waters. Even more dangerous will be the result of our developing Artificial Intelligence (AI) that lives in Cyber Space that we do NOT really understand. _______________________ NEW AI & THE 4TH INDUSTRIAL REVOLUTIONFUTURE OF ARTIFICIAL INTELLIGENCE The 4th Industrial Revolution will be more of a radical change than the first three … even though they were”shockers” in their inception. Civilization has journeyed the route and use of fire, agriculture, the wheel, electricity, mass production, synthetic chemicals, the internet, block chain, self-driving cars, AI growing people in laboratories, and downloading our brains into computers. Let's examine the first three Industrial Revolutions and see from whence we have journeyed. FIRST INDUSTRIAL REVOLUTION ~ IR-1 The was marked by a transition from hand production methods to machines through the use of steam power and water power. The implementation of new technologies took a long time, so the period which this refers to was between 1760 and 1820, or 1840 in Europe and the United States. SECOND INDUSTRIAL REVOLUTION ~ IR-2 The , also known as the Technological Revolution, is the period between 1871 and 1914 that resulted from installations of extensive railroad and telegraph networks, which allowed for faster transfer of people and ideas, as well as electricity. Increasing electrification allowed for factories to develop the modern production line. THIRD INDUSTRIAL REVOLUTION ~ IR-3 The Third Industrial Revolution, also known as the Digital Revolution, occurred in the late 20th century. The production of the Z1 computer, which used binary and Boolean logic, was the beginning of more advanced digital developments. The next significant development in communication technologies was the supercomputer. FOURTH INDUSTRIAL REVOLUTION ~ IR-4 The Fourth Industrial Revolution is the trend towards automation and data exchange in manufacturing technologies and processes which include cyber-physical systems (CPS), IoT, industrial Internet of Things, cloud computing, cognitive computing, and artificial intelligence. The combination of machine learning and computational power allows machines to carry out highly complicated tasks. Also, in cooperation with Smart Factories. NOTE: Computerization and digitalization were building blocks leading us to IR 4.0 The Smart Factory is no longer a vision. While different model factories represent the feasible, many enterprises already clarify with examples practically, how the Smart Factory functions. The technical foundations on which the Smart Factory―the intelligent factory―is based are cyber-physical systems that communicate with each other using the Internet of Things and Services. An important part of this process is the exchange of data between the product and the production line. This enables a much more efficient connection of the Supply Chain and better organization within any production environment. Within modular structured smart factories, cyber-physical systems monitor physical processes, create a virtual copy of the physical world and make decentralized decisions. SO WHAT DOES THIS MEAN TO US Artificial Intelligence has brought us a long way. However, AI may take us too far. The “danger zone” is when it will be able to think on the same level as a human. To develop a construct upon which to investigate, let's examine the three different TYPES of AI. AI ~ ARTIFICIAL INTELLIGENCE OR WEAK AI / ANI ~ NARROW INTELLIGENCE Artificial intelligence is a computer system that can perform complex tasks that would otherwise require human minds—such as visual perception, speech recognition, decision-making, and translation between languages. The majority of these machines rely on deep learning and programming, which helps “teach” them to process vast amounts of data to recognize patterns and carry out actions. It is essentially recreating the human mind in machine form, similar to what is being carried out in Smart Factories today (as well as other areas of processing and bio-development). Artificial Intelligence works on a supervised learning system, where various sets of data are provided to the machines, to learn from examples. This helps AI to classify objects or predict the results. AI performs intelligent tasks, but its reach is very narrow and limited as it can only provide an outcome that is already programmed. It cannot make unpredictable decisions on its own, like a human brain can. AI is also referred to as Narrow AI [ANI] or Weak AI. This type of artificial intelligence is one that focuses primarily on one single narrow task, with a limited range of abilities. If you think of an example of AI that exists in our lives right now, it is ANI. AGI - ARTIFICIAL GENERAL INTELLIGENCE OR TRUE (REAL) INTELLIGENCE AGI technology would be on the level of a human mind. Due to this fact, it will probably be some time before we truly grasp AGI, as we still don’t know all there is to know about the human brain itself. However, in concept at least, AGI would be able to think on the same level as a human, much like Sonny the robot in I-Robot featuring Will Smith. Artificial General Intelligence, on the contrary, is the intelligence of a machine that could perform all the intellectual tasks performed by human beings. It possesses the ability to analyze a situation on its own and take a calculative decision, like humans can, without having to be programmed in advance. We are actually nearing that in some of our Smart Factories. As I noted previously, within modular structured Smart Factories, cyber-physical systems monitor physical processes, create a virtual copy of the physical world and make decentralized decisions. ASI - ARTIFICIAL SUPER INTELLIGENCE This is where it gets a little theoretical and a touch scary. ASI refers to AI technology that will match and then surpass the human mind. To be classed as an ASI, the technology would have to be more capable than a human in every single way possible. Not only could these AI things carry out tasks, but they would even be capable of having emotions and relationships. NOTE: The evolution from AGI to ASI would in theory be much faster than it is taking us to get from ANI to AGI right now, since AGI would allow computers to “think” and exponentially improve themselves once they are able to really learn from experience and by trial and error. If a transition to ASI ever happens, the exponential growth that is in theory expected to occur at this point is often called an Intelligence Explosion … SINGULARITY! NOTE: We should ensure a safe and ethical functioning of AI in all fields and make it a priority in further development. However, once systems start “thinking” on their own―with NO knowledge of God―what are the limits?! WHAT ABOUT NEW GLOBAL GOVERNANCE The future Global Leader [Antimashiach / FALSE messiah] … along with his False Prophet … will demand the populace to take a digital “mark” on their right hands or forehead that will “connect” them with a Smart System: without which they can neither BUY nor SELL. ARE YOU READY FOR THIS ► Brain modification allowing receptors to gain access to—or receive messages from—paranormal and Satanic occult sources. ► Downloading—via the transfer of artificial intelligence (AI) information—through brain-machine interfacing, a desire for the “Mark of the Beast.” ► Corrupted spermatozoa which could fertilize an ovum producing a hybrid being: a non—other than normal—human life form. [Think: Nephilim] ► Receiving fallen—demonically anointed—influence via psycho-neural pathways. SUMMARY I have alerted you to what the New Global Governance Leader―Antimashiach―FALSE messiah will use in the End Times. Teach AND prepare your children and grandchildren about what is and will be happening. Make sure that YOU and your progeny are prepared for Heaven. Here is HOW you can be sure >>> Baruch haba b'Shem ADONAI Your friend, Prince Handley ______________________ [Scroll down past English, Spanish and French] ______________________