Goto

Collaborating Authors

 neural turing machine


Intelligent DoS and DDoS Detection: A Hybrid GRU-NTM Approach to Network Security

Panggabean, Caroline, Venkatachalam, Chandrasekar, Shah, Priyanka, John, Sincy, P, Renuka Devi, Venkatachalam, Shanmugavalli

arXiv.org Artificial Intelligence

Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any cur rent or future media. Caroline Panggabean Departement of CSE (AI) JAIN (Deemed - to - be University) Bangalore, Karnataka carolinepgabean@gmail.com Sincy John Departement of CSE (AIM) JAIN (Deemed - to - be University) Bangalore, Karnataka sincyjohn@jainuniversity.ac.in Chandrasekar Venkatachalam Departement of CSE (AI) JAIN (Deemed - to - be University) Bangalore, Karnataka chandrasekar.v@jainuniversity.ac.in Renuka Devi P Departement of CSE (AIML) JAIN (Deemed - to - be University) Bangalore, Karnataka renukadevi.p@jainuniversity.ac.in Priyanka Shah Departement of CSE (AI) JAIN (Deemed - to - be University) Bangalore, Karnataka priyankashah8324@gmail.com Shanmugavalli Venkatachalam Department of CSE KSR College of Engineering Namakkal, Tamil N adu drvshanmugavalli@gmail.com Abstract -- Detecting Denial of Service (DoS) and Distributed Denial of Service (DDoS) attacks remains a critical challenge in cybersecurity. This research introduces a hybrid deep learning model combining Gated Recurrent Units (GRUs) and a Neural Turing Machine (NTM) for enhanced intrusion detection. Trained on UNSW - NB15 and BoT - IoT datasets, the model employs GRU layers for sequential data processing and an NTM for long - term pattern recognition.


Reviews: Differentiable Learning of Logical Rules for Knowledge Base Reasoning

Neural Information Processing Systems

This paper develops a model for learning to answer queries in knowledge bases with incomplete data about relations between entities. For example, the running example in the paper is answering queries like HasOfficeInCountry(Uber,?), when the relation is not directly present in the knowledge base, but supporting relations like HasOfficeInCity(Uber, NYC) and CityInCountry(NYC, USA). The aim in this work is to learn rules like HasOfficeInCountry(A, B) HasOfficeInCountry(A, C) && CityInCountry(C, B). Note that this is a bit different from learning embeddings for entities in a knowledge base, because the rule to be learned is abstract, not depending on any specific entities. The formulation in this paper is cast the problem as one of learning two components: - a set of rules, represented as a sequence of relations (those that appear in the RHS of the rule) - a real-valued confidence on the rule The approach to learning follows ideas from Neural Turing Machines and differentiable program synthesis, whereby the discrete problem is relaxed to a continuous problem by defining a model for executing the rules where all rules are executed at each step and then averaged together with weights given by the confidences.


FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory

Pal, Anwesan, Wadhwa, Sahil, Jaiswal, Ayush, Zhang, Xu, Wu, Yue, Chada, Rakesh, Natarajan, Pradeep, Christensen, Henrik I.

arXiv.org Artificial Intelligence

Multi-turn textual feedback-based fashion image retrieval focuses on a real-world setting, where users can iteratively provide information to refine retrieval results until they find an item that fits all their requirements. In this work, we present a novel memory-based method, called FashionNTM, for such a multi-turn system. Our framework incorporates a new Cascaded Memory Neural Turing Machine (CM-NTM) approach for implicit state management, thereby learning to integrate information across all past turns to retrieve new images, for a given turn. Unlike vanilla Neural Turing Machine (NTM), our CM-NTM operates on multiple inputs, which interact with their respective memories via individual read and write heads, to learn complex relationships. Extensive evaluation results show that our proposed method outperforms the previous state-of-the-art algorithm by 50.5%, on Multi-turn FashionIQ -- the only existing multi-turn fashion dataset currently, in addition to having a relative improvement of 12.6% on Multi-turn Shoes -- an extension of the single-turn Shoes dataset that we created in this work. Further analysis of the model in a real-world interactive setting demonstrates two important capabilities of our model -- memory retention across turns, and agnosticity to turn order for non-contradictory feedback. Finally, user study results show that images retrieved by FashionNTM were favored by 83.1% over other multi-turn models. Project page: https://sites.google.com/eng.ucsd.edu/fashionntm


My 2 cents on Google'sLaMDA being sentient

#artificialintelligence

AI models don't have a memory: When you converse with a chatbot one day, it won't remember what you said the next day. Chatbots (and Language Models) typically work by looking at "context", which, for you, basically means a few sentences in the past. The limit will vary from model to model, but it's typically up to 1000 words or something (not sure what is it these days with super huge models, but there's always a limit). Even if a chatbot uses "RNN", it's still very limited (usually even more) as RNNs struggle with long-term memory where long [a few hundred words]. The point is that AI models have no idea what you said a few sentences back. Also, don't be confused by models like Neural Turing Machine, which have a "working memory" (like RAM) but still no permanent memory (like a hard disk).


Understanding Memory in Deep Learning Systems: The Neuroscience, and Cognitive Psychology…

#artificialintelligence

I recently started a new newsletter focus on AI education. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Memory modeling is an active area of research in the deep learning space. In recent years, techniques such as Neural Turing Machines(NTM) have made significant progress setting up the foundation for building human-like memory structures in deep learning systems.


Reservoir memory machines

Paassen, Benjamin, Schulz, Alexander

arXiv.org Machine Learning

While neural networks have achieved impressive successes in domains like image classification or machine translation, standard models still struggle with tasks that require very longterm memory without interference and would thus benefit from a separation of memory and computation [Graves et al., 2016, Collier and Beel, 2018]. Neural Turing Machines (NTM) attempt to address these tasks by augmenting recurrent neural networks with an explicit memory to which the network has read and write access [Graves et al., 2016, Collier and Beel, 2018]. Unfortunately, such models are notoriously hard to train, even compared to other deep learning models [Collier and Beel, 2018]. In our contribution, we propose to address this training problem by replacing the learned recurrent neural network controller of a NTM with an echo state network (ESN) [Jaeger and Haas, 2004]. In other words, we only learn the controller for the read and write head of our memory access as well as the output mapping, all of which is possible via standard linear regression.


Neural Turing Machines

#artificialintelligence

We discuss Neural Turing Machine(NTM), an architecture proposed by Graves et al. in DeepMind. NTMs are designed to solve tasks that require writing to and retrieving information from an external memory, which makes it resemble a working memory system that can be described by short-term storage(memory) of information and its rule-based manipulation. Compared with RNN structure with internal memory, NTMs utilize attentional mechanisms to efficiently read and write an external memory, which makes them a more favorable choice for capturing long-range dependencies. But, as we will see, these two are not independent of each other and can be combined to form a more powerful architecture. The overall architecture of NTM is demonstrated in Figure 1, where the controller is a general neural network, an MLP or RNN, which receives inputs and previous read vectors and omits outputs in response.


Can Neural Networks Develop Attention? Google Thinks they Can

#artificialintelligence

Trying to read this article is a complicated task from the neuroscientific standpoint. At this time you are probably bombarded with emails, news, notifications on our phone, the usual annoying coworker interrupting and other distractions that cause your brain to spin on many directions. In order to read this tiny article or perform many other cognitive tasks, you need to focus, you need attention. Attention is a cognitive skill that is pivotal to the formation of knowledge. However, the dynamics of attention have remained a mystery to neuroscientists for centuries and, just recently, that we have had major breakthroughs that help to explain how attention works.


Memory-Augmented Recurrent Networks for Dialogue Coherence

Donahue, David, Meng, Yuanliang, Rumshisky, Anna

arXiv.org Machine Learning

Recent dialogue approaches operate by reading each word in a conversation history, and aggregating accrued dialogue information into a single state. This fixed-size vector is not expandable and must maintain a consistent format over time. Other recent approaches exploit an attention mechanism to extract useful information from past conversational utterances, but this introduces an increased computational complexity. In this work, we explore the use of the Neural Turing Machine (NTM) to provide a more permanent and flexible storage mechanism for maintaining dialogue coherence. Specifically, we introduce two separate dialogue architectures based on this NTM design. The first design features a sequence-to-sequence architecture with two separate NTM modules, one for each participant in the conversation. The second memory architecture incorporates a single NTM module, which stores parallel context information for both speakers. This second design also replaces the sequence-to-sequence architecture with a neural language model, to allow for longer context of the NTM and greater understanding of the dialogue history. We report perplexity performance for both models, and compare them to existing baselines.


How does the Mind store Information?

Panigrahy, Rina

arXiv.org Artificial Intelligence

How we store information in our mind has been a major intriguing open question. We approach this question not from a physiological standpoint as to how information is physically stored in the brain, but from a conceptual and algorithm standpoint as to the right data structures to be used to organize and index information. Here we propose a memory architecture directly based on the recursive sketching ideas from the paper "Recursive Sketches for Modular Deep Networks", ICML 2019 (arXiv:1905.12730), to store information in memory as concise sketches. We also give a high level, informal exposition of the recursive sketching idea from the paper that makes use of subspace embeddings to capture deep network computations into a concise sketch. These sketches form an implicit knowledge graph that can be used to find related information via sketches from the past while processing an event.