AITopics | term memory

Collaborating Authors

term memory

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scaling Competence, Shrinking Reasoning: Cognitive Signatures in Language Model Learning

Singh, Mukul, Singha, Ananya, Radhakrishna, Arjun, Gulwani, Sumit

arXiv.org Artificial IntelligenceDec-1-2025

We analyze reasoning in language models during task-specific fine-tuning and draws parallel between reasoning tokens--intermediate steps generated while solving problem and the human working memory. Drawing from cognitive science, we align training dynamics with the Four Stages of Competence: models initially produce incorrect outputs without reasoning, then begin reasoning (but still fail), eventually reason effectively, and finally solve tasks without explicit reasoning. We find that reasoning token length expands as performance improves, peaks at the stage of conscious competence, then declines as the model internalizes the task. Notably, after training, models retain performance even when reasoning is removed--suggesting it scaffolded learning but is no longer needed. This progression offers actionable insights: reasoning token dynamics can serve as a signal for diagnosing training stage, identifying convergence, and guiding early stopping. We propose metrics to track this trajectory and argue that reasoning behavior is valuable for understanding and optimizing reasoning model training.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.21743

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

Add feedback

Forecasting Future DDoS Attacks Using Long Short Term Memory (LSTM) Model

Yeen, Kong Mun, Noor, Rafidah Md, Shah, Wahidah Md, Hassan, Aslinda, Munir, Muhammad Umair

arXiv.org Artificial IntelligenceSep-3-2025

This paper forecasts future Distributed Denial - of - Service (DDoS) attacks us ing deep learning models. Although several studies address forecasting DDoS attacks, they remain relatively limited compared to detection - focused research . By studying the current trends and forecasting based on newer and updated datasets, mitigation plans against the attacks can be planned and formulated. The methodology used in this research work conforms to the Cross Industry Standard Process for Data Mining (CRISP - DM) model. Leveraging cyberattack data from the COVID - 19 period (2019 - 2020), sourced from Digital Attack Map and compiled by Arbor Networks, the study aims to identify recent attack trends and forecast future activity to support proactive mitigation strategies. The dataset was examined using statistical analysis techniques to identify prevailing patterns, with emphasis on the frequency of attac ks, the duration of attack instances, and the maximum throughput recorded during each incident . Compared to other deep learning models, the LSTM model is proposed for its ability to learn long - term temporal patterns in evolving DDoS traffic. The performanc e of LSTM model was evaluated using Mean Squared Error (MSE) under varying neuron counts and window sizes. While the model demonstrated limited predictive accuracy in terms of absolute values, the visual comparison between the predicted and actual data usi ng line charts revealed close alignment in trend patterns . This suggests that the model captures the underlying temporal dynamics of the data, thereby providing a promising foundation for future model optimization and performance enhancement. Many cyberattack methods are well known, including but not limited to phishing, spoofing, malware infections, ransomware, and Denial - of - Service (DoS) attacks. A DoS attack occurs when an attacker attempts to disable a service, server, or network . Attackers attempt to make services inaccessible by overwhelming the available resources on the hosting server, infrastructure and/or systems. However, DoS can be eas ily track ed, as it could contai n information about the attacker that can be obtained from network traces and attack logs.

artificial intelligence, ddos attack, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.5121/ijwmn.2025.17407

2509.02076

Country:

Asia > Malaysia (0.29)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.87)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Artificial Intelligence Software Structured to Simulate Human Working Memory, Mental Imagery, and Mental Continuity

Reser, Jared Edward

arXiv.org Artificial IntelligenceAug-13-2025

This article presents an artificial intelligence (AI) architecture intended to simulate the iterative updating of the human working memory system. It features several interconnected neural networks designed to emulate the specialized modules of the cerebral cortex. These are structured hierarchically and integrated into a global workspace. They are capable of temporarily maintaining high-level representational patterns akin to the psychological items maintained in working memory. This maintenance is made possible by persistent neural activity in the form of two modalities: sustained neural firing (resulting in a focus of attention) and synaptic potentiation (resulting in a short-term store). Representations held in persistent activity are recursively replaced resulting in incremental changes to the content of the working memory system. As this content gradually evolves, successive processing states overlap and are continuous with one another. The present article will explore how this architecture can lead to iterative shift in the distribution of coactive representations, ultimately leading to mental continuity between processing states, and thus to human-like thought and cognition. Like the human brain, this AI working memory store will be linked to multiple imagery (topographic map) generation systems corresponding to various sensory modalities. As working memory is iteratively updated, the maps created in response will construct sequences of related mental imagery. Thus, neural networks emulating the prefrontal cortex and its reciprocal interactions with early sensory and motor cortex capture the imagery guidance functions of the human brain. This sensory and motor imagery creation, coupled with an iteratively updated working memory store may provide an AI system with the cognitive assets needed to achieve synthetic consciousness or artificial sentience.

large language model, machine learning, pattern recognition, (23 more...)

arXiv.org Artificial Intelligence

2204.05138

Country:

North America > United States > California (0.45)
Europe > United Kingdom > England (0.27)

Genre:

Research Report > Promising Solution (0.67)
Research Report > New Finding (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
(4 more...)

Add feedback

Hall Effect Thruster Forecasting using a Topological Approach for Data Assimilation

Chumley, Max M., Khasawneh, Firas A.

arXiv.org Artificial IntelligenceApr-9-2025

Hall Effect Thrusters (HETs) are electric thrusters that eject heavy ionized gas particles from the spacecraft to generate thrust. Although traditionally they were used for station keeping, recently They have been used for interplanetary space missions due to their high delta-V potential and their operational longevity in contrast to other thrusters, e.g., chemical. However, the operation of HETs involves complex processes such as ionization of gases, strong magnetic fields, and complicated solar panel power supply interactions. Therefore, their operation is extremely difficult to model thus necessitating Data Assimilation (DA) approaches for estimating and predicting their operational states. Because HET's operating environment is often noisy with non-Gaussian sources, this significantly limits applicable DA tools. We describe a topological approach for data assimilation that bypasses these limitations that does not depend on the noise model, and utilize it to forecast spatiotemporal plume field states of HETs. Our approach is a generalization of the Topological Approach for Data Assimilation (TADA) method that allows including different forecast functions. We show how TADA can be combined with the Long Short-Term Memory network for accurate forecasting. We then apply our approach to high-fidelity Hall Effect Thruster (HET) simulation data from the Air Force Research Laboratory (AFRL) rocket propulsion division where we demonstrate the forecast resiliency of TADA on noise contaminated, high-dimensional data.

artificial intelligence, machine learning, persistence diagram, (18 more...)

arXiv.org Artificial Intelligence

2504.06157

Country: North America > United States (0.68)

Genre: Research Report (0.50)

Industry: Government > Military > Air Force (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Creating Scalable AGI: the Open General Intelligence Framework

Dollinger, Daniel A., Singleton, Michael

arXiv.org Artificial IntelligenceNov-27-2024

Recent advancements in Artificial Intelligence (AI), particularly with Large Language Models (LLMs), have led to significant progress in narrow tasks such as image classification, language translation, coding, and writing. However, these models face limitations in reliability and scalability due to their siloed architectures, which are designed to handle only one data modality (data type) at a time. This single modal approach hinders their ability to integrate the complex set of data points required for real-world challenges and problem-solving tasks like medical diagnosis, quality assurance, equipment troubleshooting, and financial decision-making. Addressing these real-world challenges requires a more capable Artificial General Intelligence (AGI) system. Our primary contribution is the development of the Open General Intelligence (OGI) framework, a novel systems architecture that serves as a macro design reference for AGI. The OGI framework adopts a modular approach to the design of intelligent systems, based on the premise that cognition must occur across multiple specialized modules that can seamlessly operate as a single system. OGI integrates these modules using a dynamic processing system and a fabric interconnect, enabling real-time adaptability, multi-modal integration, and scalable processing. The OGI framework consists of three key components: (1) Overall Macro Design Guidance that directs operational design and processing, (2) a Dynamic Processing System that controls routing, primary goals, instructions, and weighting, and (3) Framework Areas, a set of specialized modules that operate cohesively to form a unified cognitive system. By incorporating known principles from human cognition into AI systems, the OGI framework aims to overcome the challenges observed in today's intelligent systems, paving the way for more holistic and context-aware problem-solving capabilities.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.15832

Country:

North America > United States > Texas (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (0.66)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.34)

Add feedback

Exploring Unseen Environments with Robots using Large Language and Vision Models through a Procedurally Generated 3D Scene Representation

S, Arjun P, Melnik, Andrew, Nandi, Gora Chand

arXiv.org Artificial IntelligenceMar-30-2024

Recent advancements in Generative Artificial Intelligence, particularly in the realm of Large Language Models (LLMs) and Large Vision Language Models (LVLMs), have enabled the prospect of leveraging cognitive planners within robotic systems. This work focuses on solving the object goal navigation problem by mimicking human cognition to attend, perceive and store task specific information and generate plans with the same. We introduce a comprehensive framework capable of exploring an unfamiliar environment in search of an object by leveraging the capabilities of Large Language Models(LLMs) and Large Vision Language Models (LVLMs) in understanding the underlying semantics of our world. A challenging task in using LLMs to generate high level sub-goals is to efficiently represent the environment around the robot. We propose to use a 3D scene modular representation, with semantically rich descriptions of the object, to provide the LLM with task relevant information. But providing the LLM with a mass of contextual information (rich 3D scene semantic representation), can lead to redundant and inefficient plans. We propose to use an LLM based pruner that leverages the capabilities of in-context learning to prune out irrelevant goal specific information.

agent, information, representation, (13 more...)

arXiv.org Artificial Intelligence

2404.00318

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Germany (0.04)
Asia > India > Jharkhand > Dhanbad (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Extending Memory for Language Modelling

Nugaliyadde, Anupiya

arXiv.org Artificial IntelligenceMay-19-2023

Breakthroughs in deep learning and memory networks have made major advances in natural language understanding. Language is sequential and information carried through the sequence can be captured through memory networks. Learning the sequence is one of the key aspects in learning the language. However, memory networks are not capable of holding infinitely long sequences in their memories and are limited by various constraints such as the vanishing or exploding gradient problem. Therefore, natural language understanding models are affected when presented with long sequential text. We introduce Long Term Memory network (LTM) to learn from infinitely long sequences. LTM gives priority to the current inputs to allow it to have a high impact. Language modeling is an important factor in natural language understanding. LTM was tested in language modeling, which requires long term memory. LTM is tested on Penn Tree bank dataset, Google Billion Word dataset and WikiText-2 dataset. We compare LTM with other language models which require long term memory.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.11462

Country: Oceania > Australia (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.90)

Add feedback

Using Machine Learning to Classify Tweets

#artificialintelligenceAug-8-2022, 06:40:03 GMT

I recently had the opportunity to take on a project with Inspirit AI where I worked with a team to use machine learning to classify whether tweets were considered positive, negative, or neutral as they related to different stocks. In order to do that we explored three different machine learning models for classifying text: bag-of-words, long short-term memory (LSTM), and bidirectional encoder representations from transformers (BERT). Here I describe our experience in solving this problem and highlight what we learned from using the different methods. The bag of words model works by grouping each word into a bag or frequency count based on how often the word is used. The frequency count can be used as a feature in a machine learning model.

lstm, machine learning, tweet, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Papers to Read on using Long Short Term Memory(LSTM) architecture in forecasting

#artificialintelligenceApr-20-2022, 06:49:16 GMT

Abstract: The spread of COVID-19 has coincided with the rise of Graph Neural Networks (GNNs), leading to several studies proposing their use to better forecast the evolution of the pandemic. Many such models also include Long Short TermMemory (LSTM) networks, a common tool for time series forecasting. In this work, we further investigate the integration of these two methods by implementing GNNs within the gates of an LSTM and exploiting spatial information. In addition, we introduce a skip connection which proves critical to jointly capture the spatial and temporal patterns in the data. We validate our daily COVID-19 new cases forecast model on data of 37 European nations for the last 472 days and show superior performance compared to state-of-the-art graph time series models based on mean absolute scaled error (MASE).

forecasting, long short term memory, lstm, (10 more...)

#artificialintelligence

Country:

Asia > Middle East > Republic of Türkiye (0.07)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.06)
Europe > Sweden > Stockholm > Stockholm (0.05)

Genre: Research Report > New Finding (0.73)

Industry:

Energy > Power Industry (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gated Recurrent Unit

#artificialintelligenceMar-13-2022, 08:35:06 GMT

GRU or Gated recurrent unit is an advancement of the standard RNN i.e recurrent neural network. It was introduced by Kyunghyun Cho et al in the year 2014. Note: If you are more interested in learning concepts in an Audio-Visual format, We have this entire article explained in the video below. If not, you may continue reading. GRUs are very similar to Long Short Term Memory(LSTM).

equation, information, lstm, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback