AITopics | peri

Collaborating Authors

peri

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stability of Transformers under Layer Normalization

Kan, Kelvin, Li, Xingjian, Zhang, Benjamin J., Sahai, Tuhin, Osher, Stanley, Kumar, Krishna, Katsoulakis, Markos A.

arXiv.org Artificial IntelligenceOct-14-2025

Despite their widespread use, training deep Transformers can be unstable. Layer normalization, a standard component, improves training stability, but its placement has often been ad-hoc. In this paper, we conduct a principled study on the forward (hidden states) and backward (gradient) stability of Transformers under different layer normalization placements. Our theory provides key insights into the training dynamics: whether training drives Transformers toward regular solutions or pathological behaviors. For forward stability, we derive explicit bounds on the growth of hidden states in trained Transformers. For backward stability, we analyze how layer normalization affects the backprop-agation of gradients, thereby explaining the training dynamics of each layer normalization placement. Our analysis also guides the scaling of residual steps in Transformer blocks, where appropriate choices can further improve stability and performance. Our numerical results corroborate our theoretical findings. Beyond these results, our framework provides a principled way to sanity-check the stability of Transformers under new architectural modifications, offering guidance for future designs.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.09904

Country: Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Peri-AIIMS: Perioperative Artificial Intelligence Driven Integrated Modeling of Surgeries using Anesthetic, Physical and Cognitive Statuses for Predicting Hospital Outcomes

Bandyopadhyay, Sabyasachi, Zhang, Jiaqing, Ison, Ronald L., Libon, David J., Tighe, Patrick, Price, Catherine, Rashidi, Parisa

arXiv.org Artificial IntelligenceOct-29-2024

The association between preoperative cognitive status and surgical outcomes is a critical, yet scarcely explored area of research. Linking intraoperative data with postoperative outcomes is a promising and low-cost way of evaluating long-term impacts of surgical interventions. In this study, we evaluated how preoperative cognitive status as measured by the clock drawing test contributed to predicting length of hospital stay, hospital charges, average pain experienced during follow-up, and 1-year mortality over and above intraoperative variables, demographics, preoperative physical status and comorbidities. We expanded our analysis to 6 specific surgical groups where sufficient data was available for cross-validation. The clock drawing images were represented by 10 constructional features discovered by a semi-supervised deep learning algorithm, previously validated to differentiate between dementia and non-dementia patients. Different machine learning models were trained to classify postoperative outcomes in hold-out test sets. The models were compared to their relative performance, time complexity, and interpretability. Shapley Additive Explanations (SHAP) analysis was used to find the most predictive features for classifying different outcomes in different surgical contexts. Relative classification performances achieved by different feature sets showed that the perioperative cognitive dataset which included clock drawing features in addition to intraoperative variables, demographics, and comorbidities served as the best dataset for 12 of 18 possible surgery-outcome combinations...

figure 4, peri, surgery, (16 more...)

arXiv.org Artificial Intelligence

2411.0084

Country:

North America > United States > Florida > Alachua County > Gainesville (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New Jersey > Gloucester County > Glassboro (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Surgery (1.00)
Health & Medicine > Therapeutic Area > Neurology > Dementia (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Meta-Learning for Neural Network-based Temporal Point Processes

Takimoto, Yoshiaki, Tanaka, Yusuke, Iwata, Tomoharu, Okawa, Maya, Kim, Hideaki, Toda, Hiroyuki, Kurashima, Takeshi

arXiv.org Artificial IntelligenceJan-28-2024

Human activities generate various event sequences such as taxi trip records, bike-sharing pick-ups, crime occurrence, and infectious disease transmission. The point process is widely used in many applications to predict such events related to human activities. However, point processes present two problems in predicting events related to human activities. First, recent high-performance point process models require the input of sufficient numbers of events collected over a long period (i.e., long sequences) for training, which are often unavailable in realistic situations. Second, the long-term predictions required in real-world applications are difficult. To tackle these problems, we propose a novel meta-learning approach for periodicity-aware prediction of future events given short sequences. The proposed method first embeds short sequences into hidden representations (i.e., task representations) via recurrent neural networks for creating predictions from short sequences. It then models the intensity of the point process by monotonic neural networks (MNNs), with the input being the task representations. We transfer the prior knowledge learned from related tasks and can improve event prediction given short sequences of target tasks. We design the MNNs to explicitly take temporal periodic patterns into account, contributing to improved long-term prediction performance. Experiments on multiple real-world datasets demonstrate that the proposed method has higher prediction performance than existing alternatives.

dataset, intensity function, sequence, (15 more...)

arXiv.org Artificial Intelligence

2401.15846

Country:

North America > United States > New York (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)

Genre: Research Report (1.00)

Industry:

Transportation (0.68)
Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback