AITopics | mds

Collaborating Authors

mds

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Effective-Rank Audit of Alignment-Induced Activation Shifts: Confound Control, Constructive Calibration, and Limits

Nakamura, Yuki

arXiv.org Machine LearningMay-26-2026

We audit alignment-induced shifts in residual-stream activations of three open-weight instruction-tuned LLMs (Llama-3.1-8B-Instruct, Gemma-2-9B-it, Qwen-2.5-7B-Instruct) using the effective rank of the alignment modification matrix on safety-relevant inputs, rho_eps := rank_eps(M_Ds)/d, which formalizes the single-refusal-direction observation of Arditi et al. (2024) as a continuous quantity. The paper has three contributions. (1) Confound-controlled measurement: a four-variant decomposition (M_naive, M_template, M_aligned, M_DiD) separates chat-template formatting, alignment-stage shift, and the refusal-mediating direction, and recovers the Arditi refusal direction on M_DiD at |cos| in {0.77, 0.86, 0.50} (Llama/Gemma/Qwen); chat-template-controlled rho_eps is {0.0029, 0.0048, 0.0044}, and the centered SVD residual is 4-7x larger. (2) Constructive calibration on a 3-layer MLP across rho_eps in {0.008, 0.17, 0.33, 0.40} exhibits a sweet-spot vs. brittle distinction: mild rank-maximization (lambda=5) buys ablation robustness, while strong regularization at the same nominal rho_eps (lambda=50) does not. rho_eps is a diagnostic for fragility, not a target whose mechanical inflation buys robustness. (3) Limits of rank-based diagnostics: (a) not safety-specific (LRH baseline is 2-3x the safety value); (b) SVD principal ordering does not match causal ordering (Llama u_2 inert despite ranking second; cumulative ablation non-monotone at k=5); (c) the spectral-gap hypothesis required to upgrade the O(rho_eps * d) achievability bound to a matching Mirsky-route lower bound fails empirically (1/90 Llama layer-reference pairs, 0/36 MLP combinations) and structurally (kappa_lb <= 2/(eps * r)). The matching lower bound remains an open problem.

large language model, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2605.24583

Country: Asia > Japan (0.40)

Genre: Research Report (0.50)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Higher Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Don't take it lightly: Phasing optical random projections with unknown operators

Sidharth Gupta, Remi Gribonval, Laurent Daudet, Ivan Dokmanić

Neural Information Processing SystemsFeb-13-2026, 17:37:56 GMT

In this paper we tackle the problem of recovering the phase of complex linear measurements whenonlymagnitude information isavailableandwecontrol the input. We are motivated by the recent development of dedicated optics-based hardware for rapid random projections which leverages the propagation of light inrandom media.

algorithm, artificial intelligence, projection, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

Leveraging Hierarchical Organization for Medical Multi-document Summarization

Hsu, Yi-Li, Mei, Katelyn X., Wang, Lucy Lu

arXiv.org Artificial IntelligenceNov-5-2025

Medical multi-document summarization (MDS) is a complex task that requires effectively managing cross-document relationships. This paper investigates whether incorporating hierarchical structures in the inputs of MDS can improve a model's ability to organize and contextualize information across documents compared to traditional flat summarization methods. We investigate two ways of incorporating hierarchical organization across three large language models (LLMs), and conduct comprehensive evaluations of the resulting summaries using automated metrics, model-based metrics, and domain expert evaluation of preference, understandability, clarity, complexity, relevance, coverage, factuality, and coherence. Our results show that human experts prefer model-generated summaries over human-written summaries. Hierarchical approaches generally preserve factuality, coverage, and coherence of information, while also increasing human preference for summaries. Additionally, we examine whether simulated judgments from GPT-4 align with human judgments, finding higher agreement along more objective evaluation facets. Our findings demonstrate that hierarchical structures can improve the clarity of medical summaries generated by models while maintaining content coverage, providing a practical way to improve human preference for generated summaries.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.23104

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

How Scale Breaks "Normalized Stress" and KL Divergence: Rethinking Quality Metrics

Smelser, Kiran, Gunaratne, Kaviru, Miller, Jacob, Kobourov, Stephen

arXiv.org Machine LearningOct-13-2025

Complex, high-dimensional data is ubiquitous across many scientific disciplines, including machine learning, biology, and the social sciences. One of the primary methods of visualizing these datasets is with two-dimensional scatter plots that visually capture some properties of the data. Because visually determining the accuracy of these plots is challenging, researchers often use quality metrics to measure the projection's accuracy and faithfulness to the original data. One of the most commonly employed metrics, normalized stress, is sensitive to uniform scaling (stretching, shrinking) of the projection, despite this act not meaningfully changing anything about the projection. Another quality metric, the Kullback--Leibler (KL) divergence used in the popular t-Distributed Stochastic Neighbor Embedding (t-SNE) technique, is also susceptible to this scale sensitivity. We investigate the effect of scaling on stress and KL divergence analytically and empirically by showing just how much the values change and how this affects dimension reduction technique evaluations. We introduce a simple technique to make both metrics scale-invariant and show that it accurately captures expected behavior on a small benchmark.

artificial intelligence, kl divergence, machine learning, (14 more...)

arXiv.org Machine Learning

2510.0866

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science (0.94)

Add feedback

Recovering Wasserstein Distance Matrices from Few Measurements

Rana, Muhammad, Tasissa, Abiy, Cai, HanQin, Gavriyelov, Yakov, Hamm, Keaton

arXiv.org Machine LearningSep-24-2025

This paper proposes two algorithms for estimating square Wasserstein distance matrices from a small number of entries. These matrices are used to compute manifold learning embeddings like multidimensional scaling (MDS) or Isomap, but contrary to Euclidean distance matrices, are extremely costly to compute. We analyze matrix completion from upper triangular samples and Nyström completion in which $\mathcal{O}(d\log(d))$ columns of the distance matrices are computed where $d$ is the desired embedding dimension, prove stability of MDS under Nyström completion, and show that it can outperform matrix completion for a fixed budget of sample distances. Finally, we show that classification of the OrganCMNIST dataset from the MedMNIST benchmark is stable on data embedded from the Nyström estimation of the distance matrix even when only 10\% of the columns are computed.

algorithm, distance matrix, matrix, (14 more...)

arXiv.org Machine Learning

2509.1925

Country:

North America > United States > Texas (0.05)
North America > United States > Massachusetts > Middlesex County > Medford (0.04)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Synthetic POMDPs to Challenge Memory-Augmented RL: Memory Demand Structure Modeling

Wang, Yongyi, Li, Lingfeng, Chen, Bozhou, Li, Ang, Liu, Hanyu, Zheng, Qirui, Yang, Xionghui, Li, Wenxin

arXiv.org Artificial IntelligenceSep-23-2025

Recent research has developed benchmarks for memory-augmented reinforcement learning (RL) algorithms, providing Partially Observable Markov Decision Process (POMDP) environments where agents depend on past observations to make decisions. While many benchmarks incorporate sufficiently complex real-world problems, they lack controllabil-ity over the degree of challenges posed to memory models. In contrast, synthetic environments enable fine-grained manipulation of dynamics, making them critical for detailed and rigorous evaluation of memory-augmented RL. Our study focuses on POMDP synthesis with three key contributions: 1. A theoretical framework for analyzing POMDPs, grounded in Memory Demand Structure (MDS), transition invariance, and related concepts; 2. A methodology leveraging linear process dynamics, state aggregation, and reward redistribution to construct customized POMDPs with predefined properties; 3. Empirically validated series of POMDP environments with increasing difficulty levels, designed based on our theoretical insights. Our work clarifies the challenges of memory-augmented RL in solving POMDPs, provides guidelines for analyzing and designing POMDP environments, and offers empirical support for selecting memory models in RL tasks.

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2508.04282

Country: Asia (0.28)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Applicability of the Minimal Dominating Set for Influence Maximisation in Multilayer Networks

Czuba, Michał, Jia, Mingshan, Bródka, Piotr, Musial, Katarzyna

arXiv.org Artificial IntelligenceMar-8-2025

The minimal dominating set (MDS) is a well-established concept in network controllability and has been successfully applied in various domains, including sensor placement, network resilience, and epidemic containment. In this study, we adapt the local-improvement MDS routine and explore its potential for enhancing seed selection for influence maximisation in multilayer networks (MLN). We employ the Linear Threshold Model (LTM), which offers an intuitive representation of influence spread or opinion dynamics by accounting for peer influence accumulation. To ensure interpretability, we utilise rank-refining seed selection methods, with the results further filtered with MDS. Our findings reveal that incorporating MDS into the seed selection process improves spread only within a specific range of situations. Notably, the improvement is observed for larger seed set budgets, lower activation thresholds, and when an "AND" strategy is used to aggregate influence across network layers. This scenario reflects situations where an individual does not require the majority of their acquaintances to hold a target opinion, but must be influenced across all social circles.

actor, experiment, mds, (13 more...)

arXiv.org Artificial Intelligence

2502.15236

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
Europe > United Kingdom > England > Dorset > Bournemouth (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: Universality and individuality in neural dynamics across large populations of recurrent networks

Neural Information Processing SystemsJan-24-2025, 03:10:38 GMT

UPDATE after rebuttal: authors have addressed some of my concerns, so I'm updating my score to 8. To summarize, this paper aims to shed light on the connections between artificial recurrent neural networks and biological networks, in order to gain insight into neural circuit functionality through studying RNNs. More specifically, the paper comments on the ability for RNNs to mimic the behavior SNNs and neural recordings despite a vast difference in their inherent architectures. Such a phenomenon may suggest neural invariants, which act universally (in the context of a task) across either all RNN and SNN architectures, or broader groups containing various architectures in each. The paper does not look at neural recordings or SNNs, but instead trains 96 RNNs of various combinations of architectures, activations, network sizes, and L2 regularizations on three separate tasks (discrete memory, pattern formation, and analog memory) common to computational neuroscience. For each task singular value canonical correlation analysis (SVCCA) and MDS are used to determine the representational geometry of the RNNs and a numerical approach to dynamical systems analysis (and again with MDS) is used to gain insight into the topological stability structure.

architecture, recurrent network, universality and individuality, (10 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Neurology (0.57)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Cognitive Science (0.57)

Add feedback

Author-Specific Linguistic Patterns Unveiled: A Deep Learning Study on Word Class Distributions

Krauss, Patrick, Schilling, Achim

arXiv.org Artificial IntelligenceJan-17-2025

Deep learning methods have been increasingly applied to computational linguistics to uncover patterns in text data. This study investigates author-specific word class distributions using part-of-speech (POS) tagging and bigram analysis. By leveraging deep neural networks, we classify literary authors based on POS tag vectors and bigram frequency matrices derived from their works. We employ fully connected and convolutional neural network architectures to explore the efficacy of unigram and bigram-based representations. Our results demonstrate that while unigram features achieve moderate classification accuracy, bigram-based models significantly improve performance, suggesting that sequential word class patterns are more distinctive of authorial style. Multi-dimensional scaling (MDS) visualizations reveal meaningful clustering of authors' works, supporting the hypothesis that stylistic nuances can be captured through computational methods. These findings highlight the potential of deep learning and linguistic feature analysis for author profiling and literary studies.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2501.10072

Country: Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Privacy-Aware Multi-Device Cooperative Edge Inference with Distributed Resource Bidding

Zhuang, Wenhao, Mao, Yuyi

arXiv.org Artificial IntelligenceDec-30-2024

Mobile edge computing (MEC) has empowered mobile devices (MDs) in supporting artificial intelligence (AI) applications through collaborative efforts with proximal MEC servers. Unfortunately, despite the great promise of device-edge cooperative AI inference, data privacy becomes an increasing concern. In this paper, we develop a privacy-aware multi-device cooperative edge inference system for classification tasks, which integrates a distributed bidding mechanism for the MEC server's computational resources. Intermediate feature compression is adopted as a principled approach to minimize data privacy leakage. To determine the bidding values and feature compression ratios in a distributed fashion, we formulate a decentralized partially observable Markov decision process (DEC-POMDP) model, for which, a multi-agent deep deterministic policy gradient (MADDPG)-based algorithm is developed. Simulation results demonstrate the effectiveness of the proposed algorithm in privacy-preserving cooperative edge inference. Specifically, given a sufficient level of data privacy protection, the proposed algorithm achieves 0.31-0.95% improvements in classification accuracy compared to the approach being agnostic to the wireless channel conditions. The performance is further enhanced by 1.54-1.67% by considering the difficulties of inference data.

artificial intelligence, machine learning, mec server, (16 more...)

arXiv.org Artificial Intelligence

2412.21069

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback