AITopics | alignment

Collaborating Authors

alignment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Can the biggest problems in AI be solved by philosophy?

New ScientistJul-6-2026, 06:00:42 GMT

Can the biggest problems in AI be solved by philosophy? Some of the biggest challenges in artificial intelligence are being worked on not by computer scientists head down in code but by philosophers lured from academia into jobs at AI firms. The philosophers are tasked with making the next generation of models more capable and reliable, but they also shed light on the mystery of consciousness and whether intelligence can be replicated in software alone. Jonathan Birch at the London School of Economics and Political Science says AI companies are the big employers of philosophy PhDs right now, with offers of interesting work, large salaries and stock options proving too tempting for many to resist. "Topics that have been researched in philosophy departments for decades - how to make rational decisions, how to systematise moral principles, what counts as thinking or reasoning or introspection, what counts as evidence of consciousness - are suddenly of massive value to AI companies," says Birch. "So, naturally, we are seeing a huge brain drain."

artificial intelligence, philosopher, social media, (17 more...)

New Scientist

Industry:

Information Technology (0.49)
Marketing (0.42)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Signed-Permutation Coordinate Transport for RMSNorm Transformers

Sweeney, John

arXiv.org Machine LearningJul-1-2026

Modern LLM workflows move coordinate-indexed objects across checkpoints: steering vectors, sparse autoencoders, top-$k$ neuron sets, attribution lists, and merge alignments. This is only well posed after fixing the model's residual-stream gauge, which we show is architecture-dependent: LayerNorm residual charts have permutation gauge $S_d$ (up to a global sign flip), while RMSNorm charts with generic per-channel gain have signed-permutation gauge $B_d = S_d \ltimes \{\pm 1\}^d$. Permutation-only alignment is therefore symmetry-incomplete for RMSNorm models. We introduce sign-marginalized Hungarian matching and prove a sharp failure mode: with decorrelated coordinates, raw signed-correlation matching has a structural permutation-accuracy ceiling at the positive-sign fraction of the true gauge, which sign-marginalization removes. We then make coordinate-preserving transport, not function-level merging, the primary object: composing saved-checkpoint local $B_d$ gauges along same-base fine-tuning trajectories recovers 91.1% of cross-run coordinates at 1500 steps versus 60.3% for endpoint matching, and the gain is not explained by merely routing through the base. The recovered gauge transfers tools that permutation-only alignment breaks: TinyLlama SAE reconstruction has NMSE 0.004 under $B_d$ versus 1.08 under $S_d$; Qwen sentiment steering preserves 95.8% of its effect versus 17.2%; refusal steering reverses sign under $S_d$; coordinate-preserving merges behave the same way. The same covariance governs stateful training: signed transport of AdamW state preserves the resumed trajectory, while permutation-only state follows a different one from a functionally identical checkpoint. Finally, gauge-sweep audits show index-level interpretability claims are reproducible only relative to an explicit gauge.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2606.31963

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

'There's this deep mystery of what, actually, is this thing?': the philosopher inside Google DeepMind

The GuardianJun-30-2026, 04:00:37 GMT

'There's this deep mystery of what, actually, is this thing?': the philosopher inside Google DeepMind AI Since 2017, Iason Gabriel has worked at the tech giant, trying to anticipate - and think through - the impact of AI. But as commercial and geopolitical pressures escalate, can ethicists make any difference? In 2017, a 33-year-old political philosopher named Iason Gabriel was told by a friend that he ought to apply for a job at DeepMind, the London-based subsidiary of Google where much of its AI research was concentrated. The suggestion was not an obvious one. Gabriel was a cheerful but intense junior academic with a passion for Vipassana meditation and what his brother calls "enthusiastic" rock climbing. At the University of Oxford, where he was a fellow at St John's College, Gabriel taught courses on political theory and wrote papers on the moral contortions of "yuppie ethics" and the ethical blind spots of effective altruism. When he wasn't there, he did crisis work for the United Nations Development Programme in Sudan and Lebanon. DeepMind, meanwhile, was the world's leading AI research lab. In part, this was because it had the financial and computational backing of Google, which had bought the company in 2014 for $650m. In part, it was because DeepMind had recently shown it could put those resources to stunning use. In Seoul, in 2016, a DeepMind system called AlphaGo defeated Lee Sedol, a South Korean Go champion, in a five-game match. The victory was significant not least because of Go's legendary complexity; the game has more possible configurations than there are atoms in the universe. Thanks to the fuss around AlphaGo, Gabriel was aware of DeepMind.

large language model, machine learning, natural language, (19 more...)

The Guardian

Country:

North America > United States (0.68)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.24)
Asia > South Korea > Seoul > Seoul (0.24)
Asia > Middle East > Lebanon (0.24)

Genre: Personal > Obituary (0.34)

Industry:

Transportation (1.00)
Information Technology (1.00)
Government > Intergovernmental Programs (0.88)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Connectivity Estimation using Stochastic Graph Heat Modelling

Goerttler, Stephan, Wu, Min, He, Fei

arXiv.org Machine LearningJun-30-2026

A growing number of techniques leverage the spatial structures that underlie many real-world datasets. Despite these advances, the complementary task of estimating spatial structures and understanding their role within these techniques has often been overlooked. In neurophysiological data analysis specifically, numerous methods exist to estimate brain connectivity, but most are not explicitly model-based, dynamic, multivariate, or directed. To address these limitations, we previously introduced noise-driven heat modelling on graphs for neurophysiological connectivity estimation. In this study, we extend this framework by relaxing earlier noise assumptions and adding regularisation to improve robustness. We also develop a simulation procedure to characterise and evaluate our technique in a controlled setting. Finally, we demonstrate that the technique is able to capture meaningful spatial structure across two experiments, each using two real-world datasets. The explicit model formulation of our connectivity estimator has the potential to improve the interpretability of graph-based techniques across a wide range of applications. The code implementing our method is available at https://github.com/sgoerttler/Heat_Connectivity.

artificial intelligence, machine learning, simulation, (17 more...)

arXiv.org Machine Learning

2606.29098

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science (0.87)

Add feedback

TimeLAVA: Learning-Agnostic Valuation for Time Series Data

Liu, Wenqin, Quan, Weizhi, Zuo, Aoqi, Gao, Erdun, Nguyen, Vu, Sejdinovic, Dino, Bondell, Howard, Gong, Mingming

arXiv.org Machine LearningJun-30-2026

Data valuation quantifies the intrinsic quality of individual samples to enable principled data curation, quality control, and robust learning. For time series in critical domains such as healthcare, finance, and industrial monitoring, effective valuation methods are essential yet fundamentally lacking. Existing approaches are either model-dependent, limiting their generalizability, or designed for i.i.d. data and thus fail to capture temporal dependencies, multi-scale patterns, and non-stationary dynamics inherent to sequential data. We introduce TimeLAVA, a learning-agnostic framework that values temporal segments by their marginal contribution to minimizing distributional discrepancy between evaluated and reference data. At its core is a novel Selective Wavelet-based Wasserstein discrepancy combining multi-scale wavelet transforms for temporal localization with unbalanced optimal transport for robustness to distributional shifts. Segment values are efficiently computed via sensitivity analysis without requiring model training and aggregated into point-wise scores. We provide theoretical guarantees linking valuation to model-agnostic generalization and prove bounded sensitivity to outlier contamination. Extensive experiments across anomaly detection, data pruning, and label noise detection demonstrate that TimeLAVA produces significantly more informative value scores than existing methods on diverse real-world datasets.

data mining, learning-agnostic valuation, machine learning, (17 more...)

arXiv.org Machine Learning

2606.18729

Country:

North America (0.28)
Asia (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.70)
Information Technology > Data Science > Data Quality > Data Transformation (0.67)

Add feedback

ITSPACE: Monotone Gaussian Optimal Transport Updates

Na, Woojoo, Dy, Jennifer

arXiv.org Machine LearningJun-30-2026

Covariance matrices serve as compact descriptors of feature distributions in many machine-learning pipelines, including domain adaptation and Gaussian embeddings. Under a centered Gaussian approximation, the unregularized Wasserstein-2 optimal-transport (OT) discrepancy admits a closed form on covariances given by the Bures-Wasserstein (BW) objective on the symmetric positive definite (SPD) cone. We propose ITSPACE (Iterative Transport for Stable Proximal Alignment of Covariance Embeddings), a proximal majorization-minimization method that directly optimizes this exact BW objective through closed-form updates in a square-root factorization. In exact arithmetic, each iteration satisfies a sufficient-decrease inequality for the BW objective; under inexact polar computations, we provide an explicit certificate-gap bound controlling deviations from exact descent. The resulting iterations preserve PSD structure by construction and naturally support rank-restricted factors, making ITSPACE well-suited as a lightweight inner-loop primitive in settings where adaptation must be performed from unlabeled target batches under strict step and compute budgets. Across real-world covariance-alignment benchmarks, ITSPACE reaches low-BW-gap solutions substantially faster than BW-gradient descent, methods based on other covariance geometries, and entropically regularized sample-OT baselines.

artificial intelligence, machine learning, objective, (16 more...)

arXiv.org Machine Learning

2606.30523

Country: Asia (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Decision-Aligned Evaluation of Uncertainty Quantification

Schneider, Annika, Rochussen, Tommy, Stiller, Joshua, Fortuin, Vincent

arXiv.org Machine LearningJun-26-2026

Uncertainty estimates in machine learning are typically evaluated using generic metrics such as the negative log-likelihood and expected calibration error, yet good performance on such metrics does not necessarily imply high utility in downstream decisions. We introduce decision-alignment, a criterion that reveals which evaluation metrics meaningfully align with downstream utilities. Applying this framework, we show that many widely used uncertainty metrics are either misaligned with common decision problems or encode pathological prior beliefs about the downstream task. We then propose prior-weighted utility metrics, a special class of proper scoring rules that provides decision-aligned uncertainty evaluation. Across benchmark experiments and real-world case studies, our metrics consistently align with realized decision utility, while conventional metrics do not. Our results surface flaws in the current UQ evaluation protocol and offer a principled extension of existing metrics toward decision-relevant UQ evaluation.

alignment, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2606.2699

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Energy > Power Industry (0.93)
Banking & Finance > Loans (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
(2 more...)

Add feedback

Demystifying Spectral Feature Learning for Instrumental Variable Regression

Neural Information Processing SystemsJun-23-2026, 12:42:37 GMT

We address the problem of causal effect estimation in the presence of hidden confounders, using nonparametric instrumental variable (IV) regression. A leading strategy employs spectral features - that is, learned features spanning the top eigensubspaces of the operator linking treatments to instruments. We derive a generalization error bound for a two-stage least squares estimator based on spectral features, and gain insights into the method's performance and failure modes. We show that performance depends on two key factors, leading to a clear taxonomy of outcomes. In a good scenario, the approach is optimal. This occurs with strong spectral alignment, meaning the structural function is well-represented by the top eigenfunctions of the conditional operator, coupled with this operator's slow eigenvalue decay, indicating a strong instrument. Performance degrades in a bad scenario: spectral alignment remains strong, but rapid eigenvalue decay (indicating a weaker instrument) demands significantly more samples for effective feature learning. Finally, in the ugly scenario, weak spectral alignment causes the method to fail, regardless of the eigenvalues' characteristics.

artificial intelligence, machine learning, operator, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Whose View of Safety DIVE for Pluralistic Alignment of Text to Image Models

Neural Information Processing SystemsJun-23-2026, 12:23:16 GMT

Current text-to-image (T2I) models often fail to account for diverse human experiences, leading to misaligned systems. We advocate for pluralism in AI alignment, where an AI understands and is steerable towards diverse, and often conflicting, human values. Our work provides three core contributions to achieve this in T2I models. First, we introduce a novel dataset for Diverse Intersectional Visual Evaluation (DIVE) - the first multimodal dataset for pluralistic alignment. It enables deep alignment to diverse safety perspectives through a large pool of demographically intersectional human raters who provided extensive feedback across 1000 prompts, with high replication, capturing nuanced safety perceptions. Second, we empirically confirm demographics as a crucial proxy for diverse viewpoints in this domain, revealing significant, context-dependent differences in harm perception that diverge from conventional evaluations. Finally, we discuss implications for building aligned T2I models, including efficient data collection strategies, LLM judgment capabilities, and model steerability towards diverse perspectives. This research offers foundational tools for more equitable and aligned T2I systems. Content Warning: The paper includes sensitive content that may be harmful.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Education (0.67)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Generalized Linear Mode Connectivity for Transformers

Neural Information Processing SystemsJun-23-2026, 12:16:34 GMT

Understanding the geometry of neural network loss landscapes is a central question in deep learning, with implications for generalization and optimization. A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low-or zero-barrier paths, despite appearing to lie in separate loss basins. However, this is often obscured by symmetries in parameter space--such as neuron permutations--which make functionally equivalent models appear dissimilar. Prior work has predominantly focused on neuron reordering through permutations, but such approaches are limited in scope and fail to capture the richer symmetries exhibited by modern architectures such as Transformers. In this work, we introduce a unified framework that captures four symmetry classes--permutations, semi-permutations, orthogonal transformations, and general invertible maps--broadening the set of valid reparameterizations and subsuming many previous approaches as special cases. Crucially, this generalization enables, for the first time, the discovery of low-and zero-barrier linear interpolation paths between independently trained Vision Transformers and GPT-2 models. Furthermore, our framework extends beyond pairwise alignment, to multi-model and width-heterogeneous settings, enabling alignment across architectures of different sizes. These results reveal deeper structure in the loss landscape and underscore the importance of symmetry-aware analysis for understanding model space geometry. Our code is available here.

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback