AITopics

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Keswani, Vijay, Cousins, Cyrus, Nguyen, Breanna, Conitzer, Vincent, Heidari, Hoda, Borg, Jana Schaich, Sinnott-Armstrong, Walter

Moral Change or Noise? On Problems of Aligning AI With Temporally Unstable Human Feedback

arXiv.org Artificial IntelligenceNov-14-2025

Alignment methods in moral domains seek to elicit moral preferences of human stakeholders and incorporate them into AI. This presupposes moral preferences as static targets, but such preferences often evolve over time. Proper alignment of AI to dynamic human preferences should ideally account for "legitimate" changes to moral reasoning, while ignoring changes related to attention deficits, cognitive biases, or other arbitrary factors. However, common AI alignment approaches largely neglect temporal changes in preferences, posing serious challenges to proper alignment, especially in high-stakes applications of AI, e.g., in healthcare domains, where misalignment can jeopardize the trustworthiness of the system and yield serious individual and societal harms. This work investigates the extent to which people's moral preferences change over time, and the impact of such changes on AI alignment. Our study is grounded in the kidney allocation domain, where we elicit responses to pairwise comparisons of hypothetical kidney transplant patients from over 400 participants across 3-5 sessions. We find that, on average, participants change their response to the same scenario presented at different times around 6-20% of the time (exhibiting "response instability"). Additionally, we observe significant shifts in several participants' retrofitted decision-making models over time (capturing "model instability"). The predictive performance of simple AI models decreases as a function of both response and model instability. Moreover, predictive performance diminishes over time, highlighting the importance of accounting for temporal changes in preferences during training. These findings raise fundamental normative and technical challenges relevant to AI alignment, highlighting the need to better understand the object of alignment (what to align to) when user preferences change significantly over time.

artificial intelligence, machine learning, natural language, (20 more...)

2511.10032

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Nephrology (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsAug-19-2025, 21:31:59 GMT

faacb7a4827b4d51e201666b93ab5fa7-Supplemental-Conference.pdf

artificial intelligence, machine learning, target model, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsAug-16-2025, 12:03:49 GMT

a113c1ecd3cace2237256f4c712f61b5-Supplemental.pdf

artificial intelligence, distortion 2, machine learning, (15 more...)

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsAug-15-2025, 02:52:26 GMT

Appendix: Learning Black-Box Attackers with Transferable Priors and Query Feedback Jiancheng Y ang

In Figure A1, we illustrate the gradients from Inception-V3 [15] and ResNet-152 [9]. These authors have contributed equally. Output: updated surrogate model S . The experiment setting and images are same as previous state-of-the-art [2]. Thereby, we also report the A VG.Q' including failures (in Table A1, A2, A3, A4, A5), where failure query numbers are considered as 10,000.

adversarial attack, asr, leba, (14 more...)

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > Canada (0.04)

Industry: Transportation > Air (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Neural Information Processing SystemsAug-15-2025, 02:52:18 GMT

Learning Black-Box Attackers with Transferable Priors and Query Feedback Jiancheng Y ang

Inspired by consistency of visual saliency between different vision models, a surrogate model is expected to improve the attack performance via transferability.

black-box attack, surrogate model, victim model, (14 more...)

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.93)
Information Technology > Security & Privacy (0.49)
Transportation > Air (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Vision (0.88)

Neural Information Processing SystemsAug-14-2025, 21:38:20 GMT

544696ef4847c903376ed6ec58f3a703-Paper-Conference.pdf

adversarial example, decision-based attack, noise, (13 more...)

Country:

Africa > Madagascar (0.04)
North America > Canada > Newfoundland and Labrador > Newfoundland (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
(2 more...)

Industry: Leisure & Entertainment > Sports > Golf (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Kannan, Ravindran, Bhattacharyya, Chiranjib, Kacham, Praneeth, Woodruff, David P.

LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions

arXiv.org Artificial IntelligenceOct-7-2024

A central problem related to transformers can be stated as follows: given two $n \times d$ matrices $Q$ and $K$, and a non-negative function $f$, define the matrix $A$ as follows: (1) apply the function $f$ to each entry of the $n \times n$ matrix $Q K^T$, and then (2) normalize each of the row sums of $A$ to be equal to $1$. The matrix $A$ can be computed in $O(n^2 d)$ time assuming $f$ can be applied to a number in constant time, but the quadratic dependence on $n$ is prohibitive in applications where it corresponds to long context lengths. For a large class of functions $f$, we show how to find all the ``large attention scores", i.e., entries of $A$ which are at least a positive value $\varepsilon$, in time with linear dependence on $n$ (i.e., $n \cdot \textrm{poly}(d/\varepsilon)$) for a positive parameter $\varepsilon > 0$. Our class of functions include all functions $f$ of the form $f(x) = |x|^p$, as explored recently in transformer models. Using recently developed tools from randomized numerical linear algebra, we prove that for any $K$, there is a ``universal set" $U \subset [n]$ of size independent of $n$, such that for any $Q$ and any row $i$, the large attention scores $A_{i,j}$ in row $i$ of $A$ all have $j \in U$. We also find $U$ in $n \cdot \textrm{poly}(d/\varepsilon)$ time. Notably, we (1) make no assumptions on the data, (2) our workspace does not grow with $n$, and (3) our algorithms can be computed in streaming and parallel settings. We call the attention mechanism that uses only the subset of keys in the universal set as LevAttention since our algorithm to identify the universal set $U$ is based on leverage scores. We empirically show the benefits of our scheme for vision transformers, showing how to train new models that use our universal set while training as well, showing that our model is able to consistently select ``important keys'' during training.

attention mass, non-local attention, query number, (14 more...)

2410.05462

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Vision (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

arXiv.org Artificial IntelligenceJul-14-2024

A3S: A General Active Clustering Method with Pairwise Constraints

Deng, Xun, Liu, Junlong, Zhong, Han, Feng, Fuli, Shen, Chen, He, Xiangnan, Ye, Jieping, Wang, Zheng

Active clustering aims to boost the clustering performance by integrating human-annotated pairwise constraints through strategic querying. Conventional approaches with semi-supervised clustering schemes encounter high query costs when applied to large datasets with numerous classes. To address these limitations, we propose a novel Adaptive Active Aggregation and Splitting (A3S) framework, falling within the cluster-adjustment scheme in active clustering. A3S features strategic active clustering adjustment on the initial cluster result, which is obtained by an adaptive clustering algorithm. In particular, our cluster adjustment is inspired by the quantitative analysis of Normalized mutual information gain under the information theory framework and can provably improve the clustering quality. The proposed A3S framework significantly elevates the performance and scalability of active clustering. In extensive experiments across diverse real-world datasets, A3S achieves desired results with significantly fewer human queries compared with existing methods.

constraint, general active clustering method, query, (12 more...)

2407.10196

Country:

Europe > Austria > Vienna (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Artificial IntelligenceJul-12-2024

SemiAdv: Query-Efficient Black-Box Adversarial Attack with Unlabeled Images

Fan, Mingyuan, Liu, Yang, Chen, Cen, Liu, Ximeng

Adversarial attack has garnered considerable attention due to its profound implications for the secure deployment of robots in sensitive security scenarios. To potentially push for advances in the field, this paper studies the adversarial attack in the black-box setting and proposes an unlabeled data-driven adversarial attack method, called SemiAdv. Specifically, SemiAdv achieves the following breakthroughs compared with previous works. First, by introducing the semi-supervised learning technique into the adversarial attack, SemiAdv substantially decreases the number of queries required for generating adversarial samples. On average, SemiAdv only needs to query a few hundred times to launch an effective attack with more than 90% success rate. Second, many existing black-box adversarial attacks require massive labeled data to mitigate the difference between the local substitute model and the remote target model for a good attack performance. While SemiAdv relaxes this limitation and is capable of utilizing unlabeled raw data to launch an effective attack. Finally, our experiments show that SemiAdv saves up to 12x query accesses for generating adversarial samples while maintaining a competitive attack success rate compared with state-of-the-art attacks.

adversarial attack, semiadv, target model, (16 more...)