AITopics | practicality

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Neural Information Processing SystemsDec-27-2025, 04:39:35 GMT

Ask, Attend, Attack: An Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models

While image-to-text models have demonstrated significant advancements in various vision-language tasks, they remain susceptible to adversarial attacks. Existing white-box attacks on image-to-text models require access to the architecture, gradients, and parameters of the target model, resulting in low practicality. Although the recently proposed gray-box attacks have improved practicality, they suffer from semantic loss during the training process, which limits their targeted attack performance. To advance adversarial attacks of image-to-text models, this paper focuses on a challenging scenario: decision-based black-box targeted attacks where the attackers only have access to the final output text and aim to perform targeted attacks. Specifically, we formulate the decision-based black-box targeted attack as a large-scale optimization problem. To efficiently solve the optimization problem, a three-stage process \textit{Ask, Attend, Attack}, called \textit{AAA}, is proposed to coordinate with the solver.

artificial intelligence, machine learning, textit, (10 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.62)

Mushkani, Rashid, Koseki, Shin

Street Review: A Participatory AI-Based Framework for Assessing Streetscape Inclusivity

arXiv.org Artificial IntelligenceNov-5-2025

City streets, sidewalks, and public areas often serve as primary interaction points among diverse user groups, including residents, commuters, and visitors ( Gehl, 2011). These spaces carry social, economic, and cultural signifi - cance that influences navigation and user experience ( Mitra ˇ sinovi c & Mehta, 2021). Municipal governments and planning agencies recognize the importance of inclusive public spaces but face challenges in operation - alizing inclusivity ( Anttiroiko & De Jong, 2020). Traditional approaches may draw on universal design principles intended to accommodate a broad range of users, but these frameworks often take a one-size-fits-all approach that prioritizes physical accessibility over the social and cul - tural dimensions of public space use ( Low, 2020). In multicultural cities, where multiple languages, cultures, and religious practices converge, these complexities become particularly evident ( Fan et al., 2023; Lit - man, 2025; Salgado et al., 2021; Youngbloom et al., 2023). Research on inclusive design has provided valuable insights, but few methods combine qualitative depth with quantitative scale to under - stand inclusivity in urban contexts ( Anttiroiko & De Jong, 2020; Mehta, 2019; Zamanifard et al., 2019). Ethnographic research and interviews offer detailed perspectives on lived experience, while computer vision and machine learning enable assessments at larger scales ( Ibrahim et al., 2020). However, large-scale computational approaches often overlook intersectional dimensions ( Zhu et al., 2025). This gap calls for integrated models that merge qualitative and quantitative methodologies.

artificial intelligence, inclusivity, machine learning, (19 more...)

doi: 10.1016/j.cities.2025.106602

2508.11708

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Law (0.93)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceOct-28-2025

Scalable Supervising Software Agents with Patch Reasoner

Xu, Junjielong, Tan, Boyin, Liu, Xiaoyuan, Peng, Chao, Gao, Pengfei, He, Pinjia

While large language model agents have advanced software engineering tasks, the unscalable nature of existing test-based supervision is limiting the potential improvement of data scaling. The reason is twofold: (1) building and running test sandbox is rather heavy and fragile, and (2) data with high-coverage tests is naturally rare and threatened by test hacking via edge cases. In this paper, we propose R4P, a patch verifier model to provide scalable rewards for training and testing SWE agents via reasoning. We consider that patch verification is fundamentally a reasoning task, mirroring how human repository maintainers review patches without writing and running new reproduction tests. To obtain sufficient reference and reduce the risk of reward hacking, R4P uses a group-wise objective for RL training, enabling it to verify multiple patches against each other's modification and gain a dense reward for stable training. R4P achieves 72.2% Acc. for verifying patches from SWE-bench-verified, surpassing OpenAI o3. To demonstrate R4P's practicality, we design and train a lite scaffold, Mini-SE, with pure reinforcement learning where all rewards are derived from R4P. As a result, Mini-SE achieves 26.2% Pass@1 on SWE-bench-verified, showing a 10.0% improvement over the original Qwen3-32B. This can be further improved to 32.8% with R4P for test-time scaling. Furthermore, R4P verifies patches within a second, 50x faster than testing on average. The stable scaling curves of rewards and accuracy along with high efficiency reflect R4P's practicality.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2510.22775

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Neural Information Processing SystemsOct-9-2025, 14:42:22 GMT

66de6afdfb5fb3c21d0e3b5c3226bf00-AuthorFeedback.pdf

We address your questions/suggestions below. If the paper is accepted to NeurIPS 2020, we will have one extra page. There are three players, each with two strategies Head and Tail. We also run with Optimistic MWU. They will appear in the final version.

algorithm, artificial intelligence, machine learning, (10 more...)

Technology:

Information Technology > Game Theory (0.32)
Information Technology > Artificial Intelligence > Machine Learning (0.31)

Neural Information Processing SystemsAug-20-2025, 00:18:31 GMT

Reviewer # 1 2 > the computational complexity is not studied or evaluated so the practicality of this approach might look questionable

We would like to thank the reviewers for their time and helpful comments. We will clarify/fix the paper as suggested. Thank you for pointing that out. Also, the Batch-RL setup is constrained by samples and not computational complexity. There was a tradeoff in writing and explaining the ideas while satisfying the page limit constraints.

computational complexity, look questionable, reviewer, (11 more...)

Technology: Information Technology > Artificial Intelligence (0.32)

arXiv.org Artificial IntelligenceAug-5-2025

Towards Evaluation for Real-World LLM Unlearning

Miao, Ke, Hu, Yuke, Li, Xiaochen, Bao, Wenjie, Liu, Zhihao, Qin, Zhan, Ren, Kui

This paper analyzes the limitations of existing unlearning evaluation metrics in terms of practicality, exactness, and robustness in real-world LLM unlearning scenarios. To overcome these limitations, we propose a new metric called Distribution Correction-based Unlearning Evaluation (DCUE). It identifies core tokens and corrects distributional biases in their confidence scores using a validation set. The evaluation results are quantified using the Kolmogorov-Smirnov test. Experimental results demonstrate that DCUE overcomes the limitations of existing metrics, which also guides the design of more practical and reliable unlearning algorithms in the future.

large language model, machine learning, natural language, (17 more...)

2508.01324

Country:

North America > United States > Virginia (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Neural Information Processing SystemsMay-27-2025, 14:59:32 GMT

Ask, Attend, Attack: An Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models

attack, effective decision-based black-box targeted attack, textit, (8 more...)

Industry: Transportation > Air (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.81)

Mushkani, Rashid, Berard, Hugo, Koseki, Shin

Negotiative Alignment: Embracing Disagreement to Achieve Fairer Outcomes -- Insights from Urban Studies

arXiv.org Artificial IntelligenceMar-16-2025

Cities are not monolithic; they are arenas of negotiation among groups that hold varying needs, values, and experiences. Conventional methods of urban assessment -- from standardized surveys to AI-driven evaluations -- frequently rely on a single consensus metric (e.g., an average measure of inclusivity or safety). Although such aggregations simplify design decisions, they risk obscuring the distinct perspectives of marginalized populations. In this paper, we present findings from a community-centered study in Montreal involving 35 residents with diverse demographic and social identities, particularly wheelchair users, seniors, and LGBTQIA2+ individuals. Using rating and ranking tasks on 20 urban sites, we observe that disagreements are systematic rather than random, reflecting structural inequalities, differing cultural values, and personal experiences of safety and accessibility. Based on these empirical insights, we propose negotiative alignment, an AI framework that treats disagreement as an essential input to be preserved, analyzed, and addressed. Negotiative alignment builds on pluralistic models by dynamically updating stakeholder preferences through multi-agent negotiation mechanisms, ensuring no single perspective is marginalized. We outline how this framework can be integrated into urban analytics -- and other decision-making contexts -- to retain minority viewpoints, adapt to changing stakeholder concerns, and enhance fairness and accountability. The study demonstrates that preserving and engaging with disagreement, rather than striving for an artificial consensus, can produce more equitable and responsive AI-driven outcomes in urban design.

artificial intelligence, disagreement, machine learning, (15 more...)

2503.12613

Country:

North America > Canada > Quebec > Montreal (0.26)
Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
(7 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMar-10-2025

Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality

Fang, Alex, Pouransari, Hadi, Jordan, Matt, Toshev, Alexander, Shankar, Vaishaal, Schmidt, Ludwig, Gunter, Tom

Data filtering has become a powerful tool for improving model performance while reducing computational cost. However, as large language model compute budgets continue to grow, the limited data volume provided by heavily filtered and deduplicated datasets will become a practical constraint. In efforts to better understand how to proceed, we study model performance at various compute budgets and across multiple pre-training datasets created through data filtering and deduplication. We find that, given appropriate modifications to the training recipe, repeating existing aggressively filtered datasets for up to ten epochs can outperform training on the ten times larger superset for a single epoch across multiple compute budget orders of magnitude. While this finding relies on repeating the dataset for many epochs, we also investigate repeats within these datasets at the document level. We find that not all documents within a dataset are equal, and we can create better datasets relative to a token budget by explicitly manipulating the counts of individual documents. We conclude by arguing that even as large language models scale, data filtering remains an important direction of research.

dataset, dclm-baseline, refinedweb, (13 more...)

2503.07879

Country:

Asia > Middle East > Jordan (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)