AITopics | scot

Collaborating Authors

scot

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reverse Engineering User Stories from Code using Large Language Models

Ouf, Mohamed, Li, Haoyu, Zhang, Michael, Guizani, Mariam

arXiv.org Artificial IntelligenceSep-25-2025

User stories are essential in agile development, yet often missing or outdated in legacy and poorly documented systems. We investigate whether large language models (LLMs) can automatically recover user stories directly from source code and how prompt design impacts output quality. Using 1,750 annotated C++ snippets of varying complexity, we evaluate five state-of-the-art LLMs across six prompting strategies. Results show that all models achieve, on average, an F1 score of 0.8 for code up to 200 NLOC. Our findings show that a single illustrative example enables the smallest model (8B) to match the performance of a much larger 70B model. In contrast, structured reasoning via Chain-of-Thought offers only marginal gains, primarily for larger models.

large language model, machine learning, user story, (15 more...)

arXiv.org Artificial Intelligence

2509.19587

Country: North America > Canada (0.15)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Reasoning for LLMs through Speculative Chain-of-Thought

Wang, Jikai, Li, Juntao, Hou, Jianye, Yan, Bowen, Wu, Lijun, Zhang, Min

arXiv.org Artificial IntelligenceMay-27-2025

Large reasoning language models such as OpenAI-o1 and Deepseek-R1 have recently attracted widespread attention due to their impressive task-solving abilities. However, the enormous model size and the generation of lengthy thought chains introduce significant reasoning costs and response latency. Existing methods for efficient reasoning mainly focus on reducing the number of model parameters or shortening the chain-of-thought length. In this paper, we introduce Speculative Chain-of-Thought (SCoT), which reduces reasoning latency from another perspective by accelerated average reasoning speed through large and small model collaboration. SCoT conducts thought-level drafting using a lightweight draft model. Then it selects the best CoT draft and corrects the error cases with the target model. The proposed thinking behavior alignment improves the efficiency of drafting and the draft selection strategy maintains the prediction accuracy of the target model for complex tasks. Experimental results on GSM8K, MATH, GaoKao, CollegeMath and Olympiad datasets show that SCoT reduces reasoning latency by 48\%$\sim$66\% and 21\%$\sim$49\% for Deepseek-R1-Distill-Qwen-32B and Deepseek-R1-Distill-Llama-70B while achieving near-target-model-level performance. Our code is available at https://github.com/Jikai0Wang/Speculative_CoT.

large language model, machine learning, target model, (20 more...)

arXiv.org Artificial Intelligence

2504.19095

Country: Asia > Middle East > UAE (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Do Inuit languages really have many words for snow?

This article was originally featured on The Conversation. Languages are windows into the worlds of the people who speak them – reflecting what they value and experience daily. So perhaps it's no surprise different languages highlight different areas of vocabulary. Scholars have noted that Mongolian has many horse-related words, that Maori has many words for ferns, and Japanese has many words related to taste. Some links are unsurprising, such as German having many words related to beer, or Fijian having many words for fish.

canadian inuktitut, language and concept, snow, (11 more...)

Popular Science

Country:

North America > United States > Alaska (0.05)
Africa > South Africa (0.05)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.37)

Add feedback

Enhancing Model Defense Against Jailbreaks with Proactive Safety Reasoning

Yang, Xianglin, Deng, Gelei, Shi, Jieming, Zhang, Tianwei, Dong, Jin Song

arXiv.org Artificial IntelligenceJan-31-2025

Large language models (LLMs) are vital for a wide range of applications yet remain susceptible to jailbreak threats, which could lead to the generation of inappropriate responses. Conventional defenses, such as refusal and adversarial training, often fail to cover corner cases or rare domains, leaving LLMs still vulnerable to more sophisticated attacks. We propose a novel defense strategy, Safety Chain-of-Thought (SCoT), which harnesses the enhanced reasoning capabilities of LLMs for proactive assessment of harmful inputs, rather than simply blocking them. SCoT augments any refusal training datasets to critically analyze the intent behind each request before generating answers. By employing proactive reasoning, SCoT enhances the generalization of LLMs across varied harmful queries and scenarios not covered in the safety alignment corpus. Additionally, it generates detailed refusals specifying the rules violated. Comparative evaluations show that SCoT significantly surpasses existing defenses, reducing vulnerability to out-of-distribution issues and adversarial manipulations while maintaining strong general capabilities. The code and data is available at https://anonymous.4open.science/r/SCoT-D4D9.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2501.1918

Country:

Asia > Singapore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Monaco (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SCOT: Self-Supervised Contrastive Pretraining For Zero-Shot Compositional Retrieval

Jawade, Bhavin, Soares, Joao V. B., Thadani, Kapil, Mohan, Deen Dayal, Eshratifar, Amir Erfan, Culpepper, Benjamin, de Juan, Paloma, Setlur, Srirangaraj, Govindaraju, Venu

arXiv.org Artificial IntelligenceJan-12-2025

Compositional image retrieval (CIR) is a multimodal learning task where a model combines a query image with a user-provided text modification to retrieve a target image. CIR finds applications in a variety of domains including product retrieval (e-commerce) and web search. Existing methods primarily focus on fully-supervised learning, wherein models are trained on datasets of labeled triplets such as FashionIQ and CIRR. This poses two significant challenges: (i) curating such triplet datasets is labor intensive; and (ii) models lack generalization to unseen objects and domains. In this work, we propose SCOT (Self-supervised COmpositional Training), a novel zero-shot compositional pretraining strategy that combines existing large image-text pair datasets with the generative capabilities of large language models to contrastively train an embedding composition network. Specifically, we show that the text embedding from a large-scale contrastively-pretrained vision-language model can be utilized as proxy target supervision during compositional pretraining, replacing the target image embedding. In zero-shot settings, this strategy surpasses SOTA zero-shot compositional retrieval methods as well as many fully-supervised methods on standard benchmarks such as FashionIQ and CIRR.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.08347

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Lee, Juyong, Hahm, Dongyoon, Choi, June Suk, Knox, W. Bradley, Lee, Kimin

arXiv.org Artificial IntelligenceDec-10-2024

Autonomous agents powered by large language models (LLMs) show promising potential in assistive tasks across various domains, including mobile device control. As these agents interact directly with personal information and device settings, ensuring their safe and reliable behavior is crucial to prevent undesirable outcomes. However, no benchmark exists for standardized evaluation of the safety of mobile device-control agents. In this work, we introduce MobileSafetyBench, a benchmark designed to evaluate the safety of device-control agents within a realistic mobile environment based on Android emulators. We develop a diverse set of tasks involving interactions with various mobile applications, including messaging and banking applications, challenging agents with managing risks encompassing misuse and negative side effects. These tasks include tests to evaluate the safety of agents in daily scenarios as well as their robustness against indirect prompt injection attacks. Our experiments demonstrate that baseline agents, based on state-of-the-art LLMs, often fail to effectively prevent harm while performing the tasks. To mitigate these safety concerns, we propose a prompting method that encourages agents to prioritize safety considerations. While this method shows promise in promoting safer behaviors, there is still considerable room for improvement to fully earn user trust. This highlights the urgent need for continued research to develop more robust safety mechanisms in mobile environments. We open-source our benchmark at: https://mobilesafetybench.github.io/.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.1752

Country:

North America > United States > New York (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

Strategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitation

Wang, Yu, Zhao, Shiwan, Wang, Zhihu, Huang, Heyuan, Fan, Ming, Zhang, Yubo, Wang, Zhixing, Wang, Haijun, Liu, Ting

arXiv.org Artificial IntelligenceSep-5-2024

The Chain-of-Thought (CoT) paradigm has emerged as a critical approach for enhancing the reasoning capabilities of large language models (LLMs). However, despite their widespread adoption and success, CoT methods often exhibit instability due to their inability to consistently ensure the quality of generated reasoning paths, leading to sub-optimal reasoning performance. To address this challenge, we propose the \textbf{Strategic Chain-of-Thought} (SCoT), a novel methodology designed to refine LLM performance by integrating strategic knowledge prior to generating intermediate reasoning steps. SCoT employs a two-stage approach within a single prompt: first eliciting an effective problem-solving strategy, which is then used to guide the generation of high-quality CoT paths and final answers. Our experiments across eight challenging reasoning datasets demonstrate significant improvements, including a 21.05\% increase on the GSM8K dataset and 24.13\% on the Tracking\_Objects dataset, respectively, using the Llama3-8b model. Additionally, we extend the SCoT framework to develop a few-shot method with automatically matched demonstrations, yielding even stronger results. These findings underscore the efficacy of SCoT, highlighting its potential to substantially enhance LLM performance in complex reasoning tasks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.03271

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > District of Columbia > Washington (0.04)
(2 more...)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Structured Chain-of-Thought Prompting for Code Generation

Li, Jia, Li, Ge, Li, Yongmin, Jin, Zhi

arXiv.org Artificial IntelligenceSep-7-2023

Large Language Models (LLMs) (e.g., ChatGPT) have shown impressive performance in code generation. LLMs take prompts as inputs, and Chain-of-Thought (CoT) prompting is the state-of-the-art prompting technique. CoT prompting asks LLMs first to generate CoTs (i.e., intermediate natural language reasoning steps) and then output the code. However, CoT prompting is designed for natural language generation and has low accuracy in code generation. In this paper, we propose Structured CoTs (SCoTs) and present a novel prompting technique for code generation, named SCoT prompting. Our motivation is source code contains rich structural information and any code can be composed of three program structures (i.e., sequence, branch, and loop structures). Intuitively, structured intermediate reasoning steps make for structured source code. Thus, we ask LLMs to use program structures to build CoTs, obtaining SCoTs. Then, LLMs generate the final code based on SCoTs. Compared to CoT prompting, SCoT prompting explicitly constrains LLMs to think about how to solve requirements from the view of source code and further the performance of LLMs in code generation. We apply SCoT prompting to two LLMs (i.e., ChatGPT and Codex) and evaluate it on three benchmarks (i.e., HumanEval, MBPP, and MBCPP). (1) SCoT prompting outperforms the state-of-the-art baseline - CoT prompting by up to 13.79% in Pass@1. (2) Human evaluation shows human developers prefer programs from SCoT prompting. (3) SCoT prompting is robust to examples and achieves substantial improvements.

code generation, requirement, scot, (13 more...)

arXiv.org Artificial Intelligence

2305.06599

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > District of Columbia > Washington (0.05)
Asia > China > Beijing > Beijing (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AVIDA: Alternating method for Visualizing and Integrating Data

Dover, Kathryn, Cang, Zixuan, Ma, Anna, Nie, Qing, Vershynin, Roman

arXiv.org Artificial IntelligenceApr-7-2023

High-dimensional multimodal data arises in many scientific fields. The integration of multimodal data becomes challenging when there is no known correspondence between the samples and the features of different datasets. To tackle this challenge, we introduce AVIDA, a framework for simultaneously performing data alignment and dimension reduction. In the numerical experiments, Gromov-Wasserstein optimal transport and t-distributed stochastic neighbor embedding are used as the alignment and dimension reduction modules respectively. We show that AVIDA correctly aligns high-dimensional datasets without common features with four synthesized datasets and two real multimodal single-cell datasets. Compared to several existing methods, we demonstrate that AVIDA better preserves structures of individual datasets, especially distinct local structures in the joint low-dimensional visualization, while achieving comparable alignment performance. Such a property is important in multimodal single-cell data analysis as some biological processes are uniquely captured by one of the datasets. In general applications, other methods can be used for the alignment and dimension reduction modules.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2206.00135

Country:

North America > United States > North Carolina (0.04)
North America > United States > California > Orange County > Irvine (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Towards Countering Essentialism through Social Bias Reasoning

Allaway, Emily, Taneja, Nina, Leslie, Sarah-Jane, Sap, Maarten

arXiv.org Artificial IntelligenceMar-28-2023

Essentialist beliefs (i.e., believing that members of the same group are fundamentally alike) play a central role in social stereotypes and can lead to harm when left unchallenged. In our work, we conduct exploratory studies into the task of countering essentialist beliefs (e.g., ``liberals are stupid''). Drawing on prior work from psychology and NLP, we construct five types of counterstatements and conduct human studies on the effectiveness of these different strategies. Our studies also investigate the role in choosing a counterstatement of the level of explicitness with which an essentialist belief is conveyed. We find that statements that broaden the scope of a stereotype (e.g., to other groups, as in ``conservatives can also be stupid'') are the most popular countering strategy. We conclude with a discussion of challenges and open questions for future work in this area (e.g., improving factuality, studying community-specific variation) and we emphasize the importance of work at the intersection of NLP and psychology.

artificial intelligence, natural language, stereotype, (17 more...)

arXiv.org Artificial Intelligence

2303.16173

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > Scotland (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback