AITopics | Wu, Xinbo

Collaborating Authors

Wu, Xinbo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks

Jha, Saurabh, Arora, Rohan, Watanabe, Yuji, Yanagawa, Takumi, Chen, Yinfang, Clark, Jackson, Bhavya, Bhavya, Verma, Mudit, Kumar, Harshit, Kitahara, Hirokuni, Zheutlin, Noah, Takano, Saki, Pathak, Divya, George, Felix, Wu, Xinbo, Turkkan, Bekir O., Vanloo, Gerard, Nidd, Michael, Dai, Ting, Chatterjee, Oishik, Gupta, Pranjal, Samanta, Suranjana, Aggarwal, Pooja, Lee, Rong, Murali, Pavankumar, Ahn, Jae-wook, Kar, Debanjana, Rahane, Ameet, Fonseca, Carlos, Paradkar, Amit, Deng, Yu, Moogi, Pratibha, Mohapatra, Prateeti, Abe, Naoki, Narayanaswami, Chandrasekhar, Xu, Tianyin, Varshney, Lav R., Mahindru, Ruchi, Sailer, Anca, Shwartz, Laura, Sow, Daby, Fuller, Nicholas C. M., Puri, Ruchir

arXiv.org Artificial IntelligenceFeb-7-2025

Realizing the vision of using AI agents to automate critical IT tasks depends on the ability to measure and understand effectiveness of proposed solutions. We introduce ITBench, a framework that offers a systematic methodology for benchmarking AI agents to address real-world IT automation tasks. Our initial release targets three key areas: Site Reliability Engineering (SRE), Compliance and Security Operations (CISO), and Financial Operations (FinOps). The design enables AI researchers to understand the challenges and opportunities of AI agents for IT automation with push-button workflows and interpretable metrics. ITBench includes an initial set of 94 real-world scenarios, which can be easily extended by community contributions. Our results show that agents powered by state-of-the-art models resolve only 13.8% of SRE scenarios, 25.2% of CISO scenarios, and 0% of FinOps scenarios. We expect ITBench to be a key enabler of AI-driven IT automation that is correct, safe, and fast.

agent, artificial intelligence, scenario, (17 more...)

arXiv.org Artificial Intelligence

2502.05352

Country:

Europe (0.27)
North America > United States > California (0.14)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)

Add feedback

SwitchCIT: Switching for Continual Instruction Tuning of Large Language Models

Wu, Xinbo, Hartman, Max, Jayaraman, Vidhata Arjun, Varshney, Lav R.

arXiv.org Artificial IntelligenceJul-16-2024

Large language models (LLMs) have demonstrated remarkable capabilities across numerous domains, as highlighted by OpenAI (2023) and Bubeck et al. (2023). However, whereas LLMs pre-trained on extensive language data excel in general language understanding, they may not be optimized for every specific task of interest prompted by instructions. Therefore, there is need for continual instruction learning to adapt LLMs to evolving tasks and domains. Indeed, continual instruction learning is essential for LLMs such as GPT (Radford et al., 2019) to maintain their effectiveness and relevance in handling a wide range of tasks and domains. Such models are trained on vast amounts of text data and fine-tuned for specific applications, often by learning tasks sequentially (Luo et al., 2023), i.e. learning on datasets pertaining to one task all at once, before moving on to the next task. The challenge lies in their ability to continually learn and adapt as they encounter new tasks and information.

large language model, machine learning, switch network, (18 more...)

arXiv.org Artificial Intelligence

2407.1178

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.50)

Industry: Education > Educational Setting (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Transformer-based Causal Language Models Perform Clustering

Wu, Xinbo, Varshney, Lav R.

arXiv.org Artificial IntelligenceMar-3-2024

Even though large language models (LLMs) have demonstrated remarkable capability in solving various natural language tasks, the capability of an LLM to follow human instructions is still a concern. Recent works have shown great improvements in the instruction-following capability via additional training for instruction-following tasks. However, the mechanisms responsible for effective instruction-following capabilities remain inadequately understood. Here, we introduce a simplified instruction-following task and use synthetic datasets to analyze a Transformer-based causal language model. Our findings suggest that the model learns task-specific information by clustering data within its hidden space, with this clustering process evolving dynamically during learning. We also demonstrate how this phenomenon assists the model in handling unseen instances, and validate our results in a more realistic setting. Furthermore, we present inspired applications regarding pre-training and alignment.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.12151

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Meta-Learning Perspective on Transformers for Causal Language Modeling

Wu, Xinbo, Varshney, Lav R.

arXiv.org Artificial IntelligenceOct-9-2023

The Transformer architecture has become prominent in developing large causal language models. However, mechanisms to explain its capabilities are not well understood. Focused on the training process, here we establish a meta-learning view of the Transformer architecture when trained for the causal language modeling task, by explicating an inner optimization process that may happen within the Transformer. Further, from within the inner optimization, we discover and theoretically analyze a special characteristic of the norms of learned token representations within Transformer-based causal language models. Our analysis is supported by experiments conducted on pre-trained large language models and real-world data.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.05884

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback