AITopics | Nigam, Priyanka

Collaborating Authors

Nigam, Priyanka

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training

Zhuang, Yuchen, Yang, Jingfeng, Jiang, Haoming, Liu, Xin, Cheng, Kewei, Lokegaonkar, Sanket, Gao, Yifan, Ping, Qing, Liu, Tianyi, Huang, Binxuan, Li, Zheng, Wang, Zhengyang, Chen, Pei, Wang, Ruijie, Zhang, Rongzhi, Zalmout, Nasser, Nigam, Priyanka, Yin, Bing, Zhang, Chao

arXiv.org Artificial IntelligenceFeb-10-2025

Due to the scarcity of agent-oriented pre-training data, LLM-based autonomous agents typically rely on complex prompting or extensive fine-tuning, which often fails to introduce new capabilities while preserving strong generalizability. We introduce Hephaestus-Forge, the first large-scale pre-training corpus designed to enhance the fundamental capabilities of LLM agents in API function calling, intrinsic reasoning and planning, and adapting to environmental feedback. Hephaestus-Forge comprises 103B agent-specific data encompassing 76,537 APIs, including both tool documentation to introduce knowledge of API functions and function calling trajectories to strengthen intrinsic reasoning. To explore effective training protocols, we investigate scaling laws to identify the optimal recipe in data mixing ratios. By continual pre-training on Hephaestus-Forge, Hephaestus outperforms small- to medium-scale open-source LLMs and rivals commercial LLMs on three agent benchmarks, demonstrating the effectiveness of our pre-training corpus in enhancing fundamental agentic capabilities and generalization of LLMs to new tasks or environments.

huggingface, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.06589

Country:

Asia (0.46)
North America > United States (0.46)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.67)
Education > Educational Setting (0.46)
Education > Curriculum > Subject-Specific Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models

Jin, Yilun, Li, Zheng, Zhang, Chenwei, Cao, Tianyu, Gao, Yifan, Jayarao, Pratik, Li, Mao, Liu, Xin, Sarkhel, Ritesh, Tang, Xianfeng, Wang, Haodong, Wang, Zhengyang, Xu, Wenju, Yang, Jingfeng, Yin, Qingyu, Li, Xian, Nigam, Priyanka, Xu, Yi, Chen, Kai, Yang, Qiang, Jiang, Meng, Yin, Bing

arXiv.org Artificial IntelligenceOct-31-2024

Online shopping is a complex multi-task, few-shot learning problem with a wide and evolving range of entities, relations, and tasks. However, existing models and benchmarks are commonly tailored to specific tasks, falling short of capturing the full complexity of online shopping. Large Language Models (LLMs), with their multi-task and few-shot learning abilities, have the potential to profoundly transform online shopping by alleviating task-specific engineering efforts and by providing users with interactive conversations. Despite the potential, LLMs face unique challenges in online shopping, such as domain-specific concepts, implicit knowledge, and heterogeneous user behaviors. Motivated by the potential and challenges, we propose Shopping MMLU, a diverse multi-task online shopping benchmark derived from real-world Amazon data. Shopping MMLU consists of 57 tasks covering 4 major shopping skills: concept understanding, knowledge reasoning, user behavior alignment, and multi-linguality, and can thus comprehensively evaluate the abilities of LLMs as general shop assistants. With Shopping MMLU, we benchmark over 20 existing LLMs and uncover valuable insights about practices and prospects of building versatile LLM-based shop assistants. Shopping MMLU can be publicly accessed at https://github.com/KL4805/ShoppingMMLU. In addition, with Shopping MMLU, we host a competition in KDD Cup 2024 with over 500 participating teams. The winning solutions and the associated workshop can be accessed at our website https://amazon-kddcup24.github.io/.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.20745

Country: Europe (0.67)

Genre: Research Report (1.00)

Industry:

Retail > Online (1.00)
Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evolutionary Contrastive Distillation for Language Model Alignment

Katz-Samuels, Julian, Li, Zheng, Yun, Hyokun, Nigam, Priyanka, Xu, Yi, Petricek, Vaclav, Yin, Bing, Chilimbi, Trishul

arXiv.org Artificial IntelligenceOct-9-2024

The ability of large language models (LLMs) to execute complex instructions is essential for their real-world applications. However, several recent studies indicate that LLMs struggle with challenging instructions. In this paper, we propose Evolutionary Contrastive Distillation (ECD), a novel method for generating high-quality synthetic preference data designed to enhance the complex instruction-following capability of language models. ECD generates data that specifically illustrates the difference between a response that successfully follows a set of complex instructions and a response that is high-quality, but nevertheless makes some subtle mistakes. This is done by prompting LLMs to progressively evolve simple instructions to more complex instructions. When the complexity of an instruction is increased, the original successful response to the original instruction becomes a "hard negative" response for the new instruction, mostly meeting requirements of the new instruction, but barely missing one or two. By pairing a good response with such a hard negative response, and employing contrastive learning algorithms such as DPO, we improve language models' ability to follow complex instructions. Empirically, we observe that our method yields a 7B model that exceeds the complex instruction-following performance of current SOTA 7B models and is competitive even with open-source 70B models.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.07513

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback