AITopics | Zalmout, Nasser

Collaborating Authors

Zalmout, Nasser

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training

Zhuang, Yuchen, Yang, Jingfeng, Jiang, Haoming, Liu, Xin, Cheng, Kewei, Lokegaonkar, Sanket, Gao, Yifan, Ping, Qing, Liu, Tianyi, Huang, Binxuan, Li, Zheng, Wang, Zhengyang, Chen, Pei, Wang, Ruijie, Zhang, Rongzhi, Zalmout, Nasser, Nigam, Priyanka, Yin, Bing, Zhang, Chao

arXiv.org Artificial IntelligenceFeb-10-2025

Due to the scarcity of agent-oriented pre-training data, LLM-based autonomous agents typically rely on complex prompting or extensive fine-tuning, which often fails to introduce new capabilities while preserving strong generalizability. We introduce Hephaestus-Forge, the first large-scale pre-training corpus designed to enhance the fundamental capabilities of LLM agents in API function calling, intrinsic reasoning and planning, and adapting to environmental feedback. Hephaestus-Forge comprises 103B agent-specific data encompassing 76,537 APIs, including both tool documentation to introduce knowledge of API functions and function calling trajectories to strengthen intrinsic reasoning. To explore effective training protocols, we investigate scaling laws to identify the optimal recipe in data mixing ratios. By continual pre-training on Hephaestus-Forge, Hephaestus outperforms small- to medium-scale open-source LLMs and rivals commercial LLMs on three agent benchmarks, demonstrating the effectiveness of our pre-training corpus in enhancing fundamental agentic capabilities and generalization of LLMs to new tasks or environments.

huggingface, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.06589

Country:

Asia (0.46)
North America > United States (0.46)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.67)
Education > Educational Setting (0.46)
Education > Curriculum > Subject-Specific Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PV2TEA: Patching Visual Modality to Textual-Established Information Extraction

Cui, Hejie, Lin, Rongmei, Zalmout, Nasser, Zhang, Chenwei, Shang, Jingbo, Yang, Carl, Li, Xian

arXiv.org Artificial IntelligenceJun-1-2023

Information extraction, e.g., attribute value extraction, has been extensively studied and formulated based only on text. However, many attributes can benefit from image-based extraction, like color, shape, pattern, among others. The visual modality has long been underutilized, mainly due to multimodal annotation difficulty. In this paper, we aim to patch the visual modality to the textual-established attribute information extractor. The cross-modality integration faces several unique challenges: (C1) images and textual descriptions are loosely paired intra-sample and inter-samples; (C2) images usually contain rich backgrounds that can mislead the prediction; (C3) weakly supervised labels from textual-established extractors are biased for multimodal training. We present PV2TEA, an encoder-decoder architecture equipped with three bias reduction schemes: (S1) Augmented label-smoothed contrast to improve the cross-modality alignment for loosely-paired image and text; (S2) Attention-pruning that adaptively distinguishes the visual foreground; (S3) Two-level neighborhood regularization that mitigates the label textual bias via reliability estimation. Empirical results on real-world e-Commerce datasets demonstrate up to 11.74% absolute (20.97% relatively) F1 increase over unimodal baselines.

extraction, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2306.01016

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Consumer Products & Services (0.68)
Information Technology > Services (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

End-to-End Conversational Search for Online Shopping with Utterance Transfer

Xiao, Liqiang, Ma2, Jun, Dong, Xin Luna, Martinez-Gomez, Pascual, Zalmout, Nasser, Chen, Wei, Zhao, Tong, He, Hao, Jin, Yaohui

arXiv.org Artificial IntelligenceSep-12-2021

Successful conversational search systems can present natural, adaptive and interactive shopping experience for online shopping customers. However, building such systems from scratch faces real word challenges from both imperfect product schema/knowledge and lack of training dialog data.In this work we first propose ConvSearch, an end-to-end conversational search system that deeply combines the dialog system with search. It leverages the text profile to retrieve products, which is more robust against imperfect product schema/knowledge compared with using product attributes alone. We then address the lack of data challenges by proposing an utterance transfer approach that generates dialogue utterances by using existing dialog from other domains, and leveraging the search behavior data from e-commerce retailer. With utterance transfer, we introduce a new conversational search dataset for online shopping. Experiments show that our utterance transfer method can significantly improve the availability of training dialogue data without crowd-sourcing, and the conversational search system significantly outperformed the best tested baseline.

crowdsourcing, neural network, utterance, (22 more...)

arXiv.org Artificial Intelligence

2109.0546

Country:

Asia (0.69)
North America > United States > Louisiana (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry:

Retail > Online (1.00)
Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > e-Commerce (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)
(3 more...)

Add feedback

Utilizing Character and Word Embeddings for Text Normalization with Sequence-to-Sequence Models

Watson, Daniel, Zalmout, Nasser, Habash, Nizar

arXiv.org Machine LearningSep-5-2018

Text normalization is an important enabling technology for several NLP tasks. Recently, neural-network-based approaches have outperformed well-established models in this task. However, in languages other than English, there has been little exploration in this direction. Both the scarcity of annotated data and the complexity of the language increase the difficulty of the problem. To address these challenges, we use a sequence-to-sequence model with character-based attention, which in addition to its self-learned character embeddings, uses word embeddings pre-trained with an approach that also models subword information. This provides the neural model with access to more linguistic information especially suitable for text normalization, without large parallel corpora. We show that providing the model with word-level features bridges the gap for the neural network approach to achieve a state-of-the-art F1 score on a standard Arabic language correction shared task dataset.

machine translation, neural network, text normalization, (19 more...)

arXiv.org Machine Learning

1809.01534

Country:

Asia > Middle East > Republic of Türkiye (0.14)
Asia > Middle East > Qatar (0.14)

Genre:

Research Report (0.82)
Overview (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback