AITopics | item id

Collaborating Authors

item id

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GRAM: Generative Recommendation via Semantic-aware Multi-granular Late Fusion

Lee, Sunkyung, Choi, Minjin, Choi, Eunseong, Kim, Hye-young, Lee, Jongwuk

arXiv.org Artificial IntelligenceJun-3-2025

Generative recommendation is an emerging paradigm that leverages the extensive knowledge of large language models by formulating recommendations into a text-to-text generation task. However, existing studies face two key limitations in (i) incorporating implicit item relationships and (ii) utilizing rich yet lengthy item information. To address these challenges, we propose a Generative Recommender via semantic-Aware Multi-granular late fusion (GRAM), introducing two synergistic innovations. First, we design semantic-to-lexical translation to encode implicit hierarchical and collaborative item relationships into the vocabulary space of LLMs. Second, we present multi-granular late fusion to integrate rich semantics efficiently with minimal information loss. It employs separate encoders for multi-granular prompts, delaying the fusion until the decoding stage. Experiments on four benchmark datasets show that GRAM outperforms eight state-of-the-art generative recommendation models, achieving significant improvements of 11.5-16.0% in Recall@5 and 5.3-13.6% in NDCG@5. The source code is available at https://github.com/skleee/GRAM.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.01673

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

NERsocial: Efficient Named Entity Recognition Dataset Construction for Human-Robot Interaction Utilizing RapidNER

Atuhurra, Jesse, Kamigaito, Hidetaka, Ouchi, Hiroki, Shindo, Hiroyuki, Watanabe, Taro

arXiv.org Artificial IntelligenceNov-27-2024

Adapting named entity recognition (NER) methods to new domains poses significant challenges. We introduce RapidNER, a framework designed for the rapid deployment of NER systems through efficient dataset construction. RapidNER operates through three key steps: (1) extracting domain-specific sub-graphs and triples from a general knowledge graph, (2) collecting and leveraging texts from various sources to build the NERsocial dataset, which focuses on entities typical in human-robot interaction, and (3) implementing an annotation scheme using Elasticsearch (ES) to enhance efficiency. NERsocial, validated by human annotators, includes six entity types, 153K tokens, and 99.4K sentences, demonstrating RapidNER's capability to expedite dataset creation.

artificial intelligence, natural language, text processing, (19 more...)

arXiv.org Artificial Intelligence

2412.09634

Country:

North America > United States > Virginia (0.04)
Asia > India (0.04)
South America > Peru (0.04)
(33 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Media > News (1.00)
Media > Music (1.00)
Leisure & Entertainment > Sports > Motorsports (1.00)
(14 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information

Moon, Hyeongdon, Davis, Richard, Neshaei, Seyed Parsa, Dillenbourg, Pierre

arXiv.org Artificial IntelligenceSep-30-2024

Knowledge tracing models have enabled a range of intelligent tutoring systems to provide feedback to students. However, existing methods for knowledge tracing in learning sciences are predominantly reliant on statistical data and instructor-defined knowledge components, making it challenging to integrate AI-generated educational content with traditional established methods. We propose a method for automatically extracting knowledge components from educational content using instruction-tuned large multimodal models. We validate this approach by comprehensively evaluating it against knowledge tracing benchmarks in five domains. Our results indicate that the automatically extracted knowledge components can effectively replace human-tagged labels, offering a promising direction for enhancing intelligent tutoring systems in limited-data scenarios, achieving more explainable assessments in educational settings, and laying the groundwork for automated assessment.

dataset, knowledge component, knowledge tracing, (13 more...)

arXiv.org Artificial Intelligence

2409.20167

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Europe > Switzerland (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.88)
Instructional Material > Course Syllabus & Notes (0.67)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Improvements to SDXL in NovelAI Diffusion V3

Ossa, Juan, Doğan, Eren, Birch, Alex, Johnson, F.

arXiv.org Artificial IntelligenceSep-26-2024

This technical report is structured as follows. In Section 2, we describe our enhancements in detail. Following that, we evaluate our contributions in Section 5. Finally, we draw conclusions in Section 6.

arxiv, diffusion model, noise, (16 more...)

arXiv.org Artificial Intelligence

2409.15997

Genre: Research Report (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Sensing and Signal Processing > Image Processing (0.70)

Add feedback

$\tau$-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

Yao, Shunyu, Shinn, Noah, Razavi, Pedram, Narasimhan, Karthik

arXiv.org Artificial IntelligenceJun-17-2024

Existing benchmarks do not test language agents on their interaction with human users or ability to follow domain-specific rules, both of which are vital for deploying them in real world applications. We propose $\tau$-bench, a benchmark emulating dynamic conversations between a user (simulated by language models) and a language agent provided with domain-specific API tools and policy guidelines. We employ an efficient and faithful evaluation process that compares the database state at the end of a conversation with the annotated goal state. We also propose a new metric (pass^k) to evaluate the reliability of agent behavior over multiple trials. Our experiments show that even state-of-the-art function calling agents (like gpt-4o) succeed on <50% of the tasks, and are quite inconsistent (pass^8 <25% in retail). Our findings point to the need for methods that can improve the ability of agents to act consistently and follow rules reliably.

agent, argument, flight, (16 more...)

arXiv.org Artificial Intelligence

2406.12045

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Texas > Tarrant County > Fort Worth (0.05)
(17 more...)

Genre: Research Report > New Finding (0.65)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Consumer Products & Services > Travel (1.00)
Information Technology (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

IDGenRec: LLM-RecSys Alignment with Textual ID Learning

Tan, Juntao, Xu, Shuyuan, Hua, Wenyue, Ge, Yingqiang, Li, Zelong, Zhang, Yongfeng

arXiv.org Artificial IntelligenceMay-17-2024

Generative recommendation based on Large Language Models (LLMs) have transformed the traditional ranking-based recommendation style into a text-to-text generation paradigm. However, in contrast to standard NLP tasks that inherently operate on human vocabulary, current research in generative recommendations struggles to effectively encode recommendation items within the text-to-text framework using concise yet meaningful ID representations. To better align LLMs with recommendation needs, we propose IDGen, representing each item as a unique, concise, semantically rich, platform-agnostic textual ID using human language tokens. This is achieved by training a textual ID generator alongside the LLM-based recommender, enabling seamless integration of personalized recommendations into natural language generation. Notably, as user history is expressed in natural language and decoupled from the original dataset, our approach suggests the potential for a foundational generative recommendation model. Experiments show that our framework consistently surpasses existing models in sequential recommendation under standard experimental setting. Then, we explore the possibility of training a foundation recommendation model with the proposed method on data collected from 19 different datasets and tested its recommendation performance on 6 unseen datasets across different platforms under a completely zero-shot setting. The results show that the zero-shot performance of the pre-trained foundation model is comparable to or even better than some traditional recommendation models based on supervised training, showing the potential of the IDGen paradigm serving as the foundation model for generative recommendation. Code and data are open-sourced at https://github.com/agiresearch/IDGenRec.

dataset, id generator, recommendation, (14 more...)

arXiv.org Artificial Intelligence

2403.19021

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Disentangling ID and Modality Effects for Session-based Recommendation

Zhang, Xiaokun, Xu, Bo, Ren, Zhaochun, Wang, Xiaochen, Lin, Hongfei, Ma, Fenglong

arXiv.org Artificial IntelligenceApr-19-2024

Session-based recommendation aims to predict intents of anonymous users based on their limited behaviors. Modeling user behaviors involves two distinct rationales: co-occurrence patterns reflected by item IDs, and fine-grained preferences represented by item modalities (e.g., text and images). However, existing methods typically entangle these causes, leading to their failure in achieving accurate and explainable recommendations. To this end, we propose a novel framework DIMO to disentangle the effects of ID and modality in the task. At the item level, we introduce a co-occurrence representation schema to explicitly incorporate cooccurrence patterns into ID representations. Simultaneously, DIMO aligns different modalities into a unified semantic space to represent them uniformly. At the session level, we present a multi-view self-supervised disentanglement, including proxy mechanism and counterfactual inference, to disentangle ID and modality effects without supervised signals. Leveraging these disentangled causes, DIMO provides recommendations via causal inference and further creates two templates for generating explanations. Extensive experiments on multiple real-world datasets demonstrate the consistent superiority of DIMO over existing methods. Further analysis also confirms DIMO's effectiveness in generating explanations.

modality, recommendation, session-based recommendation, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3657748

2404.12969

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > China > Liaoning Province > Dalian (0.04)
North America > United States > Pennsylvania (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.47)

Add feedback

ID Embedding as Subtle Features of Content and Structure for Multimodal Recommendation

Liu, Yuting, Yang, Enneng, Dang, Yizhou, Guo, Guibing, Liu, Qiang, Liang, Yuliang, Jiang, Linying, Wang, Xingwei

arXiv.org Artificial IntelligenceNov-10-2023

Multimodal recommendation aims to model user and item representations comprehensively with the involvement of multimedia content for effective recommendations. Existing research has shown that it is beneficial for recommendation performance to combine (user- and item-) ID embeddings with multimodal salient features, indicating the value of IDs. However, there is a lack of a thorough analysis of the ID embeddings in terms of feature semantics in the literature. In this paper, we revisit the value of ID embeddings for multimodal recommendation and conduct a thorough study regarding its semantics, which we recognize as subtle features of content and structures. Then, we propose a novel recommendation model by incorporating ID embeddings to enhance the semantic features of both content and structures. Specifically, we put forward a hierarchical attention mechanism to incorporate ID embeddings in modality fusing, coupled with contrastive learning, to enhance content representations. Meanwhile, we propose a lightweight graph convolutional network for each modality to amalgamate neighborhood and ID embeddings for improving structural representations. Finally, the content and structure representations are combined to form the ultimate item embedding for recommendation. Extensive experiments on three real-world datasets (Baby, Sports, and Clothing) demonstrate the superiority of our method over state-of-the-art multimodal recommendation methods and the effectiveness of fine-grained ID embeddings.

modality, recommendation, representation, (14 more...)

arXiv.org Artificial Intelligence

2311.05956

Country:

Asia > China (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Large Language Models for Generative Recommendation: A Survey and Visionary Discussions

Li, Lei, Zhang, Yongfeng, Liu, Dugang, Chen, Li

arXiv.org Artificial IntelligenceSep-3-2023

Recent years have witnessed the wide adoption of large language models (LLM) in different fields, especially natural language processing and computer vision. Such a trend can also be observed in recommender systems (RS). However, most of related work treat LLM as a component of the conventional recommendation pipeline (e.g., as a feature extractor) which may not be able to fully leverage the generative power of LLM. Instead of separating the recommendation process into multiple stages such as score computation and re-ranking, this process can be simplified to one stage with LLM: directly generating recommendations from the complete pool of items. This survey reviews the progress, methods and future directions of LLM-based generative recommendation by examining three questions: 1) What generative recommendation is, 2) Why RS should advance to generative recommendation, and 3) How to implement LLM-based generative recommendation for various RS tasks. We hope that the survey can provide the context and guidance needed to explore this interesting and emerging topic.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2309.01157

Country:

Asia > China > Jiangsu Province > Yancheng (0.04)
Asia > China > Hong Kong (0.04)
North America > United States (0.04)
(2 more...)

Genre: Overview (1.00)

Industry:

Media (0.68)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Text Matching Improves Sequential Recommendation by Reducing Popularity Biases

Liu, Zhenghao, Mei, Sen, Xiong, Chenyan, Li, Xiaohua, Yu, Shi, Liu, Zhiyuan, Gu, Yu, Yu, Ge

arXiv.org Artificial IntelligenceAug-27-2023

This paper proposes Text mAtching based SequenTial rEcommendation model (TASTE), which maps items and users in an embedding space and recommends items by matching their text representations. TASTE verbalizes items and user-item interactions using identifiers and attributes of items. To better characterize user behaviors, TASTE additionally proposes an attention sparsity method, which enables TASTE to model longer user-item interactions by reducing the self-attention computations during encoding. Our experiments show that TASTE outperforms the state-of-the-art methods on widely used sequential recommendation datasets. TASTE alleviates the cold start problem by representing long-tail items using full-text modeling and bringing the benefits of pretrained language models to recommendation systems. Our further analyses illustrate that TASTE significantly improves the recommendation accuracy by reducing the popularity bias of previous item id based recommendation models and returning more appropriate and text-relevant items to satisfy users. All codes are available at https://github.com/OpenMatch/TASTE.

artificial intelligence, natural language, proceedings, (18 more...)

arXiv.org Artificial Intelligence

2308.14029

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Nevada > Clark County > Las Vegas (0.07)
Europe > United Kingdom > England > West Midlands > Birmingham (0.05)
(6 more...)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Personal Products (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback