AITopics | plum

Collaborating Authors

plum

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Part-Level Visual Understanding

Neural Information Processing SystemsJun-23-2026, 06:23:11 GMT

Real-world objects are composed of distinctive, object-specific parts. Identifying these parts is key to performing fine-grained, compositional reasoning--yet, large multimodal models (LMMs) struggle to perform this seemingly straightforward task. In this work, we introduce PARTONOMY, an LMM benchmark designed for pixel-level part grounding. We construct PARTONOMY from existing part datasets and our own rigorously annotated set of images, encompassing 862 part labels and 534 object labels for evaluation.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Transportation (0.68)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

PARTONOMY: Large Multimodal Models with Part-Level Visual Understanding

Blume, Ansel, Kim, Jeonghwan, Ha, Hyeonjeong, Chatikyan, Elen, Jin, Xiaomeng, Nguyen, Khanh Duy, Peng, Nanyun, Chang, Kai-Wei, Hoiem, Derek, Ji, Heng

arXiv.org Artificial IntelligenceOct-28-2025

Real-world objects are composed of distinctive, object-specific parts. Identifying these parts is key to performing fine-grained, compositional reasoning-yet, large multimodal models (LMMs) struggle to perform this seemingly straightforward task. In this work, we introduce PARTONOMY, an LMM benchmark designed for pixel-level part grounding. We construct PARTONOMY from existing part datasets and our own rigorously annotated set of images, encompassing 862 part labels and 534 object labels for evaluation. Unlike existing datasets that simply ask models to identify generic parts, PARTONOMY uses specialized concepts (e.g., agricultural airplane), and challenges models to compare objects' parts, consider part-whole relationships, and justify textual predictions with visual segmentations. Our experiments demonstrate significant limitations in state-of-the-art LMMs (e.g., LISA-13B achieves only 5.9% gIoU), highlighting a critical gap in their part grounding abilities. We note that existing segmentation-enabled LMMs (segmenting LMMs) have two key architectural shortcomings: they use special [SEG] tokens not seen during pretraining which induce distribution shift, and they discard predicted segmentations instead of using past predictions to guide future ones. To address these deficiencies, we train several part-centric LMMs and propose PLUM, a novel segmenting LMM that uses span tagging instead of segmentation tokens and that conditions on prior predictions in a feedback loop. We find that pretrained PLUM outperforms existing segmenting LMMs on reasoning segmentation, VQA, and visual hallucination benchmarks. In addition, PLUM finetuned on our proposed Explanatory Part Segmentation task is competitive with segmenting LMMs trained on significantly more segmentation data. Our work opens up new avenues towards enabling fine-grained, grounded visual understanding in LMMs.

large language model, machine learning, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2505.20759

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Government (0.93)
Transportation > Air (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Vision (0.70)
(2 more...)

Add feedback

PLUM: Adapting Pre-trained Language Models for Industrial-scale Generative Recommendations

He, Ruining, Heldt, Lukasz, Hong, Lichan, Keshavan, Raghunandan, Mao, Shifan, Mehta, Nikhil, Su, Zhengyang, Tsai, Alicia, Wang, Yueqi, Wang, Shao-Chuan, Yi, Xinyang, Baugher, Lexi, Cakici, Baykal, Chi, Ed, Goodrow, Cristos, Han, Ningren, Ma, He, Rosales, Romer, Van Soest, Abby, Tandon, Devansh, Wu, Su-Lin, Yang, Weilong, Zheng, Yilin

arXiv.org Artificial IntelligenceOct-10-2025

Large Language Models (LLMs) pose a new paradigm of modeling and computation for information tasks. Recommendation systems are a critical application domain poised to benefit significantly from the sequence modeling capabilities and world knowledge inherent in these large models. In this paper, we introduce PLUM, a framework designed to adapt pre-trained LLMs for industry-scale recommendation tasks. PLUM consists of item tokenization using Semantic IDs, continued pre-training (CPT) on domain-specific data, and task-specific fine-tuning for recommendation objectives. For fine-tuning, we focus particularly on generative retrieval, where the model is directly trained to generate Semantic IDs of recommended items based on user context. We conduct comprehensive experiments on large-scale internal video recommendation datasets. Our results demonstrate that PLUM achieves substantial improvements for retrieval compared to a heavily-optimized production model built with large embedding tables. We also present a scaling study for the model's retrieval performance, our learnings about CPT, a few enhancements to Semantic IDs, along with an overview of the training and inference methods that enable launching this framework to billions of users in YouTube.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.07784

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Way to LLM Personalization: Learning to Remember User Conversations

Magister, Lucie Charlotte, Metcalf, Katherine, Zhang, Yizhe, ter Hoeve, Maartje

arXiv.org Artificial IntelligenceNov-20-2024

Large Language Models (LLMs) have quickly become an invaluable assistant for a variety of tasks. However, their effectiveness is constrained by their ability to tailor responses to human preferences and behaviors via personalization. Prior work in LLM personalization has largely focused on style transfer or incorporating small factoids about the user, as knowledge injection remains an open challenge. In this paper, we explore injecting knowledge of prior conversations into LLMs to enable future work on less redundant, personalized conversations. We identify two real-world constraints: (1) conversations are sequential in time and must be treated as such during training, and (2) per-user personalization is only viable in parameter-efficient settings. To this aim, we propose PLUM, a pipeline performing data augmentation for up-sampling conversations as question-answer pairs, that are then used to finetune a low-rank adaptation adapter with a weighted cross entropy loss. Even in this first exploration of the problem, we perform competitively with baselines such as RAG, attaining an accuracy of 81.5% across 100 conversations.

accuracy, preprint arxiv, user conversation, (15 more...)

arXiv.org Artificial Intelligence

2411.13405

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Plum: Prompt Learning using Metaheuristic

Pan, Rui, Xing, Shuo, Diao, Shizhe, Sun, Wenhe, Liu, Xiang, Shum, Kashun, Pi, Renjie, Zhang, Jipeng, Zhang, Tong

arXiv.org Artificial IntelligenceJun-30-2024

Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models. Special prompts, such as Chain-of-Thought, have even revealed previously unknown reasoning capabilities within these models. However, the progress of discovering effective prompts has been slow, driving a desire for general prompt optimization methods. Unfortunately, few existing prompt learning methods satisfy the criteria of being truly "general", i.e., automatic, discrete, black-box, gradient-free, and interpretable all at once. In this paper, we introduce metaheuristics, a branch of discrete non-convex optimization methods with over 100 options, as a promising approach to prompt learning. Within our paradigm, we test six typical methods: hill climbing, simulated annealing, genetic algorithms with/without crossover, tabu search, and harmony search, demonstrating their effectiveness in white-box and black-box prompt learning. Furthermore, we show that these methods can be used to discover more human-understandable prompts that were previously unknown in both reasoning and image generation tasks, opening the door to a cornucopia of possibilities in prompt optimization. We release all the codes in \url{https://github.com/research4pan/Plum}.

algorithm, arxiv preprint arxiv, language model, (14 more...)

arXiv.org Artificial Intelligence

2311.08364

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
(2 more...)

Add feedback

PLUM: Preference Learning Plus Test Cases Yields Better Code Language Models

Zhang, Dylan, Diao, Shizhe, Zou, Xueyan, Peng, Hao

arXiv.org Artificial IntelligenceJun-10-2024

Instruction-finetuned code language models (LMs) have shown promise in various programming tasks. They are trained, using a language modeling objective, on natural language instructions and gold code snippet pairs. Recent evidence suggests that these models, never exposed to incorrect solutions during training, often struggle to distinguish between correct and incorrect solutions. This observation raises our inquiry: Can preference learning, which trains models to prefer correct solutions over incorrect ones, help push the boundaries of code LMs even further? We propose PLUM, a novel \textbf{p}reference \textbf{l}earning framework a\textbf{u}gmented with test cases tailored for code L\textbf{M}s.PLUM aims to investigate the key success factors and potential benefits of preference learning in code LMs, which remain elusive despite its success in aligning LMs with human values. PLUM consists of three stages: (1) Generating test cases for natural language instructions, (2) sampling candidate solutions from the policy and evaluating them against the test cases to create a preference dataset, which is then used to (3) train the policy with a preference learning algorithm. Experiments demonstrate that PLUM substantially improves the performance of existing code LMs on established code generation benchmarks such as HumanEval (+) and MBPP (+), even for the state-of-the-art open-source language model CodeQwen-1.5-7B-Chat. PLUM complements the supervised fine-tuning (SFT) stage, demonstrating synergistic effects.

instruction, language model, test case, (13 more...)

arXiv.org Artificial Intelligence

2406.06887

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.90)

Add feedback

LLMs for Robotic Object Disambiguation

Jiang, Connie, Xu, Yiqing, Hsu, David

arXiv.org Artificial IntelligenceJan-6-2024

The advantages of pre-trained large language models (LLMs) are apparent in a variety of language processing tasks. But can a language model's knowledge be further harnessed to effectively disambiguate objects and navigate decision-making challenges within the realm of robotics? Our study reveals the LLM's aptitude for solving complex decision making challenges that are often previously modeled by Partially Observable Markov Decision Processes (POMDPs). A pivotal focus of our research is the object disambiguation capability of LLMs. We detail the integration of an LLM into a tabletop environment disambiguation task, a decision making problem where the robot's task is to discern and retrieve a user's desired object from an arbitrarily large and complex cluster of objects. Despite multiple query attempts with zero-shot prompt engineering (details can be found in the Appendix), the LLM struggled to inquire about features not explicitly provided in the scene description. In response, we have developed a few-shot prompt engineering system to improve the LLM's ability to pose disambiguating queries. The result is a model capable of both using given features when they are available and inferring new relevant features when necessary, to successfully generate and navigate down a precise decision tree to the correct object--even when faced with identical options.

arxiv preprint arxiv, chocolate bar, language model, (14 more...)

arXiv.org Artificial Intelligence

2401.03388

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

The surprising future of fintech

#artificialintelligenceJul-2-2020, 08:20:29 GMT

Thanks to open banking, fintech early adopters likely already have accounts that round up transactions to boost savings or connect to third-party tools for loan applications, budget management and more. But the new wave of fintech startups are proving there's much more that can be done using open banking, the two-year-old mandate from UK regulators that required banks to easily allow their customers to share their data with third parties such as apps. "Open banking offers people the chance to get personalised, tailored support to help them manage their money by allowing regulated companies to securely analyse their bank data," says Lubaina Manji, senior programme manager at Nesta Challenges, one of the organisations behind the Open Up 2020 Challenge, alongside the Open Banking Implementation Entity (OBIE). "It's enabled the creation of new services and tools to help people with every aspect of money management – from budgeting to investing, and much, much more, all in a safe and secure way." And some of the innovations from finalists in the Open Up 2020 Challenge have surprised with their ingenuity and customer focus, she says, citing Sustainably's round-up tool for automated charity donations, and Kalgera's neuroscience-informed AI to help spot fraud targeting people with dementia – two projects that highlight the purpose-driven idea behind open banking and the aim to get financial support to show who need it the most.

artificial intelligence, open banking, surprising future, (17 more...)

#artificialintelligence

Country: Europe > United Kingdom (0.35)

Industry:

Banking & Finance (1.00)
Health & Medicine > Therapeutic Area > Neurology > Dementia (0.55)
Government > Regional Government > Europe Government > United Kingdom Government (0.35)

Technology:

Information Technology > e-Commerce > Financial Technology (0.83)
Information Technology > Artificial Intelligence (0.70)

Add feedback

Amazon just upgraded the popular Echo Dot--is it worth buying?

USATODAY - Tech Top StoriesNov-1-2019, 20:37:20 GMT

If you're considering getting or giving a smart speaker in the near future, the Echo Dot is a great place to start. It doesn't take up much real estate on the counter, it's relatively easy on the wallet, and it comes loaded with Alexa and her many, many capabilities. But now that there is a new generation of Echo Dots available, should you spring for the latest and greatest? Let's look at the new Dot, what makes it different, and whether it's worth the extra cash. The display on the third-generation Echo Dot with Clock can also display timers and weather.

USATODAY - Tech Top Stories

Industry: Information Technology (0.85)

Technology:

Information Technology > Communications > Social Media (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.51)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)

Add feedback

AI's desire

#artificialintelligenceJun-26-2018, 21:41:44 GMT

At the Artificial Intelligence Conference in New York, Kathryn Hume pointed me to Ellen Ullman's excellent book, Life in Code: A Personal History of Technology. In Part 3 of her book "Life, Artificial," Ullman talks about artificial intelligence, robotics, and the desire to create artificial life. What these views of human sentience have in common, and why they fail to describe us, is a certain disdain for the body: the utter lack of a body in early AI and in later formulations like Kurzweil's (the lonely cortex, scanned and downloaded, a brain-in-a-jar); and the disregard for this body, this mammalian flesh, in robotics and ALife [Artificial Life]. By connecting the poverty of AI with its denial of the body, Ullman follows an important thread in feminist theory: our thinking needs to be connected to bodies, to physical human process, to blood and meat. The male-dominated Western tradition is all about abstraction, for which Plato is the poster child.

abstraction, artificial intelligence, machine learning, (17 more...)

#artificialintelligence

Country: North America > United States > New York (0.25)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Artificial Intelligence > Robots (0.64)
Information Technology > Artificial Intelligence > Games > Go (0.30)

Add feedback