AITopics | prompter

Collaborating Authors

prompter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

fabSAM: A Farmland Boundary Delineation Method Based on the Segment Anything Model

Xie, Yufeng, Wu, Hanzhi, Tong, Hongxiang, Xiao, Lei, Zhou, Wenwen, Li, Ling, Wanger, Thomas Cherico

arXiv.org Artificial IntelligenceJan-21-2025

Delineating farmland boundaries is essential for agricultural management such as crop monitoring and agricultural census. Traditional methods using remote sensing imagery have been efficient but limited in generalisation. The Segment Anything Model (SAM), known for its impressive zero shot performance, has been adapted for remote sensing tasks through prompt learning and fine tuning. Here, we propose a SAM based farmland boundary delineation framework 'fabSAM' that combines a Deeplabv3+ based Prompter and SAM. Also, a fine tuning strategy was introduced to enable SAMs decoder to improve the use of prompt information. Experimental results on the AI4Boundaries and AI4SmallFarms datasets have shown that fabSAM has a significant improvement in farmland region identification and boundary delineation. Compared to zero shot SAM, fabSAM surpassed it by 23.5% and 15.1% in mIOU on the AI4Boundaries and AI4SmallFarms datasets, respectively. For Deeplabv3+, fabSAM outperformed it by 4.9% and 12.5% in mIOU, respectively. These results highlight the effectiveness of fabSAM, which also means that we can more easily obtain the global farmland region and boundary maps from open source satellite image datasets like Sentinel2.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.12487

Country:

North America > United States (0.68)
Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > China > Hong Kong (0.04)
(12 more...)

Genre: Research Report (0.52)

Industry:

Food & Agriculture > Agriculture (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.57)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment

Trivedi, Prashant, Chakraborty, Souradip, Reddy, Avinash, Aggarwal, Vaneet, Bedi, Amrit Singh, Atia, George K.

arXiv.org Artificial IntelligenceJan-6-2025

The alignment of large language models (LLMs) with human values is critical as these models become increasingly integrated into various societal and decision-making processes. Traditional methods, such as reinforcement learning from human feedback (RLHF), achieve alignment by fine-tuning model parameters, but these approaches are often computationally expensive and impractical when models are frozen or inaccessible for parameter modification. In contrast, prompt optimization is a viable alternative to RLHF for LLM alignment. While the existing literature has shown empirical promise of prompt optimization, its theoretical underpinning remains under-explored. We address this gap by formulating prompt optimization as an optimization problem and try to provide theoretical insights into the optimality of such a framework. To analyze the performance of the prompt optimization, we study theoretical suboptimality bounds and provide insights in terms of how prompt optimization depends upon the given prompter and target model. We also provide empirical validation through experiments on various datasets, demonstrating that prompt optimization can effectively align LLMs, even when parameter fine-tuning is not feasible.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.03486

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Banking & Finance (1.00)
Health & Medicine > Surgery (0.95)
Materials > Chemicals > Commodity Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Contrastive Localized Language-Image Pre-Training

Chen, Hong-You, Lai, Zhengfeng, Zhang, Haotian, Wang, Xinze, Eichner, Marcin, You, Keen, Cao, Meng, Zhang, Bowen, Yang, Yinfei, Gan, Zhe

arXiv.org Artificial IntelligenceOct-3-2024

Contrastive Language-Image Pre-training (CLIP) has been a celebrated method for training vision encoders to generate image/text representations facilitating various applications. Recently, CLIP has been widely adopted as the vision backbone of multimodal large language models (MLLMs) to connect image inputs for language interactions. The success of CLIP as a vision-language foundation model relies on aligning web-crawled noisy text annotations at image levels. Nevertheless, such criteria may become insufficient for downstream tasks in need of fine-grained vision representations, especially when region-level understanding is demanding for MLLMs. In this paper, we improve the localization capability of CLIP with several advances. We propose a pre-training method called Contrastive Localized Language-Image Pre-training (CLOC) by complementing CLIP with region-text contrastive loss and modules. We formulate a new concept, promptable embeddings, of which the encoder produces image embeddings easy to transform into region representations given spatial hints. To support large-scale pre-training, we design a visually-enriched and spatially-localized captioning framework to effectively generate region-text pseudo-labels at scale. By scaling up to billions of annotated images, CLOC enables high-quality regional embeddings for image region recognition and retrieval tasks, and can be a drop-in replacement of CLIP to enhance MLLMs, especially on referring and grounding tasks.

caption, encoder, prompter, (16 more...)

arXiv.org Artificial Intelligence

2410.02746

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks

Yang, Hunmin, Jeong, Jongoh, Yoon, Kuk-Jin

arXiv.org Artificial IntelligenceJul-30-2024

Recent vision-language foundation models, such as CLIP, have demonstrated superior capabilities in learning representations that can be transferable across diverse range of downstream tasks and domains. With the emergence of such powerful models, it has become crucial to effectively leverage their capabilities in tackling challenging vision tasks. On the other hand, only a few works have focused on devising adversarial examples that transfer well to both unknown domains and model architectures. In this paper, we propose a novel transfer attack method called PDCL-Attack, which leverages the CLIP model to enhance the transferability of adversarial perturbations generated by a generative model-based attack framework. Specifically, we formulate an effective prompt-driven feature guidance by harnessing the semantic representation power of text, particularly from the ground-truth class labels of input images. To the best of our knowledge, we are the first to introduce prompt learning to enhance the transferable generative attacks. Extensive experiments conducted across various cross-domain and cross-model settings empirically validate our approach, demonstrating its superiority over state-of-the-art methods.

adversarial example, effectiveness, transferability, (14 more...)

arXiv.org Artificial Intelligence

2407.20657

Country: North America > United States > California (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)
(2 more...)

Add feedback

Situated Instruction Following

Min, So Yeon, Puig, Xavi, Chaplot, Devendra Singh, Yang, Tsung-Yen, Rai, Akshara, Parashar, Priyam, Salakhutdinov, Ruslan, Bisk, Yonatan, Mottaghi, Roozbeh

arXiv.org Artificial IntelligenceJul-15-2024

Language is never spoken in a vacuum. It is expressed, comprehended, and contextualized within the holistic backdrop of the speaker's history, actions, and environment. Since humans are used to communicating efficiently with situated language, the practicality of robotic assistants hinge on their ability to understand and act upon implicit and situated instructions. In traditional instruction following paradigms, the agent acts alone in an empty house, leading to language use that is both simplified and artificially "complete." In contrast, we propose situated instruction following (SIF), which embraces the inherent underspecification and ambiguity of real-world communication with the physical presence of a human speaker. The meaning of situated instructions naturally unfold through the past actions and the expected future behaviors of the human involved. Specifically, within our settings we have instructions that (1) are ambiguously specified, (2) have temporally evolving intent, (3) can be interpreted more precisely with the agent's dynamic actions. Our experiments indicate that state-of-the-art Embodied Instruction Following (EIF) models lack holistic understanding of situated human intention.

agent, arxiv preprint arxiv, instruction, (14 more...)

arXiv.org Artificial Intelligence

2407.12061

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter

Song, Suqi, Zhang, Chenxu, Zhang, Peng, Li, Pengkun, Song, Fenglong, Zhang, Lei

arXiv.org Artificial IntelligenceJul-10-2024

Urban waterlogging poses a major risk to public safety and infrastructure. Conventional methods using water-level sensors need high-maintenance to hardly achieve full coverage. Recent advances employ surveillance camera imagery and deep learning for detection, yet these struggle amidst scarce data and adverse environmental conditions. In this paper, we establish a challenging Urban Waterlogging Benchmark (UW-Bench) under diverse adverse conditions to advance real-world applications. We propose a Large-Small Model co-adapter paradigm (LSM-adapter), which harnesses the substantial generic segmentation potential of large model and the specific task-directed guidance of small model. Specifically, a Triple-S Prompt Adapter module alongside a Dynamic Prompt Combiner are proposed to generate then merge multiple prompts for mask decoder adaptation. Meanwhile, a Histogram Equalization Adap-ter module is designed to infuse the image specific information for image encoder adaptation. Results and analysis show the challenge and superiority of our developed benchmark and algorithm. Project page: \url{https://github.com/zhang-chenxu/LSM-Adapter}

segmentation, small model, training strategy, (13 more...)

arXiv.org Artificial Intelligence

2407.08109

Country:

Asia > China > Chongqing Province > Chongqing (0.04)
North America > United States (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Industry: Commercial Services & Supplies > Security & Alarm Services (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

Prompting Explicit and Implicit Knowledge for Multi-hop Question Answering Based on Human Reading Process

Huang, Guangming, Long, Yunfei, Luo, Cunjin, Shen, Jiaxing, Sun, Xia

arXiv.org Artificial IntelligenceJun-27-2024

Pre-trained language models (PLMs) leverage chains-of-thought (CoT) to simulate human reasoning and inference processes, achieving proficient performance in multi-hop QA. However, a gap persists between PLMs' reasoning abilities and those of humans when tackling complex problems. Psychological studies suggest a vital connection between explicit information in passages and human prior knowledge during reading. Nevertheless, current research has given insufficient attention to linking input passages and PLMs' pre-training-based knowledge from the perspective of human cognition studies. In this study, we introduce a Prompting Explicit and Implicit knowledge (PEI) framework, which uses prompts to connect explicit and implicit knowledge, aligning with human reading process for multi-hop QA. We consider the input passages as explicit knowledge, employing them to elicit implicit knowledge through unified prompt reasoning. Furthermore, our model incorporates type-specific reasoning via prompts, a form of implicit knowledge. Experimental results show that PEI performs comparably to the state-of-the-art on HotpotQA. Ablation studies confirm the efficacy of our model in bridging and integrating explicit and implicit knowledge.

implicit knowledge, knowledge, proceedings, (16 more...)

arXiv.org Artificial Intelligence

2402.1935

Country:

Africa > Democratic Republic of the Congo > Kinshasa Province > Kinshasa (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Knowledge Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions

Tang, Yiming, Dong, Bin

arXiv.org Artificial IntelligenceJun-16-2024

Large language models (LLMs) benefit greatly from prompt engineering, with in-context learning standing as a pivital technique. While former approaches have provided various ways to construct the demonstrations used for in-context learning, they often ignore the inherent heterogeneity within datasets, applying the same demonstrations to all reasoning questions. We observed that the effectiveness of demonstrations varies depending on the specific question. This motivates our exploration of using prompt engineering to select appropriate demonstrations. To address the challenge of automatically creating and choosing demonstrations tailored to each question, we propose a novel prompt engineering workflow built around a novel object called the "demonstration notebook." This notebook helps identify the most suitable in-context learning example for a question by gathering and reusing information from the LLM's past interactions. Our experiments show that this approach outperforms all existing methods for automatic demonstration construction and selection (as far as we know), achieving state-of-the-art results on serveral reasoning benchmarks. The method's versatility is further demonstrated by its success in text summarization and prompt compression tasks. Additionally, we contribute a rigorous analysis method to reveal the "demonstrative regime" of a demonstration, providing valuable insights into how demonstrations relate to different question types within a dataset.

demonstration, demonstration notebook, demonstrative regime, (13 more...)

arXiv.org Artificial Intelligence

2406.10878

Country:

Asia > Middle East > Republic of Türkiye > Samsun Province > Samsun (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

INCPrompt: Task-Aware incremental Prompting for Rehearsal-Free Class-incremental Learning

Wang, Zhiyuan, Qu, Xiaoyang, Xiao, Jing, Chen, Bokui, Wang, Jianzong

arXiv.org Artificial IntelligenceJan-21-2024

This paper introduces INCPrompt, an innovative continual learning solution that effectively addresses catastrophic forgetting. INCPrompt's key innovation lies in its use of adaptive key-learner and task-aware prompts that capture task-relevant information. This unique combination encapsulates general knowledge across tasks and encodes task-specific knowledge. Our comprehensive evaluation across multiple continual learning benchmarks demonstrates INCPrompt's superiority over existing algorithms, showing its effectiveness in mitigating catastrophic forgetting while maintaining high performance. These results highlight the significant impact of task-aware incremental prompting on continual learning performance.

continual learning, incprompt, learning, (14 more...)

arXiv.org Artificial Intelligence

2401.11667

Country: Asia > China > Guangdong Province > Shenzhen (0.05)

Genre: Research Report (0.40)

Industry: Education > Educational Setting (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

RoboGPT: an intelligent agent of making embodied long-term decisions for daily instruction tasks

Chen, Yaran, Cui, Wenbo, Chen, Yuanwen, Tan, Mining, Zhang, Xinyao, Zhao, Dongbin, Wang, He

arXiv.org Artificial IntelligenceNov-27-2023

Robotic agents must master common sense and long-term sequential decisions to solve daily tasks through natural language instruction. The developments in Large Language Models (LLMs) in natural language processing have inspired efforts to use LLMs in complex robot planning. Despite LLMs' great generalization and comprehension of instruction tasks, LLMs-generated task plans sometimes lack feasibility and correctness. To address the problem, we propose a RoboGPT agent\footnote{our code and dataset will be released soon} for making embodied long-term decisions for daily tasks, with two modules: 1) LLMs-based planning with re-plan to break the task into multiple sub-goals; 2) RoboSkill individually designed for sub-goals to learn better navigation and manipulation skills. The LLMs-based planning is enhanced with a new robotic dataset and re-plan, called RoboGPT. The new robotic dataset of 67k daily instruction tasks is gathered for fine-tuning the Llama model and obtaining RoboGPT. RoboGPT planner with strong generalization can plan hundreds of daily instruction tasks. Additionally, a low-computational Re-Plan module is designed to allow plans to flexibly adapt to the environment, thereby addressing the nomenclature diversity challenge. The proposed RoboGPT agent outperforms SOTA methods on the ALFRED daily tasks. Moreover, RoboGPT planner exceeds SOTA LLM-based planners like ChatGPT in task-planning rationality for hundreds of unseen daily tasks, and even other domain tasks, while keeping the large model's original broad application and generality.

apple, instruction, microwave, (15 more...)

arXiv.org Artificial Intelligence

2311.15649

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback