Goto

Collaborating Authors

 decorator


Prompt Decorators: A Declarative and Composable Syntax for Reasoning, Formatting, and Control in LLMs

Heris, Mostapha Kalami

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are central to reasoning, writing, and decision-support workflows, yet users lack consistent control over how they reason and express outputs. Conventional prompt engineering relies on verbose natural-language instructions, limiting reproducibility, modularity, and interpretability. This paper introduces Prompt Decorators, a declarative, composable syntax that governs LLM behavior through compact control tokens such as +++Reasoning, +++Tone(style=formal), and +++Import(topic="Systems Thinking"). Each decorator modifies a behavioral dimension, such as reasoning style, structure, or tone, without changing task content. The framework formalizes twenty core decorators organized into two functional families (Cognitive & Generative and Expressive & Systemic), each further decomposed into subcategories that govern reasoning, interaction, expression, and session-control. It defines a unified syntax, scoping model, and deterministic processing pipeline enabling predictable and auditable behavior composition. By decoupling task intent from execution behavior, Prompt Decorators create a reusable and interpretable interface for prompt design. Illustrative use cases demonstrate improved reasoning transparency, reduced prompt complexity, and standardized model behavior across domains. The paper concludes with implications for interoperability, behavioral consistency, and the development of declarative interfaces for scalable AI systems.


Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models

Zhao, Junda, Song, Yuliang, Cohen, Eldan

arXiv.org Artificial Intelligence

Recent advancements in source code summarization have leveraged transformer-based pre-trained models, including Large Language Models of Code (LLMCs), to automate and improve the generation of code summaries. However, existing methods often focus on generating a single high-quality summary for a given source code, neglecting scenarios where the generated summary might be inadequate and alternative options are needed. In this paper, we introduce Variational Prefix Tuning (VPT), a novel approach that enhances pre-trained models' ability to generate diverse yet accurate sets of summaries, allowing the user to choose the most suitable one for the given source code. Our method integrates a Conditional Variational Autoencoder (CVAE) framework as a modular component into pre-trained models, enabling us to model the distribution of observed target summaries and sample continuous embeddings to be used as prefixes to steer the generation of diverse outputs during decoding. Importantly, we construct our method in a parameter-efficient manner, eliminating the need for expensive model retraining, especially when using LLMCs. Furthermore, we employ a bi-criteria reranking method to select a subset of generated summaries, optimizing both the diversity and the accuracy of the options presented to users. We present extensive experimental evaluations using widely used datasets and current state-of-the-art pre-trained code summarization models to demonstrate the effectiveness of our approach and its adaptability across models.


From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions

Rakotonirina, Nathanaël Carraz, Hamdy, Mohammed, Campos, Jon Ander, Weber, Lucas, Testoni, Alberto, Fadaee, Marzieh, Pezzelle, Sandro, Del Tredici, Marco

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly used in working environments for a wide range of tasks, excelling at solving individual problems in isolation. However, are they also able to effectively collaborate over long-term interactions? To investigate this, we introduce MemoryCode, a synthetic multi-session dataset designed to test LLMs' ability to track and execute simple coding instructions amid irrelevant information, simulating a realistic setting. While all the models we tested handle isolated instructions well, even the performance of state-of-the-art models like GPT-4o deteriorates when instructions are spread across sessions. Our analysis suggests this is due to their failure to retrieve and integrate information over long instruction chains. Our results highlight a fundamental limitation of current LLMs, restricting their ability to collaborate effectively in long interactions.


ART: Actually Robust Training

Chwilczyński, Sebastian, Trębacz, Kacper, Cyganik, Karol, Małecki, Mateusz, Brzezinski, Dariusz

arXiv.org Artificial Intelligence

Some guidelines have been proposed, yet currently, they lack practical implementations. Furthermore, neural network training often takes on the form of trial and error, lacking a structured and thoughtful process. To alleviate these issues, in this paper, we introduce Art, a Python library designed to help automatically impose rules and standards while developing deep learning pipelines. Art divides model development into a series of smaller steps of increasing complexity, each concluded with a validation check improving the interpretability and robustness of the process. The current version of Art comes equipped with nine predefined steps inspired by Andrej Karpathy's Recipe for Training Neural Networks, a visualization dashboard, and integration with loggers such as Neptune.


Out of style: Misadventures with LLMs and code style transfer

Munson, Karl, Ting, Chih-Kai, Wade, Serenity, Savla, Anish, Dolby, Julian, Kate, Kiran, Srinivas, Kavitha

arXiv.org Artificial Intelligence

Like text, programs have styles, and certain programming styles are more desirable than others for program readability, maintainability, and performance. Code style transfer, however, is difficult to automate except for trivial style guidelines such as limits on line length. Inspired by the success of using language models for text style transfer, we investigate if code language models can perform code style transfer. Code style transfer, unlike text transfer, has rigorous requirements: the system needs to identify lines of code to change, change them correctly, and leave the rest of the program untouched. We designed CSB (Code Style Benchmark), a benchmark suite of code style transfer tasks across five categories including converting for-loops to list comprehensions, eliminating duplication in code, adding decorators to methods, etc. We then used these tests to see if large pre-trained code language models or fine-tuned models perform style transfer correctly, based on rigorous metrics to test that the transfer did occur, and the code still passes functional tests. Surprisingly, language models failed to perform all of the tasks, suggesting that they perform poorly on tasks that require code understanding. We will make available the large-scale corpora to help the community build better code models.


Functional Programming Paradigm of Python for Scientific Computation Pipeline Integration

Zhang, Chen, Jia, Lecheng, Zhang, Wei, Wen, Ning

arXiv.org Artificial Intelligence

As an interpreted programming language, Python is of the characteristics of concise syntax, flexibility on multiple programming paradigms, cross-platforms, etc. [1-3]. Its software ecosystem gets enriched, which benefits from the contributions of soaring researchers and developers studied in various fields. As with the booming development of AI techniques generally, Python has become a de facto development standard for scientific computation and AI algorithms due to not only some excel high performance libraries such as numpy, scipy, and tensorly [4-6], but also high compatibility for integrating other programming languages. Nonetheless, as the expense of exceeding flexibility and the richness of software ecosystem, it gets also challenging in integration for complicated project via Python, particularly in scientific computation used in interdisciplinary application [7-10]. Such as, most of data manipulations are generally in demands of importing third party libraries designed on basis of different specifications, which can frequently result in incompatibility problems raised by data types or such like.


Is AI more creative than the human brain? I doubt it – and I definitely want humans to stay in charge Stefan Stern

The Guardian

It's (fairly) easy if you try. You could scroll down or click the little x in the corner of the screen to get rid of me. If you are reading the print edition you could just turn the page. One of the indignities of the digital age is being asked, constantly, to confirm we are who we say we are, that we are indeed a human being. Something feels slightly amiss when the (non-human) technology demands that we convince it that we are not the same as them.


UltraFeedback: Boosting Language Models with High-quality Feedback

Cui, Ganqu, Yuan, Lifan, Ding, Ning, Yao, Guanming, Zhu, Wei, Ni, Yuan, Xie, Guotong, Liu, Zhiyuan, Sun, Maosong

arXiv.org Artificial Intelligence

Reinforcement learning from human feedback (RLHF) has become a pivot technique in aligning large language models (LLMs) with human preferences. In RLHF practice, preference data plays a crucial role in bridging human proclivity and LLMs. However, the scarcity of diverse, naturalistic datasets of human preferences on LLM outputs at scale poses a great challenge to RLHF as well as feedback learning research within the open-source community. Current preference datasets, either proprietary or limited in size and prompt variety, result in limited RLHF adoption in open-source models and hinder further exploration. We meticulously devise annotation instructions and employ GPT-4 to offer detailed feedback in both numerical and textual forms. Experimental results indicate that our models outperform existing open-source models, achieving top performance across multiple benchmarks. Large language models (LLMs), represented by ChatGPT (OpenAI, 2022) and GPT-4 (OpenAI, 2023), have demonstrated proficiency in generating fluent text as well as solving various languageoriented tasks. Trained on massive corpora through likelihood maximization techniques, these LLMs have exhibited remarkable generalization and equipped the ability to execute diverse tasks in response to user directives (Ouyang et al., 2022; Wei et al., 2022; Sanh et al., 2022). Unfortunately, relying solely on likelihood maximization during training leads to well-known issues - LLMs may generate convincing but incorrect or unsafe content that deviates from human preferences (Stiennon et al., 2020; Ouyang et al., 2022; Perez et al., 2022). To further align LLMs with human preferences, reinforcement learning from human feedback (RLHF) (Ouyang et al., 2022; Askell et al., 2021; Bai et al., 2022a; Touvron et al., 2023b) has been introduced and widely adopted by leading corporations. RLHF builds upon preference data, which rates and compares different responses given the same prompt. Typically, RLHF trains a reward model on preference data and then applies RL algorithms such as Proximal Policy Optimization (PPO) (Schulman et al., 2017) on LLMs to optimize the rewards (OpenAI, 2022; 2023; Touvron et al., 2023b; Bai et al., 2022a). While proprietary models have largely capitalized on RLHF's potential to produce outputs that are both more useful and safer, a significant gap persists in the open-source community. As a result, few open-source models adopt RLHF as it demonstrates marginal gains, which critically hinders successful RLHF practice and further research.


How to Use Custom Losses with Custom Gradients in TensorFlow with Keras

#artificialintelligence

Keras does a great job of abstracting low-level details of neural network creation so you can focus on getting the job done. But, if you're reading this, you've probably discovered that Keras' off-the-shelf methods cannot always be used to learn your model's parameters. Perhaps your model has a gradient that cannot be calculated through the magic of autodiff, or your loss function does not conform to the signature my_loss_fn(y_true, y_pred) mentioned in Keras' documentation. If you found the online documentation wholly unhelpful, read on! I [hopefully] have all the answers you couldn't find anywhere else.


[100%OFF] Complete Python & Python OOP With Exercises& Projects In2022

#artificialintelligence

Udemy is the biggest website in the world that offer courses in many categories, all the skills that you would be looking for are offered in Udemy, including languages, design, marketing and a lot of other categories, so when you ever want to buy a courses and pay for a new skills, Udemy would be the best forum for you. You can find payment courses, 100 free courses and coupons also, more than 12 categories are offered, and that what makes sure you will find the domain and the skill you are looking for. Our duty is to search for 100 off courses and free coupons. Python Programming Basics and Python Object Oriented Programming Guide for Python Programmers & Python Coders in a simple and easy way with Examples, quizzes, Resources & Python Projects to master Python from zero to hero. Why to master Python Programming?