Goto

Collaborating Authors

 position


User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction

arXiv.org Machine Learning

Large language models are increasingly used as personal assistants, yet most lack a persistent user model, forcing users to repeatedly restate preferences across sessions. We propose Vector-Adapted Retrieval Scoring (VARS), a pipeline-agnostic, frozen-backbone framework that represents each user with long-term and short-term vectors in a shared preference space and uses these vectors to bias retrieval scoring over structured preference memory. The vectors are updated online from weak scalar rewards from users' feedback, enabling personalization without per-user fine-tuning. We evaluate on \textsc{MultiSessionCollab}, an online multi-session collaboration benchmark with rich user preference profiles, across math and code tasks. Under frozen backbones, the main benefit of user-aware retrieval is improved interaction efficiency rather than large gains in raw task accuracy: our full VARS agent achieves the strongest overall performance, matches a strong Reflection baseline in task success, and reduces timeout rate and user effort. The learned long-term vectors also align with cross-user preference overlap, while short-term vectors capture session-specific adaptation, supporting the interpretability of the dual-vector design. Code, model, and data are available at https://github.com/YurenHao0426/VARS.



Factorio Learning Environment

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are rapidly saturating existing benchmarks, necessitating new open-ended evaluations. We introduce the Factorio Learning Environment (FLE), based on the game of Factorio, that tests agents in long-term planning, program synthesis, and resource optimization. FLE provides exponentially scaling challenges -- from basic automation to complex factories processing millions of resource units per second. We provide two settings: (1) lab-play consisting of eight structured tasks with fixed resources, and (2) open-play with the unbounded task of building the largest factory on an procedurally generated map. We demonstrate across both settings that models still lack strong spatial reasoning. In lab-play, we find that LLMs exhibit promising short-horizon skills, yet are unable to operate effectively in constrained environments, reflecting limitations in error analysis. In open-play, while LLMs discover automation strategies that improve growth (e.g electric-powered drilling), they fail to achieve complex automation (e.g electronic-circuit manufacturing).


Position: It's Time to Act on the Risk of Efficient Personalized Text Generation

arXiv.org Artificial Intelligence

The recent surge in high-quality open-sourced Generative AI text models (colloquially: LLMs), as well as efficient finetuning techniques, has opened the possibility of creating high-quality personalized models, i.e., models generating text attuned to a specific individual's needs and capable of credibly imitating their writing style by leveraging that person's own data to refine an open-source model. The technology to create such models is accessible to private individuals, and training and running such models can be done cheaply on consumer-grade hardware. These advancements are a huge gain for usability and privacy. This position paper argues, however, that these advancements also introduce new safety risks by making it practically feasible for malicious actors to impersonate specific individuals at scale, for instance for the purpose of phishing emails, based on small amounts of publicly available text. We further argue that these risks are complementary to - and distinct from - the much-discussed risks of other impersonation attacks such as image, voice, or video deepfakes, and are not adequately addressed by the larger research community, or the current generation of open - and closed-source models.


Feature Visualization

@machinelearnbot

How can we chose a preconditioner that will give us these benefits? A good first guess is one that makes your data decorrelated and whitened. In the case of images this means doing gradient descent in the Fourier basis, This points to a profound fact about the Fourier transform. As long as a correlation is consistent across spatial positions -- such as the correlation between a pixel and its left neighbor being the same across all positions of an image -- the Fourier coefficients will be independent variables. To see this, note that such a spatially consistent correlation can be expressed as a convolution, and by the convolution theorem becomes pointwise multiplication after the Fourier transform.


The CS Freiburg Team

AI Magazine

Robotic soccer is an ideal task to demonstrate new techniques and explore new problems. Moreover, problems and solutions can easily be communicated because soccer is a well-known game. Our intention in building a robotic soccer team and participating in RoboCup-98 was, first, to demonstrate the usefulness of the self-localization methods we have developed. Second, we wanted to show that playing soccer based on an explicit world model is much more effective than other methods. Third, we intended to explore the problem of building and maintaining a global team world model.


Special Issue on Innovative Applications of AI

AI Magazine

IAAI is the premier venue for learning about AI's impact through deployed applications and emerging AI technologies. Case studies of deployed applications with measurable benefits arising from the use of AI technology provide clear evidence of the impact and value of AI technology to today's world. The emerging applications track features technologies that are rapidly maturing to the point of application. The seven articles selected for this special issue are extended versions of the papers that appeared at the conference. Four of the articles describe deployed applications that are already in use in the field.


1113

AI Magazine

These methods and ideas are discussed here. LOLA's console and see an LOLA's hard drive had decided to crash Performing all computation on board has several advantages: The video data are not corrupted by radio-transmission noise, commands are not lost, and there's no communication lag that might result in These findings are consistent with those of previous competitors (Nourbakhsh, Powers, and Birchfield 1995). On the down side, the on-board image processor contributes significantly to the battery drain, which is partly the result of its intended desktop use. Still, we are able to get about two hours of operation to each charge. Nomadic Technologies is currently making efforts to offer a version that is better suited for mobile robot use. Figure 1.


Review of Knowledge Engineering and Management

AI Magazine

Identifying generic, domain-independent tasks, formalizing task representation, elucidating the role of the task in eliciting domain-specific knowledge, and standardizing the design and development of expert systems then became the major research problems of the field. Knowledge specification, includes the task decomposition and the specification of the domain information types and knowledge bases. The task decomposition can be guided by selecting to reuse some of the previously identified task templates. Finally, during knowledge refinement, the models are validated through simulation on paper or with prototyping, and the knowledge bases are refined. Depending on how familiar the analyst is with the domain, these activities might have to be performed repeatedly, and subsequent activities might provide feedback for corrections or extensions to the products of earlier ones.


461

AI Magazine

A recent article by Ronald Brachman (Brachman, 1985) points out some philosophical or semantic problems in using the notion of a prototype, which is described by using default properties. The problem arises since default properties can be overridden or cancelled in representing particular instances, and therefore lack definitional power: i.e., they are not really essential to the concept being represented. As an example, Brachman presents an elephant joke: Q: What's big and gray, has a trunk, and lives in the trees? A: An elephant-I lied about the trees. Before discussing a solution to this dilemma, consider the following modified version of the elephant joke, perhaps not quite as funny: Q: What's big and gray, has a trunk, and lives in the trees?