Goto

Collaborating Authors

 sasha


Personalized Reasoning: Just-In-Time Personalization and Why LLMs Fail At It

Li, Shuyue Stella, Bose, Avinandan, Brahman, Faeze, Du, Simon Shaolei, Koh, Pang Wei, Fazel, Maryam, Tsvetkov, Yulia

arXiv.org Artificial Intelligence

Current large language model (LLM) development treats task-solving and preference alignment as separate challenges, optimizing first for objective correctness, then for alignment to aggregated human preferences. This paradigm fails in human-facing applications where solving a problem correctly is insufficient if the response mismatches the user's needs. This challenge intensifies in just-in-time scenarios where no prior user interaction history exists due to cold-start conditions or privacy constraints. LLMs need to identify what they don't know about user preferences, strategically elicit preference values through questioning, then adapt their reasoning processes and responses accordingly -- a complicated chain of cognitive processes which we term personalized reasoning. We introduce PREFDISCO, an evaluation methodology that transforms static benchmarks into interactive personalization tasks using psychologically-grounded personas with sparse preferences. Our framework creates scenarios where identical questions require different reasoning chains depending on user context, as optimal explanation approaches vary by individual expertise and preferences while maintaining factual accuracy. Evaluation of 21 frontier models across 10 tasks reveals 29.0% of naive personalization attempts produce worse preference alignment than generic responses, yet generic responses also fail to serve individual user needs effectively. These findings suggest personalized reasoning requires dedicated development rather than emerging naturally. PREFDISCO establishes personalized reasoning as a measurable research frontier and reveals fundamental limitations in current LLMs' interactive capabilities, providing a foundation for developing systems that can adapt to individual users in education, healthcare, and technical domains where personalization is critical.


East Is South review – weighty AI drama takes aim at humanity's biggest questions

The Guardian

House of Cards writer Beau Willimon's new play East Is South deals with the ethics and advancement of AI. But despite the transformative subject matter, Ellen McDougall's production has as much propulsion as a car in reverse. Skins actor Kaya Scodelario plays Lena, a former Mennonite and gifted coder, who is wrestling with the expanding consciousness of Logos, the software her company has developed. We meet her as she is preparing to be questioned by the workplace bigwigs who watch her from the upper level of Alex Eales's two-tiered sciene-inspired set as if she is a caged animal. Lena and her lover, Sasha (Luke Treadaway) are being investigated after a security breach.


Jack Grealish and Sasha's baby revealed! AI predicts what Manchester City star and his childhood sweetheart's offspring will look like after they announce they are expecting their first child together

Daily Mail - Science & tech

Footballing ace Jack Grealish has revealed he's expecting a baby with his childhood sweetheart Sasha Attwood. Taking to Instagram on Sunday, the legendary Manchester City midfielder shared a picture of himself holding Sasha's growing baby bump to break the news. Now, AI predicts what their offspring will look like – and although the sex hasn't been revealed the technology seems to be expecting a boy. Just like his dad, the little kid sports a memorable hairdo and seems to be displaying some neat skills with a football. So, can you see the likeness?


Sasha: Creative Goal-Oriented Reasoning in Smart Homes with Large Language Models

King, Evan, Yu, Haoxiang, Lee, Sangsu, Julien, Christine

arXiv.org Artificial Intelligence

Smart home assistants function best when user commands are direct and well-specified (e.g., "turn on the kitchen light"), or when a hard-coded routine specifies the response. In more natural communication, however, human speech is unconstrained, often describing goals (e.g., "make it cozy in here" or "help me save energy") rather than indicating specific target devices and actions to take on those devices. Current systems fail to understand these under-specified commands since they cannot reason about devices and settings as they relate to human situations. We introduce large language models (LLMs) to this problem space, exploring their use for controlling devices and creating automation routines in response to under-specified user commands in smart homes. We empirically study the baseline quality and failure modes of LLM-created action plans with a survey of age-diverse users. We find that LLMs can reason creatively to achieve challenging goals, but they experience patterns of failure that diminish their usefulness. We address these gaps with Sasha, a smarter smart home assistant. Sasha responds to loosely-constrained commands like "make it cozy" or "help me sleep better" by executing plans to achieve user goals, e.g., setting a mood with available devices, or devising automation routines. We implement and evaluate Sasha in a hands-on user study, showing the capabilities and limitations of LLM-driven smart homes when faced with unconstrained user-generated scenarios.


Tech 2016

BBC News

It has been an eventful 12 months. Samsung smartphones exploded, GoPro drones dropped out of the air and Pebble smartwatches met an untimely end. Facebook became embroiled in a fake news controversy, Yahoo revealed several mega-breaches, we identified the supposed creator of Bitcoin - who then went AWOL - and millions indulged in a game of Pokemon Go. Yet none of those stories made our most-read-of-the-month list - based on the number of times an article was clicked - as you can see below. There is a good rule of thumb: if you do not want your employer to know what you are up to online, wait until you are not on the job.


Machines combating disease - IoTUK

#artificialintelligence

Alejandro (Sasha) Vicente Grabovetsky, Co-founder of Avalon AI, discusses the ways in which machine learning is improving the rates of failed dementia clinical trials and improving the lives of those living with the disease. The idea for Avalon AI came together when my Co-founder Olivier van den Biggelaar and I realised that we shared the same aim, which was to help defeat ageing. Following that, what immediately came to mind was dementia because it's a disease that has not been successfully tackled yet. Lots of age related diseases like diabetes and cancer receive a lot of funding and are being heavily addressed, while dementia is under-funded partly due to failed clinical trials. Very few dementia clinical trials have succeeded and we noticed that a lot of the past trials were targeting late-stage dementia, where a lot of brain damage had already occurred.