Personal
Finding Personalized Good-Enough Solutions to Unsatisfiable Stable Roommates Problems
The Stable Roommates problems are characterized by the preferences of agents over other agents as roommates. A solution is a partition of the agents into pairs that are acceptable to each other (i.e., they are in the preference lists of each other), and the matching is stable (i.e., there do not exist any two agents who prefer each other to their roommates, and thus block the matching). Motivated by real-world applications, and considering that stable roommates problems do not always have solutions, we continue our studies to compute "good-enough" matchings. In addition to the agents' habits and habitual preferences, we consider their networks of preferred friends, and introduce a method to generate personalized solutions to stable roommates problems. We illustrate the usefulness of our method with examples and empirical evaluations.
Star Trek legend William Shatner discovers powerful new way to live forever
A groundbreaking program has now made it possible to preserve your life stories and wisdom, allowing you to speak to loved ones decades into the future. StoryFile, an innovative AI company, has developed lifelike, interactive 3D avatars that allow people to'live on' after death, sharing memories and answering questions in the same natural and conversational manner of a real person. Individuals like philanthropist Michael Staenberg, 71, and Star Trek star William Shatner, 94, have used StoryFile to immortalize both their experiences and personalities. Staenberg, a property developer and philanthropist who has given away more than 850 million, said: 'I hope to pass my knowledge on, and the good I've created.' The technology captures video interviews, transforming them into hologram-style avatars that use generative AI, similar to ChatGPT, to respond dynamically to questions.
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
Agrawal, Lakshya A, Tan, Shangyin, Soylu, Dilara, Ziems, Noah, Khare, Rishi, Opsahl-Ong, Krista, Singhvi, Arnav, Shandilya, Herumb, Ryan, Michael J, Jiang, Meng, Potts, Christopher, Sen, Koushik, Dimakis, Alexandros G., Stoica, Ion, Klein, Dan, Zaharia, Matei, Khattab, Omar
Large language models (LLMs) are increasingly adapted to downstream tasks via reinforcement learning (RL) methods like Group Relative Policy Optimization (GRPO), which often require thousands of rollouts to learn new tasks. We argue that the interpretable nature of language can often provide a much richer learning medium for LLMs, compared with policy gradients derived from sparse, scalar rewards. To test this, we introduce GEPA (Genetic-Pareto), a prompt optimizer that thoroughly incorporates natural language reflection to learn high-level rules from trial and error. Given any AI system containing one or more LLM prompts, GEPA samples system-level trajectories (e.g., reasoning, tool calls, and tool outputs) and reflects on them in natural language to diagnose problems, propose and test prompt updates, and combine complementary lessons from the Pareto frontier of its own attempts. As a result of GEPA's design, it can often turn even just a few rollouts into a large quality gain. Across four tasks, GEPA outperforms GRPO by 10% on average and by up to 20%, while using up to 35x fewer rollouts. GEPA also outperforms the leading prompt optimizer, MIPROv2, by over 10% across two LLMs, and demonstrates promising results as an inference-time search strategy for code optimization.
18 months. 12,000 questions. A whole lot of anxiety. What I learned from reading students' ChatGPT logs
Making new friends is hard. Finding out what trousers exist in the world other than black ones is also, apparently, hard. Fortunately, for an AI-enabled generation of students, help with the complexities of campus life is just a prompt away. If you are really stuck on an essay or can't decide between management consulting or a legal career, or need suggestions on what you can cook with tomatoes, mushrooms, beetroot, mozzarella, olive oil and rice, then ChatGPT is there. It will to listen to you, analyse your inputs, and offer up a perfectly structured paper, a convincing cover letter, or a workable recipe for tomato and mushroom risotto with roasted beetroot and mozzarella. I know this because three undergraduates have given me permission to eavesdrop on every conversation they have had with ChatGPT over the past 18 months.
Interview with Kate Candon: Leveraging explicit and implicit feedback in human-robot interactions
In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. Kate Candon is a PhD student at Yale University interested in understanding how we can create interactive agents that are more effectively able to help people. We spoke to Kate to find out more about how she is leveraging explicit and implicit feedback in human-robot interactions. Specifically I'm interested in how we can get robots to better learn from humans in the way that they naturally teach. Typically, a lot of work in robot learning is with a human teacher who is only tasked with giving explicit feedback to the robot, but they're not necessarily engaged in the task.
OPeRA: A Dataset of Observation, Persona, Rationale, and Action for Evaluating LLMs on Human Online Shopping Behavior Simulation
Wang, Ziyi, Lu, Yuxuan, Li, Wenbo, Amini, Amirali, Sun, Bo, Bart, Yakov, Lyu, Weimin, Gesi, Jiri, Wang, Tian, Huang, Jing, Su, Yu, Ehsan, Upol, Alikhani, Malihe, Li, Toby Jia-Jun, Chilton, Lydia, Wang, Dakuo
Can large language models (LLMs) accurately simulate the next web action of a specific user? While LLMs have shown promising capabilities in generating ``believable'' human behaviors, evaluating their ability to mimic real user behaviors remains an open challenge, largely due to the lack of high-quality, publicly available datasets that capture both the observable actions and the internal reasoning of an actual human user. To address this gap, we introduce OPERA, a novel dataset of Observation, Persona, Rationale, and Action collected from real human participants during online shopping sessions. OPERA is the first public dataset that comprehensively captures: user personas, browser observations, fine-grained web actions, and self-reported just-in-time rationales. We developed both an online questionnaire and a custom browser plugin to gather this dataset with high fidelity. Using OPERA, we establish the first benchmark to evaluate how well current LLMs can predict a specific user's next action and rationale with a given persona and
CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards
Liu, Cheng, Lu, Yifei, Ye, Fanghua, Li, Jian, Chen, Xingyu, Ren, Feiliang, Tu, Zhaopeng, Li, Xiaolong
Role-Playing Language Agents (RPLAs) have emerged as a significant application direction for Large Language Models (LLMs). Existing approaches typically rely on prompt engineering or supervised fine-tuning to enable models to imitate character behaviors in specific scenarios, but often neglect the underlying \emph{cognitive} mechanisms driving these behaviors. Inspired by cognitive psychology, we introduce \textbf{CogDual}, a novel RPLA adopting a \textit{cognize-then-respond } reasoning paradigm. By jointly modeling external situational awareness and internal self-awareness, CogDual generates responses with improved character consistency and contextual alignment. To further optimize the performance, we employ reinforcement learning with two general-purpose reward schemes designed for open-domain text generation. Extensive experiments on the CoSER benchmark, as well as Cross-MR and LifeChoice, demonstrate that CogDual consistently outperforms existing baselines and generalizes effectively across diverse role-playing tasks.
Which lips do YOU think are most attractive? Scientists reveal the most desirable pout - so, do you agree?
From Angelina Jolie to Megan Fox, many celebrities are known for their luscious lips. But what exactly does the perfect pout look like? A new study has revealed the answer - and it's bad news for fans of lip fillers. Scientists from the American University of Beirut showed 200 people AI-generated pictures of a woman, whose lips had been adjusted in various ways. An analysis of their preferences revealed that the perfect pout features an upper-to-lower lip ratio (U/L) of between 0.618:1 and 1:1.
Zelenskyy says he and Trump are considering a drone 'mega-deal'
U.S. President Donald Trump and Ukrainian President Volodymyr Zelenskyy are considering a deal that involves Washington buying battlefield-tested Ukrainian drones in exchange for Kyiv purchasing weapons from the U.S., Zelenskyy said in an interview with the New York Post. Zelenskyy said his latest talks with Trump focused on a deal that would help each country bolster its aerial technology. Ukrainian drones have been able to strike targets as deep as 1,300 kilometers into Russian territory. "The people of America need this technology, and you need to have it in your arsenal," Zelenskyy told the Post in the interview conducted Wednesday. The Ukrainian leader said drones were the key tool that has allowed his country to fight off Russia's invasion for more than three years.
"Is it always watching? Is it always listening?" Exploring Contextual Privacy and Security Concerns Toward Domestic Social Robots
Bell, Henry, Kwesi, Jabari, Laabadli, Hiba, Emami-Naeini, Pardis
Equipped with artificial intelligence (AI) and advanced sensing capabilities, social robots are gaining interest among consumers in the United States. These robots seem like a natural evolution of traditional smart home devices. However, their extensive data collection capabilities, anthropomorphic features, and capacity to interact with their environment make social robots a more significant security and privacy threat. Increased risks include data linkage, unauthorized data sharing, and the physical safety of users and their homes. It is critical to investigate U.S. users' security and privacy needs and concerns to guide the design of social robots while these devices are still in the early stages of commercialization in the U.S. market. Through 19 semi-structured interviews, we identified significant security and privacy concerns, highlighting the need for transparency, usability, and robust privacy controls to support adoption. For educational applications, participants worried most about misinformation, and in medical use cases, they worried about the reliability of these devices. Participants were also concerned with the data inference that social robots could enable. We found that participants expect tangible privacy controls, indicators of data collection, and context-appropriate functionality.