Goto

Collaborating Authors

 booking


Non-Collaborative User Simulators for Tool Agents

Shim, Jeonghoon, Song, Woojung, Jin, Cheyon, KooK, Seungwon, Jo, Yohan

arXiv.org Artificial Intelligence

Tool agents interact with users through multi-turn dialogues to accomplish various tasks. Recent studies have adopted user simulation methods to develop these agents in multi-turn settings. However, existing user simulators tend to be agent-friendly, exhibiting only cooperative behaviors, which fails to train and test agents against non-collaborative users in the real world. To address this, we propose a novel user simulator architecture that simulates four categories of non-collaborative behaviors: requesting unavailable services, digressing into tangential conversations, expressing impatience, and providing incomplete utterances. Our user simulator can simulate challenging and natural non-collaborative behaviors while reliably delivering all intents and information necessary to accomplish the task. Our experiments on MultiWOZ and $τ$-bench reveal significant performance degradation in state-of-the-art tool agents when encountering non-collaborative users. We provide detailed analyses of agents' weaknesses under each non-collaborative condition, such as escalated hallucinations and dialogue breakdowns. Ultimately, we contribute an easily extensible user simulation framework to help the research community develop tool agents and preemptively diagnose them under challenging real-world conditions within their own services.


Hotel adverts banned over misleadingly cheap rooms

BBC News

Adverts by four of Britain's biggest hotel and travel firms have been banned for stating misleading minimum prices for rooms. The Advertising Standards Authority (ASA) upheld complaints against the Hilton hotel group, Travelodge, Booking.com and Accor over their use of eye-catching so-called from prices. The watchdog found only a small number of rooms actually available to book at the promoted price and concluded the adverts overstated the deals. It said this was unfair on those looking for good deals or seeking to make informed choices about where to book. ASA operations manager Emily Henwood said: Advertised prices must match what's really available.


Breaking the Cycle of Incarceration With Targeted Mental Health Outreach: A Case Study in Machine Learning for Public Policy

Rodolfa, Kit T., Salomon, Erika, Yao, Jin, Yoder, Steve, Sullivan, Robert, McGuire, Kevin, Dickinson, Allie, MacDougall, Rob, Seidler, Brian, Sung, Christina, Herdeman, Claire, Ghani, Rayid

arXiv.org Artificial Intelligence

Many incarcerated individuals face significant and complex challenges, including mental illness, substance dependence, and homelessness, yet jails and prisons are often poorly equipped to address these needs. With little support from the existing criminal justice system, these needs can remain untreated and worsen, often leading to further offenses and a cycle of incarceration with adverse outcomes both for the individual and for public safety, with particularly large impacts on communities of color that continue to widen the already extensive racial disparities in criminal justice outcomes. Responding to these failures, a growing number of criminal justice stakeholders are seeking to break this cycle through innovative approaches such as community-driven and alternative approaches to policing, mentoring, community building, restorative justice, pretrial diversion, holistic defense, and social service connections. Here we report on a collaboration between Johnson County, Kansas, and Carnegie Mellon University to perform targeted, proactive mental health outreach in an effort to reduce reincarceration rates. This paper describes the data used, our predictive modeling approach and results, as well as the design and analysis of a field trial conducted to confirm our model's predictive power, evaluate the impact of this targeted outreach, and understand at what level of reincarceration risk outreach might be most effective. Through this trial, we find that our model is highly predictive of new jail bookings, with more than half of individuals in the trial's highest-risk group returning to jail in the following year. Outreach was most effective among these highest-risk individuals, with impacts on mental health utilization, EMS dispatches, and criminal justice involvement.


Get 60% off a lifetime of flights and hotels for life

Popular Science

If you've ever booked a hotel or flight only to see the price drop the very next day, or have been putting your travel dreams on hold altogether because of the prices, your worries are over. OneAir is a members-only, all-in-one AI-powered travel platform built for modern travelers. Their intelligent search engine scans and tracks millions of flight and hotel deals from your home airport to leading global destinations. Members receive instant mobile and email alerts as soon as travel prices drop. A lifetime subscription is now available to new users for just 59.99 when you use code FLY30 at checkout.


The AI that sees flight deals before Google even wakes up

Popular Science

And yet, booking a cheap flight still feels like a shot in the dark. That is, unless you've tapped into this lesser-known AI tool that's helping travelers score the best travel deals before the big sites can catch up. OneAir is a members-only service that gives you access to hidden rates for airlines and hotels. What are the prices like? The lifetime membership fee is normally 99.99, but you can use code FLY30 at checkout to save 30 percent this week: 69.99! Summer is approaching, and you may be eager to book a fun getaway to de-stress from … everything happening right now.


Predicting Potential Customer Support Needs and Optimizing Search Ranking in a Two-Sided Marketplace

Kim, Do-kyum, Zhao, Han, Gao, Huiji, He, Liwei, Haldar, Malay, Katariya, Sanjeev

arXiv.org Artificial Intelligence

Airbnb is an online marketplace that connects hosts and guests to unique stays and experiences. When guests stay at homes booked on Airbnb, there are a small fraction of stays that lead to support needed from Airbnb's Customer Support (CS), which may cause inconvenience to guests and hosts and require Airbnb resources to resolve. In this work, we show that instances where CS support is needed may be predicted based on hosts and guests behavior. We build a model to predict the likelihood of CS support needs for each match of guest and host. The model score is incorporated into Airbnb's search ranking algorithm as one of the many factors. The change promotes more reliable matches in search results and significantly reduces bookings that require CS support.


SPADE: Systematic Prompt Framework for Automated Dialogue Expansion in Machine-Generated Text Detection

Li, Haoyi, Yuan, Angela Yifei, Han, Soyeon Caren, Leckie, Christopher

arXiv.org Artificial Intelligence

The increasing capability of large language models (LLMs) to generate synthetic content has heightened concerns about their misuse, driving the development of Machine-Generated Text (MGT) detection models. However, these detectors face significant challenges due to the lack of systematically generated, high-quality datasets for training. To address this issue, we propose five novel data augmentation frameworks for synthetic user dialogue generation through a structured prompting approach, reducing the costs associated with traditional data collection methods. Our proposed method yields 14 new dialogue datasets, which we benchmark against seven MGT detection models. The results demonstrate improved generalization performance when utilizing a mixed dataset produced by our proposed augmentation framework. Furthermore, considering that real-world agents lack knowledge of future opponent utterances, we simulate online dialogue detection and examine the relationship between chat history length and detection accuracy. We also benchmark online detection performance with limited chat history on our frameworks. Our open-source datasets can be downloaded from https://github.com/AngieYYF/SPADE-customer-service-dialogue.


Costco expands travel benefit by rolling out artificial intelligence to members

FOX News

Giselle and Stephen Jiroch of California have been traveling full-time for the last four years. The couple said these U.S. destinations are must-see spots. Costco is rolling out new ways to deliver perks to its customers while tapping into the travel industry's knowledge and insight. In collaboration with Travelport, a global technology company that connects travel suppliers, Costco Travel has introduced new features for members. The partnership will expand the flight options available to members.


Using Contextually Aligned Online Reviews to Measure LLMs' Performance Disparities Across Language Varieties

Tang, Zixin, Huang, Chieh-Yang, Li, Tsung-Chi, Ng, Ho Yin Sam, Huang, Hen-Hsen, Huang, Ting-Hao 'Kenneth'

arXiv.org Artificial Intelligence

A language can have different varieties. These varieties can affect the performance of natural language processing (NLP) models, including large language models (LLMs), which are often trained on data from widely spoken varieties. This paper introduces a novel and cost-effective approach to benchmark model performance across language varieties. We argue that international online review platforms, such as Booking.com, can serve as effective data sources for constructing datasets that capture comments in different language varieties from similar real-world scenarios, like reviews for the same hotel with the same rating using the same language (e.g., Mandarin Chinese) but different language varieties (e.g., Taiwan Mandarin, Mainland Mandarin). To prove this concept, we constructed a contextually aligned dataset comprising reviews in Taiwan Mandarin and Mainland Mandarin and tested six LLMs in a sentiment analysis task. Our results show that LLMs consistently underperform in Taiwan Mandarin.


Maturity Framework for Enhancing Machine Learning Quality

Castelli, Angelantonio, Chouliaras, Georgios Christos, Goldenberg, Dmitri

arXiv.org Artificial Intelligence

With the rapid integration of Machine Learning (ML) in business applications and processes, it is crucial to ensure the quality, reliability and reproducibility of such systems. We suggest a methodical approach towards ML system quality assessment and introduce a structured Maturity framework for governance of ML. We emphasize the importance of quality in ML and the need for rigorous assessment, driven by issues in ML governance and gaps in existing frameworks. Our primary contribution is a comprehensive open-sourced quality assessment method, validated with empirical evidence, accompanied by a systematic maturity framework tailored to ML systems. Drawing from applied experience at Booking.com, we discuss challenges and lessons learned during large-scale adoption within organizations. The study presents empirical findings, highlighting quality improvement trends and showcasing business outcomes. The maturity framework for ML systems, aims to become a valuable resource to reshape industry standards and enable a structural approach to improve ML maturity in any organization.