Goto

Collaborating Authors

 Roissy


Flights returning to normal after Airbus warning grounded planes

BBC News

Thousands of Airbus planes are being returned to normal service after being grounded for hours due to a warning that solar radiation could interfere with onboard flight control computers. The aerospace giant - based in France - said around 6,000 of its A320 planes had been affected with most requiring a quick software update. Some 900 older planes need a replacement computer. French Transport Minister Philippe Tabarot said the updates went very smoothly for more than 5,000 planes. Fewer than 100 aircraft still needed the update, Airbus had told him, according to local media.


4 Arrested Over Scattered Spider Hacking Spree

WIRED

WIRED reported this week on public records that show the United States Department of Homeland Security urging local law enforcement around the country to interpret common protest activities and surrounding logistics--including riding a bike, livestreaming a police encounter, or skateboarding--as "violent tactics." The guidance could influence cops to use everyday behavior as a pretext for police action. An AI hiring bot used on the McDonald's "McHire" site exposed tens of millions of job applicants' personal data because of a group of web-based security vulnerabilities--including use of the classically guessable password "123456" on an administrator account. The site's chatbot, known as Olivia, was built by the artificial intelligence software firm Paradox.ai. Meanwhile, in the wake of last week's devastating floods in Texas that killed at least 120 people, conspiracy theories about the extreme weather event have gained enough traction among anti-government extremists, GOP influencers, and others with large platforms to produce real-world consequences like death threats.


Urban Region Embeddings from Service-Specific Mobile Traffic Data

arXiv.org Artificial Intelligence

--With the advent of advanced 4G/5G mobile networks, mobile phone data collected by operators now includes detailed, service-specific traffic information with high spatiotemporal resolution. In this paper, we leverage this type of data to explore its potential for generating high-quality representations of urban regions. T o achieve this, we present a methodology for creating urban region embeddings from service-specific mobile traffic data, employing a temporal convolutional network-based autoencoder, transformers, and learnable weighted sum models to capture key urban features. In the extensive experimental evaluation conducted using a real-world dataset, we demonstrate that the embeddings generated by our methodology effectively capture urban characteristics. Specifically, our embeddings are compared against those of a state-of-the-art competitor across two downstream tasks. Additionally, through clustering techniques, we investigate how well the embeddings produced by our methodology capture the temporal dynamics and characteristics of the underlying urban regions. Overall, this work highlights the potential of service-specific mobile traffic data for urban research and emphasizes the importance of making such data accessible to support public innovation. Mobile phone activity data is a well-established and widely explored type of mobility data used in various applications, including mobility, health, socio-economic, and demographic studies. In the past years, mobile phone data was typically studied in the form of Call Detail Records (CDRs), which capture users' connections to cell towers during calls or messaging activities. However, this type of data is often sparse and irregular, limiting its potential for broader and more scalable applications. With the rise of 4G/5G cellular networks, mobile phone usage has shifted towards extensive use of data services, such as mobile applications, which generate massive volumes of data traffic. The information related to the data traffic volume generated by these services can offer rich spatio-temporal details and insights into the characteristics of the underlying urban regions. To this end, in this work, we consider the NetMob 2023 dataset [1], which provides detailed data on mobile traffic volume across multiple data services. Orange, the mobile operator providing the dataset, recorded upload and download traffic for 68 different mobile applications across 20 major French cities.


Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks

arXiv.org Artificial Intelligence

The rise of large language models (LLMs) has revolutionized user interactions with knowledge-based systems, enabling chatbots to synthesize vast amounts of information and assist with complex, exploratory tasks. However, LLM-based chatbots often struggle to provide personalized support, particularly when users start with vague queries or lack sufficient contextual information. This paper introduces the Collaborative Assistant for Personalized Exploration (CARE), a system designed to enhance personalization in exploratory tasks by combining a multi-agent LLM framework with a structured user interface. CARE's interface consists of a Chat Panel, Solution Panel, and Needs Panel, enabling iterative query refinement and dynamic solution generation. The multi-agent framework collaborates to identify both explicit and implicit user needs, delivering tailored, actionable solutions. In a within-subject user study with 22 participants, CARE was consistently preferred over a baseline LLM chatbot, with users praising its ability to reduce cognitive load, inspire creativity, and provide more tailored solutions. Our findings highlight CARE's potential to transform LLM-based systems from passive information retrievers to proactive partners in personalized problem-solving and exploration.


Few-shot Prompting for Pairwise Ranking: An Effective Non-Parametric Retrieval Model

arXiv.org Artificial Intelligence

A supervised ranking model, despite its advantage of being effective, usually involves complex processing - typically multiple stages of task-specific pre-training and fine-tuning. This has motivated researchers to explore simpler pipelines leveraging large language models (LLMs) that are capable of working in a zero-shot manner. However, since zero-shot inference does not make use of a training set of pairs of queries and their relevant documents, its performance is mostly worse than that of supervised models, which are trained on such example pairs. Motivated by the existing findings that training examples generally improve zero-shot performance, in our work, we explore if this also applies to ranking models. More specifically, given a query and a pair of documents, the preference prediction task is improved by augmenting examples of preferences for similar queries from a training set. Our proposed pairwise few-shot ranker demonstrates consistent improvements over the zero-shot baseline on both in-domain (TREC DL) and out-domain (BEIR subset) retrieval benchmarks. Our method also achieves a close performance to that of a supervised model without requiring any complex training pipeline.


A Regression Mixture Model to understand the effect of the Covid-19 pandemic on Public Transport Ridership

arXiv.org Artificial Intelligence

The Covid-19 pandemic drastically changed urban mobility, both during the height of the pandemic with government lockdowns, but also in the longer term with the adoption of working-from-home policies. To understand its effects on rail public transport ridership, we propose a dedicated Regression Mixture Model able to perform both the clustering of public transport stations and the segmentation of time periods, while ignoring variations due to additional variables such as the official lockdowns or non-working days. Each cluster is thus defined by a series of segments in which the effect of the exogenous variables is constant. As each segment within a cluster has its own regression coefficients to model the impact of the covariates, we analyze how these coefficients evolve to understand the changes in the cluster. We present the regression mixture model and the parameter estimation using the EM algorithm, before demonstrating the benefits of the model on both simulated and real data. Thanks to a five-year dataset of the ridership in the Paris public transport system, we analyze the impact of the pandemic, not only in terms of the number of travelers but also on the weekly commute. We further analyze the specific changes that the pandemic caused inside each cluster.


Generative Judge for Evaluating Alignment

arXiv.org Artificial Intelligence

The rapid development of Large Language Models (LLMs) has substantially expanded the range of tasks they can address. In the field of Natural Language Processing (NLP), researchers have shifted their focus from conventional NLP tasks (e.g., sequence tagging and parsing) towards tasks that revolve around aligning with human needs (e.g., brainstorming and email writing). This shift in task distribution imposes new requirements on evaluating these aligned models regarding generality (i.e., assessing performance across diverse scenarios), flexibility (i.e., examining under different protocols), and interpretability (i.e., scrutinizing models with explanations). In this paper, we propose a generative judge with 13B parameters, Auto-J, designed to address these challenges. Our model is trained on user queries and LLM-generated responses under massive real-world scenarios and accommodates diverse evaluation protocols (e.g., pairwise response comparison and single-response evaluation) with well-structured natural language critiques. To demonstrate the efficacy of our approach, we construct a new testbed covering 58 different scenarios. Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models, by a large margin. We also provide detailed analysis and case studies to further reveal the potential of our method and make a variety of resources public at https://github.com/GAIR-NLP/auto-j.


Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models

arXiv.org Artificial Intelligence

Listwise rerankers based on large language models (LLM) are the zero-shot state-of-the-art. However, current works in this direction all depend on the GPT models, making it a single point of failure in scientific reproducibility. Moreover, it raises the concern that the current research findings only hold for GPT models but not LLM in general. In this work, we lift this pre-condition and build for the first time effective listwise rerankers without any form of dependency on GPT. Our passage retrieval experiments show that our best list se reranker surpasses the listwise rerankers based on GPT-3.5 by 13% and achieves 97% effectiveness of the ones built on GPT-4. Our results also show that the existing training datasets, which were expressly constructed for pointwise ranking, are insufficient for building such listwise rerankers. Instead, high-quality listwise ranking data is required and crucial, calling for further work on building human-annotated listwise data resources.


A Branched Deep Convolutional Network for Forecasting the Occurrence of Hazes in Paris using Meteorological Maps with Different Characteristic Spatial Scales

arXiv.org Artificial Intelligence

A deep learning platform has been developed to forecast the occurrence of the low visibility events or hazes. It is trained by using multi-decadal daily regional maps of various meteorological and hydrological variables as input features and surface visibility observations as the targets. To better preserve the characteristic spatial information of different input features for training, two branched architectures have recently been developed for the case of Paris hazes. These new architectures have improved the performance of the network, producing reasonable scores in both validation and a blind forecasting evaluation using the data of 2021 and 2022 that have not been used in the training and validation.


Applied metamodelling for ATM performance simulations

arXiv.org Artificial Intelligence

The use of Air traffic management (ATM) simulators for planing and operations can be challenging due to their modelling complexity. This paper presents XALM (eXplainable Active Learning Metamodel), a three-step framework integrating active learning and SHAP (SHapley Additive exPlanations) values into simulation metamodels for supporting ATM decision-making. XALM efficiently uncovers hidden relationships among input and output variables in ATM simulators, those usually of interest in policy analysis. Our experiments show XALM's predictive performance comparable to the XGBoost metamodel with fewer simulations. Additionally, XALM exhibits superior explanatory capabilities compared to non-active learning metamodels. Using the `Mercury' (flight and passenger) ATM simulator, XALM is applied to a real-world scenario in Paris Charles de Gaulle airport, extending an arrival manager's range and scope by analysing six variables. This case study illustrates XALM's effectiveness in enhancing simulation interpretability and understanding variable interactions. By addressing computational challenges and improving explainability, XALM complements traditional simulation-based analyses. Lastly, we discuss two practical approaches for reducing the computational burden of the metamodelling further: we introduce a stopping criterion for active learning based on the inherent uncertainty of the metamodel, and we show how the simulations used for the metamodel can be reused across key performance indicators, thus decreasing the overall number of simulations needed.