Law
Dynamic Assortment Selection and Pricing with Censored Preference Feedback
In this study, we investigate the problem of dynamic multi-product selection and pricing by introducing a novel framework based on a \textit{censored multinomial logit} (C-MNL) choice model. In this model, sellers present a set of products with prices, and buyers filter out products priced above their valuation, purchasing at most one product from the remaining options based on their preferences. The goal is to maximize seller revenue by dynamically adjusting product offerings and prices, while learning both product valuations and buyer preferences through purchase feedback. To achieve this, we propose a Lower Confidence Bound (LCB) pricing strategy. By combining this pricing strategy with either an Upper Confidence Bound (UCB) or Thompson Sampling (TS) product selection approach, our algorithms achieve regret bounds of $\tilde{O}(d^{\frac{3}{2}}\sqrt{T/\kappa})$ and $\tilde{O}(d^{2}\sqrt{T/\kappa})$, respectively. Finally, we validate the performance of our methods through simulations, demonstrating their effectiveness.
Multi-Modal Framing Analysis of News
Arora, Arnav, Yadav, Srishti, Antoniak, Maria, Belongie, Serge, Augenstein, Isabelle
Automated frame analysis of political communication is a popular task in computational social science that is used to study how authors select aspects of a topic to frame its reception. So far, such studies have been narrow, in that they use a fixed set of pre-defined frames and focus only on the text, ignoring the visual contexts in which those texts appear. Especially for framing in the news, this leaves out valuable information about editorial choices, which include not just the written article but also accompanying photographs. To overcome such limitations, we present a method for conducting multi-modal, multi-label framing analysis at scale using large (vision-)language models. Grounding our work in framing theory, we extract latent meaning embedded in images used to convey a certain point and contrast that to the text by comparing the respective frames used. We also identify highly partisan framing of topics with issue-specific frame analysis found in prior qualitative work. We demonstrate a method for doing scalable integrative framing analysis of both text and image in news, providing a more complete picture for understanding media bias.
Elon Musk Lost His Big Bet
Last night, X's "For You" algorithm offered me up what felt like a dispatch from an alternate universe. It was a post from Elon Musk, originally published hours earlier. "This is the first time humans have been in orbit around the poles of the Earth!" he wrote. Underneath his post was a video shared by SpaceX--footage of craggy ice caps, taken by the company's Dragon spacecraft during a private mission. Taken on its own, the video is genuinely captivating.
A bestseller is born: How Zuckerberg discovered the Streisand Effect
Feedback is New Scientist's popular sideways look at the latest science and technology news. You can submit items you believe may amuse readers to Feedback by emailing feedback@newscientist.com Some things are sadly inevitable: death, taxes, another Coldplay album. One such inevitability, long since proved beyond any reasonable doubt, is that if you try to suppress an embarrassing story, you will only draw more attention to it. This phenomenon is called the Streisand Effect, after an incident in 2003 when Barbra Streisand sued to have an aerial photograph taken off the internet.
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Shen, Wei, Liu, Guanlin, Wu, Zheng, Zhu, Ruofei, Yang, Qingping, Xin, Chao, Yue, Yu, Yan, Lin
Reinforcement Learning from Human Feedback (RLHF) is crucial for aligning large language models with human preferences. While recent research has focused on algorithmic improvements, the importance of prompt-data construction has been overlooked. This paper addresses this gap by exploring data-driven bottlenecks in RLHF performance scaling, particularly reward hacking and decreasing response diversity. We introduce a hybrid reward system combining reasoning task verifiers (RTV) and a generative reward model (GenRM) to mitigate reward hacking. We also propose a novel prompt-selection method, Pre-PPO, to maintain response diversity and enhance learning effectiveness. Additionally, we find that prioritizing mathematical and coding tasks early in RLHF training significantly improves performance. Experiments across two model sizes validate our methods' effectiveness and scalability. Results show that RTV is most resistant to reward hacking, followed by GenRM with ground truth, and then GenRM with SFT Best-of-N responses. Our strategies enable rapid capture of subtle task-specific distinctions, leading to substantial improvements in overall RLHF performance. This work highlights the importance of careful data construction and provides practical methods to overcome performance barriers in RLHF.
British authors want Meta to answer for alleged copyright infringement
A March 20 article in The Atlantic served as the letter's impetus. It reported that Meta had used LibGen, a pirated collection of over 7.5 million books, to train its AI models. Anyone on the internet over the last few weeks has likely seen videos of distraught authors learning that their work is available on the database (and potentially used by Meta without their permission). A lawsuit in the US alleges Meta CEO Mark Zuckerberg approved the use of LibGen's data to train its AI. The lawsuit's plaintiffs include writers Sarah Silverman and Ta-Nehisi Coates.
Cooper: A Library for Constrained Optimization in Deep Learning
Gallego-Posada, Jose, Ramirez, Juan, Hashemizadeh, Meraj, Lacoste-Julien, Simon
Cooper is an open-source package for solving constrained optimization problems involving deep learning models. Cooper implements several Lagrangian-based first-order update schemes, making it easy to combine constrained optimization algorithms with high-level features of PyTorch such as automatic differentiation, and specialized deep learning architectures and optimizers. Although Cooper is specifically designed for deep learning applications where gradients are estimated based on mini-batches, it is suitable for general non-convex continuous constrained optimization. Cooper's source code is available at https://github.com/cooper-org/cooper.
Making Sense of Robots in Public Spaces: A Study of Trash Barrel Robots
Bu, Fanjun, Fischer, Kerstin, Ju, Wendy
In this work, we analyze video data and interviews from a public deployment of two trash barrel robots in a large public space to better understand the sensemaking activities people perform when they encounter robots in public spaces. Based on an analysis of 274 human-robot interactions and interviews with N=65 individuals or groups, we discovered that people were responding not only to the robots or their behavior, but also to the general idea of deploying robots as trashcans, and the larger social implications of that idea. They wanted to understand details about the deployment because having that knowledge would change how they interact with the robot. Based on our data and analysis, we have provided implications for design that may be topics for future human-robot design researchers who are exploring robots for public space deployment. Furthermore, our work offers a practical example of analyzing field data to make sense of robots in public spaces.
Benchmarking Federated Machine Unlearning methods for Tabular Data
Xiao, Chenguang, Ghosh, Abhirup, Wu, Han, Wang, Shuo, van Thiel, Diederick
Machine unlearning, which enables a model to forget specific data upon request, is increasingly relevant in the era of privacy-centric machine learning, particularly within federated learning (FL) environments. This paper presents a pioneering study on benchmarking machine unlearning methods within a federated setting for tabular data, addressing the unique challenges posed by cross-silo FL where data privacy and communication efficiency are paramount. We explore unlearning at the feature and instance levels, employing both machine learning, random forest and logistic regression models. Our methodology benchmarks various unlearning algorithms, including fine-tuning and gradient-based approaches, across multiple datasets, with metrics focused on fidelity, certifiability, and computational efficiency. Experiments demonstrate that while fidelity remains high across methods, tree-based models excel in certifiability, ensuring exact unlearning, whereas gradient-based methods show improved computational efficiency. This study provides critical insights into the design and selection of unlearning algorithms tailored to the FL environment, offering a foundation for further research in privacy-preserving machine learning.
Simulation of Autonomous Industrial Vehicle Fleet Using Fuzzy Agents: Application to Task Allocation and Battery Charge Management
Grosset, Juliette, Fougères, Alain-Jérôme, Oukacha, Ouzna, Djoko-Kouam, Moïse, Bonnin, Jean-Marie
Abstract: The research introduces a multi - agent simulation that uses fuzzy inference to investigate the work distribution and battery charging control of mobile baggage conveyor robots in an airport in a comprehensive manner. Thanks to a distributed system, this simulation approach provides high adaptability, adjusting to changes in conveyor agent availability, battery capacity, awareness of the activities of the conveyor fleet, and knowledge of the context of infrastructure resource availability. Dynamic factors, such as workload variations and communication between the conveyor agents and infrastructure are con sidered as heuristics, hig hlighting the importance of flexible and collaborative approaches in autonomous systems. The results highlight the effectiveness of adaptive fuzzy multi - agent models to optimize dynamic task allocation, adapt to the variation of baggage arrival flows, impr ove the overall operational efficiency of conveyor agents, and reduce their energy consumption. Keywords: autonomous industrial vehicle, agent - based si mulation, fuzzy agent, dynamic task allocation, battery charge management, Airport 4.0 1. INTRODUCTION The implementation of fleets of Autonomous Industrial Vehicles (AIV) in the context of Airport 4.0 presents a number of challenges, all of which are connected to the true degree of autonomy of these vehicles: employee acceptance, vehicle localization, traf fic flow, failure detection, collision avoidance, and vehicle perception in dynamic environments. The different limitations and specifications developed by producers and potential consumers of these AIVs might be taken into consideration thanks to simulati on.